Analyzing in game tweets from OKC vs Memphis game 4

During an NBA playoff game twitter can be a fun place.  With all the momentum changes and big shots tweets are constantly streaming in to the point where it can be hard to keep up with them.  Twitter’s streaming API allows you gather tweets in real time using various search parameters.  You only get a small sample of tweets, about 1% from what I have read, but it still allows for some fun analysis.  Before last night’s game 4 of the Memphis, OKC series I decided to gather tweets containing the words “westbrook”, “durant” and “scott brooks”.  I graphed the number of tweets each minute starting a half hour or so before the game and ending shortly after the end of the game and drew some vertical lines to show the start and end of each period.


Not surprisingly the biggest spikes in tweet volume center around Russell Westbrook.  The first big spike is from the 3 pointer he made right before halftime.  The two big spikes for Westbrook in overtime occurred after he held the ball for an entire possession then missed a 3 as the shot clock was ticking down and after his offensive rebound immediately followed by a turnover with about 30 seconds left.  Some common words in Westbrook tweets were “jackson”, “fuck”, “trade” and “pass”

Durant related tweets came in at a more steady rate, increasing as the game neared the end of the 4th quarter and he continued to miss shots.

Scott Brooks related tweets were pretty quiet until Memphis pulled close in the 4th quarter and OKC’s iso heavy offense was coming up empty most possessions.  Over 10% of the Scott Brooks tweets contained the word “fire”


Front Court Touches by Game in the Playoffs

Front court touches data for each team from the SportVU touches table on broken down by game.  I will try to update these daily.







Golden State








San Antonio





Playoff Series Scoring Distributions

I decided to look at the scoring distributions for each series so far using the SportVU stats and box score stats to visualize how teams have been scoring in each game.  From these I came up with six categories: Drives, Close, Catch and Shoot, Pull Up, Free Throws and Fast Break.  The first four are from the SportVU numbers on are defined as follows:

  • Drives: Any touch that starts at least 20 feet of the hoop and is dribbled within 10 feet of the hoop and excludes fast breaks
  • Close: Points that are scored by a player on any touch that starts within 12 feet of the basket, excluding drives
  • Catch and Shoot: Any jump shot outside of 10 feet where a player possessed the ball for 2 seconds or less and took no dribbles
  • Pull Up Shots: Any jump shot outside 10 feet where a player took 1 or more dribbles before shooting

The totals from all six categories don’t sum to a team’s final score, some points a team scores might not fall into one of these categories and some points might be counted in multiple categories but for most games they sum very close to the team’s final score.  It’s not perfect but I think this still gives a good visualization of how teams are scoring their points.

Indiana vs Atlanta


Miami vs Charlotte


Toronto vs Brooklyn


Chicago vs Washington


San Antonio vs Dallas


OKC vs Memphis


Clippers vs Golden State


Houston vs Portland



Usage, efficiency and uncontested shot player similarities

I posted a while ago about hierarchical clustering of player shot distributions to visualize similarities between shot distributions among players.  With the new player tracking boxscores including uncontested field goal attempts, I decided to do a similar analysis to see similarities between a player’s usage, effective field goal percentage and percentage of their shots that are uncontested.  I only used players in the top 100 in field goal attempts because the graph gets too cluttered with more players than that.  Here is the graph:

Hierarchical clustering of usage, efficiency and uncontested shots

Hierarchical clustering of usage, efficiency and uncontested shots

Players very similar to each other are connected with shorter lines, so Damian Lillard and Mike Conley are very similar. LeBron James and Stephen Curry are most similar to each other, but not all that similar, and they aren’t really similar to any other players since they have a long line connecting them and a long line connecting their cluster to another cluster. The same is true to a lesser extent for Kevin Durant and Dwyane Wade.  You can really see the clustering of the low post big men (high EFG%, low % of uncontested shots)  at the bottom and the guys who get a lot of catch and shoot 3s up at the top right(low usage, high EFG%, high % of uncontested shots).

When it may be good to “Hack-a-Shaq”

There is a good post here evaluating the math behind hacking bad free throw shooter and finding the break even point for hacking.  Doing the math we can see that unless there is a really bad free throw it is often a bad long term decision to start hacking.  Since there are a finite number of possessions in a game and the object of the game is to have more points than the other team at the end of the game, there can be times where the best play to win the game isn’t the optimal long term play.  A simple example of this is in a tie game where a team can hold for a game winning shot at the buzzer, a team is better off shooting a long 2 than a 3.  The 3 point shot has a higher expected value, but since it will be the last shot in a tie game all that matters is the percentage of the time that you score a basket, making a long 2 that you will make 40% of the time better than a 3 you will make 35% of the time.  Applying this same line of thinking to hacking a bad free throw shooter we can see if there might be times where hacking might be a good short term decision despite being a bad long term decision.  For example we have seen teams use it as a strategy to try to slow down the opposing team in an attempt to get back into the game. Maybe a team down 10 with 4 minutes left in the game and needing to make a quick run might be better off hacking a bad free throw shooter despite it being a losing play long term.

To look for these types of situations I decided to run a simple model giving each team 5 possessions and comparing the outcomes of hacking and not hacking.  Using play by play data I got the percentage of the time a possession ends in 0, 1, 2, 3 and 4 points and used these percentages to get the outcomes for the team not getting hacked.  For the team getting hacked I used a 50% free throw shooter since that is roughly the break even long run point.  I used an offensive rebounding rate on missed free throws of 13.8% since in my previous post I found that is the offensive rebounding rate on free throws shot by 50% and under free throw shooters.  I assumed that if they do get the offensive rebound the hacking team will hack the bad free throw shooter again.  These are all league average numbers so the results would be slightly different if you used team specific numbers to determine the effectiveness of a the hacking strategy for a certain team against a specific team.

I simulated 100,000 runs of this model and got the percentage of times each point differential occurred over the 10 total possessions.  Here is the distribution of outcomes for hacking and not hacking from the perspective of the team doing the hacking.

points gained no hacking hacking
-10 0.17 0.01
-9 0.41 0.06
-8 0.86 0.23
-7 1.63 0.71
-6 2.78 1.69
-5 4.13 3.3
-4 6.31 5.54
-3 7.52 8.18
-2 10.14 10.7
-1 9.98 12.49
0 11.97 13.08
1 9.95 12.45
2 10.19 10.63
3 7.47 8.15
4 6.31 5.72
5 4.16 3.53
6 2.8 1.96
7 1.62 0.96
8 0.84 0.41
9 0.42 0.14
10 0.17 0.05
11 0.06 0.01

From this it looks like the best time to hack a bad free throw shooter is when a team has a lead and they want to prevent their opponent from making a run to get back in the game.  The chances of finishing the 5 possessions down 4 or more points are 11.5% when hacking and 16.3% when not hacking.  So maybe when a team is up 10 with 4 minutes left against a team with a bad free throw shooter it might be a good time to start hacking.

This model obviously isn’t perfect.  I think it under predicts significant runs since it doesn’t account for how the possessions interact with each other.  For the sake of simplicity, this model just assumes a team will score 0, 1, 2, 3 or 4 points at the league average rate no matter what the other team did on the previous possession, which obviously isn’t realistic.  For example a non dead ball turnover will often lead to a fast break with a much higher expected value than an average possession.  I think if I expanded the model to account for this it would lead to more outcomes where the outcome is 5 or more points gained or lost.  If this was the case,  it would actually make hacking to protect a lead a better decision than the results this simple model produces.

Another thing this model doesn’t account for is how a team will change it’s strategy depending on the score.  For example, a team looking to make a comeback may try shooting more 3s in an effort to get back in the game.  This would also probably lead to an increase in the number of outcomes in the 5 or more points gained or lost range.

All in all it’s a simple model but the results make sense.  It’s hard for a team to make a run to get back into a game when all their possessions are simply a 50% free throw shooter shooting free throws.

Offensive Rebounding on Missed Free Throws

I decided to look at offensive rebound rates on missed free throws and see if there is any difference depending on the shooter’s free throw percentage. I used the play by play data from the NBA Stats website for all seasons from 1996-97 to 2012-13.  I grouped the data by 5% intervals using the player’s career free throw percentage.  The OREB% for all missed free throws is 12%.

From the results it looks like if you are a bad free throw shooter there is a slight increase in the offensive rebound rate.


FT% OREB rate
50 or under 0.13794339
50-55 0.13431176
55-60 0.12192710
60-65 0.12684097
65-70 0.12978457
70-75 0.11811789
75-80 0.12121092
80-85 0.10648898
85+ 0.08268259

Best Offensive and Defensive teams since 1996-97

Using the data from the NBA Stats website, I calculated the top offensive and defensive teams since the 1996-97 season(that’s as far back as the stats go on the NBA website) by comparing how many standard deviations from average a team was in each season.  Interesting to note that Steve Nash was the starting point guard for the top 5 offences, twice with Dallas and three times with Phoenix.  Also note how good this year’s Pacers team is defensively.

Top 20 Offences:

Team Season Standard Deviations from mean
Phoenix Suns 2006_07 2.9
Dallas Mavericks 2003_04 2.8
Phoenix Suns 2004_05 2.5
Dallas Mavericks 2001_02 2.3
Phoenix Suns 2009_10 2.3
Chicago Bulls 1996_97 2.2
Sacramento Kings 2003_04 2.2
Utah Jazz 1996_97 2.1
Miami Heat 2012_13 2.1
San Antonio Spurs 2011_12 2
Oklahoma City Thunder 2012_13 2
Phoenix Suns 2005_06 2
Dallas Mavericks 2002_03 1.9
Phoenix Suns 2008_09 1.9
Indiana Pacers 1998_99 1.9
Utah Jazz 1997_98 1.9
Los Angeles Lakers 1997_98 1.8
Dallas Mavericks 2006_07 1.8
Dallas Mavericks 2005_06 1.8
Milwaukee Bucks 2000_01 1.8

Top 20 Defences:

Team Season Standard Deviations from mean
Indiana Pacers 2013_14 -3.1
Boston Celtics 2007_08 -2.6
San Antonio Spurs 2003_04 -2.3
San Antonio Spurs 2004_05 -2.3
Chicago Bulls 2010_11 -2.1
Indiana Pacers 2012_13 -2.1
Chicago Bulls 2006_07 -2.1
Detroit Pistons 2003_04 -2
Boston Celtics 2010_11 -2
Chicago Bulls 2011_12 -2
Orlando Magic 2008_09 -1.9
San Antonio Spurs 2006_07 -1.9
San Antonio Spurs 1998_99 -1.9
San Antonio Spurs 2005_06 -1.9
Houston Rockets 2006_07 -1.9
Boston Celtics 2011_12 -1.9
New Jersey Nets 2002_03 -1.9
Houston Rockets 2007_08 -1.8
Memphis Grizzlies 2012_13 -1.8
San Antonio Spurs 2001_02 -1.8

Minnesota’s crunch time possessions

Minnesota is 0-10 in games decided by less than 4 points.  Most of this is due to their inability to score in the half court.  Thanks to the new NBA video play by play on the NBA Stats website, all of there crunch time possessions can be gathered. Here are all their offensive possessions in the last minute of one possession games and with between 1:00 and 2:00 remaining in games within 5 points or less.  I made the script to get these pretty quickly so it might not be a complete list and there are some duplicated videos for plays that happen in quick succession (ie. missed shot followed by a tip in). Click on the description to see video of the play. It’s certainly not just bad luck that they are losing all these games. There are a lot of ugly possessions.

game_id time score_margin description
0021300010 1:34 1 Missed Field Goal
0021300010 1:05 1 Turnover
0021300010 0:27 1 Turnover
0021300010 0:15 -1 Missed Field Goal
0021300010 0:10 -3 Made Field Goal
0021300047 1:38 -5 Missed Field Goal
0021300047 1:16 -5 Made Field Goal
0021300047 0:39 -3 Missed Field Goal
0021300047 0:38 -3 Made Field Goal
0021300047 0:01 -1 Missed Field Goal
0021300106 1:39 -4 Turnover
0021300106 1:04 -4 Made Field Goal
0021300106 0:03 -2 Missed Field Goal
0021300106 0:01 -2 Missed Field Goal
0021300106 0:00 -2 Missed Field Goal
0021300156 1:57 0 Missed Field Goal
0021300156 1:44 0 Missed Field Goal
0021300156 1:11 -2 Made Field Goal
0021300156 0:25 -2 Missed Field Goal
0021300168 1:51 -4 Missed Field Goal
0021300223 1:13 -5 Made Field Goal
0021300223 0:44 -3 Missed Field Goal
0021300322 1:16 4 Missed Field Goal
0021300322 1:15 4 Missed Field Goal
0021300322 1:15 4 Made Field Goal
0021300338 1:36 -3 Missed Field Goal
0021300338 1:08 -5 Missed Field Goal
0021300351 1:32 5 Foul
0021300358 1:58 -3 Missed Field Goal
0021300358 1:31 -3 Missed Field Goal
0021300358 1:06 -3 Made Field Goal
0021300358 0:24 -1 Missed Field Goal
0021300358 0:15 -3 Missed Field Goal
0021300406 1:29 3 Made Field Goal
0021300406 0:08 2 Turnover
0021300406 0:01 0 Missed Field Goal
0021300406 0:00 0 Missed Field Goal
0021300406 1:30 0 Missed Field Goal
0021300406 1:29 0 Turnover
0021300406 0:46 0 Made Field Goal
0021300406 0:24 -1 Missed Field Goal
0021300406 0:15 -3 Made Field Goal
0021300406 0:05 -2 Missed Field Goal
0021300455 1:38 -4 Made Field Goal
0021300455 0:59 -2 Turnover
0021300455 0:07 -2 Missed Field Goal
0021300455 0:00 -2 Missed Field Goal
0021300493 1:36 0 Missed Field Goal
0021300493 1:07 -2 Made Field Goal
0021300493 0:31 0 Missed Field Goal
0021300493 0:27 -2 Made Field Goal
0021300493 0:27 0 Foul
0021300493 0:02 -2 Foul
0021300524 1:51 5 Foul
0021300524 1:15 4 Missed Field Goal
0021300524 0:45 2 Turnover
0021300524 0:24 1 Turnover
0021300524 0:00 -1 Missed Field Goal

Effective Field Goal Percentage Breakdown

After reading today that the Miami Heat currently have the highest effective field goal percentage in NBA history I decided to look deeper into each team’s effective field goal percentage and essentially see where their eFG% is coming from – skilled shooting or shot distribution (getting efficient shots).  To do this I got all the shots from this season from and got a breakdown by shooting zone for each team and the league average eFG% for each zone.  Then I calculated what each team’s eFG% would be if their shot distribution remained the same but they shot at the league average from each zone.  I will call this the team’s average shooting eFG%.  If you subtract the team’s average shooting eFG% from the team’s actual eFG% you get the team’s percentage points from the league average eFG% due to shooting skill.  If you subtract the league average eFG% from the team’s average shooting eFG% you get the team’s percentage points from the league average eFG% due to shot distribution.

Not surprisingly the Houston Rockets are the team with the highest difference due to shot distribution by a pretty substantial margin followed by Miami and Philadelphia.  The three worst are Cleveland, Memphis and Milwaukee.  Miami has the biggest difference due to shooting ability, also by a large margin, followed by San Antonio and Dallas.  I added a graph of the results below.  Teams in the top right are good at both shooting ability and getting efficient shots while teams in the bottom left are bad at both.  As you can see in the graph, Miami and Houston are both way out on their own.

eFG breakdown

Defending the Rim and 3s

Same thing as my last post but for defense.  The ratio of an opponent’s shots in the restricted area and behind the 3 point line graphed against defensive efficiency for all seasons since 1996-97.  Correlation is 0.53 – a little higher than it is for offensive efficiency.