There’s an interesting paradox when it comes to Sloan research papers, one of the major talking points of the conference this year. Some papers put forth conclusions that likely won’t do much to affect actual game play but have results that are really freaking interesting: the Hot Hand paper for example. It gives a new and convincing angle to an age old debate and is, again, really interesting. But real basketball applications of the findings are lacking. It confirms an already accepted belief among basketball coaches and players. The “streakiest” players in the league would be a fun list, but means nothing to NBA front offices.
On the other side of the equation are papers with wacky results that will mean a lot in the long run. POINTWISE: Predicting Points and Valuing Decisions in Real Time with NBA Optical Tracking Data falls into this category. Co-Author Kirk Goldsberry summarized it on Grantland.
This paper is less about providing any sorts of conclusions and more about providing the framework to build something useful. The most important thing the authors come up with is the possession model. Being able to calculate an expected point value (EPV) at any moment of a possession, given the configuration of the players, is a springboard for infinite possibilities.
Converting EPV into useful metrics will be tough, but there are lots of interesting ideas. The most obvious direction to go is to see how discrete actions add or subtract from an individual possession’s EPV. Theoretically, it makes sense, but there are a lot of flaws at this stage.
One major issue is that of using a comparison to a baseline player. Chandler Parsons was a good three point shooter last year while Dirk Nowitzki was a superlative shooter from midrange. In EPV-added, Dirk is given a lot more credit in this situation because of the comparison to a baseline, despite Parsons’ shot being much more valuable.
The baseline comparison can’t account for player role and usage either. The authors pointed this out with respect to Jose Calderon outranking LeBron James in EPV-added this season, but didn’t yet offer any potential solutions. Russell Westbrook suffers in this metric because of his propensity to take inefficient shots. His deciding to take a contested, midrange, pull-up jumper is insane when Thabo Sefalosha is semi-open for a three on the wing. Westbrook decreases the EPV of the possession by making this decision. But were he to pass it to Sefalosha on the wing every time he dribbled down the court defenses would key on Sefalosha and he would flounder in a higher usage role. That’s an extreme example, but it highlights a flaw in giving set values to discrete decisions.
The authors only quantified on-ball decision making this season, for simplicity’s sake. But for this to become a useful tool for front offices, everything will have to be taken into account. Quantifying off-ball decision making and complex interactions like the Westbrook/Sefalosha one will be no small task. Instead of one guy with the ball having a thousand possible actions, now 10 guys each have a thousand possible actions, many of which are dependent on the decisions other players make. The question of how to split credit between Tony Parker and Kawhi Leonard–and every other player on the floor, for that matter–in the authors’ critical example will require some serious number crunching.
I get confused even thinking about how to do it. If the authors needed Harvard’s advanced cluster computing service for their more basic approach in the paper, what will NBA teams need to build on this and create metrics that can be useful? Looking at it this way makes what the Raptors have done all the more impressive.
These issues of assigning credit and off ball movement/spacing are the ones that have plagued most advanced stats. There are some things that are easy to figure out. A player, on average, only deserves a third of the credit for a defensive rebound. Other things have been tougher to determine, like assists. It’s partly for this reason that statistics that work from the top-down with only 12 pieces of data–the scoring margin, the period of time, the players on the floor–trounce stats that build player values from the bottom up by adding together up lots of discrete player actions when it comes to accuracy. It’s easier to not worry about diminishing returns and interactions between players and just look at who plays when teams win. The PER approach has no way to model interactions between players. On the other hand the adjusted plus-minus (APM) approach is a black box; Amir Johnson’s teams are always better when he’s on the floor, but why? Does he set good screens? Play good help defense? That’s not to mention major issues with APM like sample size, role, coaching and collinearity.
SportVU has the potential to solve many of these problems. Every movement on the floor is tracked. Putting those movements together into something meaningful will be a hell of a task. Though the returns aren’t immediate, Goldsberry’s paper is the first step in a direction that many in the world of analytics have dreamt about.