It is fairly easy to construct a retrospective efficiency rating. Take the efficiencies for each game, correct for location and rest, and then solve using an OLS regression for each team’s true efficiency rating. Nice and neat.
However, how should a predictive rating work? The best approach would be to adjust for what players are playing and when, but in this investigation I’ll restrict myself to team-level data. If I have the same data (efficiencies for each game), how would I best predict game n+1?
The obvious approach is to attack the issue from a Bayesian perspective, so I shall.
Suppose we assume that the teams are distributed based on a normal distribution curve with mean 0. That’s a pretty good assumption; the overall results tend to look like the famous bell curve. Lots of teams close to average; a few outliers. We’ll roll with that assumption.
Assume, also, that each team’s performances are a random sampling drawn from a normal distribution around that team’s true efficiency level. This is not necessarily as good an assumption–if Lebron were out 35 games, those games are not taken from the same sample that the games WITH Lebron are drawn from. Also, teams tend to pull their starters in blowouts, leading to results that show “playing to the level of the opponent”. Dean Oliver talked about this back in 1997, discussing removing the covariance of team offense and defense. Furthermore, teams make trades during the season; some help the team’s current level, some hurt. Finally, opponents also are having the same issues that make opponent adjustments (assuming the opponent has the same true level all year) rather dicey.
Nevertheless, for this first exercise in Bayesian inference, I will make the assumption that the sampled performances are from a normal distribution.
Okay, so we have a prior distribution–the overall league distribution, centered on 0. Standard deviation is right around 5 (I just checked the average efficiency differentials for the 30 teams, and the standard deviation was about 5).
Now the hard part: we have lots of observations of how teams performed in given games. Since this is a predictive rating, we are considering the possibility that older samples are not as good as more recent samples (i.e. there is a transformation function applied between each game). How do we incorporate these observations in a Bayesian framework?
Well, we need some sort of “standard error” to associate with each game’s observation. In general, the team’s performance over the season (adjusted for opponent, rest, & location) have a standard deviation of about 12.5, but that includes the variation that we are trying to adjust for (the change through the season). I’ll use 12 as the starting point per game.
So, with the Bayesian prior (stdev 5, mean 0) and the a first sample, we can get an updated distribution. Of course, to adjust this game for opponent, we must have some idea of how good the opponent is–in this case, I’ll take the adjusted efficiency over the course of this whole season as a reasonable approximation. This game, we’ll assume the actual adjusted efficiency margin for our team was +15–so the distribution has a stdev of 12 and a mean of 15.
Plot it up and we get a posterior distribution:
Wait… how did I calculate that? I multiplied the two functions, and scaled it so the sum of the area under the new distribution is still equal to 1. (Yes, that is called calculus.) We can still describe the posterior distribution simply, though, because it’s still a normal distribution: the mean is now 2.22 and the new standard deviation is 4.615.
There is a shortcut–there are formulas to calculate the new mean and the new standard deviation, rather than using calculus and MathCAD to do the work:
Well, here comes the complex part. We could just keep doing that same thing over-and-over again–but then we’d be weighting all of the games equally, basically just honing in on the regressed year-long-average efficiency differential. That would look like this, for the first 10 games:
But, we had originally intended to look at “predictive” efficiency ratings. So that means we are willing to deprecate older game data somewhat. How much? That really can’t be answered without some empirical analysis.
The basic concept is this: between each game, apply a transformation that increases the standard deviation by a fixed amount. Thus, if the current game has a standard deviation of 12, then the previous game would have a standard deviation of 12 + a, and the next a standard deviation of 12 + 2a, etc… This has the effect of weighting each game 1/(12)^2, then 1/(12+a)^2, then 1/(12+2a)^2, and so on. That can be restated as a weighting of the form b, b-c, b-c^2, b-c^3, etc. In this case, b would be 1/144, and c would be… ugly. (24a + a^2)/[144*(144+24a+a^2)], I believe. Anyway, moving along…
Adding the “penalty factor” depreciates the value of the older samples. The choice of how much depreciation to use requires the empirical work. To do this, I compiled last year’s NBA data and this year’s NBA data, adjusted all of the games for opponent, location, and rest, and set to work. I created a framework using the above methods to create the Bayesian-updated projection for each game, based on the games so far that season.
Finally, I solved for “a”, minimizing the sum of the squared error of the pre-game predictions. The best fit of “a” was 0.20. Going back and checking whether that “12” is correct for the game-level standard error yielded slightly less than 12, but the difference was so insignificant on the overall error (thousandths of 1 percent) that I’ll just use 12.
And so we have a model: the Bayesian prior has mean 0, stdev 5. Each game is set as mean=adjusted performance, standard deviation = 12. Older games have 0.2*n added to their standard deviation, where n is the number of games since that game was played.
Here’s how that looks, for the same 10 games in the graph above (the first 10 games for Atlanta this year, incidentally):
How does that look compared to the previous version without the “penalty”? The distributions are a bit wider and the mean moves a bit further based on more recent data.
Okay, let’s look at the final product. Here is Chicago through last night:
And here is a look at the current rankings (the errors for all of them are in the 2.05 range–remember, error in this model depends only on quantity of observation, not observed variance of observation):
|Rank||Team||Bayesian Eff. Dif.||Unweighted||Hollinger|
Unweighted indicates a ranking with all games in the season weighted equally. Hollinger comes from today’s Hollinger Power Rankings on ESPN, the primary other power rankings I know of that consider recency effects.
There’s an elite 6 right now, and then… …. Philly, of course!
To truly show how this Bayesian updating method works, here is how each team’s Bayesian efficiency differential has progressed through the year, with one of those lovely Google Motion Charts:
This concludes a remarkably long ramble about Bayesian methods and NBA efficiency ratings. Hopefully the reader has been enlightened, not confused!