APBRmetrics

jsill · Joined: 19 Aug 2009 Posts: 73

I have some results at my website, hoopnumbers.com, which I'm hoping those of you with an interest in adjusted +/- (APM) will find interesting.

Mike Tamada and maybe some others of you here have mentioned the idea of using ridge regression (a.k.a. regularization) in conjunction with APM. Coincidentally,. this is what I've been working on off and on for the last few months, and I finally got it to the point where I'm ready to put it up on my website.

My main finding is that APM with a carefully chosen regularization parameter (which I'll call RAPM) is about twice as accurate as an APM using standard regression and using 3 years of data, where the weighting of past years of data and the reference player minutes cutoff has also been carefully optimized. Interestingly, this is more or less true even if you only use 1 year of data in conjunction with regularization, since the accuracy boost from using 3 years of data is measurable but fairly minor when regularization is used. The parameter estimates resulting from RAPM using 3 years of data are more intuitively reasonable than the 1 year estimates, although I think even the 1 year estimates look more reasonable than the 1 year estimates you get with standard regression.

My basis for claiming that RAPM is twice as accurate is explained at hoopnumbers.com, but I'll sketch it here. I evaluate the models by testing their predictions on unseen data, i.e., on games which were not included in the dataset used to fit the model. You can take the substitution history of a game and take a previously fitted APM model and generate predictions for each game snippet. Then you can add up all the snippet predictions, appropriately possessions-weighted, to get a prediction for the game's final margin of victory. You can compare this prediction to the actual margin of victory and evaluate the accuracy. I did this for the 342 games in March and April of last year, after fitting models on the games through February (and, additionally in some cases, also the games from '07-'08 and '06-'07). The best I managed to do with standard regression-based APM, using 3 years of data, was an R-squared of about 9% on the March and April games. With regularization and using 3 years of data, I got the R-squared up to 17%. Surprisingly, even with 1 year of data, I could get an R-squared of 16% if regularization was used appropriately, without even using a minutes cutoff, i.e., without lumping any players into the reference player bucket.

Again the claim of a near-doubling in accuracy is relative to a 3-year time-weighted APM using standard regression. If we were to compare RAPM to standard APM on 1 year of data, the boost would be bigger. In fact, I'm not even sure how to define the boost in that case, since the accuracy of 1 year standard APM is just not very good at all, according to my experiments.

In addition to introducing RAPM, a secondary goal of this post is to encourage the use of out-of-sample testing techniques like cross-validation as a way of evaluating methods to see what kind of predictive power they have, and also as a way to make choices which otherwise sometimes seem sort of arbitrary, like the minutes cutoff for the reference player or the weighting of past years of data. Looking at the standard errors around parameter estimates and so forth has its place, in my opinion, but ultimately these models are usually used to predict the future (implicitly and indirectly or otherwise) so I think it's important to gauge their success in doing so by testing their success on a holdout set which the model was not fit on.

Here is the writeup of my results:

http://hoopnumbers.com/allAnalysisView?analysis=RAPM&discussion=True

Here are my 3-year RAPM results:

http://hoopnumbers.com/allAnalysisView?analysis=RAPM&discussion=False&leaders=True&year=2009multiYear

Here are my 1-year RAPM results:

http://hoopnumbers.com/allAnalysisView?analysis=RAPM&discussion=False&leaders=True&year=2009

Thanks for any feedback!

Crow · Joined: 20 Jan 2009 Posts: 817

Very timely. Thanks and good luck with this and what might come out of it.

I've talked over time about a concern that traditional Adjusted was overstretching the data. Your writeup also mentions that concern and 1 year and 3 year RAPM gives a tighter range between +8 and -8. I have just started looking at your data but I like this much.

In the NESSiS thread I was essentially reaching for ways to achieve what cross-validation and regularization could achieve. I had temporarily neglected the Ridge Regression talk. I am glad to see it implemented.

The moves of players on good and bad teams under regularization compared to without might address the issue Nick S. noted about minutes weighted team Adjusted +/- of existing Adjusted models have some notable variance with actual team performance. How much better does with regularization do on that than without at the minutes-weighted team level?

Even after regularization it isn't explaining a lot is it? Sounds really low, lower than I expected. But this is a the stint or play by play level? I guess that shouldn't be so surprising or alarming. But how well does it explain at the game level or the season series between teams or playoff series level?

What do you think about the idea of some sort of SPM-APM blend or SPM influenced input for APM?

Do you have interest / plans in taking this technique to the lineup level? Player pairs?

Anything further to say about the multicollinearity issue or possible improved ways to address and reduce its effects?

What if you instead of trying to find a single value for each player applied to every stint on the court to solve the league puzzle as best you can, you allowed the model to assign a player a value from a limited number of different values, say 3 or 5 of them, to model a player who doesn't perform exactly the same all the time? Would that help reduce average errors and outliers? Can that be made to work? Would that help get at where the good and bad contexts and player fits with role and context are? Then the player's value instead of being a single point estimate of + this or - that would be for example 20% +4, 40% +2, and 40% -3 or some such. I think that could be useful. And I guess you could look at which of these partial Adjusted scores come during a more heavy ratio of more win-meaningful moments or less meaningful moments. Players will vary on that and it would be worth gauging. Winston addresses this issue and uses it in determining the average win impact estimate but seeing it at a lower split of the data might be useful for addressing the rotation based on game situation.

Mike G · Posted: Wed Nov 04, 2009 6:46 am Post subject:

Wow, 447 ranked players.

Misspelled "analyses" in the headline.

21 players are ranked as better than playing at home:

DSMok1 · Posted: Wed Nov 04, 2009 11:59 am Post subject:

Good work, jsill!

Did you happen to calculate the standard error for each player? That would be immensely useful in understanding the confidence associated with each evaluation.

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 711 Location: Raleigh, NC

Great stuff Joe, but I have a few questions. I'm in the process of becoming more familiar with using cross-validating to measure prediction error, so I'm very much interested in some of the stuff you've done here.

Would it be appropriate to say you're using 10-fold cross-validation using the data up to February? Did you do any cross-validation using an entire season worth of data? Can you calculate standard errors for your cross-validation estimates?

I would also be interested in seeing the mean absolute error of the cross-validation instead of just RMSE. Lastly, would it be possible to succinctly describe the difference(s) between ridge regression and lasso?
_________________
I am a basketball geek.

jsill · Joined: 19 Aug 2009 Posts: 73

Crow:

Thanks for all the feedback.

schtevie · Joined: 18 Apr 2005 Posts: 412

All else equal, bigger R-squareds are nice. But do the collective results make sense?

Take an arbitrary cut-off of a pretty darn good player, someone who delivers a net 4 points per 100 possessions. In the three-year data shown, there are nine players who accomplished this. Just nine.

The greatest of these is KG, who following the estimate, when he was in the game for his approximate 30 minutes, his contribution on the scoreboard above that of an average (0 APM) player was about 4.5 points. And for Chauncey Billups at #9, playing 35 minutes per game, his contribution was about 2.9 above average.

By contrast, the straight APM gives a dramatically different result (that is consistent whether it uses one or more seasons). Take Stephen Ilardi's stabilized results for last year (using six years of weighted data). Here we have 43 players with APMs above 4. And the stars are starrier.

Never mind the particular players cited; that isn't the point. The issue is whether it is plausible that the biggest stars in the league have such small impacts on the scoreboard and that there are apparently so few of them.

I am skeptical.

basketballvalue · Joined: 07 Mar 2006 Posts: 208

Joe,

I think this looks very interesting and I'm looking forward to really reading through your links in detail. I particularly appreciate you've used the estimates to predict segments not in the dataset used for estimating, I agree this is very important.

For our reference, have you compared your predictions to predictions using other approaches (e.g. PER, Win Score,...)? This would help set our reference point for how good 9% or 17% is. Of course, this is venturing into the territory of Dan's presentation at NESSIS a couple of years ago.

Thanks,
Aaron
_________________
www.basketballvalue.com
Follow on Twitter

jsill · Joined: 19 Aug 2009 Posts: 73

Mike G:

I agree that Dwight Howard's ranking is lower than we would expect. On the other hand, at least his APM based on '08-'09 alone is 2.515, or 34th in the league. At basketballvalue he is at 1.04, or barely above average, for '08-'09 after looking tremendous in '07-'08.

The Spurs results and Matt Bonner's APM in particular are a little funky. If you look at his raw plus/minus per 48 minutes relative to the other Spurs last year, though, he looks awfully good. It's amazing to me, in particular, that they defended so well with him on the floor (90.7 vs. 92.6 for Duncan). As I mention in my writeup, by no means do I think Bonner was a top 10 player last year or the best on the Spurs. The numbers are what they are, though.

DSMok1: I do yet not have the standard errors for each player. Because I'm using regularization, this becomes more complicated than getting standard errors in a classic regression. In theory, we should be able to get an "a posteriori" distribution for the parameters which is a consequence of combining the a prior distribution from which the regularization term stems with the data. I need to do some research on how to do this, though.

jsill · Joined: 19 Aug 2009 Posts: 73

Ryan:

deepak · Joined: 26 Apr 2006 Posts: 665

If you have the numbers readily available, could you publish the leaders in fast break points per game (team-wise, or even player-wise) over the last several years? I can't find that information elsewhere.

Crow · Joined: 20 Jan 2009 Posts: 817

Thanks jsill for the replies to my questions and the others.

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 711 Location: Raleigh, NC

Thanks for the response Joe. Very insightful.

As for the standard error, I'm talking about the standard error of the RMSE. More specifically, in The Elements of Statistical Learning, Hastie et al. refer to "... the importance of reporting the estimated standard error of the CV estimate" (pg 249). I'm still going through this section of the book, so I don't know exactly how you go about calculating it, but I figure you might know how to do so. Very Happy

_________________
I am a basketball geek.

DSMok1 · Posted: Wed Nov 04, 2009 2:48 pm Post subject:

I was considering your Lambda (a-priori distribution) and realized that what you are getting, because of its inclusion, is a "regressed to the mean" APM. Since you calculated your Lambda based on one year of data, the regression to the mean is greater. If you used multiple years of data, the Lambda should change such that there is a greater spread, or at least more outliers. That said, because of "regression to the mean" most players' APM does balance out over several years, reducing outliers that way....

I would be interested if you looked into this.

Basically, this is analogous to a Bayesian "best estimate" of the player's true current APM, similar to what I discussed here. The issue, however, is that all players are regressed toward 0--which is not accurate. I would prefer to see the player's regressed toward a value based on their "minutes per game," which I see as approximately based on APM and thus providing a good frame of reference.

Crow · Joined: 20 Jan 2009 Posts: 817