APBRmetrics

**Ilardi** · Joined: 15 May 2008 Posts: 257 Location: Lawrence, KS

As many of you know, Dan Rosenbaum pioneered the use of "statistical plus-minus" (APM estimates based on boxscore stats) to help bring down the error levels of traditional APM estimates.

In his seminal 2004 paper (http://www.82games.com/comm30.htm) he describes the generation of a composite Statistical + Pure APM measure, as follows:

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 706 Location: Raleigh, NC

What was your idea Steve?

My thinking is you'd have a std err for both PURE and STATS, and when you multiply them by a and 1-a, and then add them together, your new standard error is:

SE(OVERALL) = sqrt( [a*SE(PURE)]^2 + [(1-a)*SE(STATS)]^2 )

Then you find the a that minimizes this.

I worry about how these might be correlated, but I'm no expert in adjusting for that.
_________________
I am a basketball geek.

**Ilardi** · Joined: 15 May 2008 Posts: 257 Location: Lawrence, KS

Ryan,

Yes, that's what I wound up with, as well, but then I ran into a snag:

How do you find the se associated with each player's STAT measure?

With the PURE measure, each player is treated as a separate regression variable, so the regression output actually gives you the se for each estimate . . . but with STAT, you're simply calculating each player's STAT value based on a set of pre-existing one-size-fits all regression weights applied to his boxscore stats (and/or other relevant stats from 82games, etc.). How does one derive an se estimate for each resulting STAT estimate?

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 706 Location: Raleigh, NC

Well you should have a std error for each predictor, so I would probably do it with simulation (since you should have a covariance matrix to work with).

If you didn't want to do that, then you should be able to just add and subtract as necessary using the covariance matrix to come up with the overall std error on the STAT rating.
_________________
I am a basketball geek.

DSMok1 · Posted: Mon Aug 10, 2009 1:20 pm Post subject:

It looks like you all are on the right track. Here's a quick reference PDF on combining errors: Combining Errors. I don't know how to measure the error covariance for the PURE and STAT interaction.

**Ilardi** · Joined: 15 May 2008 Posts: 257 Location: Lawrence, KS

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 706 Location: Raleigh, NC

I think you can use: http://en.wikipedia.org/wiki/Variance

You want to look at "In general, for the sum of N variables...", while making sure you keep track of negatives if the coefficients are < 0.
_________________
I am a basketball geek.

**Ilardi** · Joined: 15 May 2008 Posts: 257 Location: Lawrence, KS

Thanks - gotta love Wikipedia.

But wouldn't this method yield the exact same variance estimate for each player's STAT rating in any given model, or am I missing something important?

DSMok1 · Posted: Mon Aug 10, 2009 2:22 pm Post subject:

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 706 Location: Raleigh, NC

Well the variance would also be a function of coefficients x stats, so that is part of the calculation of Var(STAT), no? Like in the example on Wikipedia, we have some constant a multiplied by the coefficient X.
_________________
I am a basketball geek.

Crow · Joined: 20 Jan 2009 Posts: 746

Steve, I hope you are heading to publishing new, multi-year overall plus/minus ratings. That is what is needed.

I've supported that in recent years as that was the direction that Dan immediately moved in the progression of his first paper.Then for years we had just the pure adjusted. And eventually the different flavors of pure and the offensive /defensive splits and newer data for statistical by itself. All quite helpful for consideration of impact but multi-year overall plus/minus ratings might give the closest estimate to true overall impact. But I guess you'll have more information on that when you compute the errors.

While I want to see the new roll-up, I'd keep all the layers though. It is about understanding a complex story.

DLew · Joined: 13 Nov 2006 Posts: 222

Steve,

Footnote #4 on that page is relevant to this discussion.

**Ilardi** · Joined: 15 May 2008 Posts: 257 Location: Lawrence, KS

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 706 Location: Raleigh, NC

I think I'd want to try and reproduce his results to understand exactly what is going on. Without that, I'm not exactly sure how to construct the SE for each player using the STAT formula.
_________________
I am a basketball geek.

**Ilardi** · Joined: 15 May 2008 Posts: 257 Location: Lawrence, KS