APBRmetrics

**Ilardi** · Joined: 15 May 2008 Posts: 265 Location: Lawrence, KS

As several have noted in this forum in recent months, retrodiction can be quite useful as a means of evaluating the utility of player metrics. However, to my knowledge, there has never been a systematic comparison of the relative retrodictive performance of the various widely used "omnibus" metrics - e.g., Win Shares, Wins Produced, APM, SPM, PER, etc.

Accordingly, I thought it might be informative (and fun) to see if a few interested parties might want to collaborate on such an undertaking under a shared set of guidelines for generating retrodictive estimates for each measure. Although it would ultimately be ideal to conduct the investigation across several seasons' worth of data, I'd like to suggest a more modest "proof of concept" investigation to begin with: retrodiction of team performance (net efficiency) during the 2008-2009 season.

Following are some basic guidelines (and I'm open to friendly amendments on any of them):

1) Only data on each metric from prior seasons (i.e., up through 2007-2008) can be used;

2) Each player's actual minutes from 2008-2009 will be used;

3) Systematic age-adjustments to each metric are permitted in projecting each player's 08-09 values;

4) An average rookie metric value will be used for all rookies (a similar procedure will be employed for all players who logged minimal minutes prior to the 08-09 season);

5) Team-by-team estimates of net efficiency (pts scored/100 poss - pts allowed/100 poss) will be generated on the basis of aggregated teamwise projected metric values. For metrics that generate estimated wins or raw point differentials, appropriate conversions will be used to derive each team's net efficiency value. (Secondary analyses can also look at wins as an outcome variable of interest, to see if choice of DV makes any difference.)

6) Finally, each metric will be evaluated on the basis of the mean observed difference (absolute value) between each team's actual and projected efficiency during the 2008-2009 season.

If there is sufficient interest, I will be happy to supply relatively low-noise APM estimates for the Retrodiction Challenge. (Since the ultimate focus of the exercise is net efficiency, I'll need to re-run my APM model to generate direct estimates of Total APM rather than deriving them indirectly on the basis of Offensive and Defensive APM, as the former method generates lower-noise Total APM estimates.)

Neil Paine · Joined: 13 Oct 2005 Posts: 774 Location: Atlanta, GA

http://www.basketball-reference.com/blog/?p=2264

**Ilardi** · Joined: 15 May 2008 Posts: 265 Location: Lawrence, KS

battaile · Joined: 27 Jul 2009 Posts: 38

I'm up for this.

"4) An average rookie metric value will be used for all rookies (a similar procedure will be employed for all players who logged minimal minutes prior to the 08-09 season); "

I'd like to be able to estimate rookies performance based on draft position (and possibly other factors such as ht/wt/position), would this be allowed?

**Ilardi** · Joined: 15 May 2008 Posts: 265 Location: Lawrence, KS

battaile · Joined: 27 Jul 2009 Posts: 38

[quote="Ilardi"][quote="battaile"]I'm up for this.

"4) An average rookie metric value will be used for all rookies (a similar procedure will be employed for all players who logged minimal minutes prior to the 08-09 season); "

I'd like to be able to estimate rookies performance based on draft position (and possibly other factors such as ht/wt/position), would this be allowed?[/quote]

I think this would only make sense if we could find an agreed-upon, systematic method of doing so for each metric . . .[/quote]

Ah, I was thinking it'd be part of what differentiated the challenge entries. What will differentiate them, just the aging adjustment and how you weight prior seasons?

**Ilardi** · Joined: 15 May 2008 Posts: 265 Location: Lawrence, KS

But then we might inadvertently conflate metric retrodictive performance with an individual modeler's skill in projecting rookie performance . . .

battaile · Joined: 27 Jul 2009 Posts: 38

[quote="Ilardi"]But then we might inadvertently conflate metric retrodictive performance with an individual modeler's skill in projecting rookie performance . . .[/quote]

Ah, I gotcha. One question though, is that same risk not present with retrodictive performance and aging adjustment?

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 711 Location: Raleigh, NC

We could always have two measures: one with and without rookies. As Steve mentioned, although we want to project rookies as best as possible, that's a separate modeling challenge that could be issued.

The intent (hopefully?) of this challenge is to use prior season data to best predict the future efficiency of a team. Projecting rookies with data from other leagues (college, NBDL, euro, etc) requires more models that we don't really care about at this point.

I'm not sure how well this will fall in line with Steve's outline, but I'm most interested in retrodictions where you're given specific information for each "shift" of players.

The intent of this type of retrodiction is to understand who predicts the best with a specific set of information, and to try and understand what information is valuable to know. For example, I'll tell you which lineup starts on offense, which players are on the court, and how many possessions they each had. We could then extend this to provide other relevant information, like lead/deficit, quarter, time left, etc. When you make all of these predictions, we aggregate them all together and determine the predicted offensive and defensive efficiency for each team.

I'll be posting some results from this type of analysis soon, so hopefully that will give a better idea of how exactly that works.
_________________
I am a basketball geek.

**Ilardi** · Joined: 15 May 2008 Posts: 265 Location: Lawrence, KS

battaile · Joined: 27 Jul 2009 Posts: 38

[quote="Ilardi"][quote="battaile"][quote="Ilardi"]But then we might inadvertently conflate metric retrodictive performance with an individual modeler's skill in projecting rookie performance . . .[/quote]

Ah, I gotcha. One question though, is that same risk not present with retrodictive performance and aging adjustment?[/quote]

Potentially . . . that's why I'm hoping we can find some agreed-upon method of age adjustment to apply to all metrics. If not, then maybe we could do a multi-part analysis:

1) Retrodiction with no adjustments for age and using a simple average rookie value for all rookies;

2) Retrodiction with age adjustment (derived in custom-tailored fashion for each metric)

3) Retrodiction with age adjustment and rookie adjustment (both derived in custom-tailored fashion)[/quote]

Ah ok, I interpreted this
"Systematic age-adjustments to each metric are permitted in projecting each player's 08-09 values;"
incorrectly as each entry would come up with our own system (but you'd have to show that it was systemic and provide backing, not just throw out numbers), but this would actually be something standardized across all entries. Ok, it all makes sense now. Smile

I'd vote for number one as I think aging adjustments are something that you can do a lot with in their own right, so for a baseline on APM-retrodiction I'd rather see them left out. Then once the baseline is established start trying to improve on it with specific aging formulas. (competition number two?)

**Ilardi** · Joined: 15 May 2008 Posts: 265 Location: Lawrence, KS

Neil Paine · Joined: 13 Oct 2005 Posts: 774 Location: Atlanta, GA

You know, I'm not sure where I got the -1.17 value for rookies, either. It's been a while.

Incidentally, I think the best way for the challenge might be this:
*Use per-minute rates
*No age adjustment
*The weighting is as follows: 3 parts Y-1, 2 parts Y-2, 1 part Y-3
*For all seasons where no data exists for a player, use the league average

Clearly this is not going to produce the best results any of our metrics can do, but then again, that's not exactly the point here, is it? The point is to test the predictive value of past results. This method is going to be easily reproducible (no possibility of cheating), simple to execute, and not reliant on the ability to project playing time or the ability to fit a model for existing players or rookies.

It will be all about the metrics themselves and how predictive they are, with no external factors muddying the waters.

Kevin Pelton · Posted: Fri Aug 28, 2009 12:33 pm Post subject:

I think Neil's proposed rules are pretty good, with the caveat that I think we might just zero out years where the player was not in the league for the sake of players who are well below or above average during their smaller sample.

If we do it this way, it's pretty much a matter of providing three years' worth of numbers, right?

battaile: Are you disabling BBCode in all of your posts? Your quotes are consistently messed up ...

Mike G · Posted: Fri Aug 28, 2009 1:23 pm Post subject:

If I'm creating an age-effect prediction algorithm, I'm going to do it by averaging trends over the last few years. In essence, a multi-year retrodiction would be precisely the exercise that would result in said algorithm.

So, all our retrodictions with their age-effects are 'suspect', in that they in effect create the best fit with our models. Hence, it's essential that we each submit a set of retros without age-effect factors, as well as with.

Then, an actual prediction has more validity.

I guess I'm just agreeing with what Battaile wrote.
_________________
`
36% of all statistics are wrong