APBRmetrics

kbche · Joined: 19 Jul 2005 Posts: 51 Location: washington d.c.

Hi Dan,

I am working on mathematically characterizing the teamwork aspect of NBA basketball. I am a chemical engineer and an avid fan of NBA basketball (Washington Wizards and Miami Heat). I recently read your WINVAL analysis paper dated May 30,2004. Thanks for sharing your creative methods for adjusting plus-minus statistics.

Have you done any more work on the OLS estimates? I noticed that the r-squared value was low. Have you looked at choosing a different set of variables?

kbche (Kimberly Brown)

I presume that you are talking about the estimates that relate box score statistics to adjusted plus/minus ratings. In my line of work the R^2 values in those regressions are pretty high. In wage regressions in labor economics R^2 values of 0.01 are not uncommon.

But in a lot of regressions, like these, we know that we are not explaining everything so a "low" R^2 is to be expected. A more critical issue is whether we can precise estimates, which in this case the estimates are not too bad. I have tried other combinations of variables, but R^2 probably will not rise a lot more until I start using non-box score statistics. I suspect that variables, such as opponents' PER, probably would improve R^2. That said, the adjusted plus/minus ratings are measured imprecisely themselves, so there is no way any set of variables is going to fully explain them.

Rather than continuing this conversation in PM, we could just take it to the board. If you would like to, feel free to copy both your post and my response.

Welcome to the board.

Best wishes,
Dan

Jon Cohodas · Joined: 08 Jul 2005 Posts: 31 Location: Richmond, VA

How amusing. I too was composing questions about Dan's regressions for email when I decided that it might be better to ask them here.

My question is the following:
I see your base equation has each observation defined as no substitutions and you estimate Margin based on average points per possession. Have you tried running the data by possession?

In other words, each possession ends with an event such as a turnover, defensive rebound, foul, score, etc.

It would seem to me that one could get at the propensities for certain events to occur on offense or defense even if they do not get a direct tablulated statistic. For example, a player who boxes out well, might not get a rebound, but the probability of a rebound occuring on his watch may be higher.

It also may be possible that "weighting" multiple observations of the same 10 player combos may affect the +/- statistics.

**Dan** **Rosenbaum** · Posted: Thu Jul 28, 2005 10:35 am Post subject:

Jon Cohodas · Joined: 08 Jul 2005 Posts: 31 Location: Richmond, VA

**Dan** **Rosenbaum** · Posted: Fri Jul 29, 2005 5:22 pm Post subject:

Jon, every observation is a shift of a game with no substitutions. The dependent variable is the points scored by the home team during that shift minus the points scored by the away team. This point differential is expressed in points per 100 possessions.

The explanatory variables include variables for every (non-replacement) player in the league indicating whether he is playing during that shift for the home team, the away team, or not all. In my latest version I also include some variables that account for the ages and experience of the home and away teams.

I weight each shift by the number of possessions (and factors that account for garbage/clutch play).

I don't know if this answers your question, because I am not quite sure what you are asking.

The results are so noisy because it is hard for the data to assign credit to all ten players during a given shift. Lots of players play together a lot and so it is difficult to statistically separate them. The regression does not give more credit to the player who gets more blocks, steals, points, assists, etc. during a shift like we do when we watch a game. So in that sense it is less efficient than a person watching a game.

In the extreme if two players always played together, we would never be able to get separate adjusted plus/minus ratings for them. If players were assigned to teams randomly throughout the season (switching teams every game), we could get very precise estimates using adjusted plus/minus ratings. But they are not, so have a lot of "noise."

Jon Cohodas · Joined: 08 Jul 2005 Posts: 31 Location: Richmond, VA

Dan,
First of all, thanks for patiently answering my questions.

**Dan** **Rosenbaum** · Posted: Tue Aug 02, 2005 4:12 am Post subject:

I have not looked at the clutch/garbage time adjustment this year, but last season it really did not make that big of a difference. It is an ad hoc adjustment that could be improved upon.

back2newbelf · Joined: 21 Jun 2005 Posts: 260

i have a question too...
dan, i still don't fully understand your concept of reference players. in your analysis, do you only compare a specific player with a reference player or do you also compare a non-reference player with a non-reference player? also, since you have different reference players for each team, do you assume that all reference players are the same?
the thing is, i'm gonna start my pure adjusted +/- analysis soon, what i do is:

(A, B etc stands for a specific player)
A B C D E vs F G H I J
and
K(changed) B C D E vs F G H I J

to compare A and K. (basically just pure adjusted +/- if i'm not mistaken)

this does not allow for any cross team comparison of 2 players... i can only compare players from the same team...

[in some way it's probably not a bad thing not to be able to give a rating for all nba players in one chart since every player's value changes for whichever team he plays (an offensive oriented player might be better in a team with good defenders and vice versa). nevertheless, sometimes it might be pretty useful]

thanks for any explanations...

**Dan** **Rosenbaum** · Posted: Wed Aug 03, 2005 8:02 am Post subject:

In essence, I treat all players who played less than 250 minutes in the last three seasons as one player for estimating purposes. And it is no problem that occasionally more than one reference players plays together or sometimes they play against each other.

Ben · Joined: 13 Jan 2005 Posts: 264 Location: Iowa City

**Dan** **Rosenbaum** · Posted: Wed Aug 03, 2005 11:08 am Post subject:

Ben · Joined: 13 Jan 2005 Posts: 264 Location: Iowa City

back2newbelf · Joined: 21 Jun 2005 Posts: 260

- you treat all reference players as equally good?
- you compare 2 lineups only when they're all the same except for the reference player? do you use lineups that differ in alot more than just the reference player, but differ only in one position from each other?

**Dan** **Rosenbaum** · Posted: Wed Aug 03, 2005 7:39 pm Post subject:

back2newbelf · Joined: 21 Jun 2005 Posts: 260

don't you think that hurts the objectivity of this research?