|
APBRmetrics The statistical revolution will not be televised.
|
View previous topic :: View next topic |
Author |
Message |
Dan Rosenbaum
Joined: 03 Jan 2005 Posts: 541 Location: Greensboro, North Carolina
|
Posted: Wed Aug 10, 2005 2:10 pm Post subject: Predictors of Adjusted Offensive and Defensive +/- Ratings |
|
|
(This was inspired by the thread started by Nikos, but once this post got really long, I figured it probably made sense to start a new thread.)
Below I have included regression results relating my adjusted offensive and defensive plus/minus rating to various box score statistics. These are not quite the series of regressions I used to produce my statistical plus/minus ratings, but I think these should be useful in talking about how box score statistics related these offensive and defensive adjusted plus/minus ratings.
Here are some important points. The way you should read the coefficients is the following. In the offenive regression, the offensive adjusted plus/minus rating increases by 0.70 points per 40 minutes with every one point increase in points per 40 minutes, holding all of the other variables (including true shot attempts) constant.
1. The offensive regression has a much higher R-squared than the defensive regression, which confirms the expectation that our box score statistics do a better job explaining offensive effectiveness.
2. There is slightly more variation in offensive ratings than defensive ratings.
3. The results suggest that holding the other variables constant, a player with better than a 37% true shooting percentage tends to increase offensive efficiency. My interpretation is that the reason this is not higher is that players who can create shots are valuable, i.e. that most players will see their true shooting percentage fall as their true shot attempts increase.
4. Interestingly, even after accounting for points scored, players with more three point attempts tend to have higher offensive adjusted plus/minus ratings. This suggests that in addition to the points they score, the ability of three point shooters to spread the floor is very important for offenses. Also, note that more three point attempts is not associated with worse defense. It does not appear that the long rebounds from missed three pointers is typically leading to easy transition points.
5. Players who go to the line more, holding the other variables constant, tend to be more effective on offense and defense. In fact, the effect is larger on defense.
6. As expected, offensive rebounds predict offensive effectiveness and defensive rebounds predict defensive effectiveness. Note, however, that the offensive rebounds appear to be more important.
7. Holding the other variables constant, players who turn the ball over tend to be not only less effective offensive players, but also less effective defenders.
8. Holding the other variables constant, steals are almost as important a predictor of offensive effectiveness as they are of defensive effectiveness. Part of this may be that steals often generate high percentage scoring opportunities for teammates, but I wonder if part of this isn't that players who get steals tend to do a better job of spacing and collecting loose balls on the offensive end. This may help explain why steals may not be a good predictor of team success, but are important at the individual level. Players who steal the ball a lot may do a better job helping their teammates avoid turnovers and bad shots.
9. For the whole sample there is not a strong relationship between assists and defensive effectiveness. But when I limit the sample to big men, I find that better assisters tend to better defensive players, holding the other variables constant.
10. Holding the other variables constant, blocks predict defensive effectivenss, but not offensive effectiveness.
11. Holding the other variables constant, players with more personal fouls tend to be more effective defenders, with little effect on offensive effectiveness.
12. Holding the other variables constant, players who play more minutes per game tend to be more effective on offense and defense. This is pretty strong evidence that coaches do observe contributions that players make that are not picked up in box score statistics.
Code: | The SAS System 14:08 Wednesday, August 10, 2005 63
Model: MODEL1
Dependent Variable: OFF1
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 12 9291008.6501 774250.72084 118.118 0.0001
Error 1081 7085843.2932 6554.8966634
C Total 1093 16376851.943
Root MSE 80.96232 R-square 0.5673
Dep Mean -0.42353 Adj R-sq 0.5625
C.V. -19115.86189
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 -7.056284 0.61411305 -11.490 0.0001
PTS 1 0.702730 0.06387650 11.001 0.0001
TSA 1 -0.525243 0.06276998 -8.368 0.0001
FTA 1 0.083834 0.06323568 1.326 0.1852
TA 1 0.327152 0.04249266 7.699 0.0001
AS 1 0.640857 0.04863086 13.178 0.0001
OR 1 0.733202 0.10084425 7.271 0.0001
DR 1 -0.138614 0.05560930 -2.493 0.0128
TO 1 -1.042591 0.14327755 -7.277 0.0001
ST 1 0.713849 0.14956205 4.773 0.0001
BK 1 -0.111075 0.10316250 -1.077 0.2819
PF 1 -0.093128 0.08545434 -1.090 0.2760
MPG 1 0.043603 0.01161761 3.753 0.0002
The SAS System 14:08 Wednesday, August 10, 2005 64
Model: MODEL2
Dependent Variable: DEF1
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 12 4432015.5876 369334.6323 48.579 0.0001
Error 1081 8218599.2007 7602.7744687
C Total 1093 12650614.788
Root MSE 87.19389 R-square 0.3503
Dep Mean 0.30028 Adj R-sq 0.3431
C.V. 29037.83409
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate Error Parameter=0 Prob > |T|
INTERCEP 1 -3.683703 0.66138060 -5.570 0.0001
PTS 1 -0.067574 0.06879300 -0.982 0.3262
TSA 1 -0.105195 0.06760132 -1.556 0.1200
FTA 1 0.179179 0.06810286 2.631 0.0086
TA 1 0.007954 0.04576327 0.174 0.8620
AS 1 0.035210 0.05237392 0.672 0.5015
OR 1 -0.126936 0.10860612 -1.169 0.2428
DR 1 0.393748 0.05988948 6.575 0.0001
TO 1 -0.382290 0.15430545 -2.477 0.0134
ST 1 1.080512 0.16107366 6.708 0.0001
BK 1 1.014717 0.11110280 9.133 0.0001
PF 1 0.309126 0.09203166 3.359 0.0008
MPG 1 0.057194 0.01251181 4.571 0.0001 |
Dependent variable first regression - adjusted offensive plus/minus rating
Dependent variable second regression - adjusted defensive plus/minus rating
PTS - points per 40 minutes
TSA - true shooting attempts per 40 minutes
FTA - free throw attempts per 40 minutes
TA - three point attempts per 40 minutes
AS - assists per 40 minutes
OR - offensive rebounds per 40 minutes
DR - defensive rebounds per 40 minutes
ST - steals per 40 minutes
BK - blocks per 40 minutes
PF - personal fouls per 40 minutes
MPG - minutes per game
All variables are pace adjusted and the regressions are using data from 2002-03 through 2004-05 and are weighted by minutes played. |
|
Back to top |
|
|
kjb
Joined: 03 Jan 2005 Posts: 865 Location: Washington, DC
|
Posted: Wed Aug 10, 2005 3:01 pm Post subject: |
|
|
I think this is GREAT stuff. I'll let others comment on the math, but I love stuff like:
Quote: | 7. Holding the other variables constant, players who turn the ball over tend to be not only less effective offensive players, but also less effective defenders. |
|
|
Back to top |
|
|
Eli W
Joined: 01 Feb 2005 Posts: 402
|
Posted: Wed Aug 10, 2005 3:06 pm Post subject: |
|
|
Sorry if this is obvious, but what's the formula for true shooting attempts? I haven't seen that terminology. |
|
Back to top |
|
|
Dan Rosenbaum
Joined: 03 Jan 2005 Posts: 541 Location: Greensboro, North Carolina
|
Posted: Wed Aug 10, 2005 3:12 pm Post subject: |
|
|
John Quincy wrote: | Sorry if this is obvious, but what's the formula for true shooting attempts? I haven't seen that terminology. |
Good question. It is just the denominator of true shooting percentage.
True shooting attempts = FGA + 0.44*FTA
True shooting percentage = (PTS/2)/(FGA + 0.44*FTA)
Sometimes I think I also refer to "true shooting attempts" as "true shot attempts." |
|
Back to top |
|
|
olcoach43
Joined: 10 Aug 2005 Posts: 28 Location: Indianapolis, Indiana
|
Posted: Wed Aug 10, 2005 3:45 pm Post subject: Defensive predictor #7 |
|
|
Dan I assume that these ratings are a function of what happens to the team when a given player is on floor? (If not can you clarify?)
If so, the defensive performance is most likely negative when a poor ballhandler (turnovers) is on the floor, because of two perfectly logical factors:
1. There may well be a points off of turnovers factor, depending upon where and what type of turnover occurs. Eg, a quick basket
2. Even if the turnovers do not lead directly to quick baskets, points per possession will suffer while turnovers are occuring, and chemistry and defensive effort will break down within the team concept.
On the bench we would be focused upon breaking a negative momentum and assistants would be hollering for me to get his a** out of there! |
|
Back to top |
|
|
Eli W
Joined: 01 Feb 2005 Posts: 402
|
Posted: Wed Aug 10, 2005 4:03 pm Post subject: |
|
|
Dan, this stuff is great. I've already plugged in the coefficients to an Excel formula and gone to work. One initial thing I noticed was how poorly Ben Gordon came out defensively. I didn't use any pace adjustments, but I have him ranking third worst in the league for 04-05 (behind Troy Hudson and DeShawn Stevenson). You have Hudson and Stevenson both ranked in the 0 percentile in defensive statistical plus/minus in your 82games article, but Gordon is listed as in the 52nd percentile. Is this big difference for Gordon due to the fact that you used slightly different regressions (or due to pace adjustments)? Or did I err in my calculations? |
|
Back to top |
|
|
olcoach43
Joined: 10 Aug 2005 Posts: 28 Location: Indianapolis, Indiana
|
Posted: Wed Aug 10, 2005 4:17 pm Post subject: Ben Gordon |
|
|
John,
I am not capable of doing the statistical work you just described. However, I am heartened by your findings. My work shows Gordon to be one of the worst "off the ball" players in the NBA, ranked 5th from the bottom of all 500+ players.
His A/TO ratio is .88 and he contributes virtually nothing in any other categories. He is a known poor defender, so I have been
wondering about this issue as well.
Last edited by olcoach43 on Wed Aug 10, 2005 4:42 pm; edited 1 time in total |
|
Back to top |
|
|
jkubatko
Joined: 05 Jan 2005 Posts: 702 Location: Columbus, OH
|
Posted: Wed Aug 10, 2005 4:31 pm Post subject: |
|
|
Dan, did you look into the problem of multicollinearity at all? If you are just interested in obtaining predictions, then multicollinearity is not a big deal. But if your goal is to understand how each predictor influences the response, then multicollinearity is a *big* problem. Based on your initial post it seems like you are interested in the latter. _________________ Regards,
Justin Kubatko
Basketball-Reference.com |
|
Back to top |
|
|
Dan Rosenbaum
Joined: 03 Jan 2005 Posts: 541 Location: Greensboro, North Carolina
|
Posted: Wed Aug 10, 2005 6:28 pm Post subject: |
|
|
John Quincy wrote: | Dan, this stuff is great. I've already plugged in the coefficients to an Excel formula and gone to work. One initial thing I noticed was how poorly Ben Gordon came out defensively. I didn't use any pace adjustments, but I have him ranking third worst in the league for 04-05 (behind Troy Hudson and DeShawn Stevenson). You have Hudson and Stevenson both ranked in the 0 percentile in defensive statistical plus/minus in your 82games article, but Gordon is listed as in the 52nd percentile. Is this big difference for Gordon due to the fact that you used slightly different regressions (or due to pace adjustments)? Or did I err in my calculations? |
It won't give the same results for a few different reasons.
1. Pace adjustments - although for the Bulls I cannot imagine this matters much.
2. Different coefficients - First, I use a slightly different specification of variables, which probably makes a slight difference. Second, I use the results from several regressions to get the statistical plus/minus ratings. One regression uses the whole sample, but others limit the sample by position or by true shooting attempts. Thus, the coefficients that I use are different than those in the post above.
3. I adjust the statistical defensive plus/minus ratings by team so that they add up to the defensive rating. I do the same with the adjusted plus/minus ratings, but those adjustments are small. But the stats do such a poor job predicting defensive effectiveness, the adjustment is sometimes pretty large, especially for a team like the Bulls in 2004-05.
4. Remember my ratings are predictions for 2005-06. And remember I find that rookies improve a lot. Throwing in the improvement due to getting older as well and I bet that adjustment adds a point or two to Gordon's rating.
Another thing to remember is that the percentiles that I give are weighted and by position. Shooting guards tend to have low defensive adjusted plus/minus ratings, so overall Gordon is rated much lower than the 52 percentile with the statistical rating.
But none of this takes away that in 2004-05, Gordon's defensive adjusted plus/minus rating was phenomenal. This is not affected by these adjustments above and it was higher than what I am projecting for 2005-06. So yes, the stats are saying the Gordon is a so-so defender or worse if I don't make the adjustments above, but the Bulls played great defense when he was in the game in 2004-05. |
|
Back to top |
|
|
Dan Rosenbaum
Joined: 03 Jan 2005 Posts: 541 Location: Greensboro, North Carolina
|
Posted: Wed Aug 10, 2005 6:39 pm Post subject: |
|
|
jkubatko wrote: | Dan, did you look into the problem of multicollinearity at all? If you are just interested in obtaining predictions, then multicollinearity is not a big deal. But if your goal is to understand how each predictor influences the response, then multicollinearity is a *big* problem. Based on your initial post it seems like you are interested in the latter. |
In general, I am mostly concerned about prediction with these results, but for the discussion in this thread I am trying to make statements about how each variable is correlated with the adjusted plus/minus ratings, holding the other variables constant.
Multicollinearity, in the classic case, does not bias coefficient estimates. It just results in larger standard errors. But most of the coefficient estimates are fairly precisely estimated, so I do not really have a standard error problem. (The standard errors are probably a bit overstated, because I have not accounted for autocorrelation in the errors of multiple observations of the same player.)
Now it is possible that in cases of severe multicollinearity where the model is misspecified that multicollinearity might bias coefficient estimates. But except for the two rebounding variables, this is probably not a big problem in these regressions. So I disagree that multicollinear is a "big" problem in these regressions. |
|
Back to top |
|
|
jkubatko
Joined: 05 Jan 2005 Posts: 702 Location: Columbus, OH
|
Posted: Wed Aug 10, 2005 9:40 pm Post subject: |
|
|
Dan Rosenbaum wrote: | Players who go to the line more, holding the other variables constant, tend to be more effective on offense and defense. In fact, the effect is larger on defense. |
Based on your SAS output, the coefficient for free throw attempts is not statistically signifcantly different from 0. Since true shooting attempts is a linear combination that includes free throw attempts, I'm wondering if the correlation between true shooting attempts and free throw attempts (which is likely to be at least 0.85) is causing this. _________________ Regards,
Justin Kubatko
Basketball-Reference.com |
|
Back to top |
|
|
Dan Rosenbaum
Joined: 03 Jan 2005 Posts: 541 Location: Greensboro, North Carolina
|
Posted: Wed Aug 10, 2005 10:27 pm Post subject: |
|
|
jkubatko wrote: | Dan Rosenbaum wrote: | Players who go to the line more, holding the other variables constant, tend to be more effective on offense and defense. In fact, the effect is larger on defense. |
Based on your SAS output, the coefficient for free throw attempts is not statistically signifcantly different from 0. Since true shooting attempts is a linear combination that includes free throw attempts, I'm wondering if the correlation between true shooting attempts and free throw attempts (which is likely to be at least 0.85) is causing this. |
First, I should add, that looking at the FTA coefficient (or the TA coefficient), holding the other variables constant, means that we are holding true shot attempts and points constant. In other words, it is giving us the effect of subsituting free throw attempts for two or three point field goal attempts without total points scored changing.
The correlation is a bit lower than 0.85 at 0.66842 and yes, that contributes to a larger standard error. But a 95% confidence interval between -0.05 and 0.22 seems reasonably precise to me.
The most highly correlated variables are points and true shot attempts, which have a correlation over 0.96. But I am able to separately identify these two variables in the offensive rating equation, but as expected I am not able to identify them in the defensive rating equation. Partially for this reason, I use a little different specification when I compute the statistical ratings. |
|
Back to top |
|
|
jkubatko
Joined: 05 Jan 2005 Posts: 702 Location: Columbus, OH
|
Posted: Wed Aug 10, 2005 10:35 pm Post subject: |
|
|
Dan Rosenbaum wrote: | The correlation is a bit lower than 0.85 at 0.66842 and yes, that contributes to a larger standard error. But a 95% confidence interval between -0.05 and 0.22 seems reasonably precise to me. |
Ah, I forgot you were using rates per 40 minutes rather than the raw totals. That explains my overestimate of the correlation coefficient. _________________ Regards,
Justin Kubatko
Basketball-Reference.com |
|
Back to top |
|
|
jkubatko
Joined: 05 Jan 2005 Posts: 702 Location: Columbus, OH
|
Posted: Wed Aug 10, 2005 10:49 pm Post subject: |
|
|
Dan, have you thought about using FTA/FGA instead of FTA? I'm just curious how that would influence the results. _________________ Regards,
Justin Kubatko
Basketball-Reference.com |
|
Back to top |
|
|
Eli W
Joined: 01 Feb 2005 Posts: 402
|
Posted: Thu Aug 11, 2005 2:07 pm Post subject: |
|
|
What are the coefficients for estimating overall adjusted plus/minus? Are they just the sums of the offensive and defensive coefficients? |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|