View previous topic :: View next topic |
Author |
Message |
Joe
Joined: 27 Sep 2009 Posts: 94 Location: Long Island, NY
|
Posted: Wed Jan 27, 2010 3:19 pm Post subject: APM case study |
|
|
I've never really gotten too much into APM, aside from occasionally glancing at it when trying to analyze a player's defense. I just can't understand when it spits out results like the following, so maybe someone can give me some feedback as to how exactly this is possible.
http://basketballvalue.com/teamplayers.php?year=2009-2010&team=DEN
Going by 1 year APM, the Nuggets only have three above average players this season, and they all play the PF/C positions (Birdman, Nene, Kenyon). Lawson and Billups are massively below average players (so far below average that being above average for either of them doesn't even fall within the SE), and even Melo is below average.
I just want to know, what on earth is going on in games to make the statistic spit out this result? How is it coming to the conclusion that the Nuggets' entire backcourt sucks (including their four highest usage players) in spite of them being the second best team in the West? How are Chauncey Billups and Tywon Lawson playing at a level that suggests they're not even replacement level players?
I get APM is inexact and there's standard error and all this, but when I see stuff like this, it makes me wary about trusting anything, because these results are so unbelievably wonky that it just makes me question everything. I'm really curious as to how exactly this happens, because maybe it can help me better understand what I'm missing here. _________________ http://www.hoopdata.com |
|
Back to top |
|
|
Joe
Joined: 27 Sep 2009 Posts: 94 Location: Long Island, NY
|
Posted: Wed Jan 27, 2010 3:21 pm Post subject: |
|
|
And on a slightly related note, if Joe Sill reads this, I'd be curious if you plan on cranking out some RAPM for this season, seeing how we're at the midway point. Would be curious to see how your numbers evaluate this situation, and maybe the Four Factors numbers could give some more insight. _________________ http://www.hoopdata.com |
|
Back to top |
|
|
Crow
Joined: 20 Jan 2009 Posts: 800
|
Posted: Wed Jan 27, 2010 3:39 pm Post subject: |
|
|
Based on the 2 year data at basketballvalue the basic story with Hilario, Martin and Andersen is still positive overall, just far less extreme.
Based on hoopnumbers previous 3 year regularized data Hilario's defensive performance helped more than his offense hurt. Martin helped with both but mainly on defense. Andersen just barely helped on defense and the offensive drag was bigger. The effects are all much more muted and much more believable to me.
Billups, Smith and Melo were all big positives on offense and light to moderate (Melo) on defensive liability by Hoopnumbers regularized.
The stories between these versions of Adjusted +/- are very different.
I too hope to see regularized Adjusted data for this season at some point. I'd feel more comfortable with that and I think less extreme cases / outlier results would go over better with more people. Though that data is still worth seeing and thinking about.
Last edited by Crow on Wed Jan 27, 2010 3:46 pm; edited 1 time in total |
|
Back to top |
|
|
Joe
Joined: 27 Sep 2009 Posts: 94 Location: Long Island, NY
|
Posted: Wed Jan 27, 2010 3:45 pm Post subject: |
|
|
Crow wrote: | Based on the 2 year data at basketballvalue the basic story with Hilario, Martin and Andersen is still positive overall, just far less extreme.
Based on hoopnumbers previous 3 year regularized data Hilario's defensive performance helped more than his offense hurt. Martin helped with both but mainly on defense. Andersen just barely helped on defense and the offensive drag was bigger. The effects are all much more muted and much more believable to me.
Billups, Smith and Melo were all big positives on offense and light to moderate (Melo) on defensive liability.
The stories between the versions of Adjusted are very different.
I too hope to see regularized data for this season at some point. I'd feel more comfortable with that and I think less extreme cases / outlier results would go over better with more people. Though that data is still worth seeing and thinking about. |
I only want to know about what's going on this season. I have no interest in previous seasons. I want to know how Billups/Lawson are massively below average players THIS SEASON, along with the entirety of the Nuggets' roster aside from three players at two positions. _________________ http://www.hoopdata.com |
|
Back to top |
|
|
DSMok1
Joined: 05 Aug 2009 Posts: 593 Location: Where the wind comes sweeping down the plains
|
Posted: Wed Jan 27, 2010 3:53 pm Post subject: |
|
|
I thought I'd post an approximate of what perhaps we SHOULD see: the statistical plus/minus for Denver so far. This should give us an idea of where things may be a little weird:
Code: | Player SPM Minutes Contrib StdErr PARP
Carmelo Anthony 4.97 67.9% 3.37 4.18 6.68
Nene Hilario 4.23 71.0% 3.00 2.73 6.47
Chauncey Billups 5.16 55.1% 2.84 3.67 5.53
Chris Andersen 1.37 44.3% 0.61 2.66 2.77
Kenyon Martin -0.77 67.2% -0.52 2.64 2.76
Ty Lawson -0.12 42.0% -0.05 3.29 2.00
Arron Afflalo -1.44 53.0% -0.76 2.56 1.82
J.R. Smith -1.92 47.3% -0.91 3.85 1.40
Anthony Carter -3.38 22.9% -0.77 3.45 0.34
Renaldo Balkman 0.10 3.0% 0.00 9.17 0.15
Malik Allen -9.90 6.6% -0.65 5.66 -0.33
Johan Petro -19.62 2.4% -0.47 11.59 -0.35
Joey Graham -7.46 17.3% -1.29 4.04 -0.45 |
SPM = Statistical Plus/Minus (in pts/100 Pos), Contrib = SPM*Min%, StdErr = Standard Error of SPM estimate, PARP = Points Above Replacement Player (in pts/100 team possessions)
Here is my interpretation of what is happening with Denver's APM: Nene, Anderson, and Martin are highly collinear. From looking at their units, it appears that 2 of the 3 are ALWAYS on the floor. Basically, the frontcourt & backcourt play has no overlap so the interaction cannot be decoupled.
Based on 82 games data, only a very few players for the Nuggets have played time at both SF and PF (the link between the backcourt and frontcourt). EVERY one of those players have had a far higher net +/- at SF than PF--thus overvaluing all the frontcourt players. This VERY SMALL sample size is dominating the interaction between frontcourt and backcourt. Does that make sense?
Last edited by DSMok1 on Wed Jan 27, 2010 3:54 pm; edited 1 time in total |
|
Back to top |
|
|
Crow
Joined: 20 Jan 2009 Posts: 800
|
Posted: Wed Jan 27, 2010 3:53 pm Post subject: |
|
|
Okay, then we differ some.
i think the multi-year products may be more useful than the one year versions even for understanding this season.
But hoopnumbers can produce 1 year regularized and it sounds like we'd both be interested in that.
In the end I'll be as or more interested in a freshened multi-year regularized product. As my main source though I'm for having and using them all. |
|
Back to top |
|
|
Crow
Joined: 20 Jan 2009 Posts: 800
|
Posted: Wed Jan 27, 2010 3:56 pm Post subject: |
|
|
Would you agree that SPM misses shot defense and therefore might rate the offensive-biased perimeter guys too high and the defensive-biased interior guys too low? |
|
Back to top |
|
|
Crow
Joined: 20 Jan 2009 Posts: 800
|
Posted: Wed Jan 27, 2010 4:05 pm Post subject: |
|
|
DSMok1 you've used stat weights for SPM based on non-regularized Adjusted +/- so far, correct? Any interest in pushing on to using stat weights for SPM based on regularized Adjusted +/-?
Any other ways you see for the two methods to help improve each other?
Any valid for SPM to help improve the initial APM that in turn helps the SPM and so on?
What is your view on after-production blends of them? Equal or mostly statistical or different ratios for offense and defense? |
|
Back to top |
|
|
Mike G
Joined: 14 Jan 2005 Posts: 3552 Location: Hendersonville, NC
|
Posted: Wed Jan 27, 2010 4:09 pm Post subject: |
|
|
I was browsing bv.com also, and was struck by how many teams, if they went with their 5 best (APM) players, would be using either 4 bigs and a SF, or 4 guards and a F.
As emphatically as I don't understand these things, DSMok's synopsis has a definite ring of believability: Teams which very rarely use 'big' or 'small' lineups can get wacky results from APM, based on the results of relatively few minutes. _________________ `
36% of all statistics are wrong |
|
Back to top |
|
|
DSMok1
Joined: 05 Aug 2009 Posts: 593 Location: Where the wind comes sweeping down the plains
|
Posted: Wed Jan 27, 2010 4:24 pm Post subject: |
|
|
Crow wrote: | DSMok1 you've used stat weights for SPM based on non-regularized Adjusted +/- so far, correct? Any interest in pushing on to using stat weights for SPM based on regularized Adjusted +/-?
Any other ways you see for the two methods to help improve each other?
Any valid for SPM to help improve the initial APM that in turn helps the SPM and so on?
What is your view on after-production blends of them? Equal or mostly statistical or different ratios for offense and defense? |
I don't think regularized APM is better--it simply is regressing all players to 0 based on the standard error of their estimate (roughly). Using SPM weights based on that would most certainly not be better. The SPM Neil derived is based on enough years to balance out the noise in the APM.
I don't like APM much, raw, because it has SO much noise. I don't think one can get much useful info out of a 1-year APM, and any longer samples don't focus in on the year in question. |
|
Back to top |
|
|
Crow
Joined: 20 Jan 2009 Posts: 800
|
Posted: Wed Jan 27, 2010 4:34 pm Post subject: |
|
|
Mike, I was about to check into how polarized overall Adjusted +/- is for interiors vs perimeters and try to understand more about the teams that are polarized one way vs. another vs. not. I assume that the offensive and defensive splits tend to be even more polarized.
DSMok1 so would you consider going roughly 80% Statistical / 20% Adjusted like Dan Rosenbaum did with his Overall Statistical shortly after the beginning of all this as 1) a step back or 2) not much of an improvement or 3) select your own phrase?
I think the results of regularized APM are overall better than the non-regularized, more so for players at each extreme, but leave that to jsill to address if he wishes. |
|
Back to top |
|
|
DSMok1
Joined: 05 Aug 2009 Posts: 593 Location: Where the wind comes sweeping down the plains
|
Posted: Wed Jan 27, 2010 4:55 pm Post subject: |
|
|
Crow wrote: |
DSMok1 so would you consider going roughly 80% Statistical / 20% Adjusted like Dan Rosenbaum did with his Overall Statistical shortly after the beginning of all this as 1) a step back or 2) not much of an improvement or 3) select your own phrase?
I think the results of regularized APM are overall better than the non-regularized, more so for players at each extreme, but leave that to jsill to address if he wishes. |
I tend to prefer looking at each rating separately to get a good overall feel. Regularized adds skew to the results; I prefer to add in the skew manually by regressing towards a moving target based on the player's MPG, age, position, and other values that are relatively orthogonal to the plus/minus valuation.
Rosenbaum's idea was to find the minimum possible stderr for each player's estimate. SPM's have a good bit less noise, so for small sample sizes (like 1/2 of a season) SPM would dominate the average. |
|
Back to top |
|
|
biggles
Joined: 15 Apr 2009 Posts: 2
|
Posted: Wed Jan 27, 2010 5:09 pm Post subject: |
|
|
Preface: A good player not falling within a standard error of the zero average (using a small sample size): not a cause for concern. Many good players not falling within 2 SEs of zero: maybe the model is busted.
If (the basketballvalue version of) APM had a thought process, it would be something like:
Afflalo/Anthony/Billups/Nene/Martin: 398 minutes, raw rating 10.61
"hey the Nuggets starting lineup is doing reasonably well, some of the players must be good"
Afflalo/Andersen/Nene/Lawson/Smith: 58 min, rating 38.75
"whoa this lineup is even better! but it's a small sample so let's keep looking"
Andersen/Anthony/Carter/Nene/Smith: 38 min, rating 50.16
"this is their best common lineup! in fact all the common lineups with Nene seem to do well. it's probably because Nene's awesome. but if Nene's awesome, then why doesn't the starting lineup do better? somebody on it must suck."
Andersen/Anthony/Billups/Lawson/Martin: 15 min, rating -42.86
"wow this lineup sucks. it can't be because of Andersen, because most lineups with him do better than the starters. that means all the rest of the guys suck. especially Billups, he never turns up on any great lineups, he must be horrible. oh but this is only 15 mins so let's put huge standard errors on everything."
***
Even though APM depends less on low minute lineups than high minute ones, it's still heavily influenced by outliers, and maybe if you're going you want to use APM for single seasons for some reason, you should downweight the low minute lineups even more. (I don't have a good mathematical idea of how best to do this.) Or just go back to elementary school and use unadjusted +/-, which for small samples is probably more accurate in some sense, at the expense of losing the ability to differentiate between teammates. |
|
Back to top |
|
|
Joe
Joined: 27 Sep 2009 Posts: 94 Location: Long Island, NY
|
Posted: Wed Jan 27, 2010 5:15 pm Post subject: |
|
|
Crow wrote: | Okay, then we differ some.
i think the multi-year products may be more useful than the one year versions even for understanding this season.
But hoopnumbers can produce 1 year regularized and it sounds like we'd both be interested in that.
In the end I'll be as or more interested in a freshened multi-year regularized product. As my main source though I'm for having and using them all. |
I just want to know what is going on THIS SEASON that explains Lawson/Billups being massively below average and almost all of Denver's team being below average. I don't care if adding other seasons to the mix help make things look better, because they probably will. I want to know what is going on in the model that is causing these results, because maybe it will help me understand what on earth any of this means.
Quote: | Here is my interpretation of what is happening with Denver's APM: Nene, Anderson, and Martin are highly collinear. From looking at their units, it appears that 2 of the 3 are ALWAYS on the floor. Basically, the frontcourt & backcourt play has no overlap so the interaction cannot be decoupled.
Based on 82 games data, only a very few players for the Nuggets have played time at both SF and PF (the link between the backcourt and frontcourt). EVERY one of those players have had a far higher net +/- at SF than PF--thus overvaluing all the frontcourt players. This VERY SMALL sample size is dominating the interaction between frontcourt and backcourt. Does that make sense? |
Thanks. That's what I wanted to know. Is this kind of thing fixable? Because this appears to be a massive, massive flaw in the model that basically makes looking at Denver's stats be completely useless. That's how I see it anyway. _________________ http://www.hoopdata.com |
|
Back to top |
|
|
Crow
Joined: 20 Jan 2009 Posts: 800
|
Posted: Wed Jan 27, 2010 5:20 pm Post subject: |
|
|
I too would like to see a run that "downweights the low minute lineups even more" or even eliminates the really small and see how much difference that makes.
In the case of Denver this season using a 0.5% of total team minutes lineup cutoff you'd still have 2/3rds of the data. Use a 1% cutoff you'd still have half the data but maybe not enough especially in a single season set. Just downweighting it might be better but if you had the time, check out each. |
|
Back to top |
|
|
|