Posted: Sun Jan 13, 2008 8:19 am Post subject: A new idea on testing the accuracy of a player rating system
This just occurred to me yesterday when I was trying to figure out how close my college player rating system would have predicted an outcome AFTER the fact - after we already know the exact minutes played of every player in a game on each team. I did a one game test of my system (Arizona vs Houston) - and actually ended up with almost the exact ratio of Arizona's rating to Houston's to the actual ratio of Arizona's points to Houston's points in the game (pure luck it happened the first game I try). Here's the thread where I mention it - it's the 9th post down where I do this "test":
Anyway - I realized - all this debate about the accuracy of different systems (Usually PER vs. Berri's Winscore or whatever it's called) in evaluating players - couldn't one retroactively test these rating systems on a game to game basis on an entire season, taking the standard deviation of the difference in ratios of predicted outcomes to actual outcomes? The system that has the smallest standard deviation from game to game would in essence be the most "accurate".
For example - for one game, Team A beats team B 110 to 100, so the TRUE ratio of points scored for the game ends up 1.10. Now - if you take the PER (final PER for the season mind you) of every player that played and multiplied it by the minutes the player played, and summed each team - the goal in a perfect world (if PER were 100% accurate) would be that if you divided the sum of Team A by the sum of team B, you'd get 1.10. If it actually ends up, say, 1.25, then the prediction deviated by 0.15.
Does that make sense? Has anyone tried this on a large scale? Wouldn't this maybe be the best way to "test" these systems? The closer a system gets to mimicking actual results when going BACK over the season - then the better that system reflects true individual player performance.
Is there a better way to grade a rating system than trying to retroactively test final season results with past game outcomes? _________________ Statman
Posted: Sun Jan 13, 2008 11:36 am Post subject: Re: A new idea on testing the accuracy of a player rating sy
Statman wrote:
For example - for one game, Team A beats team B 110 to 100, so the TRUE ratio of points scored for the game ends up 1.10. Now - if you take the PER (final PER for the season mind you) of every player that played and multiplied it by the minutes the player played, and summed each team - the goal in a perfect world (if PER were 100% accurate) would be that if you divided the sum of Team A by the sum of team B, you'd get 1.10. If it actually ends up, say, 1.25, then the prediction deviated by 0.15.
I'm not sure you're measuring how accurate those rating systems are.
Once you sum up all together it's very likely going to work because the tough part is to split the credit for every change in the scoreboard within the team => most of the errors are going to be corrected once putting everything together again!
I really don't see how to test them at the player level in a quantitative way, actually.
I'm not sure if I understood. A player's performance against some team or some games not necesarily is correlated with his average season performance. If such prediction at the individual game level could be possible, Las Vegas would know it and betting would dissapear. The most closed correlation that I think can be done is with wins (no margins).
I have not the level to revise Rosembaum's tests, but my common sense says me that if you compare the ratings with 4 (eight really) factors and the W% of each action (scoring, ballhandling, rebounding and FTs/foulings), the PER's lack of points allowed will produce problems in the def. eFG% prediction, but the win% relationship eFG% + R% supossedly should produce more problems in WP, but is not the case. I think, and I'm not sure either, is because altough rebounds are overrated, the importance of rebounding rating (20%) is reduced; scoring is underrated but the importance of efficiency (40%) is rised, and probably is why WP remains relatively predictive. Somebody please correct it if that's possible.
Posted: Sun Jan 13, 2008 5:30 pm Post subject: Re: A new idea on testing the accuracy of a player rating sy
Ryoga Hibiki wrote:
Statman wrote:
For example - for one game, Team A beats team B 110 to 100, so the TRUE ratio of points scored for the game ends up 1.10. Now - if you take the PER (final PER for the season mind you) of every player that played and multiplied it by the minutes the player played, and summed each team - the goal in a perfect world (if PER were 100% accurate) would be that if you divided the sum of Team A by the sum of team B, you'd get 1.10. If it actually ends up, say, 1.25, then the prediction deviated by 0.15.
I'm not sure you're measuring how accurate those rating systems are.
Once you sum up all together it's very likely going to work because the tough part is to split the credit for every change in the scoreboard within the team => most of the errors are going to be corrected once putting everything together again!
I really don't see how to test them at the player level in a quantitative way, actually.
Not if you sum the absolute values of the deviations of each game.
But yes - overall, a system should sum close fairly close to zero (if you are just summing the differences, some games negative, some positive) - or it's probably a bit off. _________________ Statman
I'm not sure if I understood. A player's performance against some team or some games not necesarily is correlated with his average season performance. If such prediction at the individual game level could be possible, Las Vegas would know it and betting would dissapear. The most closed correlation that I think can be done is with wins (no margins).
I have not the level to revise Rosembaum's tests, but my common sense says me that if you compare the ratings with 4 (eight really) factors and the W% of each action (scoring, ballhandling, rebounding and FTs/foulings), the PER's lack of points allowed will produce problems in the def. eFG% prediction, but the win% relationship eFG% + R% supossedly should produce more problems in WP, but is not the case. I think, and I'm not sure either, is because altough rebounds are overrated, the importance of rebounding rating (20%) is reduced; scoring is underrated but the importance of efficiency (40%) is rised, and probably is why WP remains relatively predictive. Somebody please correct it if that's possible.
As for the first part - yes, this is true. However, we aren't looking at the players as individuals, but summing the players every game in proportion to their minutes. Of course every game will have variance (many games large variance) - even if the rating was quite "accurate" - but over the course of a whole season, the absolute value of the sums of the deviations (I forgot to mention before absolute value) still would be lower for the "more accurate" systems I would think.
I think some may be confused a little by my one game example - one game being "accurate" obviously doesn't tell us that much (although I was happy the one game I tested in mine wasn't WAY off) - but over thousands of games - I would think it could tell us alot when comparing one system to another.
If I were a programmer - I could probably have all the data pulled from every game box score (all I need is final score and individual player minues). From here one could test PER, Berri's, maybe Mike G's, etc. against each other (IF I had all their final player ratings) - see whose system seems to best on average to retroactively "predict" the past results.
As for your second point - I'm sure what you are getting at. I agree PERs biggest problem would probably lie most in not measuring points allowed, and Berris problem would lie probably elsewhere (overvaluing rebounding, undervaluing usage?) - I'm just curious which would be more "off" from real life results.
The system I use does use opposing points as part of a factor, which is why it's a ratio (100 being average): players summed results (scaled to actual team points) divided by opponent's points. I also take into account playing time in the final result as kinda a "Bruce Bowen" adjustment (low rated guys who play alot on good teams probably are doing more things not reflected in the box score, and vice versa). For college it's trickier - because of the inclusion of SoS, which could be skipped in the NBA without too much "error" (though there would be some, ie the East being much weaker than the West in a given season). _________________ Statman
Joined: 14 Jan 2005 Posts: 1712 Location: Delphi, Indiana
Posted: Tue Jan 15, 2008 10:32 am Post subject:
I would expect a system that (like mine) matches point-differential to player rates to yield closer predictions than one which doesn't. However, credit to individual players is still not adequate. If Bowen doesn't play, do the Spurs suffer a little, or do they suffer a lot?
Last year, team expected wins (based on my eWins-producing stats) averaged an 'error' of 2.5 wins from pythagorean-expected. But actual wins 'erred' by an average of 2.7 from pyth. Can you beat that? _________________ 40% of all statistics are wrong.
Joined: 31 Dec 2004 Posts: 958 Location: Durham, NC
Posted: Tue Jan 15, 2008 6:25 pm Post subject: Re: A new idea on testing the accuracy of a player rating sy
Statman wrote:
For example - for one game, Team A beats team B 110 to 100, so the TRUE ratio of points scored for the game ends up 1.10. Now - if you take the PER (final PER for the season mind you) of every player that played and multiplied it by the minutes the player played, and summed each team - the goal in a perfect world (if PER were 100% accurate) would be that if you divided the sum of Team A by the sum of team B, you'd get 1.10. If it actually ends up, say, 1.25, then the prediction deviated by 0.15.
Does that make sense? Has anyone tried this on a large scale? Wouldn't this maybe be the best way to "test" these systems? The closer a system gets to mimicking actual results when going BACK over the season - then the better that system reflects true individual player performance.
I've tossed around the idea of something like that too. It seems incredibly monumental though.
One question: are you referring to the PERs over the entire season, or the PERs calculated based on stats only up to that game?
Posted: Tue Jan 15, 2008 7:05 pm Post subject: Re: A new idea on testing the accuracy of a player rating sy
[quote="gabefarkas"]
Statman wrote:
One question: are you referring to the PERs over the entire season, or the PERs calculated based on stats only up to that game?
I was thinking whole season PER.
I do think Mike G's would probably be more "accurate" in this type of testing than John H's or Berri's. Mine (which I haven't worked on recently since I've been doing alot of college stuff) would probably be also, since Mike G & I have a number of similarities in our approach (I make sure the linear weights totals exactly match team points totals, and I also use opposition scoring as a factor). _________________ Statman
Joined: 05 Sep 2007 Posts: 18 Location: University of Miami-Florida
Posted: Tue Jan 15, 2008 11:28 pm Post subject:
This idea is interesting, but does not work for a few reasons. For instance, if our only measure of player productivity is points, then if we add up all the players with that measure then the team whose players add up to a higher point total will win. Similarly, if we jut use Offensive Rating as a measure, and add all of that up, the team with the higher offensive rating will win. If we "believe what we say we believe," that shot creation has value, that efficiency goes down as usage goes up, then such a system does not work. The difficulty in basketball analysis is not figuring out what leads to wins. The difficulty is in figuring out how to apportion credit for these stats among players.
This idea is interesting, but does not work for a few reasons. For instance, if our only measure of player productivity is points, then if we add up all the players with that measure then the team whose players add up to a higher point total will win. Similarly, if we jut use Offensive Rating as a measure, and add all of that up, the team with the higher offensive rating will win. If we "believe what we say we believe," that shot creation has value, that efficiency goes down as usage goes up, then such a system does not work. The difficulty in basketball analysis is not figuring out what leads to wins. The difficulty is in figuring out how to apportion credit for these stats among players.
I'm really not sure what you are saying here.
IF you are saying that there isn't a metric that can accurately reflect true player impact on a team - that may be somewhat true.
However, I do think it is possible to have a metric that does a fairly solid job of measuring player performance. While not perfect by any means, or necessarily fair to every type of player (ie Bruce Bowen), I would think it is possible to put together something that has some substance.
I was wondering HOW to figure out if a metric seems to hold water - and this idea was the best I've come up with. IF we are going to appoint some type of number to a player's performance for the season (say PER), then I would think if we were to go back and test the results - we shouldn't have a huge deviation from the ACTUAL results on average. What would be considered too big a deviation - I really don't know, unless we tested many metrics and saw what kind of results we got. I don't think it would take long to figure out which metrics are obviously WAY off, and which aren't too bad.
But, there will obviously NEVER be a perfect metric to accurately measure true player performance. _________________ Statman
Joined: 05 Sep 2007 Posts: 18 Location: University of Miami-Florida
Posted: Wed Jan 16, 2008 12:07 am Post subject:
Well, I think that Dave Berri accomplished this with Wages of WIns. He took the end results of every team in every year since they kept all the stats (I think '76?) and found out what team totals had the highest correlation to winning. (You can download pretty much every result ever from www.basketballdatabase.com.) So, this would seem to answer the "whose stats add up to most wins" question. This is where the big debates between Wages and BOP come in. In Wages, the rebounder pretty much gets credit for the forced miss and the rebound. Also, Wages gives no value to shot creation. This isn't to say that the "what stats add up to wins" question is not worth asking. I do not want to seem like I am dismissing the idea. Nor do I want to say that I do not find value in a lot of what Wages has to say about basketball. I'm just trying to question the soundness of using a system of measurement, "what adds up to the most wins," that actually provides one of the main forks in the road to agreement between different schools of basketball thought.
Well, I think that Dave Berri accomplished this with Wages of WIns. He took the end results of every team in every year since they kept all the stats (I think '76?) and found out what team totals had the highest correlation to winning. (You can download pretty much every result ever from www.basketballdatabase.com.) So, this would seem to answer the "whose stats add up to most wins" question. This is where the big debates between Wages and BOP come in. In Wages, the rebounder pretty much gets credit for the forced miss and the rebound. Also, Wages gives no value to shot creation. This isn't to say that the "what stats add up to wins" question is not worth asking. I do not want to seem like I am dismissing the idea. Nor do I want to say that I do not find value in a lot of what Wages has to say about basketball. I'm just trying to question the soundness of using a system of measurement, "what adds up to the most wins," that actually provides one of the main forks in the road to agreement between different schools of basketball thought.
He did this at the team level - that is a different ball of wax from what I am saying.
Did he later, AFTER he created his ratings and got his results for a given season, go back and test these INDIVIDUAL results against the actual results? If he did, what was his standard deviation in an average NBA game (of the summed individual results) from actual results? Did he test this deviation against against individual factors on their own (like, say, the standard deviation if one used only scoring rate, or rebound rate, etc)? Did he make sure his metric had a lower deviation than any others he could test?
They have the WScore metric at the individual game. I think its standard deviation would be similar to any other one without team adjusts, except for the out of scale in rebounding (probably neither of them is right at the scoring efficiency).
The basketball win laws were already discovered in 4 (eight) factors, and every metric is obligued to fit with these proportions and the main of all (the zero sum approach at every action of the game). What rest in metrics is the weight of skills (shot creation, ballhandling, rebounding skills, etc.), a secondary thing. And the usage worth (linked with the weight of skills): why players are above and under average in attempts (defensive attempts too) at every factor, and what is the worth of that. Another thing would be the quality of play time and how that change the weight of stats. And finally decission makings (linked with usage).
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum