This is Google's cache of viewtopic.php?t=343. It is a snapshot of the page as it appeared on Jan 16, 2011 17:51:18 GMT. The current page could have changed in the meantime. Learn more

Text-only version
These search terms are highlighted: dan rosenbaum  
APBRmetrics :: View topic - Possession Estimators
APBRmetrics Forum Index APBRmetrics
The statistical revolution will not be televised.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Possession Estimators
Goto page 1, 2, 3, 4  Next
 
Post new topic   Reply to topic    APBRmetrics Forum Index -> General discussion
View previous topic :: View next topic  
Author Message
jkubatko



Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Wed Aug 17, 2005 3:16 pm    Post subject: Possession Estimators Reply with quote

Kevin Broom recently e-mailed me to ask why the league efficiency on the Basketball-Reference.com team pages was 103, but the league rating on the player pages was 106. League efficiency and league rating are both estimates of points per 100 possessions. "Efficiency" is John Hollinger's term, while "rating" is Dean Oliver's term. Kevin went to 82games.com and recorded the actual possessions per game for each team over the last three years. I then computed estimated possessions using Hollinger's formula and Oliver's formula:

Code:

Hollinger = FGA + 0.44*FTA - ORB + TO
Oliver    = FGA + 0.4*FTA - (ORB/(ORB+(oppTRB-oppORB)))*(FGA-FG)*1.07 + TO


Using the data collected by Kevin, the root mean square error (rmse) of Hollinger's estimates is 3.91 possessions per game, while the rmse of Oliver's estimates is 1.49 possessions per game. It is interesting to note that Hollinger's formula always produces an overestimate of team possessions per game (errors ranging from -6.21 to -2.37, with error calculated as actual minus estimated). Oliver's formula produced an overestimate in more than 90% of all cases, with errors ranging from -3.79 to 0.31.

The results above lead me to believe that I should stop using Hollinger's formula to estimate team points per 100 possessions and use Oliver's formula instead. Any thoughts?
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ed Küpfer



Joined: 30 Dec 2004
Posts: 762
Location: Toronto

PostPosted: Wed Aug 17, 2005 5:25 pm    Post subject: Re: Possession Estimators Reply with quote

jkubatko wrote:
The results above lead me to believe that I should stop using Hollinger's formula to estimate team points per 100 possessions and use Oliver's formula instead. Any thoughts?


Short answer: yes.

Longer answer: I've been screwing around trying to find a better possession estimator, and I haven't got anywhere. I think we may be at the limit of resolution, at least for season-level data. Game-level data may produce better results.
_________________
ed
Back to top
View user's profile Send private message Send e-mail
Ben



Joined: 13 Jan 2005
Posts: 264
Location: Iowa City

PostPosted: Wed Aug 17, 2005 8:30 pm    Post subject: Reply with quote

Why not regress the 82games.com possessions on the variables used in the standard formulas?
Back to top
View user's profile Send private message
Dan Rosenbaum



Joined: 03 Jan 2005
Posts: 540
Location: Greensboro, North Carolina

PostPosted: Wed Aug 17, 2005 9:04 pm    Post subject: Re: Possession Estimators Reply with quote

jkubatko wrote:
Kevin Broom recently e-mailed me to ask why the league efficiency on the Basketball-Reference.com team pages was 103, but the league rating on the player pages was 106. League efficiency and league rating are both estimates of points per 100 possessions. "Efficiency" is John Hollinger's term, while "rating" is Dean Oliver's term. Kevin went to 82games.com and recorded the actual possessions per game for each team over the last three years. I then computed estimated possessions using Hollinger's formula and Oliver's formula:

Code:

Hollinger = FGA + 0.44*FTA - ORB + TO
Oliver    = FGA + 0.4*FTA - (ORB/(ORB+(oppTRB-oppORB)))*(FGA-FG)*1.07 + TO


Using the data collected by Kevin, the root mean square error (rmse) of Hollinger's estimates is 3.91 possessions per game, while the rmse of Oliver's estimates is 1.49 possessions per game. It is interesting to note that Hollinger's formula always produces an overestimate of team possessions per game (errors ranging from -6.21 to -2.37, with error calculated as actual minus estimated). Oliver's formula produced an overestimate in more than 90% of all cases, with errors ranging from -3.79 to 0.31.

The results above lead me to believe that I should stop using Hollinger's formula to estimate team points per 100 possessions and use Oliver's formula instead. Any thoughts?

Interestingly it looks like the range for Hollinger's formula is a little less than the range for DeanO's formula. If we subtracted the mean diffrences (relative to 82games) from from DeanO's and Hollinger's formulas, which one then ends up having the smallest RMSE? (It would probably be better to do this in percentage terms, multiplying each by one minus the average percentage they are off from the 82 games figures.)

The reason I ask this is if they are pretty comparable on this measure, then I might prefer Hollinger's measure just because it easier. It would suggest that Hollinger's formula just misses a few possessions and that DeanO's formula doesn't correct the estimate much more than a simple addition or multiplication correction would do.


Last edited by Dan Rosenbaum on Thu Aug 18, 2005 12:42 am; edited 2 times in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
jkubatko



Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Wed Aug 17, 2005 9:44 pm    Post subject: Re: Possession Estimators Reply with quote

Dan Rosenbaum wrote:
Interestingly it looks like the range for Hollinger's formula is a little less than the range for Dean's formula.


Yes, but not by much. The range for Hollinger's errors is 3.84, while the range for Oliver's errors is 4.10.

Dan Rosenbaum wrote:
If we simply added 1.49 to DeanO's formula and 3.91 to Hollinger's, which one then ends up having the smallest RMSE? (It would probably be better to do this in percentage terms, multiplying each by one minus the average percentage they are off from the percentages that KevinB calculated.)


We should actually be subtracting 1.49 and 3.91 from Oliver's formula and Hollinger's formula, respectively. Hollinger's formula always overestimates possessions for the sampled years, and Oliver's formula overestimates possessions in more than 90% of the sampled years. If you use this fudge factor, the rmse for Hollinger's formula is 0.96 and the rmse for Oliver's formula is 1.01.

Dan Rosenbaum wrote:
The reason I ask this is if they are pretty comparable on this measure, then I might prefer Hollinger's measure just because it easier. It would suggest that Hollinger's formula just misses a few possessions and that DeanO's formula doesn't correct the estimate much more than a simple addition or multiplication correction would do.


For a quick calculation Hollinger's is okay, but for my web site I prefer the more accurate estimate. If they were close I wouldn't worry about it, but Oliver's points per 100 possession estimates are much closer to reality than Hollinger's are.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com


Last edited by jkubatko on Thu Aug 18, 2005 9:21 am; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Dan Rosenbaum



Joined: 03 Jan 2005
Posts: 540
Location: Greensboro, North Carolina

PostPosted: Wed Aug 17, 2005 9:50 pm    Post subject: Re: Possession Estimators Reply with quote

jkubatko wrote:
Dan Rosenbaum wrote:
If we simply added 1.49 to DeanO's formula and 3.91 to Hollinger's, which one then ends up having the smallest RMSE? (It would probably be better to do this in percentage terms, multiplying each by one minus the average percentage they are off from the percentages that KevinB calculated.)


We should actually be subtracting 1.49 and 3.91 from Hollinger's formula and Oliver's formula, respectively. Hollinger's formula always overestimates possessions for the sampled years, and Oliver's formula overestimates possessions in more than 90% of the sampled years. If you use this fudge factor, the rmse for Hollinger's formula is 0.96 and the rmse for Oliver's formula is 1.01.

Dan Rosenbaum wrote:
The reason I ask this is if they are pretty comparable on this measure, then I might prefer Hollinger's measure just because it easier. It would suggest that Hollinger's formula just misses a few possessions and that DeanO's formula doesn't correct the estimate much more than a simple addition or multiplication correction would do.


For a quick calculation Hollinger's is okay, but for my web site I prefer the more accurate estimate. If they were close I wouldn't worry about it, but Oliver's points per 100 possession estimates are much closer to reality than Hollinger's are.

Actually, with your RMSE results (if they hold up), it looks like the most accurate formula would be to use Hollinger's formula and subtract mean difference from the 82games figures. My guess is that it would be a little more accurate if you did it in percentage terms using the following formula.

JH Adjusted Possessions = JH Posssessions *
[1 - ((Average JH Possessions - Average 82games Possessions)/Average JH Possessions)]


Last edited by Dan Rosenbaum on Thu Aug 18, 2005 12:48 am; edited 2 times in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
Kevin Pelton
Site Admin


Joined: 30 Dec 2004
Posts: 965
Location: Seattle

PostPosted: Thu Aug 18, 2005 12:03 am    Post subject: Reply with quote

I think given how important the concept of possessions is to our work, the ability to explain it to people is more important than chasing percentage points of accuracy.

Having different measures out there seems to me to be a worse scenario than either one on its on.

We've made some standardization efforts on this forum (True Shooting Percentage, notably), but I'd like to see those extended. Hollinger and B-R.com tend to be the two best-known sources, so when you're in lock step (as they are on using per-40 minute stats as opposed to per-48 minute now, for example), that probably should be the de facto standard.

But there are other issues that should be resolved including possessions.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Dan Rosenbaum



Joined: 03 Jan 2005
Posts: 540
Location: Greensboro, North Carolina

PostPosted: Thu Aug 18, 2005 12:46 am    Post subject: Reply with quote

I went back and edited my two posts in this thread because I for some reason started to think the RMSEs reported in the first post were mean differences. Thus, the numbers I was plugging into the formulas were wrong.
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
jkubatko



Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Thu Aug 18, 2005 9:59 am    Post subject: Reply with quote

admin wrote:
I think given how important the concept of possessions is to our work, the ability to explain it to people is more important than chasing percentage points of accuracy.


I would agree, which is why I wouldn't do something like add a multplier obtianed from a regression analysis to improve the accuracy of either formula.

admin wrote:
We've made some standardization efforts on this forum (True Shooting Percentage, notably), but I'd like to see those extended. Hollinger and B-R.com tend to be the two best-known sources, so when you're in lock step (as they are on using per-40 minute stats as opposed to per-48 minute now, for example), that probably should be the de facto standard.


I prefer Oliver's formula, as those estimates are much closer to reality. Also, I use many of Oliver's metrics on the player pages. One benefit of a switch is that people won't be confused why the league rating on the players pages differs from the league efficiency on the team pages is 103.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com


Last edited by jkubatko on Thu Aug 18, 2005 1:30 pm; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ben



Joined: 13 Jan 2005
Posts: 264
Location: Iowa City

PostPosted: Thu Aug 18, 2005 1:16 pm    Post subject: Reply with quote

jkubatko wrote:
admin wrote:
I think given how important the concept of possessions is to our work, the ability to explain it to people is more important than chasing percentage points of accuracy.


I would agree, which is why I wouldn't do soemthing like add a multplier obtianed from a regression analysis to improve the accuracy of either formula.


To each his own, but IMHO any linear function of FGA, FTA, OR, and TOs is much simpler than Oliver's formula.

In any event, I mostly use these for efficiency measures and I'm mostly interested in relative values rather than absolutes. Even if Hollinger's is off, as far as I know, there's no systematic bias against any types of players or teams. Certainly, I could be missing something though.
Back to top
View user's profile Send private message
jkubatko



Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Thu Aug 18, 2005 2:02 pm    Post subject: Reply with quote

Ben wrote:
To each his own, but IMHO any linear function of FGA, FTA, OR, and TOs is much simpler than Oliver's formula.


I would agree with this. I know Hollinger's formula off the top of my head, but I can't say the same about Oliver's.

Ben wrote:
In any event, I mostly use these for efficiency measures and I'm mostly interested in relative values rather than absolutes. Even if Hollinger's is off, as far as I know, there's no systematic bias against any types of players or teams. Certainly, I could be missing something though.


You're right, it does depend on how you're going to use them. If I want the better estimate of a team's points per 100 possessions, then Oliver's formula is the way to go. What's interesting is that Hollinger's formula seems to do a better job of preserving the actual rankings. For the years 2003-2005 I ranked each team by actual points per 100 possessions, Efficiency (Hollinger's term), and Rating (Oliver's term). I then calculated how far the ranking for each estimator was from the actual ranking. For example, using the actual data the 2005 Wizards ranked 13th in points per 100 possessions, 11th in Efficiency, and 10th in Rating. The Efficiency ranking was off by two places (the absolute value of 13 minus 11) and the Rating ranking was off by three places. In this case the Efficiency ranking is closer to the actual ranking. I did this for all 88 cases and obtained the following results:

* In 50 of the 88 cases the rankings of the two estimators were equally close to the actual rankings.

* In 25 of the 88 cases the Efficiency ranking was closer to the actual ranking than the Rating ranking.

* In 13 of the 88 cases the Rating ranking was closer to the actual ranking than the Efficiency ranking.

* In 40 of the 88 cases the Efficiency ranking was equal to the actual ranking, and in 68 of the 88 cases the Efficiency ranking was within 2 of the actual ranking.

* In 31 of the 88 cases the Rating ranking was equal to the actual ranking, and in 69 of the 88 cases the Rating ranking was within 2 of the actual ranking.

* The biggest miss by both estimators was the 2003 Detroit Pistons. Using the actual data they were 6th, but they were ranked 14th in Efficiency and 16th in Rating.

All of this just confuses me...
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ed Küpfer



Joined: 30 Dec 2004
Posts: 762
Location: Toronto

PostPosted: Thu Aug 18, 2005 3:13 pm    Post subject: Reply with quote

Let's get some numbers happening. Here are the team stats for 2004 and 2005, including possessions: http://ca.geocities.com/edkupfer/basketballstuff/possessionstats.txt
The numbers in the possessions field were calculated by

1. Going to 82games.com player on/off pages.
2. For each player, calculating the number of team possessions while "on" and while "off," for offense and defense. Possessions = Points/(RTG/100)
3. Summing the on/off possession numbers.
4. Taking the average for each team, rounding it to whole numbers.

Code:
           HOLL     DEAN
FTA Coef   0.40     0.40
RMSE     222.42   102.72
Const   -264.39  -119.59
Coef       1.01     1.01

Centiles               
0.10    -300.24  -159.22
0.20    -282.00  -145.43
0.30    -272.70  -135.51
0.40    -250.02  -123.59
0.50    -200.34  - 59.47
0.60    -168.30  - 33.17
0.70    -154.85  - 26.95
0.80    -141.31  -  9.61
0.90    -129.00     3.86




Those numbers seem awful strange. The coefficient that minimizes Hollinger's RMSE is 0.3 -- which is just wrong. We know that it's somewhere between 0.4 and 0.45. The optimal coefficient for DeanO's is 0.36, which is better, but still off.

Code:
           HOLL     DEAN
FTA Coef   0.30     0.36
RMSE      77.48    69.58
Const    -12.42   -59.47
Coef       1.00     1.01

Centiles               

0.10    -103.83   -80.15
0.20     -90.74   -65.96
0.30     -67.82   -56.35
0.40     -52.65   -41.16
0.50       3.23    32.55
0.60      38.35    54.22
0.70      56.93    64.53
0.80      73.96    73.98
0.90      89.12    87.23



What's going on here? Could Roland's possession numbers be wrong?
_________________
ed
Back to top
View user's profile Send private message Send e-mail
Ben



Joined: 13 Jan 2005
Posts: 264
Location: Iowa City

PostPosted: Thu Aug 18, 2005 3:17 pm    Post subject: Reply with quote

Those results are actually worse than I would have expected. That is confusing.

Edit: I was referring to Justin's post, and hadn't seen Ed's yet.
Back to top
View user's profile Send private message
Ben



Joined: 13 Jan 2005
Posts: 264
Location: Iowa City

PostPosted: Thu Aug 18, 2005 3:22 pm    Post subject: Reply with quote

How are possessions defined at 82games.com? Are they the same thing that we've been trying to estimate?
Back to top
View user's profile Send private message
jkubatko



Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Thu Aug 18, 2005 3:45 pm    Post subject: Reply with quote

Ben wrote:
How are possessions defined at 82games.com? Are they the same thing that we've been trying to estimate?


That was my question as well. If they aren't defined the same way, then I have wasted a lot of time the last two days. :-)
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    APBRmetrics Forum Index -> General discussion All times are GMT - 5 Hours
Goto page 1, 2, 3, 4  Next
Page 1 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group