APBRmetrics Forum Index APBRmetrics
The statistical revolution will not be televised.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Searching for Data
Goto page Previous  1, 2
 
Post new topic   Reply to topic    APBRmetrics Forum Index -> General discussion
View previous topic :: View next topic  
Author Message
Mountain



Joined: 13 Mar 2007
Posts: 374

PostPosted: Thu Jan 31, 2008 12:17 pm    Post subject: Reply with quote

Some thoughts for the discussion:

5 man lineups have offensive and defensive efficiencies. Teams do in a sense as the sum of all 5 man lineups used but that can vary some or a lot game to game. The team average efficiencies are not fixed. Clearly they are affected game by game by opponent but also by choice of how to matchup to opponent. There may be a stat based optimal set & timing of 5 man lineups on average or for use in a particular game (based on study of past direct matchups or extended to team similars) but performance in each game will vary.

I share these comments in relation to seeking a "real" or "a priori" team efficiency. The values you get for that are influenced by all the coaching choices regarding lineups and matchups. Some teams vary lineup and matchup strategy more than others and I would guess this may be affecting the actual match pattern with your predictions you cited above. I don't think you are getting to underlying real team efficiency unless you assume that coaches have optimalized equally.

An alternative to the search for underlying "team" efficiency would be to adjust the 5 man lineups compiled at basketballvalue.com and know the adjusted efficiencies. Then perhaps you can try to go from 5 man lineup to optimize matchups to get to predicted team performance in a game or season average that way. The way coaches do or should.

That is my reaction for what it might be worth Dream-. Getting to your ultimate goal may take additional steps and be very hard. But certainly more things can be learned toward the goal from your efforts and I look forward to hearing more of your method and findings.


Last edited by Mountain on Thu Jan 31, 2008 4:03 pm; edited 1 time in total
Back to top
View user's profile Send private message
Chicago76



Joined: 06 Nov 2005
Posts: 52

PostPosted: Thu Jan 31, 2008 3:05 pm    Post subject: Reply with quote

Dream- wrote:
It is very interesting indeed. I am already finding surprising results with respect to home court advantage and how rested teams are.

I am adjusting the strengths individually for these factors because it seems that different teams react much differently to them.

I am also finding that some teams are incredibly hard to predict (with 50% hit rate) while others are very consistent (95% hit rate). And it is not the ones on the very top or bottom of the standings.


So you're saying that teams at the very top and bottom are inconsistent? I would think this is the case. Take the Celtics. They win a lot, but if you're using O and D ratings of them and their opponents, coming up with an adjusted spread is going to get complicated. Primarily because they rest guys when they have a comfortable lead. They're outscoring people through 3 quarters by more than 10 pts a game, but in the 4th, they're coasting at +0.5 pts. Very good teams and very bad teams are difficult to model because they frequently find themselves in games where no one cares mid-way through the 4th.

The other difficulty I see in what you're trying to achieve is accounting for the frenzied part of a close game in the final few minutes. Your model might appropriately predict a 4 pt victory for team A over B. Team A might be ahead 4 with 2 minutes left. Then they end up winning by 10 due to a foulathon at the end. We see games all the time where the final score isn't really indicative of the play for 98% of the game.
Back to top
View user's profile Send private message
Dream-



Joined: 26 Jan 2008
Posts: 12

PostPosted: Thu Jan 31, 2008 3:29 pm    Post subject: Reply with quote

Mountain,

Yes the 5 men squad is important and so is what you call the coach's optimization vs a given team. I may look at that too after I am done with this first analysis.

Chicago76,

Yes you are right, the last minutes make it very difficult but I think there is a definite trend.

Regarding the inconsistency of teams. The top and bottom team are consistent but not as consistent as some teams that are near the edges but not at the edges. There also seems to be less consistency with bad teams, it seems that while bad teams lose most games, they do it very erratically (they may just give a great game against a good team).

Here are some more results:

-The first 5 games of the season are very unpredictable, the ratings jump all over. After 5 games the ratings settle down and the accuracy increases as the season goes by.

-The last 5 games of the season are very unpredictable again. Possibly due to some teams giving up, others saving their good playesr and not caring anymore, and yet others that push themselves trying to get into the playoffs.

-The playoffs are very unpredictable in a game by game basis, but they are predictable as a series.

So far I am using unadjusted strengths (I am still looking for proper data and writing the code that will retrieve the data and format it to something I can use).

I am using an ELO type convergence method and evaluating performance just in a win-lose per game basis.

I simulate the whole season and at each game I make a prediction, compare it with the result, and adjust the ratings.

Doing just a W-L rating does provide some predictability, but doing it with points is even more accurate. To use the points, I am converting point differentials to a score value (0 is a strong loss, 1 is a strong win, while 0.51 is a weak win). I am using an inverse exponential function, where the exponent seems to have a great deal of importance when it comes to accuracy of prediction. Right now the best exponential makes the curve almost a straight line.
Back to top
View user's profile Send private message
Chicago76



Joined: 06 Nov 2005
Posts: 52

PostPosted: Fri Feb 01, 2008 2:00 am    Post subject: Reply with quote

Dream- wrote:
Doing just a W-L rating does provide some predictability, but doing it with points is even more accurate. To use the points, I am converting point differentials to a score value (0 is a strong loss, 1 is a strong win, while 0.51 is a weak win). I am using an inverse exponential function, where the exponent seems to have a great deal of importance when it comes to accuracy of prediction. Right now the best exponential makes the curve almost a straight line.


Question: why convert point differentials to a subjective definition of stong loss/win or weak loss/win. Why not just use the point differentials?

An observation:

There is quite a bit of great data out there on a lot of sites, but I'm partial to something that might be helpful to you on b-r. You were mentioning the unpredictable nature of some teams. Apart from the blowout scenario and the score differential due to end of game free throw scenario, I've got one other that could lead to volatility in team performance.

Individual player performance generally is more variable game to game than team performance. Intuitively, teams that rely upon one guy to carry a large load may have greater performance volatility.

b-r has a nice tool here: http://www.basketball-reference.com/fc/psl_finder.cgi

I pulled up usage% for 2006-07 (essentially % of a teams touches a player gets while on the court). I weighted this by % of time the player was actually on the court to get % of total team touches:

Bryant 26.8%
James 25.1%
Carter 24.5%
Arenas 23.5%
McGrady 22.6%
Anthony 21.1%
Garnett 20.9%
Nowitzki 20.7%
Iverson 20.6%
Randolph 20.4%
Gordon 20.1%

Injuries and trades aside, I would expect these players' teams to generally have a higher degree of unpredicability than other teams--especially on those teams where there isn't another player(s) where usage is constrained (like an Iverson on the Nuggets). Other players can't absorb the burden on an off night.
Back to top
View user's profile Send private message
Dream-



Joined: 26 Jan 2008
Posts: 12

PostPosted: Fri Feb 01, 2008 12:41 pm    Post subject: Reply with quote

Chicago,

Thanks for the data, I will post the consistency values for said teams, but right now, from memory, I think the Lakers had a pretty decent consistency, so did the Mavericks. I think the Mavericks and the Spurs were the most consistent for 2006-07.

Regarding why to turn the point difference into an artificial curve. The main reason is that at some point a score difference of more than certain points is not directly proportional to the strength difference between the teams. The current exponential reaches 0.707 at about 14 points.

Another thing I am finding, but need more research, is that for some reason rest does not affect the team's performance in a way I would expect. For most teams playing a game with 0 days of rest, drops the performance quite a bit, but resting for 3 or more days seems to also give a drop in performance (but it varies from team to team some would lose performance with 2 days rest and get it back with 3+). AT this point the only "safe" assumption is that 0 days rest causes a drop.
Back to top
View user's profile Send private message
gabefarkas



Joined: 31 Dec 2004
Posts: 958
Location: Durham, NC

PostPosted: Fri Feb 01, 2008 3:12 pm    Post subject: Reply with quote

Dream- wrote:
The main reason is that at some point a score difference of more than certain points is not directly proportional to the strength difference between the teams. The current exponential reaches 0.707 at about 14 points.

For the uninitiated, what does "the current exponential" mean? Also, what is it supposed to be (if there is an assumed value) before it reaches 0.707? And how did you get 14 points as a cut-off?


Dream- wrote:
Another thing I am finding, but need more research, is that for some reason rest does not affect the team's performance in a way I would expect. For most teams playing a game with 0 days of rest, drops the performance quite a bit, but resting for 3 or more days seems to also give a drop in performance (but it varies from team to team some would lose performance with 2 days rest and get it back with 3+). AT this point the only "safe" assumption is that 0 days rest causes a drop.

I'm not sure what's surprising about that. Heuristically, I would think too much rest can cause players to be "rusty" or "out of practice", no?
Back to top
View user's profile Send private message Send e-mail AIM Address
Dream-



Joined: 26 Jan 2008
Posts: 12

PostPosted: Fri Feb 01, 2008 3:50 pm    Post subject: Reply with quote

The "current exponential" is my own inverse exponential function that gives a curve from 0 to 1, where 1 is reached asymptotically (meaning that the function never reaches 1 but gets infinitely close to it).

By adjusting the function we can vary the relationship between point differential in a given game and strength of victory (or loss). After the exponent was adjusted to maximize prediction hits, I got a curve where a strength of victory of 0.707 was reached by scoring 14 points (this point is called the half-power point in some areas of physics, particularly electronics).

The current algorithm is still quite sensitive to changes in parameters (and there may be no solution to that). For example increasing the k factor on the ELO formula produces results that are more variable and perhaps better for predictions because it weights recent performance more heavily, but it also requires lowering the exponential curve to a point where it looks artificial to me.

So I am still figuring this out. Smile
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    APBRmetrics Forum Index -> General discussion All times are GMT - 5 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group