APBRmetrics

farbror · Joined: 13 Oct 2005 Posts: 15 Location: Sweden

I am a fairly qualified statistician with a craving for power ratings. What is the recommended reading on "How to create meaningfull Power Ratings"?

Cheers, farbror
SWEDEN

kjb · Joined: 03 Jan 2005 Posts: 865 Location: Washington, DC

farbror · Joined: 13 Oct 2005 Posts: 15 Location: Sweden

Thank you for the input and the starting points. Another most interesting question is how to update your ratings?

Also, I think I read a version of the pythagorean approach you mentioned were the parameters are raised to 16.5. What is the state of the art?

....and has anyone looked into stuff like how power ratings should be adjusted for trades? I get the impression that most of the well thought material posted here is "player focused"?

cheers, farbror
SWEDEN

Neil Paine · Joined: 13 Oct 2005 Posts: 774 Location: Atlanta, GA

Okay, here's a super-laborious method I use for the NFL. It's based on Wes Colley's NCAA Football model, but much more simplified. I'm convinced it can also work for the NBA, but it would be a bitch to set up and update, which is why I haven't done it yet...

Start with something called Laplace's Method: to estimate a team's true probability of winning a game, use this method: r = (1 + wins) / (2 + losses). (This was a method first introduced by Pierre-Simon Laplace, the French mathematician, in hopes of determining where an unseen marker on a craps table is solely by trial and error shots of dice).

Now, account for strength of schedule, as not all NBA schedules are created equal. This is where I deviate from Colley's method, because we need not calculate opponents' opponents' record in the rating (the NBA only has 30 teams; one layer of schedule-strength should do the trick). Simply sum the "r" values for every opponent a team plays (updated continuously throughout the season), and let that sum equal the variable "s". Now introduce a new variable, "nEff", to the mix: nEff = ((wins - losses)/2) + s. To form a raw rating, then: Raw = (1 + nEff) / (2 + Wins + Losses) This effectively deals with schedule strength by giving the winning team only a fraction of a win that varies based on opponent strength, and conversely punishing the losing team only a fraction of a loss, also variable based on opponent skill. If you're really crazy (like Wes Colley, for instance), you calculate "nEff" not by summing "r"-values, but instead by taking the half the sum of the convergent infinite series of the "r" values to the nth power, where n is the number of iterations... In other words, instead of using the simple laplace values (which don't fluctuate depending on who you play) for schedule strength, he instantaneously updates the raw ratings, plugging them in as "r" in each successive iteration, and then iterates an infinite number of times. But nobody's that crazy!

Finally, I take the "Raw" number and run a regression on it to make it look like coventional power ratings: 100 is the best possible, 75 is average, etc. It's more arbitrary, but easier to look at an individual team rating and make a value judgment.

If you want the really gory mathematical details, check this bad boy out.

Hope this helped, though I can't believe that it possibly did...

kjb · Joined: 03 Jan 2005 Posts: 865 Location: Washington, DC

farbror · Joined: 13 Oct 2005 Posts: 15 Location: Sweden

Hehe, I just might have. I would really appreciated a few hints on where the best sources of ( free ) data is available. I am fluent in statistical programming and might be of some assitance in that area.

cheers, farbror
sweden

Jon Cohodas · Joined: 08 Jul 2005 Posts: 31 Location: Richmond, VA

Now, account for strength of schedule, as not all NBA schedules are created equal. This is where I deviate from Colley's method, because we need not calculate opponents' opponents' record in the rating (the NBA only has 30 teams; one layer of schedule-strength should do the trick). Simply sum the "r" values for every opponent a team plays (updated continuously throughout the season), and let that sum equal the variable "s". Now introduce a new variable, "nEff", to the mix: nEff = ((wins - losses)/2) + s. To form a raw rating, then: Raw = (1 + nEff) / (2 + Wins + Losses) This effectively deals with schedule strength by giving the winning team only a fraction of a win that varies based on opponent strength, and conversely punishing the losing team only a fraction of a loss, also variable based on opponent skill. If you're really crazy (like Wes Colley, for instance), you calculate "nEff" not by summing "r"-values, but instead by taking the half the sum of the convergent infinite series of the "r" values to the nth power, where n is the number of iterations... In other words, instead of using the simple laplace values (which don't fluctuate depending on who you play) for schedule strength, he instantaneously updates the raw ratings, plugging them in as "r" in each successive iteration, and then iterates an infinite number of times. But nobody's that crazy!

I am that crazy I guess, but you don't have to be. I calculate my "power ratings" for college football by maximizing a log likelihood function which involves iterating not an infinite number of times, but rather until the rating values are stable to a large number of significant figures.

You should note that for Colley's method, you do not need to iterate, but rather you may just invert the C matrix in section 6.1 of the paper you linked. I was able to exactly replicate his calculations in Excel (with an the help of a freeware, large matrix add-in) using this method. (I have the file somewhere if you want to see it.)

Since the NBA has less than 40 teams, Excel might be able to invert the matrix without the add-in.