APBRmetrics

mtamada · Joined: 28 Jan 2005 Posts: 377

Here's a report on NESSIS, focusing on Wayne Winston's presentation about his and Sagarin's version of adjusted plus-minus (WinVal).

When the publicity for WinVal first came out, with the descriptions of how it accounted for the ability of the other 9 players on the court, I assumed that Winston and Sagarin were looking at all 450-odd NBA players by solving a system of 450 equations with 450 variables. But it turns out that they run a least squares regression, at its heart the same as DanR's adjusted +/-.

Winston highlighted two ways in which his APM differs from what unnamed others who use OLS might get. He noted that there can be high degrees of collinearity in the data, and said that he has a "proprietary fix for this". More on this later.

He also said that his technique for dealing with players with little playing time differs ... I haven't done APM myself nor looked in detail at how DanR, AaronB, etc. do it but Winston said that some people treat those players as generic replacement level players (I presume in order to get the estimation to run). Winston said he does not do this (but gave no details about what he does do), which can cause some player's results to differ markedly (Dirk Nowitzki changing from -1 to +7).

I think it's fair to say that many in the audience thought that Winston seems overly sure about his results, and seems not to pay attention to standard errors. Some of the questions after his presentation directly or indirectly addressed this, but he IMO dismissed them with hand-waving. E.g. one of his startling results was that Lamar Odom had a better APM in the 2009 playoffs than Kobe did. But I worry about sample size: over a few games, entire teams might, for random reasons, score more or fewer points than they "should" against a given opponent. Individual player's estimated APMs would be subject to even more random variability, i.e. high standard errors.

Slightly different but related to this: Winston acknowledged the debate about whether to use one season or more than one season of data when coming up with APM estimates. Given his predilections above, it's not surprising that he opts to use one season of data.

A comment on collinearity: no surprise that it's there given NBA substitution patterns, and it is a common bugaboo in doing linear regression. Econometricians will tell you that there is only one true solution: get more data (i.e. increase the sample size). Kind of hard because we can't ask the players to go out and play some more games for us (but we can opt to use 2 or even 3 seasons of data, which was the reason I voted for AaronB to use multiple seasons on bbval.com, though I knew the masses would ask for APMs based on one season).

Other solutions? One common one is ridge regression, a weird-sounding technique when you read about how to do it, but it makes sense when you think of it as a Bayesian estimator where, in the face of regression coefficients with unreliable estimates (due to the collinearity), we opt to bias the estimates to zero, similar to regressing estimates to the mean but in this case we regress them to zero.

But my guess is that Winston isn't doing something like that. What is he doing? No idea. Another common suggestion is to transform one or more of the variables, but there's not much transforming that one can do with dummy variables.

So Winston's APM is similar to the other APMs out there, but with some modifications. Do these modifications result in much smaller standard errors? It'd be hard to imagine how they could, but Winston didn't provide standard errors nor address them much in the Question & Answer period.

He did also mention that he's been focusing more on lineups than on individual player ratings, consistent with some of the recent conversations and interests in this group.

The conference concluded with a panel discussion with MikeZ, AaronB, and Ken Catanella from the NBA Office. Ken said that the highly detailed video data that will soon be produced by the NBA will eventually be made available to the public. The panelists gave some interesting descriptions about their work situations, and also some useful advice about getting a job in the NBA -- but most of that same advice has I think already been mentioned on this site.

The final highlight was finally meeting MikeZ, who'd reserved the top floor of a sports bar across the street from where the Celtics play (I refuse to learn whatever dumb corporate name they've given to what should be called the Boston Garden). So after the conference a good proportion of the attendees went there and I got to spend time over food and drinks with him, AaronB, EdK, EliW, and Ben Alamar (editor of JQAS; I hadn't realized that he'd also consulted for the Sonics/Thunder). Swell group of fellows.

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 711 Location: Raleigh, NC

Thanks for the recap!

My sense is that there is only so much we can squeeze out of the adj +/- model as currently specified. No idea what he could be doing to deal with the collinearity issues, but I think that there is still only so much confidence you can get out of the model. Although multiple years of data reduces std error, I have reservations about what we assume when we do this (not assumptions in the statistical/model sense, but in the basketball sense).

Great stuff, and I hope there will be video posted online!
_________________
I am a basketball geek.

John Hollinger · Joined: 14 Feb 2005 Posts: 175

"Post" entertainment at the Fours has rapidly become a rather enjoyable tradition at the Boston events ... wish I could have been there but the calendar wouldn't let me.

mikez · Joined: 14 Mar 2005 Posts: 75

Simply put, something is wrong with you if you don't love the Fours.

Anyways, it was great to see everyone, and especially to finally meet mtamada. Hopefully we'll all do it again in two years (which is how often the organizers have said they plan to do NESSIS).

-MZ

mtamada · Joined: 28 Jan 2005 Posts: 377

Correcting something that I wrote:

Ed Küpfer · Joined: 30 Dec 2004 Posts: 787 Location: Toronto

I think I speak for everyone when I say it was great meeting me.
_________________
ed

mikez · Joined: 14 Mar 2005 Posts: 75

HoopStudies · Posted: Fri Oct 02, 2009 9:23 am Post subject:

I know there was other basketball work besides Winston's stuff at NESSIS. Can someone give some info on how those were? I've read the abstracts...
_________________
Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers.

mtamada · Joined: 28 Jan 2005 Posts: 377

gabefarkas · Joined: 31 Dec 2004 Posts: 1313 Location: Durham, NC

tpryan · Joined: 11 Feb 2005 Posts: 100

MoonbeamLevels · Joined: 09 Apr 2007 Posts: 10

I'm a PhD student in statistics, and it's fascinating to hear topics such as Ridge Regression emerge here! It (along with other penalized regression models like the LASSO) plays with the concept of the "bias-variance" tradeoff, essentially reducing the variance significantly at the expense of a little bias. Different shrinkage methods have different advantages- the LASSO simultaneously generates parameter estimates while handling the model selection issue by shrinking the parameters toward zero (and setting a number of them equal to zero, in most cases), while Ridge Regression shrinks the coefficients in a way that circumvents collinearity, but does not necessarily set them equal to zero. A useful combination of these two methods, called the "elastic net", seems to be quite promising according to Statistics literature. I'm not sure how well these methods would translate to basketball data, but I would imagine they could indeed be quite useful.

HoopStudies · Posted: Tue Oct 06, 2009 1:36 pm Post subject:

Good to see some of the people with that more advanced stat knowledge providing the input. I didn't know about ridge regression until this discussion. Relating it to Bayesian stuff is definitely helpful... So many tools and you have to know when to use which ones...
_________________
Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers.

mtamada · Joined: 28 Jan 2005 Posts: 377

Ryan J. Parker · Joined: 23 Mar 2007 Posts: 711 Location: Raleigh, NC

Thanks for that link mtamada. I haven't seen much work in trying to look at the underlying assumptions and testing if they work or not. In fact, I haven't seen any such talk about when you'd have to revert back to a fixed effects model!
_________________
I am a basketball geek.