|
APBRmetrics The statistical revolution will not be televised.
|
View previous topic :: View next topic |
Author |
Message |
Mountain
Joined: 13 Mar 2007 Posts: 1527
|
Posted: Thu Jun 18, 2009 1:56 am Post subject: |
|
|
Thanks, when I've pitched this sketch several times I was hoping for some feedback on feasibility or challenges or ways to accomplish it.
when time permits I'll check further into the other topics you raise.
I can see that the significance level might need to shift. As with all this adjusted data at best you will end up fairly confident about the most of the worst and best and not that confident of the level of the rest. Still that has some value and then you can check the tape or memory and decide how far to believe or adjust in specific cases. Believe rather than "know" for sure.
Until the partial Factor level adjusted data is derived you can look at the factor and partial Factor level raw data and the adjusted +/- or offensive and defensive splits and other data and make some guesses about what the most significant adjusted partial Factors might be and get sense of their sign and in some cases magnitude. But there could be multiple sets of Factor or partial Factor solutions to the composite level adjusted scores for players of roughly similar power.
Last edited by Mountain on Fri Jun 19, 2009 8:04 am; edited 1 time in total |
|
Back to top |
|
|
gabefarkas
Joined: 31 Dec 2004 Posts: 1313 Location: Durham, NC
|
Posted: Thu Jun 18, 2009 1:25 pm Post subject: |
|
|
That Gelman Bayesian book looks interesting, but the only mention of multiplicity is somewhere in the references, as far as I can tell.
The article you linked to seems to be on the right track.
The classic reference in my day job is this, but I'm not sure where you can find a copy of it since it's fairly old. |
|
Back to top |
|
|
Ryan J. Parker
Joined: 23 Mar 2007 Posts: 708 Location: Raleigh, NC
|
|
Back to top |
|
|
mtamada
Joined: 28 Jan 2005 Posts: 376
|
Posted: Thu Jun 18, 2009 4:25 pm Post subject: |
|
|
By coincidence I attended a seminar a couple of weeks ago where Brad Efron presented a paper on using Emprical Bayesian techniques to reduce one aspect of the multiplicity problem: selecting variables to use in a multivariate regression. He has even made available a program, written in R, which does the estimation (and in a typical joke, calls his program "EBay"). The paper and program are on his webpage, under the entries for 2008.
One side note that he mentioned during the seminar, which I wasn't familiar with, is that Bayesian estimation techniques are immune (perhaps under certain conditions?) to the multiplicity problems of classical/frequentist statistics: i.e. no need to shrink or regress estimates to the mean. At least I think that's what he said, it was an offhand comment, and I do not have a good knowledge of Bayesian statistics.
But now this paper says that Emprical Bayesian estimates may often differ significantly from Bayesian estimates, suggesting perhaps that Efron's EBay solution may not be adequate. |
|
Back to top |
|
|
Ryan J. Parker
Joined: 23 Mar 2007 Posts: 708 Location: Raleigh, NC
|
Posted: Thu Jun 18, 2009 4:29 pm Post subject: |
|
|
Interesting stuff mtamada. I'm still learning, so when someone says "empirical bayes" I'm not exactly sure what they're referring to. I know the general idea is that you're using data to create priors in which you then use those priors with the data, in a sense using the data twice. Gelman doesn't prefer this terminology, calling the empirical part redundant. Should be some good reading there, though. _________________ I am a basketball geek. |
|
Back to top |
|
|
mtamada
Joined: 28 Jan 2005 Posts: 376
|
Posted: Thu Jun 18, 2009 4:48 pm Post subject: |
|
|
Ryan J. Parker wrote: | Interesting stuff mtamada. I'm still learning, so when someone says "empirical bayes" I'm not exactly sure what they're referring to. I know the general idea is that you're using data to create priors in which you then use those priors with the data, in a sense using the data twice. Gelman doesn't prefer this terminology, calling the empirical part redundant. Should be some good reading there, though. |
Yeah, I'm no expert, here's a nice short summary of some views about Emprical Bayesian techniques, including Gelman's viewpoint.
From Efron's talk, I gather that one of the problems with Emprical Bayesian techniques is that the estimates have larger standard errors (greater uncertainty) than calculated, and maybe bias as well -- presumably because the estimates are based, not on true priors, but on parameters estimated from the data. But you don't know the standard errors of those estimated parameters ... or maybe it's hard to calclulate how that uncertainty leads to additional uncertainty in the final Empirical Bayes estimates. |
|
Back to top |
|
|
gabefarkas
Joined: 31 Dec 2004 Posts: 1313 Location: Durham, NC
|
Posted: Fri Jun 19, 2009 7:42 am Post subject: |
|
|
Yup, that's it. I can't tell if you can get the full article or not, but if you can it's definitely worth reading. |
|
Back to top |
|
|
tpryan
Joined: 11 Feb 2005 Posts: 99
|
Posted: Sun Jun 21, 2009 4:16 am Post subject: |
|
|
Of course what Gabe was saying is that if many tests are performed, a "significant" result or two could be obtained due to chance alone. There is not a simple solution to that problem, in general, and adjusting alpha levels for individual tests can be too conservative.
I am not a Bayesian, nor an expert on it, but I tend to agree with Gelman regarding terminology. One starts with a reasonable prior, maybe even a noninformative prior, then combines that with data to obtain the posterior. Systems change over time, as George Box has emphasized, prefering to think of rapid change, so the posterior becomes the next prior, then posterior_2 is produced from more data, etc.
"EBay". Clever. |
|
Back to top |
|
|
mtamada
Joined: 28 Jan 2005 Posts: 376
|
Posted: Mon Jul 20, 2009 3:47 pm Post subject: |
|
|
Back to the video-based data capture that started this thread: Sportsvision (the same company that brings you those yellow virtual first-down lines on TV football broadcasts, as well as baseball's Pitch F/X data) recently unveiled the next generation beyond Pitch F/X: tracking and timing of balls and players.
The prototype system is in place in San Francisco (I refuse to even attempt to keep up with the commercially-based name changes of ballparks and arenas, it was originally called PacBell Field). They recently had an all-day mini-conference in San Francisco to talk about the logistics and ins and outs of this technology. There was even a presentation about creating "heat maps" of Pitch F/X data (rather than scatterplots), which sounds similar to the colorful shot charts recently discussed here.
Although the nature and flow of basketball games are very different from those in baseball, I hope that the NBA and MLB and their contractors are communicating and cooperating; this is all new stuff and rather than independently re-inventing the wheel, I think all of the sports and technologists could probably learn a lot about new techniques and best practices from each other. The NBA's upcoming system has been described as being provided by STATS LLC, but I don't know if they're literally doing the hardware, technology, etc., I think of them as being a data company rather than a technology company. Maybe STATS is already partnering with Sportsivision? Sportsvision's website says that they are the source of Hoops F/X data, evidently used by TV broadcasters. Did any NBA reps attend the Pitch F/X mini-conference (which evidently was open to anybody, all it lacked was publicity)?
Additional hopes for the future: whatever the NBA and STATS end up calling their 6-HD-camera setup ("Hoopsvision"?), I hope they make the data publicly available and organize conferences (or participate in existing ones such as Sloan, NESSIS, or NCSSORS). At NCSSORS, someone mentioned the reams of data that the NFL has -- but doesn't share. I think that's a mistake on the NFL's part, an exampe of 20th century thinking. Yes it cost them probably millions of dollar to create and collect those data, but by only sharing it within the NFL (or licensing the data for a very high price) they limit the amount of research that can utilize the data. 21st century thinking would tell them to make the data freely available; there are literally hundreds if not thousands of fans and would-be analysts who would love nothing more than to jump on those data and start doing analysis -- all for free. If the NBA and STATS make their Hoopsvision data freely available, somewhere out there is the next Dean Oliver who'll make some revolutionary findings with the data. (Or come to think of it, the original DeanO is still around too!) |
|
Back to top |
|
|
HoopStudies
Joined: 30 Dec 2004 Posts: 705 Location: Near Philadelphia, PA
|
Posted: Mon Jul 20, 2009 4:45 pm Post subject: |
|
|
mtamada wrote: | ...If the NBA and STATS make their Hoopsvision data freely available, somewhere out there is the next Dean Oliver who'll make some revolutionary findings with the data. (Or come to think of it, the original DeanO is still around too!) |
And, yes, I know what to do with the data. Definitely a good challenge, bringing every ounce of PhD training I got. _________________ Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers. |
|
Back to top |
|
|
Crow
Joined: 20 Jan 2009 Posts: 798
|
Posted: Mon Jul 20, 2009 8:39 pm Post subject: |
|
|
Detailed video translated into a multi-factor database would get at the situational FG%s of mid-range shots- open or degree contested along with time of shot clock and perhaps catch n shoot versus off the dribble. That would aid the management / reduction of mid-range shots.
Ideally you could use such a database to look at play sequences and try to find optimized sequences for your team vs different team types / lineup mixes and defensive schemes (based perhaps largely on where you get shot a and expected payoff instead of actual?), using the mid-range as a part of overall strategy, to the extent that you normally have to and not beyond that. In chess often the masters think in what 10 or 20 move sequences? Do the best NBA coaches?
And going beyond sequences you could usefully examine plays and how the swirl of motion and player attributes in that motion with their potentialities lead to more or less open and good shots. And then try to repeat the most successful plays and the critical pieces of plays precisely. If the cameras are fixed you could compare a successful, pretty play to other real game versions of it down to inches or practice it until it sufficiently fits the pattern. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|