|
APBRmetrics The statistical revolution will not be televised.
|
View previous topic :: View next topic |
Author |
Message |
mdelp
Joined: 30 Oct 2008 Posts: 3 Location: edmonton
|
Posted: Wed Nov 19, 2008 3:34 pm Post subject: Data Mining Project Ideas/Advice? |
|
|
Hello everyone. This is a great forum. I'm a Master's student in computer science and life long basketball fan. I managed to get a professor to agree to let me do a Data Mining project in the basketball realm.
I'm wondering if anybody has any good ideas for projects ideas. Some ideas I was thinking of follow:
I've read "A Starting point for Analyzing Basketball statistics" and the authors mention that one "unsolved" item in basketball is figuring out what it is about star players that makes them so effective and valuable despite the fact that their per minute numbers and shooting efficiency may not be as good as some of the top shooters. Intuitively, this has to do with creating their own shots, and shooting before the shot clock is up. A star player like Allen Iverson or Kobe Bryant will be able to create their own shots unlike a pinpoint shooter like Sasha Vujajic or Steve Kerr who needs to take an open assisted shot but can hit that with higher efficiency. Also, someone has to take a shot within 24 seconds. The ball is often put in the hands of the star player with a few seconds left on the clock so since everything else about the play has failed he will give them the best chance to score in the end. The idea of the project would be to formulate a hypothesis and prove it (i.e. that star players take more unassisted shots % wise, and make more % of shots then others with only a few seconds left in the shot clock), although this could probably easily be verified or falsified by looking at some the 82Games analysis.
So a better project would be to find any other features that separate out the star players, and/or correlate star players together with up and coming players, to help predict which players will be stars. For example do Rodney Stuckey's per-minute-stats indicate he will be a star one day because there are correlations with other stars?
Another problem that seems big and still "open" is that of determining the best defensive players from statistics. I know there are defensive efficiency measures out there, but I also know that it is a difficult problem since who one plays with can really influence their defensive plus - numbers. I'd like to look at this problem, but I'm not sure how I can add to what is all ready out there.
If you like my ideas, can make any recommendations on where to get statistics or what data mining techniques to use, know of anybody who has built a freely available play-by-play parser, or have some other good ideas to work on, please reply! |
|
Back to top |
|
|
MattB
Joined: 22 Jun 2006 Posts: 38 Location: Lowell
|
Posted: Wed Nov 19, 2008 4:06 pm Post subject: |
|
|
Something I have thought about from time to time is creating a central database and stat gathering app that would aggregate statistics from multiple sources...
Basically it would consist of a flexible interface that would allow for tracking very specific customizable events. If I wanted to track shots contested or potential assists (basically something more subjective or required human calculation) then I could use this interface to keep track of the player/time/position and so on.
This data would then get uploaded to a db and be a new data point that gets correlated to the game/team/player it's relevant to.
To make this a bit more feasible it might be worth tracking a few statistics (rather than opening it up to being user defined).
I think one of the most frustrating things for someone who does this as a hobby is the lack of availability to some of the more subjective stats. The benefit of this system would be that the few of us who are willing to take the time to keep track of these things would also benefit from others doing the same thing. Basically building a community of trackers.
I always thought it would be interesting anyway. |
|
Back to top |
|
|
Mountain
Joined: 13 Mar 2007 Posts: 1527
|
Posted: Thu Nov 20, 2008 4:39 am Post subject: |
|
|
If you wanted to look at the shot creation question it might be interesting to try to sort or even identify normal shots from so-called "created" shots. Put shots into a table by time on shot clock, floor location, game clock & situation and look at the FG%s.
If Kobe jacks up a 21 footer moving sideways is it a "created" shot? Created yes literally but is it created-good / neccessary? Does it matter if it with 17 seconds on the shot clock or 3? I'd say yes. Does it matter if it is down by 6 or up by 6? Yes. Hand in face or clean look? Yes. Teammates in better spots with less pressure? Yes.Break Kobe's shots into 20, 50, even 100 segments (for a season or as many as you can get data for) and look at the patterns. If he is creating good created shots exactly where, when and why is he doing so. Were his teammates cold or hot? Does he notice or care in early youth, championship years, after Shaq, now?
Check him and some others.
If you wanted to work on defense maybe correlate sum of overall counterpart defensive stats available at 82 games with 5 man lineup performance. How good are sum of average counterpart defensive stats at matching specific lineup defensive performance? Does a defender above or below average by x deviations cause an accelerated non-linear impact?
Or maybe assemble player level allowed shot charts of various kinds and then analyze them. Based on counterpart defense assumption or for the entire opponent with player on compared to off or the league average chart or player above average, average or below on counterpart data, adjusted +/- or team defensive rating. If a player looks above or below average by one or more of these where more specifically is this lean coming from? |
|
Back to top |
|
|
Harold Almonte
Joined: 04 Aug 2006 Posts: 616
|
Posted: Thu Nov 20, 2008 10:43 am Post subject: |
|
|
Creating shots after pass is mainly a matter of an extra skill: ballhandling skill, and then physical advantage. Creators judge this advantage over their matchups and take decissions. I think an individual attempt with high success% is always better than lengthening the team off. poss. no matter the clock time, but studies tell us that extra passes increase FG%.
It would be interesting to look at the Mountain's idea of scorers's FG% or (points per poss.) on "not assisted-driving FGAs only" vs. rest of team's overall ppp., at different clock times, and see when extra passes are not a good option. |
|
Back to top |
|
|
thref23
Joined: 13 Aug 2007 Posts: 90
|
Posted: Thu Nov 20, 2008 12:28 pm Post subject: |
|
|
MattB wrote: | Something I have thought about from time to time is creating a central database and stat gathering app that would aggregate statistics from multiple sources...
Basically it would consist of a flexible interface that would allow for tracking very specific customizable events. If I wanted to track shots contested or potential assists (basically something more subjective or required human calculation) then I could use this interface to keep track of the player/time/position and so on.
This data would then get uploaded to a db and be a new data point that gets correlated to the game/team/player it's relevant to.
To make this a bit more feasible it might be worth tracking a few statistics (rather than opening it up to being user defined).
I think one of the most frustrating things for someone who does this as a hobby is the lack of availability to some of the more subjective stats. The benefit of this system would be that the few of us who are willing to take the time to keep track of these things would also benefit from others doing the same thing. Basically building a community of trackers.
I always thought it would be interesting anyway. |
I agree with this. I have thought a lot about the same thing, although I lack the computer programming know how and possibly the finances and time relative to making it happen.
I don't think the focus should be solely on subjective stats (like potential assists, as you mention), but also on detailed play by play data (so when all is said and done you can click a button and view stats on possessions when a specific player dribbles between his legs near the right baseline, for example).
If an application is simple and user friendly enough, maybe even able to run on a Pocket PC, it would likely be easy enough to find volunteers (and you could pay $XX dollars per game). All anybody would need is a tv, and dvr/tivo (or some other sort of recording means), and they get paid to pay close attention to a basketball game. The idea wouldn’t be solely to track NBA stats, but also college/NBA-DL/Euro stats. A hefty project especially considering the quality control needed to make it real legit, but it would have enough value to recoup any investment required. nbaplaycharting.com seems to be initiating something similar, but I don’t know how in depth they are planning to get.
Anyways, re:subjective stats, other than what is proposed in this thread already…I like the idea of scoring assists subjectively as they are scored in hockey – i.e. there can be more than one assist per basket. I like refining rebounding percentages based on a player’s distance to the rim relative to teammates, so it only really counts as an opportunity (or perhaps even a semi opportunity) when it actually can be considered one, and I like keeping track of who appears to be really responsible for a turnover. I like tracking altered (but not blocked) shots (which would involve a lot of subjectivity), and keeping track of garbage time / desperation shots separate from other shot attempts. I also like the idea of noting whether a shot is taken with teammates under the rim in position to come away with the rebound versus shots taken without teammates in position for a rebound, and whether a player is being single or double teamed or what not. And of course, it is worthy to track good fouls versus bad fouls versus bad calls. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|