web analytics

Center Comparison Chart (and K-Means Clustering)

January 19, 2011
By

Last week, I unveiled a Google Motion Chart that included a large number of advanced stats comparing point guards. This week, we’ll start at the other end: centers. I actually am including players classified as either C or PF/C by BasketballValue, where I got the position information.

Most people feel that the position of center is changing, morphing into something different than it once was.  The presence of numerous “centers” that hang around on the perimeter shooting 3′s is an indicator of this phenomenon.  Still, there is a defined way a center plays–and to define it, let’s turn to the lovely tool known as K-Means Clustering.

Tom Haberstroh had a series at Hardwood Paroxysm discussing the positional revolution. I commented at the time that a K-Means clustering analysis may be the way to attack the problem of what “positions” there really are in the NBA. Of course, the positional spectrum is really a continuum, but since there are 5 players on the floor, perhaps the 5 most common “roles” could be discerned.

Well, thanks to Dr. Wagner Kamakura at Duke, there is a free K-Means clustering plugin for Excel. I compiled 4 years’ worth of Hoopdata statistics (the shot location data is critical for this analysis) and set to work. I’m not exactly sure what I found. I explored all sorts of clustering options–5, 6, 7, 9, 10 clusters; weighted clusters; different statistics.

Here is a taste of the results, a table of a 5 cluster run, showing the archetype of each cluster. The column “Wt” at the left shows what weight I put on that specific statistic.

5 K-Means Offensive Clusters

5 K-Means Offensive Clusters

Cool, huh? I don’t know what it means, either. I never thought Jason Kidd and Troy Murphy played the same position.

What is obvious, though, is that there is a well-defined “center” position. The center has extreme values in nearly every statistical category, from AST% to shot location and FG%. And Tim Duncan and Kevin Garnett don’t play center, at least on the offensive end–but David Lee does, or should I say did. Lee was a “Center” only in 2007 but has since switched to cluster 3, “Post/PF”.

Well, to get back to the Center Comparison Chart. To simplify matters, I just used BasketballValue’s positions, so as not to get stuck in the positionality swamp. May I present 61 “Centers or Center/Power Forwards”. Three different clusters from above represented.

Interesting points:

  • Centers are pretty average on offense a lot of the time–a lot of their baskets are created by others and they don’t create many themselves, for the most part.  No one would dispute that Kevin Love is a really good offensive player, though.
  • Dwight Howard can play some D.
  • Bargnani is a Center/PF?  He doesn’t even get 10% of available rebounds!
  • How can the same player be the best rebounder and the worst shot-blocker?
  • Tyson Chandler is efficient on offense this year. In that he scores when he shoots. He just isn’t very good at offense otherwise!
  • OKC has 3 “centers” but… ouch.

Glossary Table:

#LabelMeaningMore Information
1PlayerPlayer name
2AgeAge of player
3TmTeam of player
4TS%True Shooting %see Basketball Reference Glossary
5eFG%Effective Field Goal %see Basketball Reference Glossary
6ORB%Offensive Rebounding %see Basketball Reference Glossary
7DRB%Defensive Rebounding %see Basketball Reference Glossary
8TRB%Total Rebounding %see Basketball Reference Glossary
9AST%Assist %see Basketball Reference Glossary
10STL%Steal %see Basketball Reference Glossary
11BLK%Block %see Basketball Reference Glossary
12TOV%Turnover %see Basketball Reference Glossary
13USG%Usage %see Basketball Reference Glossary
14PERPERsee Basketball Reference Glossary
15ORtgOffensive Ratingsee Basketball Reference Glossary
16DRtgDefensive Ratingsee Basketball Reference Glossary
17OWSOffensive Win Sharessee Basketball Reference Glossary
18DWSDefensive Win Sharessee Basketball Reference Glossary
19WSWin Sharessee Basketball Reference Glossary
20WS/48Win Shares/48 minutessee Basketball Reference Glossary
21ASPMAdvanced Statistical Plus/Minus
22O ASPMOffensive Advanced Statistical Plus/Minus
23D ASPMDefensive Advanced Statistical Plus/Minus
24OVORPOffensive Value over Replacement player
25DVORPDefensive Value over Replacement Player
26VORPValue over Replacement Player
27VORP-GMValue over Replacement Player, in games played
28MPGMinutes per Game

Share

Tags: , , , ,

8 Responses to Center Comparison Chart (and K-Means Clustering)

  1. Crow on January 19, 2011 at 7:56 pm

    To be clear, what is “AWS/48″ mentioned in the chart? WS48 or something else?

    What are the distributions of traditional position labels assigned to players for each cluster?

    • DanielM on January 19, 2011 at 11:04 pm

      Alternative Win Score per 48 minutes. It’s a Hoopdata stat.

    • DanielM on January 20, 2011 at 3:33 pm

      I’ll explore clustering more some other time, and how it relates to “traditional” positions.

  2. AC on January 19, 2011 at 11:10 pm

    I’m not sure, and I can’t even find it on the chart. Tough game today, but its been even tougher watching the DT comments devolve into stupid arguments and extremism. I don’t know about you guys (Crow and Daniel), but its getting pretty overwhelming at times, and I worry one of my favorite places on the internet is fast declining.

    • AC on January 19, 2011 at 11:11 pm

      oh see it now…

    • DanielM on January 20, 2011 at 6:03 am

      AWS/48 was used as a generic all-in-one stat for the K-Means clustering “example players”.

      I avoid Daily Thunder during games. And yes, the quality of commenting has declined. Royce has hinted at some sort of additional moderation of the comments, but it’s really hard to deal with negativity and backbiting.

  3. DanielM on January 20, 2011 at 3:32 pm

    Anybody think I should add contract status/contract value to the Google motion charts?

  4. Crow on January 21, 2011 at 10:13 am

    More options are usually better.

    If you found an Adjusted +/- dataset you were comfortable to add that would be another good option to have.

    I thought it was probably Alternative Win Score per 48 minutes but prefer to be sure and not everyone else would necessarily know it.

Leave a Reply

Your email address will not be published. Required fields are marked *

DSMok1 on Twitter

To-Do List

  1. Salary and contract value discussions and charts
  2. Multi-year APM/RAPM with aging incorporated
  3. Revise ASPM based on multi-year RAPM with aging
  4. ASPM within-year stability/cross validation
  5. Historical ASPM Tableau visualizations
  6. Create Excel VBA recursive web scraping tutorial
  7. Comparison of residual exponents for rankings
  8. Comparison of various "value metrics" ability to "explain" wins
  9. Publication of spreadsheets used
  10. Work on using Bayesian priors in Adjusted +/-
  11. Work on K-Means clustering for player categorization
  12. Learn ridge regression
  13. Temporally locally-weighted rankings
  14. WOWY as validation of replacement level
  15. Revise ASPM with latest RAPM data
  16. Conversion of ASPM to" wins"
  17. Lineup Bayesian APM
  18. Lineup RAPM
  19. Learn SQL