That’s right, golf. I’m taking up where Ken Pomeroy left off. A year or two ago, he developed a rating system for golfers–basically, he created a huge regression of all players and all specific rounds at tournaments. Each round was assigned a level of difficulty, and each player was assigned an overall rating. His numbers, prior to the PGA Championship in 2009, are on his website, including odds for each player to win.
I’m attempting to both continue the effort and take it a step further. I’m creating a Bayesian rating system that best projects out-of-sample (future) performance.
To do this, I compiled all tournaments on the European and PGA tours for this year and the previous 2 years. I didn’t grab any other tour’s data, since it was a little harder to get a hold of.
Next, I looked at the number of variables. ~2000 players and ~200 tournaments (with ~4 rounds each). That’s 2800 unknowns right off the bat! Ouch.
So I simplified. I chose a subset of “baseline golfers” that played a bunch of rounds in the last 2+ years, across both tours. These ~80 golfers I defined to sum to 0, to set the baseline for each course. I then took the 800 tournament rounds, split them into 7 chunks, and solved for each chunk and the 80 golfers simultaneously. Thus, the 80 golfers could vary amongst themselves, but they had to sum to 0–and then the tournament round difficulties were estimated against them. Thus, some rounds were assigned a difficulty of 73, others 68.
Once I had set a difficulty level for each round in the past 2+ years, it was time to get Bayesian. I didn’t do it explicitly like I have previously. I gave a weight parameter, slightly less than 1, and weighted the results of each round for each player by the (weight parameter)^n, where n is the number of weeks since that tournament. I then added in a regression toward a fixed value (A), with a weight (R). All ready, then!
I took the players that have played more than 140 rounds in the past 2+ years, and minimized the prediction error ^2 for each round in their past 10 tournaments played. This gave me the weight parameter, fixed value (A), and weight (R).
In order to do a prediction of The Masters, I had to find out how much the players varied from round to round. So I calculated sqrt(average(predictionerror^2)) for the last 15 tournaments for each player. For the players with the most data, this average 2.78. I then regressed each player’s standard deviation toward that mean of 2.78, to get a true estimate of the standard deviation going forward.
Well then. That’s about it! We’ve got a Bayesian prediction for the next tournament, and a per-round standard deviation. Perfect for a Monte Carlo!
First, the ratings themselves, for the top 100 players of the US/Euro tours:
Rank | Players | Total Rounds | PGA | Euro | Bayesian Rating | Average Rating | Stdev |
---|---|---|---|---|---|---|---|
1 | Martin Kaymer | 188 | 74 | 114 | -1.45 | -1.82 | 2.73 |
2 | Graeme McDowell | 203 | 80 | 123 | -1.40 | -1.60 | 2.82 |
3 | Charl Schwartzel | 233 | 68 | 165 | -1.40 | -1.68 | 2.92 |
4 | Lee Westwood | 186 | 76 | 110 | -1.39 | -1.98 | 2.79 |
5 | Matt Kuchar | 205 | 201 | 4 | -1.32 | -1.41 | 2.47 |
6 | Francesco Molinari | 224 | 52 | 172 | -1.31 | -1.71 | 3.00 |
7 | Luke Donald | 202 | 158 | 44 | -1.21 | -1.44 | 2.70 |
8 | Steve Stricker | 175 | 171 | 4 | -1.18 | -1.77 | 2.70 |
9 | Nick Watney | 206 | 194 | 12 | -1.18 | -1.32 | 2.95 |
10 | Rory McIlroy | 208 | 98 | 110 | -1.17 | -1.68 | 2.89 |
11 | Paul Casey | 166 | 106 | 60 | -1.11 | -1.64 | 2.50 |
12 | Phil Mickelson | 193 | 167 | 26 | -1.08 | -1.39 | 2.82 |
13 | Tiger Woods | 135 | 119 | 16 | -1.04 | -2.20 | 2.84 |
14 | Louis Oosthuizen | 197 | 40 | 157 | -1.01 | -1.34 | 2.96 |
15 | Retief Goosen | 236 | 164 | 72 | -0.98 | -1.39 | 2.77 |
16 | Raphael Jacquelin | 232 | 6 | 226 | -0.91 | -0.98 | 2.56 |
17 | Thomas Aiken | 196 | 14 | 182 | -0.90 | -1.10 | 2.63 |
18 | Anders Hansen | 196 | 24 | 172 | -0.88 | -1.19 | 2.61 |
19 | Peter Hanson | 202 | 48 | 154 | -0.84 | -1.37 | 2.87 |
20 | Justin Rose | 210 | 176 | 34 | -0.82 | -0.98 | 2.69 |
21 | Dustin Johnson | 190 | 186 | 4 | -0.81 | -1.13 | 2.86 |
22 | Hunter Mahan | 209 | 205 | 4 | -0.81 | -1.13 | 2.69 |
23 | Richard Green | 164 | 6 | 158 | -0.80 | -1.29 | 2.78 |
24 | Joost Luiten | 145 | 0 | 145 | -0.77 | -1.03 | 2.75 |
25 | Alvaro Quiros | 210 | 54 | 156 | -0.75 | -1.08 | 2.75 |
26 | Jamie Donaldson | 209 | 0 | 209 | -0.72 | -1.01 | 2.83 |
27 | Edoardo Molinari | 145 | 38 | 107 | -0.71 | -1.24 | 2.75 |
28 | Padraig Harrington | 202 | 139 | 63 | -0.71 | -1.17 | 2.86 |
29 | Stephen Gallacher | 163 | 8 | 155 | -0.70 | -0.83 | 2.60 |
30 | Robert Allenby | 192 | 164 | 28 | -0.70 | -1.08 | 2.53 |
31 | Ernie Els | 237 | 164 | 73 | -0.70 | -1.25 | 2.76 |
32 | Miguel Angel Jimenez | 226 | 50 | 176 | -0.69 | -1.12 | 2.93 |
33 | Ian Poulter | 188 | 118 | 70 | -0.69 | -1.15 | 2.80 |
34 | Anthony Wall | 204 | 6 | 198 | -0.69 | -1.05 | 2.41 |
35 | Jim Furyk | 182 | 182 | 0 | -0.69 | -1.32 | 2.57 |
36 | Rickie Fowler | 150 | 144 | 6 | -0.68 | -0.79 | 2.72 |
37 | Tim Clark | 197 | 175 | 22 | -0.65 | -1.12 | 2.47 |
38 | David Lynn | 197 | 0 | 197 | -0.65 | -0.93 | 2.83 |
39 | Chris Wood | 195 | 16 | 179 | -0.63 | -1.02 | 2.62 |
40 | Martin Laird | 206 | 190 | 16 | -0.62 | -0.36 | 3.04 |
41 | Charles Howell III | 234 | 234 | 0 | -0.62 | -0.72 | 2.75 |
42 | Robert-Jan Derksen | 207 | 0 | 207 | -0.61 | -1.01 | 2.40 |
43 | Jean-Baptiste Gonnet | 194 | 0 | 194 | -0.61 | -0.64 | 2.94 |
44 | Spencer Levin | 225 | 225 | 0 | -0.59 | -0.55 | 2.88 |
45 | Bill Haas | 214 | 210 | 4 | -0.59 | -0.68 | 2.82 |
46 | Gregory Bourdy | 230 | 8 | 222 | -0.58 | -0.76 | 2.27 |
47 | Ben Crane | 194 | 190 | 4 | -0.57 | -0.89 | 2.72 |
48 | Ross Fisher | 187 | 70 | 117 | -0.57 | -0.98 | 2.86 |
49 | Rafael Cabrera-Bello | 227 | 4 | 223 | -0.57 | -0.72 | 2.66 |
50 | David Toms | 204 | 204 | 0 | -0.56 | -0.92 | 2.64 |
51 | Kevin Na | 211 | 211 | 0 | -0.56 | -0.90 | 2.52 |
52 | Sergio Garcia | 188 | 118 | 70 | -0.55 | -1.00 | 2.81 |
53 | Thongchai Jaidee | 207 | 36 | 171 | -0.55 | -1.09 | 2.75 |
54 | Vijay Singh | 171 | 169 | 2 | -0.54 | -0.65 | 2.63 |
55 | Zach Johnson | 200 | 200 | 0 | -0.54 | -1.06 | 2.39 |
56 | K.J. Choi | 189 | 171 | 18 | -0.53 | -0.79 | 2.60 |
57 | Matteo Manassero | 96 | 14 | 82 | -0.53 | -0.97 | 3.02 |
58 | Rory Sabbatini | 223 | 193 | 30 | -0.52 | -0.61 | 3.02 |
59 | Stewart Cink | 178 | 174 | 4 | -0.52 | -0.68 | 2.56 |
60 | Bo Van Pelt | 220 | 220 | 0 | -0.51 | -0.80 | 2.56 |
61 | Darren Clarke | 211 | 30 | 181 | -0.51 | -0.72 | 2.95 |
62 | Bubba Watson | 174 | 174 | 0 | -0.51 | -0.83 | 2.82 |
63 | Thomas Bjorn | 168 | 6 | 162 | -0.50 | -0.70 | 2.81 |
64 | Robert Dinwiddie | 139 | 0 | 139 | -0.50 | -0.63 | 2.78 |
65 | J.B. Holmes | 203 | 201 | 2 | -0.49 | -0.51 | 2.84 |
66 | Robert Karlsson | 151 | 64 | 87 | -0.48 | -0.90 | 2.99 |
67 | Peter Lawrie | 215 | 0 | 215 | -0.47 | -0.97 | 2.84 |
68 | Brendon de Jonge | 239 | 239 | 0 | -0.47 | -0.43 | 2.86 |
69 | Gary Woodland | 100 | 100 | 0 | -0.46 | 0.02 | 2.93 |
70 | Johan Edfors | 195 | 10 | 185 | -0.46 | -0.86 | 3.00 |
71 | Bradley Dredge | 207 | 4 | 203 | -0.46 | -0.90 | 2.85 |
72 | Ryan Moore | 190 | 186 | 4 | -0.46 | -0.71 | 2.98 |
73 | Ignacio Garrido | 229 | 4 | 225 | -0.46 | -0.89 | 2.86 |
74 | John Senden | 240 | 234 | 6 | -0.46 | -0.74 | 2.69 |
75 | Simon Dyson | 222 | 26 | 196 | -0.46 | -1.01 | 2.83 |
76 | Damien McGrane | 245 | 2 | 243 | -0.45 | -0.91 | 2.92 |
77 | J.J. Henry | 220 | 220 | 0 | -0.44 | -0.51 | 2.74 |
78 | Brandt Snedeker | 195 | 193 | 2 | -0.42 | -0.65 | 2.88 |
79 | Gonzalo Fernandez-Casta | 198 | 28 | 170 | -0.42 | -0.93 | 2.90 |
80 | Paul Lawrie | 181 | 6 | 175 | -0.42 | -0.76 | 2.68 |
81 | Steve Marino | 205 | 199 | 6 | -0.42 | -0.77 | 2.82 |
82 | Jaco Van Zyl | 46 | 0 | 46 | -0.42 | -1.32 | 2.78 |
83 | Brian Gay | 229 | 221 | 8 | -0.42 | -0.58 | 2.49 |
84 | James Kingston | 187 | 12 | 175 | -0.41 | -0.60 | 2.99 |
85 | Soren Hansen | 222 | 42 | 180 | -0.41 | -1.00 | 2.31 |
86 | Geoff Ogilvy | 181 | 151 | 30 | -0.41 | -0.99 | 2.68 |
87 | Nicolas Colsaerts | 102 | 0 | 102 | -0.40 | -0.83 | 2.78 |
88 | Gareth Maybin | 216 | 6 | 210 | -0.40 | -0.92 | 3.04 |
89 | Jonathan Byrd | 194 | 194 | 0 | -0.39 | -0.59 | 2.86 |
90 | Adam Scott | 172 | 136 | 36 | -0.38 | -0.67 | 2.88 |
91 | Aaron Baddeley | 195 | 183 | 12 | -0.38 | -0.41 | 2.90 |
92 | Fredrik Jacobson | 192 | 192 | 0 | -0.38 | -0.55 | 2.69 |
93 | Jerry Kelly | 211 | 207 | 4 | -0.36 | -0.57 | 2.72 |
94 | Soren Kjeldsen | 211 | 42 | 169 | -0.36 | -0.89 | 2.68 |
95 | Scott Verplank | 172 | 172 | 0 | -0.33 | -0.64 | 2.83 |
96 | Jeev Milkha Singh | 230 | 104 | 126 | -0.33 | -0.64 | 2.77 |
97 | Danny Willett | 195 | 2 | 193 | -0.33 | -0.95 | 3.12 |
98 | Webb Simpson | 215 | 215 | 0 | -0.33 | -0.34 | 2.70 |
99 | Alexander Noren | 188 | 6 | 182 | -0.30 | -0.82 | 2.78 |
100 | Marc Leishman | 224 | 216 | 8 | -0.30 | -0.39 | 2.88 |
And here are the Monte Carlo simulation results, for this year’s Masters. Some players did not have enough data to make a projection; I’ll assume none of them win (that would be old champs, a few Asian players, and the amateurs.)
Notes on the results:
- There is no clear favorite, as opposed to Pomeroy’s ratings for the ’09 PGA. Tiger has the best average rating over the past 2+ years, but has not done well recently. For the ’09 PGA, Tiger had a 25% chance of winning! Nothing like that this year.
- The European contingent looks really strong.
- Who is Charl Schshwartszel? Did I misspell that? Will we set an all-time record for misspellings of contenders?
- Tiger and Phil have similar odds? Remember, injuries aren’t accounted for explicitly, so Phil’s playing with arthritis and not doing well may be counted too strongly.
- Compare with Jason Sobel’s Rankings.
Rank | Player | Bayesian | Stdev | Wins/Million | Win% | Avg. Rank | 1 in |
---|---|---|---|---|---|---|---|
1 | Charl Schwartzel | -1.40 | 2.92 | 47126 | 4.71% | 28.2 | 21.2 |
2 | Francesco Molinari | -1.31 | 3.00 | 45299 | 4.53% | 29.7 | 22.1 |
3 | Graeme McDowell | -1.40 | 2.82 | 42684 | 4.27% | 27.9 | 23.4 |
4 | Martin Kaymer | -1.45 | 2.73 | 41458 | 4.15% | 26.8 | 24.1 |
5 | Lee Westwood | -1.39 | 2.79 | 40848 | 4.08% | 27.9 | 24.5 |
6 | Nick Watney | -1.18 | 2.95 | 35881 | 3.59% | 31.5 | 27.9 |
7 | Rory McIlroy | -1.17 | 2.89 | 33364 | 3.34% | 31.5 | 30.0 |
8 | Louis Oosthuizen | -1.01 | 2.96 | 28707 | 2.87% | 34.2 | 34.8 |
9 | Luke Donald | -1.21 | 2.70 | 27406 | 2.74% | 30.5 | 36.5 |
10 | Phil Mickelson | -1.08 | 2.82 | 26358 | 2.64% | 32.9 | 37.9 |
11 | Steve Stricker | -1.18 | 2.70 | 25994 | 2.60% | 30.9 | 38.5 |
12 | Tiger Woods | -1.04 | 2.84 | 25499 | 2.55% | 33.5 | 39.2 |
13 | Matt Kuchar | -1.32 | 2.47 | 23285 | 2.33% | 28.2 | 42.9 |
14 | Retief Goosen | -0.98 | 2.77 | 20961 | 2.10% | 34.4 | 47.7 |
15 | Peter Hanson | -0.84 | 2.87 | 19204 | 1.92% | 36.9 | 52.1 |
16 | Dustin Johnson | -0.81 | 2.86 | 18010 | 1.80% | 37.3 | 55.5 |
17 | Martin Laird | -0.62 | 3.04 | 17665 | 1.77% | 40.5 | 56.6 |
18 | Paul Casey | -1.11 | 2.50 | 16726 | 1.67% | 31.7 | 59.8 |
19 | Miguel Angel Jimenez | -0.69 | 2.93 | 16514 | 1.65% | 39.3 | 60.6 |
20 | Padraig Harrington | -0.71 | 2.86 | 15171 | 1.52% | 39.0 | 65.9 |
21 | Rory Sabbatini | -0.52 | 3.02 | 14543 | 1.45% | 42.2 | 68.8 |
22 | Alvaro Quiros | -0.75 | 2.75 | 13577 | 1.36% | 38.2 | 73.7 |
23 | Justin Rose | -0.82 | 2.69 | 13543 | 1.35% | 37.0 | 73.8 |
24 | Ian Poulter | -0.69 | 2.80 | 13472 | 1.35% | 39.1 | 74.2 |
25 | Anders Hansen | -0.88 | 2.61 | 13417 | 1.34% | 35.9 | 74.5 |
26 | Hunter Mahan | -0.81 | 2.69 | 13323 | 1.33% | 37.1 | 75.1 |
27 | Robert Karlsson | -0.48 | 2.99 | 13165 | 1.32% | 42.9 | 76.0 |
28 | Edoardo Molinari | -0.71 | 2.75 | 12721 | 1.27% | 38.8 | 78.6 |
29 | Ernie Els | -0.70 | 2.76 | 12635 | 1.26% | 39.1 | 79.1 |
30 | Ryan Moore | -0.46 | 2.98 | 12291 | 1.23% | 43.2 | 81.4 |
31 | Ross Fisher | -0.57 | 2.86 | 11960 | 1.20% | 41.3 | 83.6 |
32 | Gary Woodland | -0.46 | 2.93 | 11412 | 1.14% | 43.0 | 87.6 |
33 | Bill Haas | -0.59 | 2.82 | 11345 | 1.13% | 40.9 | 88.1 |
34 | Rickie Fowler | -0.68 | 2.72 | 11035 | 1.10% | 39.4 | 90.6 |
35 | Sergio Garcia | -0.55 | 2.81 | 10576 | 1.06% | 41.6 | 94.6 |
36 | Bubba Watson | -0.51 | 2.82 | 10154 | 1.02% | 42.3 | 98.5 |
37 | Brandt Snedeker | -0.42 | 2.88 | 9695 | 0.97% | 43.7 | 103.1 |
38 | Aaron Baddeley | -0.38 | 2.90 | 9489 | 0.95% | 44.5 | 105.4 |
39 | Ben Crane | -0.57 | 2.72 | 9197 | 0.92% | 41.2 | 108.7 |
40 | Adam Scott | -0.38 | 2.88 | 9193 | 0.92% | 44.4 | 108.8 |
41 | Jonathan Byrd | -0.39 | 2.86 | 8710 | 0.87% | 44.4 | 114.8 |
42 | Steve Marino | -0.42 | 2.82 | 8629 | 0.86% | 43.8 | 115.9 |
43 | Jim Furyk | -0.69 | 2.57 | 8468 | 0.85% | 39.1 | 118.1 |
44 | Anthony Kim | -0.19 | 3.00 | 8198 | 0.82% | 47.6 | 122.0 |
45 | Robert Allenby | -0.70 | 2.53 | 8027 | 0.80% | 38.8 | 124.6 |
46 | Y.E. Yang | -0.29 | 2.90 | 7954 | 0.80% | 45.9 | 125.7 |
47 | David Toms | -0.56 | 2.64 | 7566 | 0.76% | 41.4 | 132.2 |
48 | Vijay Singh | -0.54 | 2.63 | 7311 | 0.73% | 41.6 | 136.8 |
49 | K.J. Choi | -0.53 | 2.60 | 6636 | 0.66% | 41.8 | 150.7 |
50 | Charley Hoffman | -0.13 | 2.94 | 6521 | 0.65% | 48.6 | 153.4 |
51 | D.A. Points | -0.22 | 2.85 | 6397 | 0.64% | 47.1 | 156.3 |
52 | Henrik Stenson | 0.08 | 3.12 | 6333 | 0.63% | 51.9 | 157.9 |
53 | Tim Clark | -0.65 | 2.47 | 6228 | 0.62% | 39.6 | 160.6 |
54 | Jerry Kelly | -0.36 | 2.72 | 6228 | 0.62% | 44.8 | 160.6 |
55 | Geoff Ogilvy | -0.41 | 2.68 | 6170 | 0.62% | 44.0 | 162.1 |
56 | Alex Cejka | -0.15 | 2.90 | 6062 | 0.61% | 48.4 | 165.0 |
57 | Stewart Cink | -0.52 | 2.56 | 5976 | 0.60% | 42.0 | 167.3 |
58 | Bo Van Pelt | -0.51 | 2.56 | 5861 | 0.59% | 42.1 | 170.6 |
59 | Jeff Overton | -0.17 | 2.84 | 5625 | 0.56% | 48.1 | 177.8 |
60 | Kevin Na | -0.56 | 2.52 | 5624 | 0.56% | 41.3 | 177.8 |
61 | Mark Wilson | -0.20 | 2.72 | 4540 | 0.45% | 47.5 | 220.3 |
62 | Fred Couples | 0.18 | 3.00 | 4162 | 0.42% | 53.5 | 240.3 |
63 | Ricky Barnes | -0.10 | 2.76 | 4128 | 0.41% | 49.3 | 242.2 |
64 | Stuart Appleby | 0.11 | 2.93 | 3954 | 0.40% | 52.6 | 252.9 |
65 | Zach Johnson | -0.54 | 2.39 | 3952 | 0.40% | 41.6 | 253.0 |
66 | Jason Day | -0.25 | 2.61 | 3679 | 0.37% | 46.8 | 271.8 |
67 | Jhonattan Vegas | -0.02 | 2.78 | 3552 | 0.36% | 50.6 | 281.5 |
68 | Sean O'Hair | -0.20 | 2.58 | 3088 | 0.31% | 47.8 | 323.8 |
69 | Camilo Villegas | 0.02 | 2.74 | 3061 | 0.31% | 51.3 | 326.7 |
70 | Lucas Glover | -0.13 | 2.63 | 3045 | 0.30% | 48.8 | 328.4 |
71 | Kevin Streelman | -0.01 | 2.59 | 2180 | 0.22% | 50.9 | 458.7 |
72 | Carl Pettersson | 0.07 | 2.65 | 2107 | 0.21% | 52.3 | 474.6 |
73 | Trevor Immelman | 0.29 | 2.79 | 1932 | 0.19% | 55.5 | 517.6 |
74 | Gregory Havret | -0.05 | 2.54 | 1925 | 0.19% | 50.4 | 519.5 |
75 | Heath Slocum | 0.19 | 2.70 | 1811 | 0.18% | 54.2 | 552.2 |
76 | Arjun Atwal | 0.42 | 2.85 | 1763 | 0.18% | 57.7 | 567.2 |
77 | Angel Cabrera | -0.05 | 2.47 | 1559 | 0.16% | 50.3 | 641.4 |
78 | Ryan Palmer | 0.20 | 2.58 | 1203 | 0.12% | 54.6 | 831.3 |
79 | Jose Maria Olazabal | 0.97 | 3.07 | 1116 | 0.11% | 64.9 | 896.1 |
80 | Jason Bohn | 0.30 | 2.57 | 971 | 0.10% | 56.3 | 1029.9 |
81 | Davis Love III | 0.08 | 2.40 | 857 | 0.09% | 52.8 | 1166.9 |
82 | Kyung-Tae Kim | 0.75 | 2.78 | 640 | 0.06% | 62.8 | 1562.5 |
83 | Hiroyuki Fujita | 0.94 | 2.78 | 452 | 0.05% | 65.5 | 2212.4 |
84 | Ryo Ishikawa | 0.92 | 2.78 | 451 | 0.05% | 65.2 | 2217.3 |
85 | Tom Watson | 0.92 | 2.78 | 445 | 0.04% | 65.2 | 2247.2 |
86 | Mike Weir | 1.21 | 3.03 | 0 | 0.00% | 68.2 | 1000000.0 |
87 | Yuta Ikeda | 1.28 | 2.78 | 0 | 0.00% | 69.8 | 1000000.0 |
88 | Mark O'Meara | 2.15 | 2.78 | 0 | 0.00% | 78.6 | 1000000.0 |
Players left out:
- Nathan Smith
- David Chung
- Hideki Matsuyama
- Jin Jeong
- Lion Kim
- Peter Uihlein
- Ben Crenshaw
- Craig Stadler
- Ian Woosnam
- Larry Mize
- Sandy Lyle
rexfordbuzzsaw
First off, this is good stuff, it’s nice to see someone with a brain try to rate golfers as opposed to Jason Sobel.
For the past couple of years, I’ve calculated a world golf ranking based on standardized scores across the world’s three biggest tours (PGA, NW, EPGA). You can see my full masters rankings here.
I think you have a couple of problems, though. For one, the European Tour is weighted to highly. I can tell you this, because I think I did a similar thing. I used to make no difference between playing on the European and PGA Tour and my odds looked a lot like that. In reality there is about a .21 standard deviation difference (~.6 strokes per round) between the PGA and European Tours. It’s about .35 for the NW Tour to PGA Tour.
I think that’s reason you are so high on the European Tour players like Schwartzel and Molinari.
I’d like to know how much of an impact recent play has in your rankings, because just looking at these off the top of my head I think it might be too much. I know in my rankings a player that is playing well recently should get a max bonus of around .2- .3 strokes per round. That is somewhat empirically derived, but mostly I came up common sense observation and by adjusting Vegas odds to my rankings.
Finally, using standard deviation of the sample I don’t think is accurate. I’m not sure positive about this, but from what I’ve done a players true standard deviation has a direct correlation between the players average score in relation to the field. There is no correlation in a single players standard deviation from year to year. Some years Tiger has a really low standard deviation, others he’s had a really high. This applies to almost every single player who has played a large sample of rounds from every year since 2002.
Hope that gives you something to think about and helps going forward.
DanielM
Thanks for the very informative post!
1) Theoretically, if I have good connectivity between tours, the European events will be adjusted appropriately automatically. I’m rating each player at each event NOT vs. “other players” but against the “baseline” for that round. This baseline is calculated as the average of how the 80 “baseline golfers” did in that round.
That group could be only 10 in any given round, and not all of those golfers played on both tours. My regression may not properly capture the difference between the tours in difficulty of round. I’ll look at it a bit more today.
That said–the 0.6 per round. Is that just a general average? Because the difficulty of each round varies WIDELY based on weather/course/etc.
2) The best fit for projecting out of sample tournament results yielded a weighting that dropped from 1 in the most recent events to about 0.05 in the events and the start of 2009. Then, everybody is regressed with about 6.5 rounds (weighted at 1) of +2.5 or so golf. The weighting towards recency seemed a bit strong to me, as well.
3) Regarding standard deviation: you may be right on this. Perhaps I should just assign everyone a consistent standard deviation? That said, I did regress each player’s standard deviation to the mean pretty hard via Bayesian inference with a prior of Avg.
Thanks again for your insights!
bradluen
Do “Average Rating” and “Bayesian Rating” have the same baseline? If so, that’s a huge regression effect. Though maybe you need a huge regression effect with only two years of data.
DanielM
Yes, they have the same baseline. There’s more than just the regression effect going on, though. There’s also the deprecation of the older results. Tiger’s oldest results are also his best results, so when they get weighted only 10% as strongly as his most recent results, that drops him a ton. There’s also regression to a baseline of +2.5 or so, but there’s only 6 rounds of +2.5 added to the weighted average–and a lot of guys have over 200 rounds. That effects the players with few rounds played a lot more.
DanielM
I’ll probably, if I have time today (I’m busy), try to revise the way I did the 80 baseline golfers to include a larger group and get a better connection between European and PGA tours. I agree with the various comments (twitter and here): it does look like the Euro players are rated unusually highly.
rexfordbuzzsaw
“That said–the 0.6 per round. Is that just a general average? Because the difficulty of each round varies WIDELY based on weather/course/etc.”
Yes.
The first thing I do is standardize the scores against everyone in the field. Obviously, it doesn’t matter if the course average is 75 or 69, the player’s relation to the field average is all that counts. That’s not to say a course plays at the same difficulty for someone playing at 8 a.m. as opposed to 1 p.m. I just hope most of that is randomness and balanced out in the long run, which I think it is pretty well.
The next step is to assign a field difficulty based on everyone’s raw average from above over a three-year period.
Finally, I take the adjusted z-scores and compare players rounds across tours. Over about 2000 rounds each way, players are on average .6 strokes better in relation to the field when they play on the European Tour. However, it does vary, like you said. The Dubai Desert Classic still boasts a stronger field than the Puerto Rico Open even though the European players raw rankings are inflated.
DanielM
I see. I’m taking a different approach, but if we’re each doing it right the answer should be the same. I’m standardizing against a group of “baseline players”, and there should be enough of them that play each round to get a good grasp of how hard the round was. I’m going to re-run the regression with over 140 players as baseline, using a slightly different procedure.
I wish I weren’t getting a “cannot allocate memory for vector” error problem with R when I try to run it all that way. Apparently, no 2-300 MB block of contiguous RAM available?
EvanZ
Daniel, you had McIlroy top 10. Pretty good!
Ben
Daniel’s looking even better today!
Neil Paine
Schwartzel!!!!!!
EvanZ
lol
“Who is Charl Schshwartszel? Did I misspell that? Will we set an all-time record for misspellings of contenders?”