April 15, 2014

And the loser of the 2014 #Masters is? #538 might shoot a #FiveThirtyEight (updated)

Neil Paine duffs a chip shot from off the green with his piece on possible winners of this year's Masters.

First, he states the obvious, that length is key at Augusta National. However, he also says greens in regulation is important. While it's true that a golfer can be both short and inaccurate, or long and accurate, in general, there's some correlation between length and inaccuracy.

And, while being closer to a green means the use of shorter irons, with more likelihood of staying on a green, the flip side is that hitting irons out of even lighter rough affects control a bit.

(Update, April 15: Paine now stands somewhat refudiated by Golf Digest. Greens in regulation may not be sabermetric, but it was the top commonality for this year's subpar finishers? Driving distance? Not so much.)

He then follows:
And for all of the breathless reverence given to Augusta’s trademark slippery greens, putting skill isn’t a significant predictor of those who will stray from expectations, either.

I suspect this is because putts per round is one of the least consistent performance indicators from season to season, ranking only above sand save percentage.
Which is true, but only in a trivial sense.

The PGA actually has a sabermetric-type stat for putting called "Strokes Gained-Putting," described here. Paine does note elsewhere (on Twitter, but not in an updated story) that putts per round has a 75 percent correlation with strokes-gained putting.

He then says he wanted long-term data. Well, now you're trying to slice your bread and make it both white and whole wheat, because he goes on to dismiss SG-P as only being around a couple of years. But, if that's the case, how "firm," then, is that 75 percent correlation? Maybe 2-3 more years of study lower it to 70 percent?

By the same token, I'll admit that, because of dramatic images, we may overrate the value of putting at Augusta. But, the 75 percent leaves enough room to wonder whether Paine shouldn't recrunch his story in a few years.

Because, as Zach Johnson showed in 2007, a short hitter with strategery in mind can still win at Augusta. (And Haney said the Tiger-proofing had worked with T. Woods more for how the course now shaped up rather than distance per se.) Beyond Johnson, shorter but not minuscule drivers but good putters with a chance to win would include Ian Poulter, Ryan Moore and Jason Dufner. At the same time, Paine rules out Henrik Stenson based on past performance, but maybe that was due to his putting as much as anything else. (ESPN doesn't have full stats for the European tour, so I don't know; he is fifth on driving distance, though, so it's not a length issue.)

I appreciate Paine's attempt to be a contrarian about "received logic" at Augusta. At least, that's how I'm diagnosing it. However, if not wrong, I think he's a bit early. (That said, I do agree with Paine that both scrambling and sand saves aren't important at Augusta.)

And, there's something else.

Greens in regulation itself is perhaps outdated. I could go across the pond and hit every green in regulation at an Open course with the classic double-hole greens yet face all 50-foot putts. This isn't a fault of Paine's. To riff on Don Rumsfeld, you go to Big Data with the data you have.

To address my issue, we need a standard distance, not just "greens." I think that as an occasional duffer in the past, 20 feet away, even for the pros, involves too much luck. Ten feet, though, might not be far enough away to separate the men from the boys in putting. So, maybe "greens inside 15 feet in regulation" is what we need?

Paine does expect there to be more "sabermetric" stats for golf coming in the future, per a Twitter exchange. However, they're not here yet. And, per Don Rumsfeld again, we're at "known unknowns" right now. We know that our current golf stats are lacking in rigor and revelation, but we're still not sure how much they're lacking and how.

This then leads to a larger-yet issue, part of what several people, far beyond those named Paul Krugman, have said about the FiveThirtyEight brand so far. And that's that it doesn't actually do that much analysis for all the Big Data it crunches. Go here for my previous critique of the FiveThirtyEight "brand."

If a statistic is inadequate, and there's none better to replace it, then let's propose a new one, like I just did. To go back to Don Rumsfeld, other people, unlike him, didn't stay in the fog of war. When Bradleys got hit by IEDs, we improvised, then replaced most of them with MRAPs. And, in a note to Neil's boss, you lose some of the hubris that kept you in the fog of war in the first place.

And, Big Data still can't tell us everything. It couldn't tell us in advance that Billy Beane was afraid of failure, assuming that was part of his problem as a baseball player. (I think it was, and also wonder if that's why he turned down the offer to run the Red Sox.) Back in the golf world, it can't explain why, per an old witticism, "Scott Hoch rhymes with 'choke'."

And, you can overcrunch data, too. Some baseball sabermetric stats, like how much value to give steals and how much to detract for being thrown out stealing, are subjective to a degree themselves.

In case anybody is wondering, here's my look at possibilities to win this year.

