Former Mets statistical analyst shares his insight on his time with the Mets, his view on sabermetrics, and his newly released book.
The third part of our interview with Professor Benjamin Baumer deals with his recently published book co-written with Professor Andrew Zimbalist, The Sabermetric Revolution. The book assesses sabermetrics chronologically, starting by revisiting Michael Lewis's Moneyball, and then moves toward the growth of data analytics in sports and what the future may entail.
Your book, The Sabermetric Revolution, took a very ground-up approach to explain the growth of sabermetrics today. Using Moneyball as a point of reference made sense in this way, because it undoubtedly spurred a new wave in the sabermetric world. It also served as good refresher for many of us who had read the book years ago and may have forgotten some of the quirks of the book that would seem outlandish in today's sabermetric community, and a good way to lure in the casual baseball fan who may not know much on the topic of sabermetrics.
My question is: Who is your target audience with this book? Is it the casual reader who seeks to learn about the basics of sabermetrics or an avid sabermetrician who would like an educated take on where the field is going?
The strength of Moneyball was that it told a very compelling story. And the story was so compelling and so well told that not only was it a bestseller and a Hollywood blockbuster, but it also had a profound effect on the baseball industry. I don't think there is any question that most of the jobs like mine wouldn't have existed had it not been for Moneyball. When I first read Moneyball, the sabermetric concepts that were discussed in the book (e.g. DIPS, individual fielder analysis, etc.) were completing fascinating to me, but the nitty-gritty details that supported those concepts weren't presented in the book (or the movie).
More Baumer
So as a reader/viewer, you get the idea, and you see how the idea fits into the bigger picture, but you don't actually see the evidence in favor of that idea. For those of us with a scientific mindset, we typically don't believe things—even sensible things—without evidence, and so Moneyball left me wanting more. Part of what I hope that our book can do for people is to bridge the gap between people who read or saw Moneyball and are interested in sabermetrics but are not practicing sabermetricians, and those who are really on the cutting-edge of sabermetrics themselves.
Our book provides context, in terms of taking readers from Moneyball to the present, thinking critically about some of the inaccuracies in Moneyball and people's perceptions of it, and reviewing the changes that have taken place in baseball since then. But also, we provide content, in that we present a fairly lengthy, but coherent and readable treatment of basic sabermetric thought. What do sabermetricians believe? And why do they believe it? We present the evidence that was left out of Moneyball. Once we've developed that, we are able to look around (at other sports, and the business side of baseball) and forward (to the future), and provide some insight and analysis into whether there is evidence that sabermetrics has actually worked, and where it is headed. So I think that there is something for everyone, but those who want to take the next steps from Moneyball will have the most to gain. Judging from Moneyball's popularity, this seems to be a fairly large group of people!
Progressing in keeping with the order of the book, what do you think is left with regard to hitting statistics? Will HITf/x data soon prove to make a big difference?
I hope so, but one of the difficulties is that not many people have access to that proprietary data set. At the major league level, I think we have a pretty good handle on who the good hitters are, but where I think we have room to improve is in forecasting the major league performance of minor league players. Why do some players who hit well in Double-A fail to produce at the major league level? Is it a question of bat speed, vision, pitch recognition, etc.?
I could see the HITf/x data giving us some insight into those kinds of questions, but I fear that not enough people will be able to get their hands on it, and those who do won't be able to share what they've found.
That seems to be the general consensus: We can quantify the best hitters in the majors, but translating minor league statistics is still an arduous task. In the book, you go into great detail about the significance of these hitting statistics, the reliability of them, as well as the history. But do you see what we consider to be basic tenets of sabermetrics changing in the long term? For example, will we soon see that what was once new and revolutionary, such as the importance of walks, become outdated dogma?
It depends on what you consider the dogma to be. If we're talking about walks being important, then I don't see that as ever changing—barring some unforeseen dramatic rule change—because the importance of getting on base stems directly from the rules of the game. It's important at every skill level in which the game is played. If, on the other hand, we're talking about walks being undervalued, then clearly I think the validity of that insight will change over time. The question of value is relative to the market, which by nature changes constantly. We sabermetricians need to be clear about making these kinds of distinctions, lest we end up in a situation where the public is misinterpreting what sabermetricians believe.
But on another level I don't like the whole idea of "sabermetric dogma." To me, sabermetrics is an inquisitive, scientific, quantitative approach to understanding baseball—not a set of fixed prescriptions.
On that, Christina Kahrl for ESPN asked many attendees of the SABR Analytics Conference whether sabermetrics is more revolutionary or evolutionary at this point in time. Your book was titled "The Sabermetric Revolution", but was that something of the past or is it presently ongoing? What's your take on Ms. Kahrl's question?
Hmm. Is the question about the adoption of sabermetrics within baseball? Or sabermetrics itself? To the former, our premise for the book was to examine the changes in the baseball industry since the publication of Moneyball, which I think is properly considered a revolution. For example, one of the things we noticed was that while only a couple of teams employed sabermetricians in 2003, now nearly (but not quite) all of them do. With a distinct catalyst, thinking about a change that dramatic as a revolution seems appropriate. But given where we are now, it seems likely that changes will be more incremental—evolutionary. The latter question may be more complicated, but I suppose it would difficult to smell a new paradigm shift before it was upon you, right? There's no question in my mind that DIPS was a monumental discovery that did bring about a paradigm shift, but how do we know when the next one will come? I've heard some people talk about individual fielder analysis and catcher framing in those terms.
It should be interesting to see just how much value can be gleaned out of quantifying those defensive aspects. In the book, you mentioned Shane Jensen's SAFE. Can you elaborate on the methodology of calculating that metric, why it's useful, and why it isn't more well known?
The SAFE model is really well designed, but there are a few things holding it back. First, it's based on a hierarchical Bayesian model, which is a statistical modeling technique that most people don't see until graduate school. So it's challenging for most people (including me). The second hang up is that the numbers never got published in a high-profile, updateable forum, like on Fangraphs or some website where people can see them. And the third snag is that the confidence intervals were so large that the inferences for most players weren't that interesting. The problem, of course, is that no one else is reporting confidence intervals, so it seems likely to me that most people are lured into a false sense of precision by the point estimates offered for UZR, Plus/Minus, or whatever else.
All of these fielding models are based on the same idea: estimate the probability p that a particular ball in play will be fielded by an average player at each fielding position, and then compare that estimate to what actually happened (e.g. did the player field it or not?). In a general sense, you build a model to estimate p, and then you examine the residuals from that model. Players with larger residuals are good because they fielded more balls successfully than your model thought they would. The question is how to build that model, and everyone does it differently (openWAR uses a logistic regression model with quadratic terms).
SAFE is a bit different in that the thing that you are building the model for is the fielding ability of each individual player—not just the fielding ability of the average player. However, the fielding ability of the average player is taken into account, but the observations allow the model to be updated to conform to the data available on each player. This is the advantage of the hierarchical Bayesian model.