Jump to content
Twins Daily
  • Create Account

Debates about WAR


gunnarthor

Recommended Posts

ESPN linked to this piece by Posnanski on Bill James.  Bill James doesn't have nice things to say about WAR.  

 

http://sportsworld.nbcsports.com/bill-james-statistical-revolution/

 

“Well, my math skills are limited and my data-processing skills are essentially nonexistent. The younger guys are way, way beyond me in those areas. I’m fine with that, and I don’t struggle against it, and I hope that I don’t deny them credit for what they can do that I can’t.

But because that is true, I ASSUMED that these were complex, nuanced, sophisticated systems. I never really looked; I just assumed that the details were out of my depth. But sometime in the last year I was doing some research that relied on these WAR systems, so I took a look at them, and … they’re not very impressive. They’re not well thought through; they haven’t made a convincing effort to address many of the inherent difficulties that the undertaking presents. They tend to get so far into the data, throw up their arms and make a wild guess. I don’t know if I’m going to get the time to do better of it, or if it will be left to others, but … we’re not at anything like an end point here. I assumed that these systems were a lot better than they actually are.”

 

A few years ago, Posnanski also noted that the A's internal WAR system ranked Cabrera ahead of Trout. (Although I don't have the link for that, sorry).

 

I know a lot of us have used WAR as a crutch to evaluate players and performance.  But perhaps we should show a healthier skepticism toward it?

Link to comment
Share on other sites

  • Replies 61
  • Created
  • Last Reply

Precision vs. direction........

 

The offensive side of WAR doesn't seem to be up for debate. Some very smart people have proposed that you can estimate (note that word, estimate) defensive runs saved. 

 

Not ONE PERSON has suggested an alternative that takes into account defense and running. All they do is say "i trust my eyes", except none of those guys go and chart EVERY SINGLE play, and use their eyes and write down what happens in all of MLB.

 

What is a better alternative, right now? By this logic, science should never posit theories, and use the knowledge we have, because no knowledge is perfect.

 

I prefer the imperfect to the "don't even try". That's how you end up with the world is flat, mankind will never fly, etc.....

Link to comment
Share on other sites

Precision vs. direction........

 

The offensive side of WAR doesn't seem to be up for debate. Some very smart people have proposed that you can estimate (note that word, estimate) defensive runs saved. 

 

Not ONE PERSON has suggested an alternative that takes into account defense and running. All they do is say "i trust my eyes", except none of those guys go and chart EVERY SINGLE play, and use their eyes and write down what happens in all of MLB.

 

What is a better alternative, right now? By this logic, science should never posit theories, and use the knowledge we have, because no knowledge is perfect.

 

I prefer the imperfect to the "don't even try". That's how you end up with the world is flat, mankind will never fly, etc.....

Pretty much summed up my thoughts. I use WAR quite a bit but try to do so loosely... For example, a 3.0 WAR player is not necessarily worth "exactly" one win more than a 2.0 WAR player. The 3 WAR player is *probably* a better player but that's not always the case because of the influence of dWAR. I'm also extremely skeptical of the defensive outliers like Kinsler (roughly three wins with the glove).

 

Basically, the higher percentage of a player's "worth" that comes from his dWAR, the more skeptical I am of the stat as a whole. I think it done a fine job on the offensive side of the spectrum and tries to value defensive worth with mixed results. It's not completely off-base most of the time but dWAR should be used with a grain of salt, especially for guys who derive 50%+ of their value from defense.

Link to comment
Share on other sites

Whatever the flaws in WAR (mostly that it only includes what has been quantified) it's miles ahead of Bill James' Win Shares.

Pretty much summed up my thoughts. :)  James's approach from many many years ago was (to oversimplify) by starting with a team's actual wins, and saying that they had to come from somewhere, and divvying them up according to production.  Just coming at it from the stats is going to rub him the wrong way, I think.

Link to comment
Share on other sites

but dWAR should be used with a grain of salt, especially for guys who derive 50%+ of their value from defense.

Is it that nobody is worth 50% on defense (well, Florimon was 100%, make that nobody worth talking about :) ), or that the valuations on defense are that far off-kilter?

Link to comment
Share on other sites

Not ONE PERSON has suggested an alternative that takes into account defense and running. All they do is say "i trust my eyes", except none of those guys go and chart EVERY SINGLE play, and use their eyes and write down what happens in all of MLB.

 

Flawed is still flawed, regardless of the alternatives.  It's valuable, when used very loosely and with full disclosure about it's flaws, but that's rarely done and that's the true problem.  It's quoted as both a predictive stat and a reflective stat with rarely a caveat at how it was compiled or how a player may have been benefited or hurt by the calculation unfairly.

 

I tend to be with James - half the equation for WAR is little more than a sham.  And while I agree there aren't many better alternatives, there are too many people content with WAR.  That's the problem.

Link to comment
Share on other sites

Sham? That is a huge exaggeration. Huge. 

 

So, everytime someone says it is about scouting.....they should say that there are flaws in scouting, and it isn't exact? that's not how communication works, it just isn't.

 

WAR is not presented as having a large subjective component, that's the sham.  People know scouting is subjective.

Link to comment
Share on other sites

Really? have you actually read these?

 

http://www.fangraphs.com/library/misc/war/

 

"Given the nature of the calculation and potential measurement errors, WAR should be used as a guide for separating groups of players and not as a precise estimate."

 

"Our measures of both are more uncertain than our measures of offense, so players who get a good amount of their value through their defensive ratings likely have more uncertainty around their WAR value than players who have defensive value closer to average. This does not mean that WAR is wrong or biased, but rather that it is not yet capable of perfect accuracy and should be used as such."

 

http://www.fangraphs.com/blogs/the-fangraphs-uzr-primer/

 

because they admit, over and over, that they are estimating things and that it is not precise. They freely state exactly what they do to calculate the stat. If one can't figure out that it is based on observation and estimates of defense, then they aren't reading the definition and explanation at all.

 

"So, even after regression, there is no guarantee that our UZR number reflects what the player actually did or his true defensive talent over that time period. But, it is the best we can do (not knowing anything else about that player)!"

 

http://www.fangraphs.com/library/defense/drs/

 

"Before drawing any conclusions about a player’s defense, look at a full three years of defensive data, drop the decimal points and take an average, and compare DRS scores with other defensive metrics (UZR, TZL, etc.). By taking a broader picture, you will help ensure that you’re not being over-confident or overstating a player’s defensive abilities."

Link to comment
Share on other sites

WAR is not presented as having a large subjective component, that's the sham.  People know scouting is subjective.

 

I know you are smart, so I know you know that in math and science......and process control, and other use of statistic, MUCH of what we do/rely on is "subjective". Many, many things are not measurable other than through observation and the scientists use of those observations. Doesn't make them a "sham", makes them biased by the observer's knowledge and lens (for all science, btw, not just "subjective" science).

Link to comment
Share on other sites

Really? have you actually read these?

 

http://www.fangraphs.com/library/misc/war/

 

"Given the nature of the calculation and potential measurement errors, WAR should be used as a guide for separating groups of players and not as a precise estimate."

 

"Our measures of both are more uncertain than our measures of offense, so players who get a good amount of their value through their defensive ratings likely have more uncertainty around their WAR value than players who have defensive value closer to average. This does not mean that WAR is wrong or biased, but rather that it is not yet capable of perfect accuracy and should be used as such."

 

http://www.fangraphs.com/blogs/the-fangraphs-uzr-primer/

 

because they admit, over and over, that they are estimating things and that it is not precise. They freely state exactly what they do to calculate the stat. If one can't figure out that it is based on observation and estimates of defense, then they aren't reading the definition and explanation at all.

 

"So, even after regression, there is no guarantee that our UZR number reflects what the player actually did or his true defensive talent over that time period. But, it is the best we can do (not knowing anything else about that player)!"

 

http://www.fangraphs.com/library/defense/drs/

 

"Before drawing any conclusions about a player’s defense, look at a full three years of defensive data, drop the decimal points and take an average, and compare DRS scores with other defensive metrics (UZR, TZL, etc.). By taking a broader picture, you will help ensure that you’re not being over-confident or overstating a player’s defensive abilities."

Mike, that's one of the problem/concern - nobody has read that yet everyone is willing to make judgement on WAR.  Heck, people on this site posted that Dozier was more valuable than Miguel Cabrera this year.

Link to comment
Share on other sites

Is it that nobody is worth 50% on defense (well, Florimon was 100%, make that nobody worth talking about :) ), or that the valuations on defense are that far off-kilter?

The latter, IMO. I'm sure there are very valuable guys defensively, I just don't really trust WAR to quantify those numbers. I think WAR is generally accurate, particularly in the aggregate but I don't really trust dWAR on a player-by-player basis. It's a decent guideline to estimate the defensive acumen of a player but I don't trust that it can accurately calculate a player's defense at, say, 1.2 wins a season.

Link to comment
Share on other sites

I know you are smart, so I know you know that in math and science......and process control, and other use of statistic, MUCH of what we do/rely on is "subjective". Many, many things are not measurable other than through observation and the scientists use of those observations. Doesn't make them a "sham", makes them biased by the observer's knowledge and lens (for all science, btw, not just "subjective" science).

 

I don't know, if we ran into ourselves 25 years ago and our younger version was talking about how the best player in baseball led the league in RBI, batting average or runs, we'd probably tell our past self to get real as those numbers were a bit of a sham and the real indicators hadn't been invented yet.

Link to comment
Share on other sites

I know you are smart, so I know you know that in math and science......and process control, and other use of statistic, MUCH of what we do/rely on is "subjective". Many, many things are not measurable other than through observation and the scientists use of those observations. Doesn't make them a "sham", makes them biased by the observer's knowledge and lens (for all science, btw, not just "subjective" science).

 

Except science has a method and process designed to filter out observer bias - WAR just accepts observer bias because they haven't found a method to filter it out.  What is subjective about WAR (in particular - UZR) is not even close to the same kind of subjectivity in controlled scientific experiments.  Not even remotely.  

 

That and WAR is juxtaposed with statistics that are almost entirely non-subjective - like OPS or BABIP or many of the other quantifiable components of baseball that makes the study of its statistics highly unique.  Baseball is a sport played with controls, but WAR is a stat that steps outside of that.

 

The links you posted show that Fangraphs and their primers understand the limitations of the stat, but most people who employ it do not.  For example, the last few years the largest outcry of people pro Trout for MVP over Cabrera was the use of WAR.  Except, by the very links you posted, using a single season WAR stat as a measurement of who was better is definitionally misusing the statistic.  Yet people do it ALL THE TIME.  

 

I can't remember the last time I heard someone argue WAR as a composite of 3-5 years of data.  It's used almost exclusively, and incorrectly, as a seasonal measurement.

Link to comment
Share on other sites

So, every time someone argues about something, they have to post everything about it? Do those that love scouting post about it's limitations when arguing it is a good idea?

 

What would you propose people do, to take into account running and defense, when talking about player value?

 

And, both sites talk about what they do to reduce the effect of observer bias. It is right in the quotes I posted....they use regression, and they use multiple years of data. In lieu of what they do, what should be done? Nothing?

Link to comment
Share on other sites

So, every time someone argues about something, they have to post everything about it? Do those that love scouting post about it's limitations when arguing it is a good idea?

 

What would you propose people do, to take into account running and defense, when talking about player value?

 

And, both sites talk about what they do to reduce the effect of observer bias. It is right in the quotes I posted....they use regression, and they use multiple years of data. In lieu of what they do, what should be done? Nothing?

 

I would suggest if you misuse a stat by it's own admission, you probably should at least acknowledge the flaws.

 

You earlier insinuated that there is regularly accepted limitations of the stat and yet its most ardent supporters often use it erroneously.  Here's the truth - anytime you use WAR on a one-year basis you have fundamentally misused it.  And people do this constantly.  

 

I would propose WAR be treated the same way as defensive stats - with a gigantic grain of salt and only in larger sample sizes.

Link to comment
Share on other sites

Ok with me, don't use it if you don't want. 

 

But you still haven't answered the question.....what should people do that is better for understanding defense and base running?

 

Right now for defense?  Trust their eye and consult the stats for comparison sake.  If you absolutely have to use a stat, never use it with less than 3 years of data.

Link to comment
Share on other sites

I don't know, if we ran into ourselves 25 years ago and our younger version was talking about how the best player in baseball led the league in RBI, batting average or runs, we'd probably tell our past self to get real as those numbers were a bit of a sham and the real indicators hadn't been invented yet.

 

But, we can't go back in time. Given what people knew at the time......those stats had data that indicated performance. It's not as good as the data we have now, though.

Link to comment
Share on other sites

Trust their eye, but not multiple eyes watching every play, writing down information about that play, and accumulating data?

 

You trust your eye, more than you trust WAR?

 

I'm genuinely curious about that.

 

That's why I consult both.  I trust my eye equally to most defensive calculations to date.  I'm very open to hearing one that makes me trust it more, but I've yet to see that. 

Link to comment
Share on other sites

Why is the eye better than WAR?  My eye only catches something like 60-80 games a year on TV and basically just for the Twins.  WAR caputures every play for evey player through observers that are trained better than I am.  How can I rank Florimon amongst SS using the eye test when I don't have enough eye data on his competition?

 

It seems to me that WAR is much like Democracy, it is the worst stat available except for all of the others.

Link to comment
Share on other sites

Why is the eye better than WAR?  My eye only catches something like 60-80 games a year on TV and basically just for the Twins.  WAR caputures every play for evey player through observers that are trained better than I am.  How can I rank Florimon amongst SS using the eye test when I don't have enough eye data on his competition?

 

It seems to me that WAR is much like Democracy, it is the worst stat available except for all of the others.

 

Except there are better measures for a player's offensive contribution.  The issue is defense.  I don't know why people feel the need for a catch-all.

Link to comment
Share on other sites

I can't remember the last time I heard someone argue WAR as a composite of 3-5 years of data.  It's used almost exclusively, and incorrectly, as a seasonal measurement.

Incorrectly, yes, but not necessarily inaccurate (or at least "not useless"). Defensively, most good/decent players seem to fall into the .5-1.5 win spectrum over the course of a season, well under what a good player will deliver in oWAR, which I think we can all agree is relatively accurate. That means, on average, that a player's dWAR swing will be something less than one win a season while their oWAR can easily swing from 0 to 6.0 or better. A small highlight that should be pointed out about dWAR is that the "replacement player" on defense is much better than the "replacement player" on offense so dWAR's ability to screw up WAR as a whole is limited in most cases while it still delivers a single "catch-all" stat to quickly measure a player's value.

 

In 2014, the NL dWAR leader was Simmons at 3.9 (which I think we can all agree is absurdly high, though Simmons is an incredible defender). The AL leader in dWAR was Kinsler at 2.9. Compare that to offense where McCutchen led the NL with a 7.8 oWAR and Trout led the AL with an 8.7 oWAR.

 

And that's why I'm skeptical of the outliers in dWAR, both good and bad. If the player slots into a dWAR around 1.0 over the course of a season, the inaccuracies of dWAR are marginalized. If that player slots into a dWAR around 3.0 while his oWAR is 2.5, then alarm bells should ring.

 

WAR is a flawed stat but not necessarily a bad stat, especially considering the other stats we (don't) have at our disposal that are obviously better.

Link to comment
Share on other sites

WAR is a flawed stat but not necessarily a bad stat, especially considering the other stats we (don't) have at our disposal that are obviously better.

 

I'm not suggesting it's "bad", but I am suggesting it is far more dubious than it is often presented.  I'll consult it for sure, it has some value, but it has some serious issues for it to be relied on as some sort of end-all, be-all stat.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

The Twins Daily Caretaker Fund
The Twins Daily Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Twins community on the internet.

×
×
  • Create New...