Jump to content
Twins Daily
  • Create Account

Debates about WAR


gunnarthor

Recommended Posts

I'm not suggesting it's "bad", but I am suggesting it is far more dubious than it is often presented.  I'll consult it for sure, it has some value, but it has some serious issues for it to be relied on as some sort of end-all, be-all stat.

 

I would argue that the stat itself is far less dubious than many claim, but its understanding by many who utilize it is very poor.  OPS was/is the same thing as many utilize the stat without context of era, ballpark, etc.  WAR is rarely used properly, and if it was, it'd be seen as a good comparative tool, not a valuing tool as many have utilized it for.

Link to comment
Share on other sites

  • Replies 61
  • Created
  • Last Reply

I would argue that the stat itself is far less dubious than many claim, but its understanding by many who utilize it is very poor.  OPS was/is the same thing as many utilize the stat without context of era, ballpark, etc.  WAR is rarely used properly, and if it was, it'd be seen as a good comparative tool, not a valuing tool as many have utilized it for.

 

I'm totally behind most of this.  However, I would suggest that the hypothetical "replacement player", the reliance on UZR, and the inability to discern positional defensive value provide plenty of dubiosity to the stat.

 

Mostly I just wanted to say dubiosity.

Link to comment
Share on other sites

I'm totally behind most of this.  However, I would suggest that the hypothetical "replacement player", the reliance on UZR, and the inability to discern positional defensive value provide plenty of dubiosity to the stat.

 

Mostly I just wanted to say dubiosity.

 

LOL! It's far from perfect, but it's the best thing out there so far.  I tend to utilize the three major WAR values out there as a scope because they each utilize different things for defense.

Link to comment
Share on other sites

LOL! It's far from perfect, but it's the best thing out there so far.  I tend to utilize the three major WAR values out there as a scope because they each utilize different things for defense.

 

Use it in moderation and as a data point.  I think it does a good job ball-parking relative values.  And gives you some scope as to the point that a 70 win team is not the best player or two away from the world series, or it provides some context to free agent signings.

 

Whether a player is actually worth 2.4 WAR or 2.8 WAR is not that important to me.  It is not like you can prove or dis-prove it anyway.

 

I also think that Bill James would seem biased here.

Link to comment
Share on other sites

In 2013 Perkins saved 36 games with 4 blown saves. Fangraphs assigned him WAR of 1.7.

 

This tells us a replacement player or some other back end bullpen guy can save 30+ games for us, right?

 

That tells me as much about the save stat as it does WAR.   Perkings was in 40 save situations and another 20 non-save appearances.  He gave up 16 ER but only blew four saves.  Granted his ERA was a very good 2.30.    But if someone else had an ERA of 3.50 it would only have been 8 more ER.  5 of those would have been in save situations.  So maybe another blown save or two at the most.

 

Analysis of the save stat has shown that since creation, closer salaries have skyrocketed but the winning percentage after 8 IP has remained the same.

 

It seems like the Tigers roll with a guy every year who has an ERA in the 4's and ends with 45 saves.

Link to comment
Share on other sites

Many good points in this thread - all of these stats and systems will continue to improve but I do tire of people actually arguing with others using WAR as some lynchpin piece of evidence.

 

Offensively, all the data is really helpful.

 

Defensively, you lost me when Josh Wilingham was rated a better defender than Torii Hunter (I can't remember which stat is was).

Link to comment
Share on other sites

Whatever.

 

Using WAR to evaluate closers makes less sense than using Saves, K%, etc.

K%, yes.

 

Saves, no.

 

The Save stat is basically junk for the reasons listed above. Its rules are completely arbitrary and give very little insight into the actual quality of the pitcher. I'd use WAR over the Save statistic, for sure. No question at all.

Link to comment
Share on other sites

K%, yes.

 

Saves, no.

 

The Save stat is basically junk for the reasons listed above. Its rules are completely arbitrary and give very little insight into the actual quality of the pitcher. I'd use WAR over the Save statistic, for sure. No question at all.

 

My big two for relievers/closers, ERA and runs inherited that are allowed to score. Given the nature of 1 run games, WHIP as well.

 

(Cue me beating a dead horse)

 

Rodney was a closer in basically 6 seasons.  He has 220 career saves (36 a year).  One year he had a 4.40 ERA and had 37 saves.  2 wins and 5 losses.  His career ERA is 3.61 which is not great for a reliever.

Link to comment
Share on other sites

My big two for relievers/closers, ERA and runs inherited that are allowed to score.

I generally look at BB% and K%. For relievers, I tend to avoid most stats and look at whether he issues free passes and misses bats (I also glance at HR and H/9).

 

But to each their own. There are plenty of ways to evaluate a reliever but the Save stat is one of the worst.

Link to comment
Share on other sites

In 2014, the NL dWAR leader was Simmons at 3.9 (which I think we can all agree is absurdly high, though Simmons is an incredible defender). The AL leader in dWAR was Kinsler at 2.9. Compare that to offense where McCutchen led the NL with a 7.8 oWAR and Trout led the AL with an 8.7 oWAR.

 

And that's why I'm skeptical of the outliers in dWAR, both good and bad. If the player slots into a dWAR around 1.0 over the course of a season, the inaccuracies of dWAR are marginalized. If that player slots into a dWAR around 3.0 while his oWAR is 2.5, then alarm bells should ring.

I don't necessarily think that the "absurdly high" outliers of dWAR should be dismissed so quickly. A 3.9 dWAR may seem like a lot, but I don't think it is impossible. A quick back-of-the-envelope calculation:

 

3.9 WAR is roughly 35 runs in the 2014 run environment. Using linear weights (the same methods used to calculate offensive WAR that most people are comfortable with), the value of a single versus a ground-out is approximately 0.7 runs (it varies based on run environments and other factors, but that is in the ballpark). So roughly speaking, during 2014 Simmons needed to convert 50 plays that a replacement level shortstop wouldn't make. Over the course of 150 games, that is one every three games - essentially twice a week.

 

I don't think that kind of performance is unreasonable at all, especially for the best of the best defenders. Maybe I'm misunderstanding your argument, but it seems like limiting dWAR to a realistic level of 1.0 or so and ignoring outliers is equally absurd. That is essentially arguing that the difference between the best and the worst defenders is just 12 plays a season. 

Link to comment
Share on other sites

I remember when the Twins traded for Capps, somebody posted some stats about how overrated "closers" were.  It went something like this: "closers" starting the 9th inning with nobody on base pitched a scoreless inning 92% of the time.

 

Other pitchers who starting an inning with nobody on base pitched a scoreless inning 90% of the time.  I can't remember if it was any random inning but the point remains.

 

That's why I hated the Capps deal so much.  You didn't need any special stats to tell you that he wasn't going to be much different than Jon Rausch was the first half of the year.  To this day Bill Smith still maintains that was a key acquisition for the playoff run that year; I'll refrain from editorial comment on his tenure.

Link to comment
Share on other sites

I don't necessarily think that the "absurdly high" outliers of dWAR should be dismissed so quickly. A 3.9 dWAR may seem like a lot, but I don't think it is impossible. A quick back-of-the-envelope calculation:

 

3.9 WAR is roughly 35 runs in the 2014 run environment. Using linear weights (the same methods used to calculate offensive WAR that most people are comfortable with), the value of a single versus a ground-out is approximately 0.7 runs (it varies based on run environments and other factors, but that is in the ballpark). So roughly speaking, during 2014 Simmons needed to convert 50 plays that a replacement level shortstop wouldn't make. Over the course of 150 games, that is one every three games - essentially twice a week.

 

I don't think that kind of performance is unreasonable at all, especially for the best of the best defenders. Maybe I'm misunderstanding your argument, but it seems like limiting dWAR to a realistic level of 1.0 or so and ignoring outliers is equally absurd. That is essentially arguing that the difference between the best and the worst defenders is just 12 plays a season. 

I'm not suggesting that outliers don't exist... I'm simply not confident that dWAR (and UZR) is capable of accurately measuring those players.

 

Simmons is a great defender. He might be worth four wins with the glove. Do I trust dWAR to tell me that? No, not really.

Link to comment
Share on other sites

So I'm going to basically sum up the argument against WAR from the first page without having read the second page.

 

People use WAR poorly and it isn't, therefore we shouldn't put much weight into it.

 

Just because people ignore speed limits and some roads are inaccurately limited (speed too high or low) doesn't mean we shouldn't use speed limits.  It means we need to continue to develop our guidelines for how we designate proper speed limits, and then make sure people are using them correctly.

 

No one will make the argument right now that WAR, particularly the defensive aspect of it, is perfected.  And if you think people are using the statistic incorrectly, than work to educate them on the statistic.  It's far and away the best statistic out there, but like every other statistic, shouldn't be used on its own.  

 

Someone said there are better offensive statistics, and I'd love to hear what they are.  The eye test does not count.  It is almost impossible for one person to watch every player at one position in every game and accurately quantify their performances, let alone all 10 positions (including DH).

Link to comment
Share on other sites

People use WAR poorly and it isn't, therefore we shouldn't put much weight into it.

 

Just because people ignore speed limits and some roads are inaccurately limited (speed too high or low) doesn't mean we shouldn't use speed limits.  It means we need to continue to develop our guidelines for how we designate proper speed limits, and then make sure people are using them correctly.

 

No one will make the argument right now that WAR, particularly the defensive aspect of it, is perfected.  And if you think people are using the statistic incorrectly, than work to educate them on the statistic.  It's far and away the best statistic out there, but like every other statistic, shouldn't be used on its own.  

 

Someone said there are better offensive statistics, and I'd love to hear what they are.  The eye test does not count.  It is almost impossible for one person to watch every player at one position in every game and accurately quantify their performances, let alone all 10 positions (including DH).

 

You are derailing a thread with a scarecrow here.  No one said it shouldn't be used, but the presentation of that statistic is consistently off-track.  It says something about the statistic if it's that poorly used just as it says something about a speed limit if no one (including cops) pays attention to it.

 

If you want to bring us all back to 55 with WAR - by all means, but we're well past that point with how the vast majority of people employ the statistic.  (Including Fangraphs and its writers).  So step one for pro-WAR people should be to get their own camp in-line with the proper use of the statistic rather than continuing to laud how great it is.  

 

I also bristle at the notion of dwar being "not perfected", that's roughly akin to saying Michael Bay's movie direction is "not perfected".  The defensive component of WAR is built on observer bias, small sample sizes, total disregard for positional importance, and a host of other issues.  And rarely, if ever, are these problems brought to light.  Instead it's just "not perfected".

 

A lack of alternatives doesn't make a bad thing a good thing - that's a poor argument if that's the best you got in WAR's favor.

Link to comment
Share on other sites

Two ex-Twins demonstrate to me many of the flaws of WAR. Nick Punto gets great marks for his defense and has a career WAR of 15.0. Cuddyer gets dWAR deducted and has a career WAR of 15.9 with more time on the field than Punto. Without bringing contracts into the discussion, would any team trade Cuddyer to get Punto? Regarding contracts, Punto signed as a free agent in 2014 and made $2.75M last year. Since he left the Twins he's made $750,000 (2011), $1.5M (2012 & 2013), and the aforementioned $2.75M. Cuddyer left one year later and has made $10.5M each of the last three years and will make $21M over the next two years.

Link to comment
Share on other sites

I think there are many flaws in WAR, enough to render it not worth serious consideration in and of themselves. Many of those flaws have been detailed here very well by others.

 

But I find the idea of "positional adjustments" so extremely wrong, at such a basic level, to render the entire premise faulty in and of itself. Ignoring the defensive issues, WAR purports to identify the worth of a given set of offensive outcomes, and translate those outcomes into wins. Fine. But then we're asked to make the leap of logic that the exact same outcomes are somehow worth more wins if performed by someone who lines up at SS, rather than 1b.

 

If we've identified the worth of a given offensive performance, and translated that into a number...what possible difference does it make where that comes from? Do we put 1.05 runs on the board if a SS hits a HR, and .95 runs if from a 1b man?

 

And yet a SS starts with a WAR advantage over a first baseman.

 

Why? Every team has one of each in the lineup. Just because big offensive numbers are harder to come by from a SS doesn't mean they add up to more runs. Those big numbers are already included in the WAR total.

Link to comment
Share on other sites

I think there are many flaws in WAR, enough to render it not worth serious consideration in and of themselves. Many of those flaws have been detailed here very well by others.

 

But I find the idea of "positional adjustments" so extremely wrong, at such a basic level, to render the entire premise faulty in and of itself. Ignoring the defensive issues, WAR purports to identify the worth of a given set of offensive outcomes, and translate those outcomes into wins. Fine. But then we're asked to make the leap of logic that the exact same outcomes are somehow worth more wins if performed by someone who lines up at SS, rather than 1b.

 

If we've identified the worth of a given offensive performance, and translated that into a number...what possible difference does it make where that comes from? Do we put 1.05 runs on the board if a SS hits a HR, and .95 runs if from a 1b man?

 

And yet a SS starts with a WAR advantage over a first baseman.

 

Why? Every team has one of each in the lineup. Just because big offensive numbers are harder to come by from a SS doesn't mean they add up to more runs. Those big numbers are already included in the WAR total.

A ss who hits 20 HR a year while batting .250 would be rarer than a 1b, so in the concept of how much better is this player than a replacemnt level player it would mean more to be a ss with those numbers than a 1b.

Link to comment
Share on other sites

WAR was stated to be a measure of what that player's contribution was towards the team wining. As any defensive metric is going to be subjective there will be a hole in any all encompasing so called advanced metric. Fangraphs clearly states it is not a precise tool.  I don't think Smith ever called it precise. If the curators or creators do not call  it a precise tool how can the imprecision be assailed. It is the users who appear not to know what the tool is used for and assail the tool for what they have forgotten.

Link to comment
Share on other sites

I think there are many flaws in WAR, enough to render it not worth serious consideration in and of themselves. Many of those flaws have been detailed here very well by others.

 

But I find the idea of "positional adjustments" so extremely wrong, at such a basic level, to render the entire premise faulty in and of itself. Ignoring the defensive issues, WAR purports to identify the worth of a given set of offensive outcomes, and translate those outcomes into wins. Fine. But then we're asked to make the leap of logic that the exact same outcomes are somehow worth more wins if performed by someone who lines up at SS, rather than 1b.

 

If we've identified the worth of a given offensive performance, and translated that into a number...what possible difference does it make where that comes from? Do we put 1.05 runs on the board if a SS hits a HR, and .95 runs if from a 1b man?

 

And yet a SS starts with a WAR advantage over a first baseman.

 

Why? Every team has one of each in the lineup. Just because big offensive numbers are harder to come by from a SS doesn't mean they add up to more runs. Those big numbers are already included in the WAR total.

I think you are missing a step in the WAR calculation that may help explain the positional adjustment. I would characterize WAR this way:

"WAR purports to identify the worth of a given set of offensive outcomes, finds the difference between that value and league average production, adjusts that difference to replacement level, and then translates the adjusted difference into wins."

WAR doesn't directly care about the raw number of runs that a player created during the season. Everything is adjusted to be relative to the context of the league. That allows comparisons to be made between players in different era or run environments.

 

This is why the positional adjustments make sense. Like you said, every team has one of each in the lineup. Last year, the league average OPS was .700, and let's say my team had a 1B and a SS who both had .700 OPS. Last year, 22 other teams had above-average production from 1B, which means that my 1B was actively hurting my team relative to the rest of the league. However, only 6 other teams had above-average SS, which means that my SS was helping my team win relative to the rest of the league. So even though both my 1B and SS produced the same number of runs from their offensive performance, relative to the rest of the league the 1B was hurting my team, while the SS was helping. That is why I think it makes sense for the SS to get a position adjustment, while 1B gets a negative adjustment. I do think there are many (potential) issues with the current implementation of positional adjustments, but overall I think the concept is correct. 

Link to comment
Share on other sites

You are derailing a thread with a scarecrow here.  No one said it shouldn't be used, but the presentation of that statistic is consistently off-track.  It says something about the statistic if it's that poorly used just as it says something about a speed limit if no one (including cops) pays attention to it.

 

If you want to bring us all back to 55 with WAR - by all means, but we're well past that point with how the vast majority of people employ the statistic.  (Including Fangraphs and its writers).  So step one for pro-WAR people should be to get their own camp in-line with the proper use of the statistic rather than continuing to laud how great it is.  

 

I also bristle at the notion of dwar being "not perfected", that's roughly akin to saying Michael Bay's movie direction is "not perfected".  The defensive component of WAR is built on observer bias, small sample sizes, total disregard for positional importance, and a host of other issues.  And rarely, if ever, are these problems brought to light.  Instead it's just "not perfected".

 

A lack of alternatives doesn't make a bad thing a good thing - that's a poor argument if that's the best you got in WAR's favor.

 

Let's talk about derailing a conversation and strawmen.  At any point did I make a statement about the quality of WAR as a statistic that wasn't comparative?  I never said WAR was a great statistic.  I even said it was flawed, particularly the fielding aspect of it (I don't see much debate on oWAR).  StatCast is going to play a huge role in developing fielding statistics once it comes out and will make dWAR's current iterations obsolete.  But until we have access to better data, there's just not a lot to do about defensive statistics.  What I did say is that there isn't currently a better statistic out there.

 

At the end of the day, WAR is nothing more than a descriptive statistic.  It's not meant to make predictions, it's meant to evaluate how a player performed in a given season.  It is the best statistic we have right now at doing so, but anyone who lauds WAR to be anything more (or rails against it for not doing more) is using it incorrectly.

Link to comment
Share on other sites

Two ex-Twins demonstrate to me many of the flaws of WAR. Nick Punto gets great marks for his defense and has a career WAR of 15.0. Cuddyer gets dWAR deducted and has a career WAR of 15.9 with more time on the field than Punto. Without bringing contracts into the discussion, would any team trade Cuddyer to get Punto? Regarding contracts, Punto signed as a free agent in 2014 and made $2.75M last year. Since he left the Twins he's made $750,000 (2011), $1.5M (2012 & 2013), and the aforementioned $2.75M. Cuddyer left one year later and has made $10.5M each of the last three years and will make $21M over the next two years.

 

This is an instance of really not using the statistic well, though.  Comparing Punto to another middle infielder would be a valid use of WAR, but Cuddyer should be compared with other RF/1B types for a more accurate comparison.

Link to comment
Share on other sites

I never said WAR was a great statistic.  

 

At the end of the day, WAR is nothing more than a descriptive statistic.  It's not meant to make predictions, it's meant to evaluate how a player performed in a given season.  It is the best statistic we have right now at doing so, but anyone who lauds WAR to be anything more (or rails against it for not doing more) is using it incorrectly.

 

Except it's not even a good measure of how a player performed in a given season because UZR is a very poor measure of defense in a one year sample.  (Among a variety of other factors) So while you are partially using the statistic at its best (for comparisons), you are also failing to take into account known problems with the statistic to use for a year-to-year comparison.  WAR is pretty poor at the function you just described as well.

 

What you quoted still stands - people are arguing this is a good stat or the "far and away the best" - and yet that argument essentially boils down to "it's the least sucky available".  Just because brussel sprouts are the last thing left to eat doesn't make them good, nor does WAR being the least flawed composite stat make it any good.  Which, to take it back to the original post, is all Bill James said: WAR isn't nearly the golden goose it's often made out to be.  It's still pretty unimpressive itself.

Link to comment
Share on other sites

Except it's not even a good measure of how a player performed in a given season because UZR is a very poor measure of defense in a one year sample.  (Among a variety of other factors) So while you are partially using the statistic at its best (for comparisons), you are also failing to take into account known problems with the statistic to use for a year-to-year comparison.  WAR is pretty poor at the function you just described as well.

 

What you quoted still stands - people are arguing this is a good stat or the "far and away the best" - and yet that argument essentially boils down to "it's the least sucky available".  Just because brussel sprouts are the last thing left to eat doesn't make them good, nor does WAR being the least flawed composite stat make it any good.  Which, to take it back to the original post, is all Bill James said: WAR isn't nearly the golden goose it's often made out to be.  It's still pretty unimpressive itself.

 

Again....I never made an argument for the validity of the stat, only it's placement in comparison to other currently available stats.  If you're looking for me to tell you, "Wow, you sure are right about dWAR," don't hold your breath.  If you want to sit here and rail against the statistic because one component of it is lacking due to poor data availability, then be my guest.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

The Twins Daily Caretaker Fund
The Twins Daily Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Twins community on the internet.

×
×
  • Create New...