Article: Mailbag: Available Pitchers, Buxton Hype, Baseball Time Machine

TheLeviathan · March 9, 2019

Defensive metrics continue to improve. No doubt. I think all skeptics like cheif and myself say it's to acknowledge the blind spots. Many say they do, but don't really.

Correct. I hear the proponents proclaim loudly how they understand that one year samples are not necessarily great use of the metrics, but then think nothing of citing WAR in a one year sample.

I look at defensive metrics, I'm not as skeptical as Chief, but the blinders go on REAL fast for a lot of people. It's not purposeful, I think people just like having the quick and dirty stats and frequently forget how dirty they really are.

Edited March 9, 2019 by TheLeviathan

Jim Hahn · March 9, 2019

The problem with WAR is that people like it. It allows them to do things that are otherwise difficult to do. Like comparing players that play different positions. Or even comparing the value a pitcher brings to a team with the that of a position player.

The issue with this is that WAR is a subjective stat. It is taking stats that someone thinks are important, weighing them to emphasize the most important, and then combining them and putting them on a scale. Assuming the math is legimate, we are when using WAR, agreeing that the author used the correct stats, and weighed them the way we individuals would if we were comparing two or more players. That is nonsense of course. Not everybody on this site would agree on something as simple whether ob% or slugging % is more important for example , much less what stats we should use or which are most important.

WAR is largely a lazy way to compare players. Since it is subjective, it is not WRONG, merely an opinion about how players compare.

Many of the new stats like FIP and most defensive stats, are like that. They are subjective stats that give you an opinion about what value players bring to their teams.

ashbury · March 9, 2019

The issue with this is that WAR is a subjective stat.

I really think the issue you are getting at is abstraction, not subjectivity.

A subjective stat would be if I created Ashbury's All-around High-performers (AAH!), which is batting average, plus a value that I choose for each player. Bryce Harper batted .249 last year but his AAH is .749 because I think he's a hell of a player and I give him .500 more. Ehire's AAH is just his .251 BA because I think he's only OK. You don't want to know what AAH I assigned Pedro Florimon a few years ago.

Harper had exactly 100 RBI last year, which sounds like a made up round number but is actually computed in a way that MLB has defined for over a century now. Harper had a bWAR of 1.3 because of a formula some guy developed in 1995, or 2007, or maybe 2013. (Actually RBI has gone through changes in its time too - I'd have to go back and check the specifics but runs scored on sac flies and double plays counted or didn't count depending on which season you are looking up.) If you accept the framework, you can verify the numbers in each case.

What both of these objective stats leave unanswered is, "... and this is important, why?" RBI is situational, so that for instance a player who was hurt for half a season will appear less valuable, and a guy with weak-hitting teammates will get fewer chances. WAR tries to figure out how a player correlates to his team's chances of winning. The importance of either stat depends on what you want to use it for, and subjectivity definitely comes in there. As a different example, FIP has importance to some people, because studies have shown that it is a better predictor for the next season's ERA, than is ERA itself. FIP is also not subjective, as you can check any website's number by computing it yourself, but its use certainly is subjective, as we saw for instance with Ricky Nolasco.

The difference, to me, is that RBI faithfully records what happened. It's extremely concrete. WAR and FIP are both abstractions. There's no such thing as an actual Win in the standings being allocated to anyone but a particular pitcher (unless you are a Bill James acolyte) so WAR tries to remedy that - and there are other similar remedies to consider instead, for instance BJ's own Win Shares. FIP tries to recognize that pitchers sometimes run into hard luck, and again there are other remedies for that if FIP itself bothers you, or you can stick with ERA and its ability to record what demonstrably did happen.

Sorry to wax philosophical. My recollection is that you aren't averse to peeling back the layers of the onion a bit, now and then.

Mr. Brooks · March 10, 2019

The problem with WAR is that people like it. It allows them to do things that are otherwise difficult to do. Like comparing players that play different positions. Or even comparing the value a pitcher brings to a team with the that of a position player.

The issue with this is that WAR is a subjective stat. It is taking stats that someone thinks are important, weighing them to emphasize the most important, and then combining them and putting them on a scale. Assuming the math is legimate, we are when using WAR, agreeing that the author used the correct stats, and weighed them the way we individuals would if we were comparing two or more players. That is nonsense of course. Not everybody on this site would agree on something as simple whether ob% or slugging % is more important for example , much less what stats we should use or which are most important.

WAR is largely a lazy way to compare players. Since it is subjective, it is not WRONG, merely an opinion about how players compare.

Many of the new stats like FIP and most defensive stats, are like that. They are subjective stats that give you an opinion about what value players bring to their teams.

There is nothing subjective that goes into FIP. It's a measurement of something that happened, just like ERA or OBP.

jorgenswest · March 10, 2019

There is nothing subjective that goes into FIP. It's a measurement of something that happened, just like ERA or OBP.

(13*HR+3*(HBP+BB)-2*K)/IP + constant (usually around 3.2)

Some might argue that the weights assigned to each event is worthy of debate.

Mr. Brooks · March 10, 2019

(13*HR+3*(HBP+BB)-2*K)/IP + constant (usually around 3.2)

Some might argue that the weights assigned to each event is worthy of debate.

Sure, but still not subjective. It's the exact same formula for every pitcher, and every part of the equation is a yes/no actual occurrence.

Jham · March 10, 2019

Sure, but still not subjective. It's the exact same formula for every pitcher, and every part of the equation is a yes/no actual occurrence.

No, the weighting matters. Presumably, you're comparing pitchers based on FIP and suggesting the pitcher with the lower FIP (objective) is a better (subjective) pitcher. But the score is weighted, and the weighting may favor Ks over limiting walks or getting ground balls.

The real question is why the heck we try to predict era at all. shouldn't FIP and xFIP be relatively constant?

Platoon · March 11, 2019

I tell old people to watch their step on ice.
With the Kids you hope they get a million dollar contract playing for the Vancouver Canucks. And if that’s not possible... you hope that they will hold the old guys arm as they navigate the slippery sidewalk.

I dread to consider what you consider "old people", but I do acknowledge that ice (and concrete) are far harder than I remember them to be in my past!

Mr. Brooks · March 11, 2019

No, the weighting matters. Presumably, you're comparing pitchers based on FIP and suggesting the pitcher with the lower FIP (objective) is a better (subjective) pitcher. But the score is weighted, and the weighting may favor Ks over limiting walks or getting ground balls.

The real question is why the heck we try to predict era at all. shouldn't FIP and xFIP be relatively constant?

I never said it didn't matter.

I said it's still objective.

For FIP to be subjective, it would require the ability for two different people to give the exact same sequence a different score, like figure skating scores.

FIP itself is not subjective, regardless of what one thinks of the formula behind it. If 1000 people calculate the FIP of a player, assuming their math is correct, all 1000 of them mathematically MUST come to the exact same number. Therefore, it's factually not subjective.

jorgenswest · March 11, 2019

Wouldn’t every calculation of UZR come out the same?

The subjective piece has always been part of baseball. A score keeper is deciding whether a hit is an error and 1000 people might not always make the same decision. An umpire has to decide if it is a ball or a strike. Once determined the calculation is the same. I think UZR and DRS would be the same.

If we want to use a measure to project the future, it is important to consider how well it correlates year to year. The defensive measures correlate similarly to many outcome based hitting stats like slugging percentage. Like those stat they need a full season of data or more to be reliable.

Riverbrian · March 11, 2019

I dread to consider what you consider "old people", but I do acknowledge that ice (and concrete) are far harder than I remember them to be in my past!

As of March 11th, 2019

Anybody older than 53 Years 5 Months and 21 Days are "Old People"

As of March 12th, 2019

It will be anybody older than 53 Years 5 Months and 22 Days.

Mike Sixel · March 11, 2019

The problem with WAR is that people like it. It allows them to do things that are otherwise difficult to do. Like comparing players that play different positions. Or even comparing the value a pitcher brings to a team with the that of a position player.

The issue with this is that WAR is a subjective stat. It is taking stats that someone thinks are important, weighing them to emphasize the most important, and then combining them and putting them on a scale. Assuming the math is legimate, we are when using WAR, agreeing that the author used the correct stats, and weighed them the way we individuals would if we were comparing two or more players. That is nonsense of course. Not everybody on this site would agree on something as simple whether ob% or slugging % is more important for example , much less what stats we should use or which are most important.

WAR is largely a lazy way to compare players. Since it is subjective, it is not WRONG, merely an opinion about how players compare.

Many of the new stats like FIP and most defensive stats, are like that. They are subjective stats that give you an opinion about what value players bring to their teams.

Uh, isn't choosing what stats to use "biased" under your terms no matter what you choose (or subjective)? I fail to understand your argument, unless you are arguing you literally can't compare players at all....because that's the logical conclusion of this post.

BrianTrottier · March 11, 2019

There is nothing subjective that goes into FIP. It's a measurement of something that happened, just like ERA or OBP.

That's not entirely true. There is occasionally a scorekeeper impact on what's counted as an earned or unearned run. Similarly, OBP is influenced by subjective inputs, such as what's a ball and what's a strike.

Mike Sixel · March 11, 2019

That's not entirely true. There is occasionally a scorekeeper impact on what's counted as an earned or unearned run. Similarly, OBP is influenced by subjective inputs, such as what's a ball and what's a strike.

by that logic, every single play is subjective. And no measure will ever be useful to some people then....

BrianTrottier · March 11, 2019

by that logic, every single play is subjective. And no measure will ever be useful to some people then....

I'm not saying that those measures aren't useful. I'm just disagreeing that "there's nothing subjective" that goes into them.

And you're probably correct that there are some people who will never view statistics or analytics as useful.

yarnivek1972 · March 11, 2019

The problem with WAR is that people like it. It allows them to do things that are otherwise difficult to do. Like comparing players that play different positions. Or even comparing the value a pitcher brings to a team with the that of a position player.

The issue with this is that WAR is a subjective stat. It is taking stats that someone thinks are important, weighing them to emphasize the most important, and then combining them and putting them on a scale. Assuming the math is legimate, we are when using WAR, agreeing that the author used the correct stats, and weighed them the way we individuals would if we were comparing two or more players. That is nonsense of course. Not everybody on this site would agree on something as simple whether ob% or slugging % is more important for example , much less what stats we should use or which are most important.

WAR is largely a lazy way to compare players. Since it is subjective, it is not WRONG, merely an opinion about how players compare.

Many of the new stats like FIP and most defensive stats, are like that. They are subjective stats that give you an opinion about what value players bring to their teams.

Kinda like QB rating in football. No one truly knows how to measure it, but everyone believes in it.

Mr. Brooks · March 12, 2019

I'm not saying that those measures aren't useful. I'm just disagreeing that "there's nothing subjective" that goes into them.

And you're probably correct that there are some people who will never view statistics or analytics as useful.

Again, I never argued that there is nothing subjective behind it.

Just that FIP itself isn't subjective, and it's factually not.

Balls and strikes, wild pitches, passed balls, errors, these things are subjective. FIP is not.

TheLeviathan · March 12, 2019

Again, I never argued that there is nothing subjective behind it.
Just that FIP itself isn't subjective, and it's factually not.
Balls and strikes, wild pitches, passed balls, errors, these things are subjective. FIP is not.

The conversation is not about whether people can read a formula and put it in a calculator, the question is whether the formula itself is too subjective. I don't think that was unclear from the discussion.

Your point seems to be an unnecessary quibble given that.

Mr. Brooks · March 12, 2019

The conversation is not about whether people can read a formula and put it in a calculator, the question is whether the formula itself is too subjective. I don't think that was unclear from the discussion.

Your point seems to be an unnecessary quibble given that.

It may have morphed into that, but that wasn't what I originally responded to.

In fact, the original post I responded to wouldn't even make sense in the context you suggest, as pretty much every stat has some level of subjectivity behind the formula.

How is the formula behind FIP any more subjective than the formula behind ERA?

TheLeviathan · March 12, 2019

It may have morphed into that, but that wasn't what I originally responded to.
In fact, the original post I responded to wouldn't even make sense in the context you suggest, as pretty much every stat has some level of subjectivity behind the formula.
How is the formula behind FIP any more subjective than the formula behind ERA?

Jorgenwest already tried to help you on this. And the original posters point was on the same thing. I'm not sure how you got on this tangent, but it's a strawman.

No one doubts that if you are given 5 x 2 you'll get 10. Of course that's objective. But it isn't whether 5 x 2 is objective or subjective, it's the question of whether it should be 5 or should be 2 and the decisions made in that regard. What's subjective is deciding it should be 5 or should be 2.

What you decide to incorporate and how you weigh it is an act of choosing what is most important. That is, by definition, subjective. Now, you can have strong reasons for doing so. You can test it and find the results in line with what you are trying to measure, but nonetheless there are elements that are subjective.

Hell, on WAR alone we have multiple ways to calculate it based on what different places choose to weigh. That doesn't eliminate them from being useful, but it pretty much squashes the "objective" argument.

Edited March 12, 2019 by TheLeviathan

Otto von Ballpark · March 12, 2019

The problem with WAR is that people like it. It allows them to do things that are otherwise difficult to do. Like comparing players that play different positions. Or even comparing the value a pitcher brings to a team with the that of a position player.

The issue with this is that WAR is a subjective stat. It is taking stats that someone thinks are important, weighing them to emphasize the most important, and then combining them and putting them on a scale. Assuming the math is legimate, we are when using WAR, agreeing that the author used the correct stats, and weighed them the way we individuals would if we were comparing two or more players. That is nonsense of course. Not everybody on this site would agree on something as simple whether ob% or slugging % is more important for example , much less what stats we should use or which are most important.

WAR is largely a lazy way to compare players. Since it is subjective, it is not WRONG, merely an opinion about how players compare.

Many of the new stats like FIP and most defensive stats, are like that. They are subjective stats that give you an opinion about what value players bring to their teams.

Every single method of baseball player evaluation since the dawn of time has been subjective. That's not unique to "new stats".

Mr. Brooks · March 13, 2019

Jorgenwest already tried to help you on this. And the original posters point was on the same thing. I'm not sure how you got on this tangent, but it's a strawman.

No one doubts that if you are given 5 x 2 you'll get 10. Of course that's objective. But it isn't whether 5 x 2 is objective or subjective, it's the question of whether it should be 5 or should be 2 and the decisions made in that regard. What's subjective is deciding it should be 5 or should be 2.

What you decide to incorporate and how you weigh it is an act of choosing what is most important. That is, by definition, subjective. Now, you can have strong reasons for doing so. You can test it and find the results in line with what you are trying to measure, but nonetheless there are elements that are subjective.

Hell, on WAR alone we have multiple ways to calculate it based on what different places choose to weigh. That doesn't eliminate them from being useful, but it pretty much squashes the "objective" argument.

Ok, in that context, then what was the point of the post I responded to? Almost every stat has a level of subjectivity behind how it's calculated, so why did he single out FIP?

TheLeviathan · March 13, 2019

Ok, in that context, then what was the point of the post I responded to? Almost every stat has a level of subjectivity behind how it's calculated, so why did he single out FIP?

FIP has more subjectivity than, say, OBP. It's about the levels, stats like WAR and a few others certainly have a higher component than many others.

Basically, any "value" stats are going to be more subjective than others.

Mr. Brooks · March 13, 2019

FIP has more subjectivity than, say, OBP. It's about the levels, stats like WAR and a few others certainly have a higher component than many others.

Basically, any "value" stats are going to be more subjective than others.

Well I don't agree.

FIP has hardly any subjectivity behind it. The multipliers only exist to put it on a comparable scale to ERA, not as an attempt to place more value on certain components of it.

TheLeviathan · March 13, 2019

Well I don't agree.
FIP has hardly any subjectivity behind it. The multipliers only exist to put it on a comparable scale to ERA, not as an attempt to place more value on certain components of it.

From Fangraphs: The individual weights for home runs, walks/HBP, and strikeouts are based on the relative values of those actions with respect to run prevention.

The constant is what puts it on scale with ERA. The multipliers are value driven.

And that is different than calculating ERA or OBP. There are no multipliers or value adjustments. (And, before some new tangent arrives, I'm not saying that makes the stat lesser. It just makes it some degree more subjective)

Jham · March 13, 2019

From Fangraphs: The individual weights for home runs, walks/HBP, and strikeouts are based on the relative values of those actions with respect to run prevention.

The constant is what puts it on scale with ERA. The multipliers are value driven.

And that is different than calculating ERA or OBP. There are no multipliers or value adjustments. (And, before some new tangent arrives, I'm not saying that makes the stat lesser. It just makes it some degree more subjective)

Fangraphs even explicitly cautions users on the weighting, particularly of HR which typically occur in small samples. Thus it created a separate stat, xFIP which assumes league avg hr rates.

That said, FIP, xFIP, and SIERA are all good stats that carefully and scientifically weight the input data based on historical correlations with run prevention. But... the game is changing in regard to, and perhaps because of, the areas these stats emphasize: k's, bbs and HRs. 2 things i'd like to see is if the counting stats have the same run prevention correlations in the modern game as they used to.

I'd really like to see how much more stable Fip is from year to year compared to era. Still don't get trying to predict era.

Mr. Brooks · March 13, 2019

Fair enough. Argument withdrawn.

Don Walcott · March 13, 2019

I think some of the confusion with this argument is that, as Yarnivek said, WAR and FIP are like the QB rating system. They aren't "statistics," though their formulas are based on statistics.

And as has been pointed out by others, it's how the rating system's formula is comprised that is "subjective" to an extent. However, it's not that "subjective" formulas are inherently unreliable. Rather, they are as reliable as the person on whose expert opinion we rely, and the data that they are using. Even without diving deep into the composition of the formulas, we can look at different formulation of WAR, for example, and determine which outcome most closely aligns with what we believe reality is.

I prefer not to use WAR at all, because I believe it is generally used by people who believe it is authoritative, like a statistic, proving definitively how one player is better than another or how one player's season or career is better than another player's season or career. I think it's a lazy way of making an argument from authority, without really knowing much about the authority. And it's also not the correct way to use these rating systems, which are really intended to be predictive (attempting to equalize things like luck and park factors) and not a reflection of what actually happened.

Jim Hahn · March 14, 2019

I seem to have derailed this thread, unintentionally, by my use of the word "subjective". As Mr. Walcott said rather well above, my intent was to point out that WAR and FIP are subjective in the CHOICE of which stats they are constructed from and also the WEIGHING of those stats. Hence the different versions of these stats. Agsin, I am not claiming these stats are WRONG. Merely that these type of stats are more like opinions about the value players bring to their teams.

Back when I was young, there was debate about who was the best player on the Twins, Killebrew, Oliva, or Carew. They were hugely different players, bringing much different skill sets to the game. Two were Hall of Fame players, the 3rd probably would of been except for injuries. You could use WAR to "settle" this debate. Except that this would still be an opinion based on what someelse thinks are the most important stats, weighed in a manner to reflect his veiws of which is more important. That view is not WRONG. It is merely an opinion. You can agree or disagree with.

I think some people use WAR as a conversation stopper. WAR says, so it must be right. To me, it is more of a place to start. It gives you one view of how players compare. You are free to come up with your own view. By the way, Oliva was the best player, if for too short of time. In my opinion.

Don Walcott · March 14, 2019

No way, Rod Carew was the best player who ever played for the Twins!!

. . .

And I'm coming up with a formula of statistics to prove it!!

Sign In

Article: Mailbag: Available Pitchers, Buxton Hype, Baseball Time Machine

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

mickeymental

Einheri

ashbury

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Prospect News & Highlights

Recent News

Recent Blog Entries

Recent Status Updates