Jump to content
Twins Daily
  • Create Account

How PECOTA Projections Could Miss on the Twins (Part One)


Recommended Posts

Projection systems are more robust than ever... but there are still whole swaths of baseball's unpredictability for which they cannot account.According to Baseball Prospectus’s PECOTA system, the Twins have a 76.6-percent chance of winning the AL Central in 2020. FanGraphs pegs them just a bit lower, at 55.0 percent to win the division. FanGraphs gives them a 68.3-percent chance to make the playoffs in any form; BP has them at a gaudy 87.5 percent. Both systems project them for over 90 wins, though fewer than 95. However, as I was preparing the Twins’ team preview for Prospectus last week, I found myself noting ways in which PECOTA (and all projection systems, in fact) is unable to properly capture the risk and the upside of a team like Minnesota.

 

First, consider injury risk. At Baseball Prospectus, the only way in which those risks are accounted for is through the manual process of building team depth charts. Because Byron Buxton strongly tends to get hurt and miss portions of each season, he’s not projected for as full a share of the playing time in center field as he’ll enjoy whenever he’s healthy. However, the playing time he is projected to receive (75 percent of center field innings, and 490 total plate appearances) isn’t treated as a variable by the system.

 

The format for team previews at Prospectus this spring involves building out and breaking down the decile projections for each team—that is, the record each club would have given a 90th-percentile outcome, an 80th-percentile outcome, and so on, down to 10th. The idea is to sketch out how PECOTA captures the variance possible at a team level, based on the vagaries of individual performances. The decile projections are created using the individual decile projections PECOTA generates, with each team projection representing the outcome of 500 simulations wherein each team’s players all perform at that decile and the rest of the league is kept at their median projection.

 

Here are the Twins’ decile projections for 2020, as of last week:

  • 90th percentile: 109-53
  • 80th percentile: 104-58
  • 70th percentile: 100-62
  • 60th percentile: 98-64
  • 50th percentile: 94-68
  • 40th percentile: 90-72
  • 30th percentile: 86-76
  • 20th percentile: 81-81
  • 10th percentile: 74-88
That seems, at first blush, like a fair depiction of the real possibility spectrum for this team. The Twins are likely to be very good, and the only way they won’t at least be a respectable Wild Card contender is if the wheels truly come off.

 

However, in all nine of those amalgamated projections, and in all 4,500 simulations that underpin them, Byron Buxton goes to the plate 490 times, and is the Twins’ center fielder for 75 percent of their defensive innings. In all of the simulations, José Berríos pitches 184 innings, Michael Pineda pitches 100, and Rich Hill pitches 52. The system is only capturing variance created by players performing in unexpected ways; it doesn’t see injury risk.

 

For that matter, the system also only reflects variance in offensive performance. In all those simulations, Buxton (with his fixed playing time) is worth 20 runs as a defender in center field. In all of them, Miguel Sanó (whom BP has pegged for 65 percent of the playing time at first base, five percent at third base, and five percent at DH, reflecting some of his own injury history) is worth eight runs as a defender at first base, because typically, third baseman who make the transition to first base while still in their prime tend to be good fielders there.

 

PECOTA has no engine for systematically scaling defensive production upward or downward, and even if it had one, it would be hard to decide how to use it. Would a player having a bad season at the plate be expected to struggle to the same extent in the field? Would a guy who turned out to be a better-than-expected defender lose something at bat because of his focus on defense and the wear and tear of playing harder in the field?

 

PECOTA isn’t alone in not being able to answer these questions, and BP isn’t alone in being unwilling to attempt to calibrate all of that manually. Playoff odds and other team-level projections, like the atomic player projections of which they consist, are fraught with oversimplifications and unexamined axes of upside and downside.

 

This is part one of a two-part exploration of the ways projection systems miss certain elements of the Twins' risk and upside. Check back tomorrow for five specific examples drawn from the Twins' individual projections.

 

Click here to view the article

Link to comment
Share on other sites

I always laugh at the NFL coverage - 6 days of speculation and then one day of games.  In MLB it is 4 months of speculation and 8 months of games - I like that best. What no system can predict beyond injuries is the wild career year of some player who will impact the game from the mound or the batter's box.

 

If I give myself a challenge of predicting the season I would start with last year - then subtract any losses and add the WAR of the newest additions.  Of course that would mean a 110 win season for the Twins so I have to adjust for quality of opponents and that brings me back to last year's record.  Unlike the past, I expect us to win the division and my nerves are set on edge only as I look at the playoffs. 

Link to comment
Share on other sites

I tried to ask this in the BP article comments section and I think you addressed it here, but I want to confirm...

 

The Twins is 90th percentile projection is just the 90th percentile result for all of the Twins players? While that might be simpler for the projection system to calculate, a 90th percentile result for all Twins players logically seems like it would be a 99.99th percentile result for the team.

Link to comment
Share on other sites

I had assumed this was the case with these projections, but it's good to understand it a little more clearly.

 

I also wonder how much it makes sense to say a teams 90th percentile outcome is when every player performs at their 90th percentile outcome. Somehow the latter seems far more unlikely to me? A team full of 90th percentile players seems more like a 99th percentile team. Maybe I'm not thinking of it correctly. Or maybe that indirectly accounts for a bit of the injury/playing time issue that Matthew is getting at.

Link to comment
Share on other sites

Maybe is doesn't make sense to do percentiles for health, but seems like a really important missing component and certainly the variable that leads the the most inaccuracy in the numbers.

 

Seems like they should try to incorporate Very Bad/Bad/Normal/Good/Very Good health variables into the system. Where "normal" their current baseline and you dial up or down inning/games player from there.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
The Twins Daily Caretaker Fund
The Twins Daily Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Twins community on the internet.

×
×
  • Create New...