What We Can Learn From 2019 Hitter Projection Accuracy (Fantasy Baseball)
Beyond our fantasy baseball content, be sure to check out our award-winning slate of Fantasy Baseball Tools as you prepare for your draft this season. From our Cheat Sheet Creator – which allows you to combine rankings from 100+ experts into one cheat sheet – to our Draft Assistant – which optimizes your picks with expert advice – we’ve got you covered this fantasy baseball draft season.
“Prediction is very difficult, especially if it’s about the future.” – Nils Bohr, Nobel Laureate in Physics
When it really boils down to it, fantasy sports comes down to two things.
- Who gets the luckiest
- Who can predict the future the least horribly
Prediction is what we making our living on here at FantasyPro’s, and look, man, we aren’t that good at it. The easiest way to spot a charlatan is to measure his confidence. If he says he’s always right – he’s probably always wrong. I’m just one guy among a staff of writers here, but I do speak for all of us – we will never be right about the future anywhere near 100% of the time. What keeps the wheels turning here is that we can beat the majority of people, and also that we show up near the top of Google searches for fantasy sports advice.
Every prediction can, at some point, be checked. That’s what I’m here to do. I have compiled last year’s FantasyPros Zeile projections along with last year’s actual end-of-year statistics. Have a look at these three beautiful projections we made last year:
Getting down to business now, the main goal here is to be informative. I want to tell you what the projections and accuracy looked like at a high level. A secondary goal is to try my best to provide some actionable information, to try to provide some good evidence to show what statistics are most and least predictable.
You can check out last year’s hitting projections through the ridiculous power of the Wayback Machine, by clicking here. The Wayback Machine is an archive of the internet, through it you can travel back in time and see the archived version of websites.
Disclaimer: it is really scary stuff if it happened to archive your middle school Xanga blog – I know from experience.
There are a few statistical issues that get in the way of doing this kind of analysis fairly. I have done my best to get around those problems, but I really hope no mathematics professors are reading this (or, at least I hope they don’t also have Twitter accounts). Here are the issues we have, and how I have chosen to deal with it.
Issue 1: Injuries
The easiest way to be insanely wrong about a hitter projection is for that player to see 300 fewer plate appearances than you predicted. We aren’t going to pretend to be able to predict injury with any real accuracy here.
My solution was to turn every hitter’s projection and actual stats into a per-600 plate appearance ratio. That means if a hitter popped 20 homers in 400 plate appearances, he will show up here as having a 30 homer season because that’s what he would have done in 200 more plate appearances at the same rate. I did compare the projected vs. actual plate appearances too just for integrity’s sake – that will show up later
Issue 2: Uneven Distributions
I cannot sit here and say that we were better at predicting stolen bases than predicting RBIs because on average we only missed our steals projections by 5 steals while we missed our RBIs projection by an average of 15. The range of steals totals last year was 46 (0 for a ton of hitters, 46 for Mallex Smith), and the range of RBI (in our sample of hitters that had 200+ PAs) was 114 (12 for Jeff Mathis, 126 for Anthony Rendon). It is very, very hard to not be more wrong about RBIs than steals in this situation.
My answer to this was to do percentiles for each category, for both projected and actual. Doing this essentially ranks every player in every category from zero to one based on where they fell among the total population of hitters. Rendon, for example, would have a 1 in RBI since he led the league, and Mathis would have a 0. This makes it possible to compare projections in all categories evenly since they are all now inside the same 0-1 range.
Finally, some actual results
I took every hitter with over 200 projected AND 200 actual plate appearances in 2019, found the difference in their projected percentile and actual percentile in each category, and then averaged them all out. A result closer to zero means we were more accurate. Here are the numbers:
Strikeout rate, walk rate, stolen bases, and home runs are the winners here. That is not surprising since these four things are the statistics that the hitter has the most control over. You cannot pass the buck to anybody besides the umpire for a strikeout or a walk, stolen bases are all about a specific skill that doesn’t often go away for a player, and the only non-pitcher factor in a home run is where the fence is.
The rest of these statistics have all kinds of random noise involved, making them much harder to project. You can have most of your hits come with nobody on base (affecting your RBI total), you can have a really low BABIP or play in a tough ballpark (affecting your AVG, OBP, SLG), you can be in a terrible offensive unit (affecting your runs total), etc. All of these external factors make it tougher to predict these categories, and this result proves it.
Our Worst Projections, Too Low on Players
These are some of the standout bad projections that were the most surprising to us (doesn’t include rookies, injuries, etc.)
R: Dansby Swanson (Projected 10th percentile, finished 69th)
HR: Ketel Marte (Projected 19th percentile, finished 75th)
RBI: Ketel Marte (Projected 18th percentile, finished 75th)
SB: Marcell Ozuna (Projected 15th percentile, finished 85th)
AVG: Yoan Moncada (Projected 17th percentile, finished 97th)
OBP: Mark Canha (Projected 19th percentile, finished 97th)
BB%: Harrison Bader (Projected 28th percentile, finished 80th)
K%: Tommy Pham (Projected 16th percentile, finished 72nd)
Our Worst Projections, Too High on Players
R: Travis Shaw (Projected 80th percentile, finished 0.3)
HR: Travis Shaw (Projected 89th percentile, finished 20th)
RBI: Travis Shaw (Projected 91st percentile, finished 1)
SB: Eric Hosmer (Projected 49th percentile, finished 0)
AVG: Marcell Ozuna (Projected 90th percentile, finished 32nd)
OBP: Travis Shaw (Projected 76th percentile, finished 8th)
BB%: Cesar Hernandez (Projected 91st percentile, finished 26th)
K%: Eloy Jimenez (Projected 66th percentile, finished 23rd)
Our Best Projections
R: Trea Turner, Mookie Betts
HR: Billy Hamilton, Albert Pujols
RBI: Billy Hamilton, Mike Trout
SB: Freddie Freeman, Matt Adams
AVG: Aaron Judge, Nomar Mazara
OBP: Lewis Brinson, Mike Trout
BB%: Mike Trout, Josh Donaldson
K%: Miguel Sano, Andrelton Simmons
You can see that we are pretty good at getting the extreme ends of the spectrum (Billy Hamilton hitting for power, Mike Trout getting on base, Miguel Sano striking out, Andrelton Simmons not striking out). We were way wrong about Travis Shaw and Marcell Ozuna.
What We Can Learn
Most people will judge their draft by how their projections look relative to the rest of their league. That is really all you can do before games start being played. This is a pretty fickle exercise, of course, because so many things change during the season. Now we know that you can feel a lot better about the projections in some categories than others.
Plate Appearances Are Key
You will not be right about a player unless you get close to the plate appearance projection. This is very tough to do, even for the wisest. Here is a scatter plot of all of the players we projected for 200+ plate appearances along with their actual plate appearance total.
The actual data is on the x-axis, and the projected on the y-axis. If we were exactly right about every player, you would see a straight line going from the bottom left corner of this graph to the top right. There were some good projections and some bad, but this is just not something that is possible to do with any high degree of accuracy.
That makes projection tough to rely on, but it does not make it useless. Your best guess is better than no guess at all, and the best fantasy baseball players in the world still do rely on projection systems.
Go With What You Know
If you are very aggressive at drafting steals or home runs, you are much more likely to succeed in those categories over the course of the year than if you had attacked batting average and runs. You can use this to your advantage by doing just that – draft to lead your league in steals and home runs. There is a lot more random variation in the other categories, which means it’s easier to be competitive in those categories with a team that does not project to be able to do so.
It is true that players that have been in the league for some time are easier to project. The hardest players to project are rookies, and that is not only because of their uncertain playing time. Since minor league data is not nearly as plentiful as Major League data, rookies come into the leagues as relative statistical unknowns. There is also the problem of them having to face way better competition, which just makes young players really tough. You can rely more heavily on the projections for established players, so that’s where I would judge your team from if you are going to. See what strengths and weaknesses you have among your veterans and try not to factor in the young player projections as heavily to that.
That’s it, and that’s all! I hope you enjoyed this, please let me know if there are any questions or additional thoughts! If you are interested, you can view the 2019 projections vs. reality data here.