Which Hitting Categories Are Most Secure? (2020 Fantasy Baseball)
Beyond our fantasy football content, be sure to check out our award-winning slate of Fantasy Football Tools as you navigate your season. From our Trade Analyzer – which allows you to instantly find out if a trade offer benefits you or your opponent – to our Waiver Wire Assistant – that allows you to quickly see which available players will improve your team, and by how much – we’ve got you covered this fantasy football season.
Drafting for a 60 game season is unlike anything a fantasy baseball player has had to do, and hopefully will ever have to do again. Having that few games to be played really does change everything.
Luck is sure to be the main factor in who wins fantasy baseball championships this year, but that does not mean it’s the only factor. There are certain ways we can still give ourselves an edge over the competition. One way to do that is to understand how each category looks when taken in small sample sizes.
Explaining My Approach
What I did was explore last year’s data and look into a bunch of different 60 game samples to see if we could learn something about the variance of each statistical category. I used the five standard hitting categories: runs, runs batted in, homers, steals, and batting average.
After we have the individual game logs for each player, we can run what we call a rolling average over the data and find how each 60 consecutive game sample they had looked. For my purposes here I ignored games where a hitter was not in the starting lineup.
Say that a hitter started 150 games last year. While you can only put 60 into 150 a little more than two times (long division, anyone?), a 150 game season actually gives us 90 different 60-game samples. The first sample is for games 1-60, the second is for games 2-61, the third is for games 3-63, and so on the whole way to the last sample which would be games 90-150.
To show what I am talking about a little bit better, let’s take a look at the season that reigning NL MVP Cody Bellinger put together.
I drew red lines on the chart around where I think the league average for a full-time starting-caliber fantasy player should be. You can see that Bellinger started on an absolute tear. The first several data points on all of those graphs (the furthest left) are way up. Over his first 60 games, he was on a 162-game pace of something like 138 runs, 54 homers, 150 RBI, 20 steals, and a .365 batting average.
Like any player, he had his ups and downs. The batting average was on a pretty steady decline all season long, and the stolen bases disappeared in the middle of the season.
Anyways, you see what we’re doing here. For every qualified hitter (I used 400 plate appearances as my cut off), we can generate 90 data points for each category, based on their 162-game pace for each category over the last 60 games.
Running the Test
The question I wanted to try to answer was how the categories line up in terms of the variance we see when we take a look at all of the players together.
To get a single number to give us a rough idea of the variance for each category, I looked at the average standard deviations of each player’s results in all categories. The standard deviation is a measure of the spread of the data. If a category was wildly random, you would see large peaks and valleys over different 60 game samples, creating a large standard deviation, with the opposite also being true.
Here are the results:
This suggests that RBI is the toughest category to predict over a 60 game sample size. This makes some intuitive sense because the biggest part of getting an RBI is out of a hitter’s control – having men on base in front of him. Anything a hitter has less control over will naturally lean towards more randomness.
Here is an example of just how wildly variant the RBI category can be:
For the first three months of the season, Yuli was on a really poor RBI pace before just a ridiculous hot streak catapulted all of his numbers. That’s an extreme example, of course, but it does show the crazy big range that the RBI category can produce in this instance.
Stolen bases are the least variant. This is not a huge surprise because of all of the players that just had flat lines all season, like this:
Nelson Cruz never stole a base last year, so he posted a bunch of zeroes in this study, which is a standard deviation of zero. However, even when we look at the most frequent runners, we see a pretty narrow spread:
I expected home runs to be more random than they proved to be in this study, as they came in just above stolen bases in the standard deviation list. Looking through the plots, you see a ton of graphs like this:
While there are notable spikes there, Chapman kept his range between a 30 homer pace and a 40 homer pace pretty much all season long – a pretty narrow lane.
The season will be controlled by randomness, there is no doubt about it. You can be the best fantasy baseball player in your league by a mile and still finish last this year. However, one way to give yourself at least a small edge would be to attack the categories that are more predictable over short samples.
From our study here, those categories seem to be homers and steals. I would not feel confident about any RBI projection you make for a player, because the randomness of a 60 game season really has a strong grip on that category. Runs and batting average are somewhere in the middle.
If I had to give someone bold advice from the results of this study, I would say hammer steals and homers early and often in the draft. Those are the two categories that the projections are most likely to be right on. You can luck into a strong RBI total, but it’s not so easy to do that with homers and steals.