Skip to main content

Fantasy Football Draft Strategy Has a Replication Crisis

Sep 6, 2020

Mock drafts are a staple of fantasy football off-seasons. But what if we’re not doing enough of them?

What’s the risk of not testing enough?

What does a child’s response to marshmallows tell us about their expected academic and personal achievements in the future? In 1972, the answer was simple: plenty. 

That was the year that a study – one that would become, perhaps, the most famous modern-day study in the field of psychology – was published by researchers at Stanford, led by professor Walter Mischel. In the study, young children were brought into a room and offered an immediate reward (usually a marshmallow). But, if the child was able to wait about fifteen minutes, they would be presented with an even better reward (like, two marshmallows). The researcher exited the room, leaving the single marshmallow in front of the kid. And some children were able to keep themselves from gobbling it up, thereby receiving double the prize for their restraint. And others . . . were not. 

Practice mock drafts with our FREE Draft Simulator >>

From these findings, the researchers derived insights on delayed gratification – and here’s where things got interesting. About fifteen years later, Dr. Mischel and his team followed up with the children from the original study, and found that the kids that were able to wait longer for the bigger reward had attained better grades in school, had gotten higher SAT scores, and even had healthier body mass indices. Their conclusion: a child’s ability to delay gratification – to exhibit self-control – predicted competency later in life. It was a story that was told and retold in academia, and applied to everything from parenting (of course) to business strategy (really). 

But, here’s the thing: the first story wasn’t the full story. 

The Replication Crisis

In 2018, researchers from New York University published a study that added a vital component to the original marshmallow test: nuance. 

It turns out that the original study had its shortcomings: most prominent among them, sample size and representation (it was a small group, and many of the kids were children of Stanford professors). In the new study, the researchers enlarged and diversified the participant group, and the correlation all but vanished: while a child’s success at the marshmallow test at age 4 did predict achievement at age 15, the correlation was half that of the original study, and negligible when controlling for family background. Simply stated, the famous “marshmallow study” failed to replicate in full (to be fair, the original researchers did warn against extrapolating their results to form sweeping conclusions or policies). 

In broad strokes, this finding is not surprising. In fact, the field of psychology has had a replication crisis for some time now. Some of psychology’s most famous studies – studies on priming, ego, and even smiling – have failed to replicate upon new attempts by researchers. 

Now, just as with the field itself, psychology’s so-called replication crisis requires nuance to understand it. You see, there are arguments on both sides of this debate. The original researchers – the ones whose studies failed to replicate – contend that the new studies lack the exhaustive attention to detail required to reproduce them (in other words, the new researchers didn’t closely follow their original instructions on how to set up the experiment, thus leading to differing results). And researchers from the new studies contend that the original psychologists conducted their studies with small or homogenous sample sizes, leading to weak correlations that should never have been published in the first place. Or worse, that they messed around until they got something interesting, published it, and hid the failures in the drawer. 

But regardless of whose side you’re on, there is at least one shared belief among both camps in this debate: reconstructing lab or environmental conditions from one study to another is really hard. Even when dutiful, deliberate considerations have been made to control for a wide array of variables, there are inconsistencies and anomalies that could influence the results. 

Said differently: People vary. Conditions vary. Interpretations vary. 

So, let me ask you this: if it’s tough to reproduce consistent, reliable results when you’re being purposeful about recreating your environment, what about when you’re being haphazard in reproducing it? 

Let’s talk about mock drafts. 

What Do Mock Drafts Really Tell Us? 

Stop and consider all of the ways in which mock drafts are simulating an environment that bears little resemblance to the real environment of draft day. Start with the computer-driven, mock draft software applications: yes, predictive algorithms continue to advance every year. But still, computers aren’t people. They aren’t as prone to our biases or overreactions. They aren’t prone to recency bias, or name recognition. And because of that, computer-designed mock drafts typically generate results relatively close to conventional rankings and player ADPs. 

The problem with that: trendy “sleeper” picks like J.K. Dobbins or Daniel Jones never go off the board as late in real life as they do in computer-generated mock drafts. Plus, universally-known players with ADPs ranging from Patrick Mahomes to Tom Brady will usually be grabbed quicker by human beings. Computers can control for some of this – but not all. 

Now, what about human-centered mock drafts? Surely these will generate valid, reliable results – after all, human beings add that tough-to-replicate unpredictability that software may fail to simulate. And yes, putting people in your practice test will add bias, preference, and partiality. But who are these people? And do they resemble your real league opponents whatsoever? Can you simulate your co-worker that always picks a quarterback early? Can you replicate your buddy that overdrafts Eagles players? Further, what if your fellow mock drafters are testing radical draft strategies? If even a minority share of fantasy managers are making rare-to-unimaginable decisions in your platform, wouldn’t that taint the results considerably? 

Ultimately: if the environment is different, if the people are different, and if the technique is different, how reliable are the findings? 

Here’s How to Control for the Replication Crisis: More mock drafts & diversification

Now, let me be clear: my contention is not that mock drafts (computer or human-centered) are worthless. Burdened with imperfect data sets, the answer isn’t to disregard data all together. The real solution draws a comparison to what the NYU researchers did to strengthen Stanford’s original marshmallow test: one, increase your sample size, and two, diversify. If you’re only going to conduct one or two mock drafts before the big night, the results from those practice runs are worthless. The same is true for those that only do computer-driven mocks, or only do human-centered ones. Just as in many fields, valid trends can only emerge from big, diverse data sets. Don’t stop once you generate your “perfect” draft simulation, one that landed you every player you hoped for, and bury your failures in the drawer. Plan for as many scenarios as possible. 

Next, and I can’t stress this one enough: don’t be married to expert rankings or ADP. If you like the guy, go and get the guy. Mock drafts often lull us into this false sense of security (“well, I really like Jonathan Taylor, but his ADP is two rounds later, so he’ll be there for me”). Remember that many of your sleepers, breakouts, and busts are everyone’s sleepers, breakouts, and busts. Mock drafts tend to frame every other drafter as conventional, and you as the maverick – when in fact, we all see ourselves as the maverick. Which means multiple teams will employ Zero-RB or Zero-WR strategies, or take players ahead of ADP, or overdraft those projected breakout stars like Clyde Edwards-Helaire or Kyler Murray. 

Diversify your trial runs. Plan for the worst. Get the guy you want, regardless of ADP. Because, when it comes down to it, you don’t know if your fellow league-mates are the kinds of people that are willing to wait out one marshmallow so they can land two later. And, even if you did, you don’t know if that insight is relevant, or predictive of anything. 

Because some fields have a replication crisis. 

And fantasy football is one of them.

Ready for your draft? Try a FAST mock and get an instant grade >>


SubscribeApple Podcasts | Spotify | Google Podcasts | Stitcher | SoundCloud | iHeartRadio

Beyond our fantasy football content, be sure to check out our award-winning slate of Fantasy Football Tools as you prepare for your draft this season. From our free mock Draft Simulator – which allows you to mock draft against realistic opponents – to our Draft Assistant – that optimizes your picks with expert advice – we’ve got you covered this fantasy football draft season.

David Giardino is a featured writer at FantasyPros. For more from David, check out his archive and follow him @DavidGiardino.

What's your take? Leave a comment

Follow the Pros!

Follow us on Twitter @FantasyPros for exclusive advice and contests