This blog post is based on a pre-print by Coles, Tiokhin, Scheel,
Isager, and Lakens “The Costs and Benefits of Replications”, submitted to Behavioral
Brain Sciences as a commentary on “Making Replication Mainstream”.
In a summary of
recent discussions about the role of direct replications in psychological
science, Zwaan, Etz, Lucas, and Donnellan (2017) argue that replications should
be more mainstream. The debate about
the importance of replication research is essentially driven by disagreements
about the value of replication studies,
in a world where we need to carefully think about the best way to allocate
limited resources when pursuing
scientific knowledge. The real
question, we believe, is when replication studies are worthwhile to perform.
Goldin-Meadow
stated that "it’s just too costly or unwieldy to generate hypotheses on
one sample and test them on another when, for example, we’re conducting a large
field study or testing hard-to-find participants" (2016). A similar comment is made by Tackett
and McShane (2018) in their comment on ZELD: “Specifically, large-scale replications are typically only possible
when data collection is fast and not particularly costly, and thus they are,
practically speaking, constrained to certain domains of psychology (e.g.,
cognitive and social).”
Such statements imply a cost-benefit
analysis. But these scholars do not quantify their costs and benefits. They
hide their subjective expected utility (what is a large-scale replication study
worth to me) behind absolute
statements, as they write “is” and “are” but really mean “it is my subjective
belief that”. Their statements are empty, scientifically speaking, because they
are not quantifiable. What is “costly”? We can not have a discussion about such
an important topic if researchers do not specify their assumptions in
quantifiable terms.
Some studies may be deemed valuable enough
to justify even quite substantial investments to guarantee that a replication
study is performed. For instance, because it is unlikely that anyone will build
a Large Hadron Collider to replicate the studies at CERN, there are two
detectors (ATLAS and CMS) so that independent teams can replicate each other’s
work. That is, not only do these researchers consider it important to have a
very low (5 sigma) alpha level when they analyze data, they also believe it is
worthwhile to let two team independently do the same thing. As a physicist
remarks: “Replication is, in the end, the most important part of error control.
Scientists are human, they make mistakes, they are deluded, and they cheat. It
is only through attempted replication that errors, delusions, and outright fraud
can be caught.” Thus, high cost is not by itself a conclusive argument against
replication. Instead, one must make the case that the benefits do not justify
the costs. Again, I ask: what is “costly”?
Decision theory is a formal framework that
allows researchers to decide when replication
studies are worthwhile. It requires researchers to specify their assumptions in
quantifiable terms. For example, the expected utility of a direct replication (compared to a conceptual
replication) depends on the probability that a specific theory or effect is
true. If you believe that many published findings are false, then directly
replicating prior work may be a cost-efficient way to prevent researchers from
building on unreliable findings. If you believe that psychological theories
usually make accurate predictions, then conceptual extensions may lead to more
efficient knowledge gains than direct replications. Instead of wasting time
arguing about whether direct replications are important or whether conceptual
replications are important, do the freaking math. Tell us at which probability
that H0 is true you think it is efficient enough to weed out false positives
from the literature through direct replications. Show us, by pre-registering
all your main analyses, that you are building on strong theories that allow you
to make correct predictions with a 92% success rate, and that you therefore do
not feel direct replications are the more efficient way to gain knowledge in
your area.
I am happy to see our ideas about the
importance of using decision theory to determine when replications are important enough to perform were
independently replicated in this commentary on ZELD by Hardwicke, Tessler, Peloquin, and Frank. We
have collaboratively been working on a manuscript to specify the Replication
Value of replication studies for several years, and with the recent funding I
received, I’m happy that we can finally dedicate the time to complete this
work. I look forward to scientists explicitly thinking about the utility of the
research they perform. This is an important question, and I can’t wait for our
field to start discussing ways to answer how we can quantify the utility of the
research we perform. This will not be easy. But unless you never think about
how to spend your resources, you are making these choices implicitly all the
time, and this question is too important to give up without even trying. In our
pre-print, we illustrate how all
concerns raised against replication studies basically boil down to a discussion
about their costs and benefits, and how formalizing these costs and benefits
would improve the way researchers discuss this topic.