The 20% Statistician: January 2018

This blog post is based on a pre-print by Coles, Tiokhin, Scheel, Isager, and Lakens “The Costs and Benefits of Replications”, submitted to Behavioral Brain Sciences as a commentary on “Making Replication Mainstream”.

In a summary of recent discussions about the role of direct replications in psychological science, Zwaan, Etz, Lucas, and Donnellan (2017) argue that replications should be more mainstream. The debate about the importance of replication research is essentially driven by disagreements about the value of replication studies, in a world where we need to carefully think about the best way to allocate limited resources when pursuing scientific knowledge. The real question, we believe, is when replication studies are worthwhile to perform.

Goldin-Meadow stated that "it’s just too costly or unwieldy to generate hypotheses on one sample and test them on another when, for example, we’re conducting a large field study or testing hard-to-find participants" (2016). A similar comment is made by Tackett and McShane (2018) in their comment on ZELD: “Specifically, large-scale replications are typically only possible when data collection is fast and not particularly costly, and thus they are, practically speaking, constrained to certain domains of psychology (e.g., cognitive and social).”

Such statements imply a cost-benefit analysis. But these scholars do not quantify their costs and benefits. They hide their subjective expected utility (what is a large-scale replication study worth to me) behind absolute statements, as they write “is” and “are” but really mean “it is my subjective belief that”. Their statements are empty, scientifically speaking, because they are not quantifiable. What is “costly”? We can not have a discussion about such an important topic if researchers do not specify their assumptions in quantifiable terms.

Some studies may be deemed valuable enough to justify even quite substantial investments to guarantee that a replication study is performed. For instance, because it is unlikely that anyone will build a Large Hadron Collider to replicate the studies at CERN, there are two detectors (ATLAS and CMS) so that independent teams can replicate each other’s work. That is, not only do these researchers consider it important to have a very low (5 sigma) alpha level when they analyze data, they also believe it is worthwhile to let two team independently do the same thing. As a physicist remarks: “Replication is, in the end, the most important part of error control. Scientists are human, they make mistakes, they are deluded, and they cheat. It is only through attempted replication that errors, delusions, and outright fraud can be caught.” Thus, high cost is not by itself a conclusive argument against replication. Instead, one must make the case that the benefits do not justify the costs. Again, I ask: what is “costly”?

Decision theory is a formal framework that allows researchers to decide when replication studies are worthwhile. It requires researchers to specify their assumptions in quantifiable terms. For example, the expected utility of a direct replication (compared to a conceptual replication) depends on the probability that a specific theory or effect is true. If you believe that many published findings are false, then directly replicating prior work may be a cost-efficient way to prevent researchers from building on unreliable findings. If you believe that psychological theories usually make accurate predictions, then conceptual extensions may lead to more efficient knowledge gains than direct replications. Instead of wasting time arguing about whether direct replications are important or whether conceptual replications are important, do the freaking math. Tell us at which probability that H0 is true you think it is efficient enough to weed out false positives from the literature through direct replications. Show us, by pre-registering all your main analyses, that you are building on strong theories that allow you to make correct predictions with a 92% success rate, and that you therefore do not feel direct replications are the more efficient way to gain knowledge in your area.

I am happy to see our ideas about the importance of using decision theory to determine when replications are important enough to perform were independently replicated in this commentary on ZELD by Hardwicke, Tessler, Peloquin, and Frank. We have collaboratively been working on a manuscript to specify the Replication Value of replication studies for several years, and with the recent funding I received, I’m happy that we can finally dedicate the time to complete this work. I look forward to scientists explicitly thinking about the utility of the research they perform. This is an important question, and I can’t wait for our field to start discussing ways to answer how we can quantify the utility of the research we perform. This will not be easy. But unless you never think about how to spend your resources, you are making these choices implicitly all the time, and this question is too important to give up without even trying. In our pre-print, we illustrate how all concerns raised against replication studies basically boil down to a discussion about their costs and benefits, and how formalizing these costs and benefits would improve the way researchers discuss this topic.

The 20% Statistician

Thursday, January 18, 2018

The Costs and Benefits of Replications