The 20% Statistician: The distinction between logical justifications and empirical justifications: A reply to Prof. Hullman

In my previous blog post I explained why debates about whether or not we should preregister will not be solved through empirical means. This blog was inspired by a preprint of my good friend Leonie Dudda which contains a scoping review of interventions to improve replicability and reproducibility. The scoping review finds that authors concluded interventions had positive effects in 60 out of 104 research conclusions (and negative effects in only 1). This is good news. At the same time, evidence for interventions was often lacking. I argued in my blog that this is not problematic for all interventions, as it would probably be better if my peers could provide a logical argument for why they preregister. I strongly stressed the importance of coherence – scientists should work in a way where their methods, aims of science, and the theories they work on are coherently aligned. I've taken this idea from Larry Laudan (see the figure below, from Laudan, 1986), who provides what I think is the best take on how we deal with disagreements in science. Scientists disagree about many things. Should we use Bayesian statistics, or frequentist statistics? The main thing I learned from discussing this on Twitter for a decade is that there is no universal answer. The only answer is conditional: If your aim is X, then given a valid and logical justification, you use method Y. Laudan refers to this as the triadic network of justification. And if you know me, you know I would like scientists to #JustifyEverything.

Prof Hullman read my blog, and in a blog post writes she is left “with more questions than answers”. Because the point I was making is so important, I will explain it in more detail. Hullman titles her blog post ‘Getting a pass on evaluating ways to improve science’. One of the main points in my blog was a delineation between when proposed improvements do not get a ‘free pass’ (e.g., when they need to be empirically justified). These are all improvements that are not ‘principled’ – they are not directly tied to an aim in science. I wrote “After the 2010 crisis in psychology, scientists did make changes to how they work. Some of these changes were principled, others less so. For example, badges were introduced for certain open science practices, and researchers implementing these open science practices would get a badge presented alongside their article. This was not a principled change, but a nudge to change behavior.” And then I said “And for some changes to science, such as the introduction of Open Science Badges, there might not be any logical justifications (or if they exist, I have not seen them). For those changes, empirical justifications are the only possibility.”

But some discussions we have in science are not empirical disagreements. They are disagreements in philosophy of science. Prof Hullman agrees with me that “Logic is obviously an important part of rigor, and I can certainly relate to being annoyed with the undervaluing of logic in fields where evidence is conventionally empirical” but thinks that my arguments for preregistration were not just logical arguments based on a philosophy of science. She writes “Beyond your philosophy of scientific progress, it comes down to the extent to which you think that scientists owe it to others to “prove” that they followed the method they said they did. It’s about how much transparency (versus trust) we feel we owe our fellow scientists, not to mention how committed we are to the idea that lying or bad behavior on the part of scientists are the big limiter of scientific progress.” The point Prof Hullman makes does not go ‘beyond’ philosophy of science, because whether and how much we should trust scientists is a core topic in social epistemology. It is part of your philosophy of science. As I wrote “This in itself is not a sufficient argument for preregistration, because there are many procedures that we could rely on. For example, we can trust scientists. If they do not say anything about flexibly analyzing their data, we can trust that they did not flexibly analyze their data. You can also believe that science should not be based on trust. Instead, you might believe that scientists should be able to scrutinize claims by peers, and that they should not have to take their word for it: Nullius in Verba. If so, then science should be transparent. You do not need to agree with this, of course”. So the point really is one about philosophy of science, and we can make certain logical arguments about which practices follow from certain philosophies, as I did in my blog post.

Prof Hullman writes “It reads a bit as if it’s a defense of preregistration, delivered with an assurance that this logical argument could not possibly be paralleled by empirical evidence: “A little bit of logic is worth more than two centuries of cliometric metatheory.” I am not providing a ‘defense’ of preregistration - that is not the right way to think about this topic. I simply pointed out that your aims can logically justify your methods. For example, if my aim is to generate knowledge by withstanding criticism, then I need to be transparent about what I have done. Note the ‘if-then’ relationship. One of my main points was to get empirical scientists to realize the difference between a logical justification and an empirical justification.

Then Prof Hullman makes a big, but very insightful, mistake. She writes “He argues that all rational individuals who agree with the premise (i.e., share his philosophical commitments) should accept the logical view, whereas empirical evidence has to be “strong enough” to convince and may still be critiqued. And so while he seems to start out by admitting that we’ll never know if science would be better if preregistration was ubiquitous, he ends up concluding that if one shares his views on science, it’s logically necessary to preregister for science to improve.” She confuses the two things my post set out to educate scientists about: There is a difference between implementing a change because you claim it will improve science, and a change because it logically follows from assumptions. I guess I did not do a good job explaining the distinction.

As I wrote: “There are two ways to respond to the question why scientific practices need to change. The first justification is ‘because science will improve’. This is an empirical justification. The world is currently in a certain observable state, and if we change things about our world, it will be in a different, but better, observable state. The second justification is ‘because it logically follows’.” Hullman’s statement that “he ends up concluding that if one shares his views on science, it’s logically necessary to preregister for science to improve” is exactly what I was *not* saying. Let me explain this again, because if Prof Hullman did not understand this, others might be confused as well.

It is essential to distinguish a coherent way of working from a better way of working. Is it better to be a subjective Bayesian, ignore error control, and aim to update beliefs until all scientists rationally believe the same thing? Or is it better to be a frequentist, make error-controlled claims that you do or do not believe, and create a collection of severely tested claims? As we have seen in the last century, there will never be evidence about the question which of these two approaches is an improvement. An all-knowing entity could tell us, but as mere mortals, an answer to the question which of these approaches is an improvement is beyond our ability to achieve.

If a subjective Bayesian changes their research practice from using a ‘default’ prior in their work, to actually quantifying their prior beliefs, their research practice becomes more *coherent*. After all, the aim is to update *your* beliefs, not some generic default belief. Maybe the use of subjective priors will slow down knowledge generation compared to the use of default priors. It might not be an improvement. But it is more coherent, and in the absence of having an empirical guarantee about the single best way to do science, my argument is that we should at least be coherent as scientists. We preregister because it makes our approach to science more *coherent*, and we evaluate coherence based on logical arguments, not based on empirical data.

In my blog, I wrote that empirical evidence can be useful to convince some people to implement policies. I think this section was too short to clearly explain my point. I have this idea because Prof Hullman writes “It strikes me as contradictory to say that it is a flaw that “Psychologists are empirically inclined creatures, and to their detriment, they often trust empirical data more than logical arguments” while at the same time saying it’s ok to produce weak empirical evidence to convince some people.” In the comments to the blog post Prof Hullman writes “I suspect he knows that his logical argument is conditional on a lot of assumptions but he wants to sell it as something more universal. That would be one explanation for why he then seems to walk it back by adding the part about how empirical evidence sometimes has value.” Prof Hullman seems to have a main worry about my blog post: “For example, is the implication that logical justification should be enough for journals to require preregistration to publish, or that lack of preregistration should be valid ground for rejecting a paper that makes claims requiring error control?” Because this last point is exactly what I argued against, I must not have explained myself clearly enough. Let’s try again.

A logical justification can never lead to a policy such as ‘require preregistration to publish’ or ‘a lack of preregistration is grounds for rejecting a paper’. Logical arguments as I discussed have a premise: ‘if you aim to do X’. All studies that do not aim to do X do not have to use method Y. My blog is not just a reminder of the importance of a coherent approach to science, but also a reminder for the people who do not want to preregister to develop a logically coherent ‘if-then’. What are your aims, if not to make error-controlled claims? Which methods are logically coherent with those aims? Write this down clearly, address criticism you will get from peers, sharpen your argument, implement your ideas formally in your papers, and you are all set and never have to worry about not preregistering. Just as I have developed a coherent argument for preregistration, tied to a specific philosophy of science over the last decade, you should – if you want to be taken seriously – have a well-developed alternative philosophy of why preregistration is not in line with your aims.

If policy makers were smart and rational they would create policies based on logical justifications where possible. Regrettably, policy makers are typically not very smart and rational. Here is the kind of policy I want to see: “If preregistration is a logically coherent step in your scientific method, we want you to implement it.” This is the same logically principled justification of a research practice as ‘if we think scientists should discover the truth, they should not lie’. The policy requires scientists to act in a logically coherent manner. In practice, this means that if you set an alpha level, control your type 2 error rate through a power analysis, and make claims based on statistical tests that have sufficiently low error rates, you have decided to adopt Mayo’s error-statistical philosophy of science. As I explained in my blog, if we add a second assumption to the aims of science, namely that the aim is to make claims that can withstand scrutiny by peers, then it logically follows that we adopt a procedure that enables scrutiny. Of course, as Laudan’s figure above illustrates, the methods we choose should ‘exhibit the realizability’ of our aims. If we believe it is important to scrutinize claims, but the only way to achieve it would be to have every scientist in the world wear a body-cam, and we all watch all footage related to a study before believing the claim, the aim of scrutiny might not be ‘realizable’. But preregistration can be implemented in practice, so the method and aim can be aligned in practice.

I would hope that if scientists embrace my view that there is a distinction between logical justifications for preregistration and empirical justifications for preregistration, they will actually gain a very strong argument to push back to the universal implementation of preregistration. All you need to do is pursue different aims than error-controlled claims, or develop a different coherent approach to scientific knowledge generation than the dominant approach we now see in psychology based on Mayo’s error-statistical framework, and any rational editor should accept your arguments.

Now, I did not want to dismiss empirical research on the consequences of interventions to improve completely. I said it could be useful to implement policies. I wrote: “I think [empirical] work can be valuable, and it might convince some people, and it might even lead to a sufficient evidence base to warrant policy change by some organizations. After all, policies need to be set anyway, and the evidence base for most of the policies in science are based on weak evidence, at best.” But this short section led to confusion.

Let me make some things clearer. First, in this example, I am not talking about the specific intervention to adopt preregistration. A policy about preregistration can be implemented based on logical arguments, and if it is implemented, it should be implemented as I stated above: “If you aim to do X, and you believe a principle in science is Y, you need to preregister”. But there are many policies that need to be set for everyone, regardless of their philosophy of science. An example would be the implementation of badges, which as I mentioned in my blog, cannot be justified logically. Furthermore, badges apply to every article in a paper. You get a preregistration badge, or not. Although in principle we could have a badge for preregistration, a badge for a logically coherent argument why you do not need preregistration, and no badge, this would go beyond the simple nudge idea behind badges. Empirical data can be useful if researchers want to convince editors to implement badges. Prof Hullman writes “It strikes me as contradictory to say that it is a flaw that “Psychologists are empirically inclined creatures, and to their detriment, they often trust empirical data more than logical arguments” while at the same time saying it’s ok to produce weak empirical evidence to convince some people.”. She does not summarize write I wrote correctly. I do not say it is ‘ok to produce weak empirical evidence to convince some people’. I am simply saying this is how some people choose to go about things, and in the absence of strong empirical evidence, and giving political interests that some scientists have, they will use empirical arguments to convince others, and that can work. I much prefer a logical basis for policies, and I prefer not to engage in policies that do not have a logical basis (I also for that reason do not like open science badges). But often, such a logical basis is not available, strong evidence is not available, and there are people who want to change the status quo.

My blog had the goal to make scientists aware of the possibility of developing logical arguments – given some premises – for preregistration. I think convincing logical arguments exist, and I have developed them for one (arguably the dominant) error-statistical philosophy in my own discipline. A lack of evidence for preregistration is not problematic, and if you ask me, realistically we should not expect it to emerge. Anyone who carefully reads my blog will see it provides ammunition for scientists to fight back against exactly the overgeneralized policies Prof Hullman is worried about (ie.., that you will need to preregister to get published). The ‘free pass’ we should be worried about in science is not the absence of empirical data, but the absence of a logical argument.

The 20% Statistician

Saturday, September 21, 2024

The distinction between logical justifications and empirical justifications: A reply to Prof. Hullman

No comments:

Post a Comment