The 20% Statistician: Surely God loves 51 km/h nearly as much as 49 km/h?

Next time you get a fine for speeding, I suggest you try the following line of defense. First, you explain to the judge that a speed limit of 50 km/h in densely populated areas is a convention. It could just as easily have been set at 51 km/h, or 49 km/h. There is no bright line at 50 km/h that prevents all accidents, because the number of deaths due to speeding is a continuous variable. Tell the judge that you believe it is better if drivers ignore speed limits, and instead drive in a thoughtful, open, and modest manner. Tell the judge that, surely, God loves 51 kilometers per hour nearly as much as 49 kilometers an hour. I strongly suspect the judge will roll their eyes, and instructs you to pay the fine.

And yet, when it comes to statistics, these are exactly the arguments statisticians bring forward to criticize current rules in science that exist to regulate when scientists can make claims based on tests. They will argue against dichotomous decisions, and in favor of being “thoughtful, open, and modest” (Wasserstein et al., 2019). Statisticians are like drivers. They deal with individual studies, like drivers deal with their own car. Driving one kilometer faster or slower feels like an arbitrary choice, just as how making a claim based on p < 0.06 or p < 0.04 is arbitrary for a statistician. And in the individual world of drivers and statisticians, there is no logical argument to treat driving 51 km/h at this time and on this street differently from driving 49 km/h.

Philosophers of science are like the government. They do not deal with individual studies, but they deal with the scientific system, just as governments deal with the traffic management system. At this higher level it sometimes becomes necessary to set rules, and enforce them. For example, as explained by the European Road Safety Observatory, originally the driving speed was largely determined by drivers, and fixed at the 85^th percentile of the speed on a road. If all drivers drive at a high speed, the maximum speed would be high, if drivers drove at a low speed, the maximum speed would be lower. Such a system is sometimes also advocated by statisticians: Let the community decide what is best, without strict top-down rules. Regrettably, it is often not responsible to let the community make their own rules. As the European Road Safety Observatory notes: “However, many behavioural observations, attention measurements, and the high number of traffic crashes caused by excessive speed have shown that one cannot always rely on the judgement of drivers to set a suitable speed limit.”

The reason why drivers can not determine how fast they should be allowed to drive is because as a society we want to prevent accidents. The reward structure for drivers is such that if they speed and do not get into an accident, they get to where they want to be more quickly, and when they speed and do get into an accident, they might kill a pedestrian or bicyclist. If we ignore having a bad conscience (combined the inability of drivers to adequately estimate the probability they will get into an accident), the reward structure would lead unacceptable risks for pedestrians. If we left the criteria that allow a scientists to make a claim up to scientists themselves, the reward structure would lead to unacceptable rates of false claims.

If someone told you they were speeding to be on time for a meeting, you would likely not scold them, or report them to the police. It is relatively accepted behavior, at least when someone does not violate the speed limit by too much. According to the European Road Safety Observatory “67% of Europeans admit to having speeded on rural roads over the previous 30 days”. And yet, reducing the average driving speed with 1 additional kilometer would save more than 2000 lives a year. Those small violations that we find acceptable have real consequences that we are often not aware of when we violate the rules. Scientists also admit to practices that increase the probability of false claims, and no one will be fired for not correcting for multiple comparisons, even if this in practice leads to a higher Type 1 error rate than the 5% they say they will use to make scientific claims.

Enforcing rules can prevent accidents and errors. And therefore, the driver who tries to convince the judge that ‘surely God loves 51 km/h nearly as much as 49 km/h’ will have little success. The judge knows that enforcing violations saves lives. In practice, drivers need to speed by more than 1 km/h to get a fine, due to corrections for measurement uncertainty. In The Netherlands, 3 km is subtracted from the speed measurement to guarantee a driver was speeding, given imperfect measurement equipment. According to the European Road Safety Observatory “The detection equipment is often set in such a way that there is a margin of tolerance with regard to the speed limit. The use of such margins of tolerance serves to filter out minor, accidental violations and to deal with the possible unreliability of the equipment. A disadvantage of this approach is, however, that it strengthens drivers' opinion that a minor offence is not so serious”. Similarly, in science, if we allow author to not correct for multiple comparisons, or make claims based on ‘marginally significant’ findings of p = 0.06 they might similarly feel the consequences are not so serious.

But at the system level, a Type 1 error rate of 10% instead of 5% has a massive impact on the safety and efficiency of the scientific system. Whether acceptable safety is reached by setting the alpha at 5% deserves to be empirically studied (just as the acceptable driving speed is determined empirically). Just as driving speeds, we might find different alpha levels acceptable in different research lines. But that a driving speed has to be established and enforced will remain important on a system level.

Some drivers will continue to complain about being fined for speeding, convinced as they are that they can determine how fast they can drive at a specific time at a specific road. Some people will never like being told what to do. Some statisticians will continue to complain that they need to adhere to a 5% error rate when making scientific claims, when they strongly believe that they can determine when they should make a claim, and when not, on a case by case basis.

We allow drivers to voice their complaints, and if there are signals that traffic rules lead to problems, they might be adjusted. And of course, no government is perfect, so suboptimal decisions will sometimes be made. But we will never abandon traffic rules, and will at best change the driving speed that will be enforced, or the parts of the road where driving speeds will be enforced. Similarly, we allow statisticians to complain about the use of significance levels to make claims. But we will never abandon the use of enforced criteria that regulate when scientists can make claims, and will at best change alpha levels, or decrease the amount of research questions that test claims in favor of descriptive research. When it comes to decisions about how we should organize the traffic management system, we don’t ask drivers. Similarly, when it comes to decisions about how to organize scientific knowledge generation, we don’t ask statisticians. Scientific knowledge generation is studied by social epistemologists. Science, like driving, is a social system with a specific goal. It is of course beneficial if those government employees that create the traffic management system are also drivers, just as it is useful if social epistemologists understand statistics. But social epistemology is it’s own specialization.

Some scientists don’t like to think of science as a large ‘knowledge production system’. Maybe it makes them feel like a cog in a machine. I like to think of scientists as part of a system. Our jobs are very similar to garbage collectors. It’s a large and essential system that exists because society needs it and is willing to pay for it, that aims to achieve a goal efficiently, with a strong social component. Therefore, it makes sense to me that science needs a set of rules to reduce errors in the system. Not all scientists will agree, just as not all civilians agree with the government. From a statistical perspective, there might not be a difference between driving 49 or 51 km/h, but from a social epistemological perspective it is justifiable to fine a driver if they drive 51 km/h inside city limits, and not fine them if they drive 49 km/h.

The 20% Statistician

Thursday, January 11, 2024

Surely God loves 51 km/h nearly as much as 49 km/h?

No comments:

Post a Comment