Comments on The 20% Statistician: Bayes Factors and p-values for independent t-tests

p values just tell you the probability of getting ...

2018-05-15T03:04:17.765+02:00

p values just tell you the probability of getting more extreme results (in the direction of the alternative hypothesis) than the observed value of the test statistic with the actual data. Thus, you are looking at a multitude of possible samples that might occur and yield worse results than your actual sample.

The Bayes factor does a better job: you are focusing on your actual data and not on other (virtual) samples that might have occurred. Most importantly, however, the Bayes factor directly compares two different models: the null model and an alternative model (representing the alternative hypothesis).

Dear Daniel, I struggled with the following senten...

2017-09-19T16:31:24.220+02:00

Dear Daniel,
I struggled with the following sentence: "Bayes Factors tell you something about the probability H0 or H1 are true, given some data (as opposed to p-values, which give you the probability of some data, given the H0)."
A Bayes Factor is defined as BF_10 = P(y|H1)/P(y|H0)
So, the Bayes Factor is the ratio of the likelihood of data under H1 and the likelihood of data under H0.
This confuses me, when contrasting it to the p-value, which also gives the probability of data given the null hypothesis.

Is it that you mean by "Bayes Factors tell you something about the probability H0 or H1 are true, given some data" they are a central component in updating the prior odds to arrive at the posterior odds?
But then, wouldn't the posterior odds, i.e. P(H1|y)/P(H0|y) constitute the central difference between Bayesian and Frequentist inference?

Doing the right thing means interpreting data as s...

2014-09-28T10:41:39.769+02:00

Doing the right thing means interpreting data as support for your hypothesis when they support your hypothesis. Let's say I expect two groups (randomly assigned) to differ on some variable (a rather boring, unpretentious hypothesis, but ok). I do a test, find the difference is statistically greater than 0. I conclude the data support my hypothesis.

Did I do anything wrong? No. I don't know how likely it is my hypothesis is correct, but it is supported by the data. This could be a freak accident, a one in a million error. The prior for H0 might be 99%. But the data I have still support my hypothesis.

Why would we conclude this, only when p <.05? Or perhaps p < .01? First of all, no one is saying you should. If you find a p = .77 and still want to continue believing in your hypothesis, go ahead - try again, better. Is it nonsense to require some threshold researchers should aim for when they want to convince others? I don't think so. Which value works best is a matter of opinion, and we cautiously allow higher p-values every now and then. But asking researchers to provide sufficient improbability of their data makes sense.

Now you have a problem with scientists saying 'I found this p-value, now I make the scientific claim that H1 is true'. But the data can be 'in line with' or 'supporting' H1. We are not talking about truth, but about collecting observations in line with what might be the truth. P-values quantify the extent to which these observations should be regarded in line with the hypothesis, not whether the hypothesis is correct.

This is more or less the core of a future blog post, so if I'm completely bonkers, stop me now :)

I can justify the necessity of using some prior,...

2014-09-28T09:28:35.519+02:00

I can justify the necessity of using *some* prior, and then we can argue over the details of that prior. Fine, but that's a detail, not part of the basic logical structure of method. But my question was why use *any* criterion on the p value to justify scientific claims? Using the p value in this way leads to logically invalid arguments (choosing a ridiculous prior, leads to silly arguments, but not *logically invalid* ones).

Scientists are lifelong learners; learning new things is what we do for a living. Asking scientists to stop using a moribund, completely unreasonable method -- and to learn one that is (or can be) reasonable -- is within the bounds of the job description. After all, method is the *core* of science, not a peripheral concern. Without reasonable method, science is nothing.

As for "[r]ecommendations [about p values] are not perfect, but useful in getting people to do the right thing, most of the time" I'm more interested in making reasonable judgments, than the "do[ing] the right thing". I don't know what "do[ing] the right thing" means when we're talking about interpreting the results of an experiment.

Hi Richard, why would you want to use a Cauchy pri...

2014-09-24T11:30:10.141+02:00

Hi Richard, why would you want to use a Cauchy prior? Why would you make any fixed recommendation except 'use your brain'. If we could live in a world where everyone had the time to build the expertise to know everything about all statistics they use (factor analysis, mix models, Bayes Factors, etc) in addition to all measurement techniques they use (physiological data, scales, etc) in all theoreies they test, people would probably send 20 years before they feel comfortable publishing anything. Recommendations are not perfect, but useful in getting people to do the right thing, most of the time. I think that's why. If you know of a more efficient system (except the 'use your brain' alternative, I think we all want to know.

Why would we want a significance threshold for any...

2014-09-22T21:36:37.424+02:00

Why would we want a significance threshold for anything? "Significance" -- that is, a threshold on the p value itself -- doesn't really mean anything of value. It's fine to set a "more strict and conservative" threshold on something of meaning -- for instance, an evidential threshold, perhaps -- but we really need to stop thinking in terms of "significance" altogether. It's a useless, arbitrary idea.

That is right, if you want to intepret your own da...

2014-09-16T13:39:55.089+02:00

That is right, if you want to intepret your own data or your study is preregistered, it makes sense. But as a general rule, we would want it the other way around, i.e. lower (more strict and conservative) significance threshold for smaller studies.

Hi, the correction is intended to prevent situatio...

2014-09-16T13:12:03.706+02:00

Hi, the correction is intended to prevent situations where a p-values says there is support for H1, but a Bayes factor says there is stronger support for H0. This post is written for researchers who want to interpret their own data. Publication bias is a problem, but not in interpreting and reporting you own data. Obviously, people should pre-register their hypotheses if they wat their statistical inferences to be taken seriously be others.

Thanks - fixed the link!

2014-09-16T13:09:11.905+02:00

Thanks - fixed the link!

Thanks! You are right. Fixed it - graph is correct...

2014-09-16T13:08:59.550+02:00

Thanks! You are right. Fixed it - graph is correct, but should be lower (or now: better) Bayes Factor. That's how counterintuitive it was ;)

"If we would require a higher t-value (and th...

2014-09-16T13:07:46.796+02:00

"If we would require a higher t-value (and thus lower p-value) in larger samples, we would at least prevent the rather ridiculous situations where we interpret data as support for H1, when the BF actually favors H0."

This doesn't take into account publication bias. If we assume that publication bias is a larger problem for smaller studies (which I think we can), we would require lower p-values for studies with smaller samples, i.e. the opposite you suggest.

great post. I think something went wrong in Figure...

2014-09-16T10:36:41.733+02:00

great post. I think something went wrong in Figure 2. You write 'smaller sample sizes actually yield higher Bayes Factors for the same t-value' even though the figure shows the exact opposite. I guess the legend got mixed up.

Hey Daniel, thanks for this post. Just wanted to l...

2014-09-16T10:15:50.306+02:00

Hey Daniel, thanks for this post. Just wanted to let you know, that the Link to Rouder et al. (2009) is set incorrectly.

Readers who want more of the theory can check out ...

2014-09-15T22:35:12.280+02:00

Readers who want more of the theory can check out my post "Bayes factor t tests, part 2: Two-sample tests" and the previous posts linked there. (bayesfactor.blogspot.com)

Comments on The 20% Statistician: Bayes Factors and p-values for independent t-tests

p values just tell you the probability of getting ...

Dear Daniel, I struggled with the following senten...

Doing the right thing means interpreting data as s...

I can justify the necessity of using *some* prior,...

Hi Richard, why would you want to use a Cauchy pri...

Why would we want a significance threshold for any...

That is right, if you want to intepret your own da...

Hi, the correction is intended to prevent situatio...

Thanks - fixed the link!

Thanks! You are right. Fixed it - graph is correct...

"If we would require a higher t-value (and th...

great post. I think something went wrong in Figure...

Hey Daniel, thanks for this post. Just wanted to l...

Readers who want more of the theory can check out ...

I can justify the necessity of using some prior,...