Comments on The 20% Statistician: The relation between p-values and the probability H0 is true is not weak enough to ban p-values

2016-01-13T22:51:33.299+01:00

This comment has been removed by a blog administrator.

Hi. Thanks. I think it is more coherent to use a...

2015-11-25T20:01:57.364+01:00

Hi.

Thanks. I think it is more coherent to use alpha when following Neyman-Pearson's approach, and 'p' when following Fisher's approach (but with the latter we then have the problem of what to do with power).

Because Trafimow (et al) criticize NHST, my reasonable guess is that they are using the hybrid model that mixes NP's and F's approaches, so their results may not be too consistent.

I also agree with you that their reasons to ditch NHST are lame, statistically speaking. I would recommend ditching NHST simply because of it been an inconsistent hybrid model (http://dx.doi.org/10.3389/fpsyg.2015.01293).

Thanks for your comment. I agree the difference be...

2015-11-25T11:55:33.927+01:00

Thanks for your comment. I agree the difference between alpha and p is important, and worthwhile to explore. To be clear, Trafimow & Rice (2009) use p, and that is what I am reproducing and checking. The question whether they should have used alpha is very interesting. We are talking about a-priori power in the formula.

Hi, I don't think Cohen actually suggested t...

2015-11-25T02:09:21.276+01:00

Hi,

I don't think Cohen actually suggested the formula "p / (p + power)" (or related). It seems so but I think Cohen simply chose a bad example, rightly criticized by Hagen (1997). Still, Cohen did not go onto proposing the formula, Hagen did.

Regarding Hagen, he uses "alpha / (alpha + power)". You substitute "p / (p + power", anew formulation. My question is, which power are you talking about, a priori power or a posteriori power?

Cheers,

(Perezgonzalez, Massey University)

Hey - this blog is my sandbox, where I play around...

2015-11-24T15:34:23.058+01:00

Hey - this blog is my sandbox, where I play around. There is a reason this is a blog and not an article - I'm putting it online, and hope to get feedback. Again, I'm following Cohen and Hagen - but my PS shows I disagreed. I'd love to hear other viewpoints, related articles, etc. I'm not clear yet whether it is better to use the exact p, or the alpha level, and why. I'll look into it, but please educate is you already know this. And if this blog is confusing - that's a good thing. I'll not be the only person struggling with the confusion aspects of this, so better to get it out here and discuss it.

Re-reading Anon's comment, I also see lot of c...

2015-11-24T15:29:06.540+01:00

Re-reading Anon's comment, I also see lot of confusion in my previous comment :) Let me give it some more thought.

Great comment by Anon. The second pointer, goes ba...

2015-11-24T15:22:26.758+01:00

Great comment by Anon. The second pointer, goes back to the simple fact that p-value is not p(D|H0). P-value is p(D;H0). H0 is not a random variable. As a consequence p(D;H0) is a number while p(D|H0) is a random variable. Now what you Daniel (and Cohen?), do is to take the formula which says p(D|H0) and plug in p(D;H0) instead of p(D|H0). To be clear, it's perfectly valid for a frequentist to investigate p(D|H0) or p(H0|D), you just should not mix p(D;H0) with p(D|H0) in a single formula.

Geez, Daniel, so much confusion in this blog :)

Feel free to run with my unsupported claims and cr...

2015-11-24T15:04:22.212+01:00

Feel free to run with my unsupported claims and criticize them. I think this blog is part of a discussion - feel free to right about the next part of the discussion, here, or on your blog.

I haven't read the paper, but from the passage...

2015-11-24T14:46:26.581+01:00

I haven't read the paper, but from the passages you quoted, the conclusions of T&R are certainly not supported. Nor do I think that it is possible to obtain any evidence to support a ban of p-values. (Still, I think the ban is a beneficial decision so long as it is implemented by the minority of journals.) On the other hand, your blog doesn't show to me that the relation is "not weak". Instead you unload more of unsupported claims. That's a pity because, as I tried to point out, there are interesting questions behind your disagreement with T&R.

Why does Cohen's formula assumes P(H0) is 0.5?...

2015-11-24T13:58:20.674+01:00

Why does Cohen's formula assumes P(H0) is 0.5? P(H0) is part of the formula - you can fill in any value you want, and Cohen surely doesn't use 0.5 in his 1994 article. I completely agree p-values should be deemphasized. I'm just reacting against people who want to ban them. Especially because I think they are useful to report from a NP perspective. Yes, you could say p < alpha, but as a heuristic, having the exact p-value is not worthless. In lines of studies (I also think single studies should be deemphasized) they are good to report. After the studies, many people will want to evaluate the evidence - Bayesian stats is ok, or just estimate the ES.

Any meaningful use of a p value as an heuristic fo...

2015-11-24T13:49:14.162+01:00

Any meaningful use of a p value as an heuristic for Bayesian posteriors presupposes that we "heuristically" understand its simultaneous dependence on variation (both the biological and technical kinds), effect size, sample size, power, and P(H0). For example, the Cohen (1994) formula silently assumes that P(H0)=0.5, which is not an assumption to be made lightly, as it would indicate that a scientist is exclusively testing rather likely (and thus boring) hypotheses. For this reason alone, this formula is not a good heuristic for most scientists, whose priors tend to be <0.5, and vary greatly from experiment to experiment.
I believe that p values are very useful in modern biology, especially when used in bulk for FDR calculations, or even in the classical Neyman-Pearson approach. Having said that, there are good reasons to think that the p should at the very least be de-emphasized as a single number heuristic for evidence for (or against) a single hypothesis. The reasons for this are logical, philosophical, and, most importantly, practical. It looks like the p is simply not worth the trouble in this context. In practice, when I help innocent biologists to calculate p values, the reason is always that the referees want it. Meanwhile, the actual effect is either so large that calculating the p value will not add much to our already high confidence, or the effect is not large enough to be really lifted by p=0.04 (a small effect is more likely to be caused by bias, especially when our experimental hypothesis predicts a larger effect). Because the primary concern of working scientists is the experimental hypothesis vs. bias, not the sampling error, the p value does not bring much to the party.

P-values are not logically inconsistent when used ...

2015-11-24T08:59:05.702+01:00

P-values are not logically inconsistent when used from a Neyman-Pearson perspective. If you want to use the to formally quantify posterior probabilities, they are not consistent. Their use as a heuristic is an entirely different matter. It does not need to be consistent, and it deserves to be evaluated giving 1) the likelihood people will understand bayesian statistics, and 2) difficulties in using bayesian statistics when the goal is to agree evidence (see the ESP discussions).

Wait what? The p-value is logically inconsistent...

2015-11-24T05:11:06.967+01:00

Wait what? The p-value *is* logically inconsistent. I don't see how someone pointing that out is "arguing with affect." The bogus science comment is clearly debatable but the p-value's inconsistency isn't.

Hi Patrick - I agree. The comment was a waste of t...

2015-11-23T18:24:21.185+01:00

Hi Patrick - I agree. The comment was a waste of time, so I deleted it. I don't like dismissive anonymous commenters.

Hi Anonymous (such a useful comment deserves a nam...

2015-11-23T18:23:24.137+01:00

Hi Anonymous (such a useful comment deserves a name!) - The formula comes directly from Cohen, 1994, who uses a precise p-value himself. Now it makes total sense to replace this by 0.05 (the logic is the same as in my PS, so I agree) but it's not what has been done in the literature (Cohen, 1994, Hagen, 1997) nor by Trafimow & Rice, 2009 (see their example in the post). But thanks for the literature reference, I'll definately read it!

Please note that the formula attributed to Cohen i...

2015-11-23T18:06:46.411+01:00

Please note that the formula attributed to Cohen is incorrect. A correct form is PPV = ((1 - P(H0)) * power) / ((1 - P(H0) * power + P(H0) * alpha) where PPV - the positive predictive value - is the ratio of the false H0s over all the rejected H0s, and alpha is the significance level. To see why alpha cannot easily be substituted by exact p values, see section 3 of Hooper (2009) J Clin Epidemiol. Dec;62(12):1242-7. "The Bayesian interpretation of a P-value depends only weakly on statistical power in realistic situations."

Also, the p value cannot be directly used in Bayesian calculations, because while the likelihood P(Data I H0) is about the exact sample data, the p value is about data as extreme or more extreme than the sample data. Thus, any Bayesian use of a single p value (derivation of the posterior probability of the H0 using the p value) is likely to be so much of a hassle as to negate the only positive aspect a single p value has - the ease of calculation.

I would say that prohibiting the evidentiary use of single p values seems like a good idea. Especially, since the theoretical maximum we can suck out from a p value is the probability of sampling error as the cause of an effect - which pretty much ignores the scientific question, whether the experimental effect is strong evidence in favor of ones scientific hypothesis, or in favor of some other, often unspecified, hypothesis (like the presence of bias).

You have your wish: http://www.sciencemag.org/cont...

2015-11-23T17:49:33.567+01:00

You have your wish: http://www.sciencemag.org/content/349/6251/aac4716

As a side note, I find your remark to be rather absurd considering how much of the reproducibility movement is coming from within psychology. Ah, well -- trolls will be trolls.

2015-11-23T17:26:55.840+01:00

This comment has been removed by a blog administrator.

Yes, I've fixed it! Thanks

2015-11-23T15:52:06.989+01:00

Yes, I've fixed it! Thanks

Hi Daniel, "P(D|H0) is the probability (P) o...

2015-11-23T15:45:14.751+01:00

Hi Daniel,

"P(D|H0) is the probability (P) of the data (D), given that the null hypothesis (H0) is true. In Cohen’s approach, this is the p-value of a study." The pvalue is actually the probability of the data... or more extremes data, given that the null hypothesis (H0) is true, isn't it ?

Thanks

It was indeed their right to ask the question they...

2015-11-23T15:34:01.194+01:00

It was indeed their right to ask the question they asked - but if you read their conclusions, than they go beyond the question they address and draw conclusions not in line with the data - which is my main problem here. Acknowledging that in most common situations, a p < 0.05 is related to a decrease in the probability H0 is true, and oh I don't know, showing that in a plot, would have been nice, which is why I wrote this blog. I think is adds to the discussion - but feel free to disagree.

This kind of extreme statements about 'logical...

2015-11-23T15:31:58.948+01:00

This kind of extreme statements about 'logically inconsistent' and 'bogus science' are common in people who discuss statistics with affect instead of logic. I find it very unproductive and ignore it.

That does not change the fact the the p-value is l...

2015-11-23T14:33:05.073+01:00

That does not change the fact the the p-value is logically inconsistent and that NHST makes people do bogus science.

The relation between p-values and posterior distributions is context dependent, it does not make sense to claim that "it is not weak enough" because that claim only makes sense in a specific context and it is irrelevant to the ban itself.

First, correlation is a non-parametric concept and...

2015-11-23T09:58:42.629+01:00

First, correlation is a non-parametric concept and I don't see any issue with comparing non-linear vars with it (as long as one doesn't report conf. intervals or p-values for the correlation). Furthermore, T&R are interested in comparing p(H0|D) and p(D|H0) and not in exp(p(H0|D)) or log(p(D|H0)).

Second, similar to conditioning of covariance, conditioning is defined for correlation and is a helpful concept to understand where you disagree with T&R. They report Corr[p(H0|D),p(D|H0)], while you report Corr[p(H0|D),p(D|H0)|p(H0),p(D|-H0)]. There is nothing wrong with any of the two quantities. If one would assume a directed causal relation (e.g. p(p(H0|D)|p(D|H0)), p(p(H0|D)|p(D|H0),p(H0),p(D,-H0)))) the two approaches correspond to computing the direct and total causal effect. So this is just another researchers-report-two-different-quantities-but-give-them-the-same-meaning drama. In particular, you don't offer any arguments for Corr[p(H0|D),p(D|H0)|p(H0),p(D|-H0)] over Corr[p(H0|D),p(D|H0)], just empty assertions like "large simulations where random values are chosen for A and B do not clearly reveal a relationship between C and D". Surely, there is nothing wrong with random sampling and marginalizing of random variables for the purpose of simulation?!!

We can actually compare the plausibility of the two measures by asking whether a reader of a particular scientific publication will be able to estimate them. Corr[p(H0|D),p(D|H0)|p(H0),p(D|-H0)] assumes that p(H0) and p(D|-H0) are known. Is this the case? p(H0) is not known, not as a point value, and p(D|-H0) in many older publications is also not known. Hence, Corr[p(H0|D),p(D|H0)] seems like a much better idea. The issue I have with T&R is that Corr[p(H0|D),p(D|H0)] may change for different choice of prior for p(H0) and p(D|-H0) and this is something that should have been checked.

Funny, i published a post on p-values today. Howev...

2015-11-22T15:51:19.970+01:00

Funny, i published a post on p-values today. However, the best mathematics can not save the p-values from being used incorrectly. I have seen several papers applying wrong methodology and then presenting p-values in order to report statistically significant results. One can do so many things.

http://samuel-zehdenick.blogspot.no/2015/11/why-does-mystical-p-value-sometimes-do.html

Comments on The 20% Statistician: The relation between p-values and the probability H0 is true is not weak enough to ban p-values

Hi. Thanks. I think it is more coherent to use a...

Thanks for your comment. I agree the difference be...

Hi, I don't think Cohen actually suggested t...

Hey - this blog is my sandbox, where I play around...

Re-reading Anon's comment, I also see lot of c...

Great comment by Anon. The second pointer, goes ba...

Feel free to run with my unsupported claims and cr...

I haven't read the paper, but from the passage...

Why does Cohen's formula assumes P(H0) is 0.5?...

Any meaningful use of a p value as an heuristic fo...

P-values are not logically inconsistent when used ...

Wait what? The p-value *is* logically inconsistent...

Hi Patrick - I agree. The comment was a waste of t...

Hi Anonymous (such a useful comment deserves a nam...

Please note that the formula attributed to Cohen i...

You have your wish: http://www.sciencemag.org/cont...

Yes, I've fixed it! Thanks

Hi Daniel, "P(D|H0) is the probability (P) o...

It was indeed their right to ask the question they...

This kind of extreme statements about 'logical...

That does not change the fact the the p-value is l...

First, correlation is a non-parametric concept and...

Funny, i published a post on p-values today. Howev...

Wait what? The p-value is logically inconsistent...