tag:blogger.com,1999:blog-987850932434001559.post5629948635381250172..comments2018-09-20T02:31:36.203-07:00Comments on The 20% Statistician: Why you don't need to adjust your alpha level for all tests you'll do in your lifetime.Daniel Lakensnoreply@blogger.comBlogger13125tag:blogger.com,1999:blog-987850932434001559.post-43044643016817171542018-02-10T00:39:33.577-08:002018-02-10T00:39:33.577-08:00I did not. Please provide the quote you are referr...I did not. Please provide the quote you are referring to. Then we can discuss what I meant. Daniel Lakenshttps://www.blogger.com/profile/18143834258497875354noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-64093017800053417922018-02-09T19:39:57.718-08:002018-02-09T19:39:57.718-08:00In your online class you just said the opposite of...In your online class you just said the opposite of this post Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-24221931807726266552016-11-13T04:28:26.357-08:002016-11-13T04:28:26.357-08:00I can understand Fisher's dismay, but it remai...I can understand Fisher's dismay, but it remains true :)Daniel Lakenshttps://www.blogger.com/profile/18143834258497875354noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-45912823366904499002016-11-13T02:55:12.671-08:002016-11-13T02:55:12.671-08:00Nice post! We should definitely pay more attention...Nice post! We should definitely pay more attention to the logical structure of the inferences we make, in particular whether multiple pieces of evidence are combined in a disjunctive (OR operator) or conjunctive (AND operator) manner. I also think it is sometimes sensible to do neither and simply "average" multiple pieces of evidence (without correction) when we interpret our results.<br /><br />On a different topic, Fisher would have probably hated to read this sentence at the end: "There is only one reason to calculate p-values, and that is to control Type 1 error rates using a Neyman-Pearson approach"<br /><br />See (Gigerenzer, 2004) http://library.mpib-berlin.mpg.de/ft/gg/GG_Mindless_2004.pdfPierre Dragicevichttps://www.lri.fr/~dragicenoreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-4956349863080632232016-10-27T10:21:50.377-07:002016-10-27T10:21:50.377-07:00Here's another puzzler for folks interested in...Here's another puzzler for folks interested in the issue of error control over families of tests: Should researchers be correcting for multiple tests, even when they themselves did not run the tests, but all of the tests were run on the same data? Link is <a href="http://doingbayesiandataanalysis.blogspot.com/2016/10/should-researchers-be-correcting-for.html?showComment=1477588413700#c2320193427321062542" rel="nofollow">HERE</a>.John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-7826893156345927002016-05-06T21:57:54.087-07:002016-05-06T21:57:54.087-07:00If you have some data, you can better use a meta-a...If you have some data, you can better use a meta-analysis. A chi-square would be a dichotomous test (sig yes or on), meta-analysis is continuous. Alternatively, you might be interested in literature on controlling the false discovery rate (instead of the Type 1 error rate) - see Benjamini & Hochberg, 1995.Daniel Lakenshttps://www.blogger.com/profile/18143834258497875354noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-45518619652155349582016-05-06T15:53:37.899-07:002016-05-06T15:53:37.899-07:00HI Daniel,
Thinking about multiple tests: What a...HI Daniel, <br /><br />Thinking about multiple tests: What about calculating the number of significant findings that you'd expect to observe due to chance (given number of tests), and then running a chi-squared test to determine whether the number of significant results you obtained are themselves, significantly different from what you'd expect due to chance?<br /><br />Intuitively i feel like this makes sense...what do you think?Simonnoreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-44179694167665086442016-02-20T02:05:52.271-08:002016-02-20T02:05:52.271-08:00You cannot combine p-values (a Bayesian t-test doe...You cannot combine p-values (a Bayesian t-test does not give a p-value). You could do both tests, interpret the p-value in terms of a NP approach (in the long run, I would rarely be wrong if I act as if there is an effect) and then interpret the evidence at hand (and the current data provide strong/weak evidence for the alternative hypothesis).Daniel Lakenshttps://www.blogger.com/profile/18143834258497875354noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-26414224218245797532016-02-20T02:04:20.655-08:002016-02-20T02:04:20.655-08:00No, it would not. You perform separate tests for e...No, it would not. You perform separate tests for each individual study. If you want to evaluate all the studies, you need to do a meta-analysis. This has a new theoretical prediction (is there an effect, if I combine all these studies). If this was really one big investigation, it would not make sense to publish these papers separately, right? And if it makes sense to publish them separately, then you don't need to control the error rate across all studies.Daniel Lakenshttps://www.blogger.com/profile/18143834258497875354noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-53734893249903396562016-02-15T12:53:25.100-08:002016-02-15T12:53:25.100-08:00the post's title is a question that has been n...the post's title is a question that has been nagging me for ages. so thanks! but in a sense some of us do test 1 theory at least some of our lives. for example, if i am testing a specific mechanism or different parts of the same mechanism and not pitting a few "theories" one against the other would this (theoretical relationship between questions) entail a correct for all these years of testing?!barucehttps://www.blogger.com/profile/14268921406075621023noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-73741428054966451802016-02-15T12:45:55.141-08:002016-02-15T12:45:55.141-08:00if i got it correct (maybe i didn't) when you ...if i got it correct (maybe i didn't) when you say 'Combining both approaches is probably a win-win, where long run error rates are controlled, after which the evidential value in individual studies in interpreted (and, because why not, parameters are estimated).', does it mean one could perform, say, a bayesian t.test and a welch t.test on a pairwise comparison and report both bayesian and frequentist p.values to come up with a decision?... even, would it be ok to combine those p.values?Fernando Marmolejo-Ramoshttps://www.blogger.com/profile/11250153541833828503noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-7771074493064883622016-02-14T11:18:35.343-08:002016-02-14T11:18:35.343-08:00Hi Etienne, I'm thinking of registered reports...Hi Etienne, I'm thinking of registered reports. There, you could pre-register a set of 2 studies, and they will be publish regardless. Let's say the second p-value is 0.8. If you indeed had high power for a minimum effect (e.g., 95%) you could decide that the effect is small, or null. That should be good to know, right? Daniel Lakenshttps://www.blogger.com/profile/18143834258497875354noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-70004343248931611682016-02-14T11:14:47.026-08:002016-02-14T11:14:47.026-08:00>>>>For example, it is perfectly fine ...>>>>For example, it is perfectly fine to pre-register a set of two experiments, the second a close replication of the first, where you will choose to reject the null-hypothesis if the p-value is smaller than 0.2236 in both experiments. The probability that you will reject the null hypothesis twice in a row if the null hypothesis is true is α * α, or 0.2236 * 0.2236 = 0.05.<br /><br />Interesting logic. In practice, however, what would happen if after your first experiment, the *one* and only target statistical test yields p < .30? Do you still run the second experiment? <br /><br />I guess you have to given you publicly pre-registered the study. But if the first experiment was highly-powered (e.g., 95%) to detect a plausible effect size (e.g., d=.20), doesn't it seem odd to still run the second experiment?Etienne LeBelhttps://www.blogger.com/profile/02395031595701212100noreply@blogger.com