tag:blogger.com,1999:blog-987850932434001559.comments2021-09-27T12:10:43.072+02:00The 20% StatisticianDaniel Lakenshttp://www.blogger.com/profile/18143834258497875354noreply@blogger.comBlogger965125tag:blogger.com,1999:blog-987850932434001559.post-49414484904896686212021-09-21T23:20:26.930+02:002021-09-21T23:20:26.930+02:00Thank you for sharing. He deserves more credit. Thank you for sharing. He deserves more credit. Ulrich Schimmackhttps://www.blogger.com/profile/03244014319857211622noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-91245950960849209782021-09-21T10:31:30.294+02:002021-09-21T10:31:30.294+02:00An interesting post but misleading in some respect...An interesting post but misleading in some respects. Neyman desrves praise for his support of David Blackwell but the implication that in this he was somehow different from Fisher is false. Fisher supported and collaborated with many Indian statisticians (see ths Sankhya obituary here http://www.senns.uk/Sankhya_Obit_RAF.pdf) and his doctoral students included CR Rao and the Ghanaian Ebenezer Laing. (It is interesting to note that PV Sukhatme, whose work Fisher liked, studied for a PhD with Neyman and a DSc with Fisher, consistent with the point of view that attitudes to non-European researchers does not really separate them.) Furthermore, Neyman's enthusiasm for communism had its negative side. It proved to be an embarassment for Polish statisticians who had to live the reality of the paradise he imagined.<br />As regards significance and hypothesis testing, in my opinion. the difference between Neyman and Pearson and Fisher has little to do with P-values versus rejection but more to do with the role of alternative hypotheses (crucial for Neyman and not needed by Fisher) and conditioning (important for Fisher but less obviously so for Neyman).<br /><br />(Small point There is a typo. Neyman was born in 1894. Also, I suppose it is debatable as to whether Bender, Neyman's place of birth, should be described as being in Russia.)<br />Stephen Sennhttps://www.blogger.com/profile/02626984605433782027noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-7360489798465223552020-11-30T10:46:35.751+01:002020-11-30T10:46:35.751+01:00Thanks for this informative blog post. I fully agr...Thanks for this informative blog post. I fully agree on the nonsense of an observed power analysis with the effect size estimated from the current data. However, as you state later, a post-hoc power analysis based on an effect size of theoretical interest can be very useful. In the case of cognitive modeling, the data points going into estimation of conditional parameters further back in the model crucially depend on the level of performance achieved in earlier processes and thus cannot be estimated a priori. Here, a post-hoc power analysis based on the level of performance achieved in prior parameters and then asking whether meaningful, theoretically predicted differences in further back parameters would have been observed with sufficient power is the only informative power analysis that can be conducted. I would greatly appreciate if you could clarify that post-hoc power analysis is more than observed power analyses and that this rightful critique only concerns a specific subset of post-hoc power analyses referred to as observed power analysis. Thank you.Beatrice G. Kuhlmannhttps://www.blogger.com/profile/04569120357210121808noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-8491872260723824172020-11-29T23:31:07.271+01:002020-11-29T23:31:07.271+01:00I feel privileged to be a scientist in a world whe...I feel privileged to be a scientist in a world where bias and lack of reproducibility are widely discussed.Nathalia Fernandeshttps://www.blogger.com/profile/17151105449900424961noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-72235992385011960052020-10-21T15:43:28.000+02:002020-10-21T15:43:28.000+02:00Daniel-
Thanks as always for your work. I don’t h...Daniel-<br /><br />Thanks as always for your work. I don’t have a lesson of my own to offer, but I did have a comment on a small part of your first assignment that I think could be problematic. <br /><br />On page 2 of the posted version of lesson 1.1, you write about the first figure “There is a horizontal red dotted line that indicates an alpha of 5% (located at a frequency of 100.000*0.05 = 5000)”. But that seems like a confusing or misleading statement. First, since the line indicates a Y value, it must be a frequency of observed outcomes for p; a line showing alpha would have to indicate an X value. And even given that the line indicates the expected frequency of outcomes, it's the expectation *under the null hypothesis*, which is not explained here. More importantly, though, even if you do mean that the line will show the expected height of the bars under the null hypothesis, the only reason that you can use N*0.05 to predict that height is that you’ve divided the distribution into 20 bars - it’s not because alpha is 0.05. If you’d chosen to divide the graph into increments of 0.01 (as you do later), the height of the red line would be N*0.01 despite alpha being 0.05 (but now there would be five bars in the alpha region instead of just one). So the height of the line is based on the number of divisions, not alpha.<br /><br />Does that critique make sense? I can try to explain more fully if not.<br /><br />Cheers,<br />AlistairAlistair Cullumhttps://www.blogger.com/profile/16193690419324782781noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-66024840740009768152020-08-28T16:37:11.994+02:002020-08-28T16:37:11.994+02:00Nice.
I am convinced that if Psychology took Popp...Nice.<br /><br />I am convinced that if Psychology took Popper seriously, it would take care of so many problems.Andyhttps://www.blogger.com/profile/00462771057358809889noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-53200633068632803942020-08-25T17:35:44.105+02:002020-08-25T17:35:44.105+02:00Awesome post, Daniel. Just what I needed for the s...Awesome post, Daniel. Just what I needed for the students. The color-coding is the best part of it indeed!Mahmood S. Zargarhttps://www.blogger.com/profile/10070404725721227546noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-23190595170379706182020-08-09T18:46:26.844+02:002020-08-09T18:46:26.844+02:00Thank you very much, Daniel! That's exactly wh...Thank you very much, Daniel! That's exactly what I need right now for interpreting my results and reporting! I would like to see this in a Journal! Can you suggest some literature that points out some of the recommendations you mentioned?Anonymoushttps://www.blogger.com/profile/01382689381389791951noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-70925000231347589192020-07-23T01:09:18.724+02:002020-07-23T01:09:18.724+02:00Love this post. What's the correspondence betw...Love this post. What's the correspondence between the two-way table and the vector of means. Is it mu = c(A1B1, A1B2, A2B1, A2B2)?William Chernoffhttps://www.blogger.com/profile/16335784673532724780noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-89607384174696927092020-07-21T22:09:15.979+02:002020-07-21T22:09:15.979+02:00Do you have an excel sheet to calculate omega squa...Do you have an excel sheet to calculate omega squared for a 2x2 ANOVA? Please let me know. Thank you for your work!Naazhttps://www.blogger.com/profile/10205384749481098056noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-26995502502863699872020-07-13T18:50:09.680+02:002020-07-13T18:50:09.680+02:00>(so no, don’t even think about dividing that p...>(so no, don’t even think about dividing that p = .08 you get from an<br />>F-test by two and reporting p = .04, one-sided)<br /><br />https://twitter.com/doinkboy/status/1280820213647368204?s=20<br /><br />¯\_(ツ)_/¯Nick Brownhttps://www.blogger.com/profile/07481236547943428014noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-56937182856417667602020-06-05T06:00:27.094+02:002020-06-05T06:00:27.094+02:00Hi Daniel,
I was wondering why this post, seemin...Hi Daniel, <br /><br />I was wondering why this post, seemingly goes against what Hopkins and peers recommend on his peer reviewed website<br /><br />https://sportscience.sportsci.org/2020/index.html<br /><br />Even though here it seems like you were happy with the method outlined in sportsscience.sportssci.<br /><br />https://discourse.datamethods.org/t/what-are-credible-priors-and-what-are-skeptical-priors/580/22Trevorhttps://www.blogger.com/profile/08202464322086017032noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-779960210518294262020-05-12T23:54:56.072+02:002020-05-12T23:54:56.072+02:00There's also a Welch's version of ANOVA. T...There's also a Welch's version of ANOVA. This blog post provides a short discussion: https://statisticsbyjim.com/anova/welchs-anova-compared-to-classic-one-way-anova/ Alistair Cullumhttps://www.blogger.com/profile/16193690419324782781noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-30256487129935513132020-05-11T17:28:35.279+02:002020-05-11T17:28:35.279+02:00Hello Daniel,
This is a very nice blog post (as al...Hello Daniel,<br />This is a very nice blog post (as always). I have read the article, I am using Jamovi and Toster module, and I have played with the provided Excel sheet. Though I do not understand one thing. I would like to perform an equivalence test on two independent samples, where I do not know the ES to test for, but I want to use raw scores instead. Both the Jamovi module and the Excel sheet allow doing so. Let‘s say I want to know if the two samples differ (or are equal) on a 7-point scale where +/- 0.5 is the threshold I am interested in. I can specify the raw score as -0.5 and 0.5, but where can I specify the lenght of my scale? I suppose 0.5 point on a 5-point scale is different to 0.5 on 7- or 9- or 11-point scales. Or am I missing something?<br /><br />I hope you can help me...<br /><br />Víthttps://www.blogger.com/profile/11229704654709925638noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-38288543728061043442020-05-11T14:57:55.972+02:002020-05-11T14:57:55.972+02:00here is the enlightening Response from the author:...here is the enlightening Response from the author:<br />https://www.talyarkoni.org/blog/2020/05/06/induction-is-not-optional-if-youre-using-inferential-statistics-reply-to-lakens/schw.stefanhttps://www.blogger.com/profile/15604298958596656339noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-4249556460061666972020-04-01T00:33:25.827+02:002020-04-01T00:33:25.827+02:00OK, now I'm getting nervous. The F statistic ...OK, now I'm getting nervous. The F statistic in one-way ANOVA uses the MSE as its denominator, and that's just pooled variance on steroids. Should we re-think the F test with more carefully designed factors? TheRandomTexanhttps://www.blogger.com/profile/07562250696288742894noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-25422591131044765102020-02-06T13:29:40.561+01:002020-02-06T13:29:40.561+01:00Our reviewer's comment is this: The authors sh...Our reviewer's comment is this: The authors should provide a back-to-the envelope assessment of what is the power of the tests given the sample size (a classic reference to look at would be Andrews, Donald W. K. 1989. “Power in Econometric Applications.” Econometrica 57(5):1059–1090). Are you familiar with this approach? It is talking about an inverse power function. Thank youMargaux Labonnehttps://www.blogger.com/profile/15290373811323434660noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-30132960131850131992020-01-24T20:18:21.252+01:002020-01-24T20:18:21.252+01:00I haven't read Yarkoni's paper (yet), so t...I haven't read Yarkoni's paper (yet), so this is not meant as a defense of his position. Rather, it's just a response to this post.<br /><br />What exactly do you mean and/or what do you take Yarkoni to mean by "alignment between theories and tests"? And what counts as "close" alignment? I ask because I don't find your argument compelling that close alignment doesn't matter in a (hypothetico-)deductive approach to theory evaluation. I assume you agree that a theory that predicts that "cleanliness reduces the severity of moral judgments" would not be well-tested (indeed would not be tested at all) by, say, measuring and comparing walking speed after priming or not priming people with age-related words. If so, then there is some degree of alignment between tests and theories required even for a deductive approach.<br /><br />It's also not clear to me exactly what role statistical testing plays for you in all this. I'm at least approximately on the same page as you with respect to induction (if it's <a href="https://twitter.com/annemscheel/status/1198718886436311041" rel="nofollow">plebeian induction</a> we're talking about, anyway), but statistical tests are explicitly concerned with drawing inferences about populations, not just observing what occurs in samples. That is, statistical tests are explicitly about <i>generalization</i>, at least in part.Noah Motionhttps://www.blogger.com/profile/00150446498549219747noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-18270611614105768322019-10-25T14:00:17.318+02:002019-10-25T14:00:17.318+02:00Hi
Not a statistician, instead a physician trying...Hi<br /><br />Not a statistician, instead a physician trying to learn statistics.<br />A bit tired today, so perhaps misunderstood something.<br /><br />I think I have some comments to this. Interesting about decision theory, though.<br /><br />Here goes:<br />In finance it’s clear whether your result is “good” or “bad”/”true” or “false”. You have an economic return or loss on a certain level.<br />In for example science (but also in medicine) you get a result either way.<br />The “value” lies (somewhat) in whether you can trust the result or not.<br />The “return” or “loss” could perhaps be seen as whether the applications of the results turn out to be useful in practice or not.<br /><br />I’m not sure that evaluation through such implementation in general is the best way to go, though.<br /><br />Instead I think one could start with setting a level of certainty that’s needed when comes to deeming a scientific question answered or not answered.<br />When claiming a scientific hypothesis is answered - what is the acceptable likelihood that the answer we have is a false nullresult, or a false positive result? (Either for a single hypothesis, or in general, for a number of them.)<br />Here I think decision theory may have it’s place: What the “sought for” level of certainty should be in a given situation (with given economic restraints, etc), or in the scientific community as a whole, can probably be examined with some form of decision theory - that in combination with known facts, etc.<br /><br />The levels of certainty perhaps don’t have to be stated in numbers.<br />Perhaps “very highly likely” or “very, very unlikely” are good enough.<br /><br />Then, when one knows that level of requested certainty, one can probably use a stepwise process, to reach it.<br />This similar to a “stepwise diagnostic process” in medicine or psychology that I think you are familiar with, where you often use several test in a row. - In science being equivalent to several studies in a row for a given hypothesis.<br />There, in general, depending on level of prior probability, etc, I think it may be smart to go for an appropriate level of beta or alpha, to obtain the requested level of certainty for either nulls or positives, in a first run, and then examine either positives or nulls further, depending on which category that is known to contain to many false ones.<br />- Perhaps similar to Bayesian decision theory that you mention.<br /><br />This could probably be tested with some sort of simulation.<br /><br />I may be wrong, but I think that is a somewhat easier approach than the one you propose.<br /><br />(Perhaps also a bit more informative or effective.<br />I think it’s better in the long run to know that 3 % of nullresults, and 25 % of positive ones are probably false, than to know that ca 10 % of each are false. In the first you mostly have to test the positive ones further. In the second you more or less have to test both positives and negatives further.)<br /><br />Best wishes!Gustav Holsthttps://www.blogger.com/profile/17781110218299868363noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-52304147134124726882019-10-15T19:37:26.788+02:002019-10-15T19:37:26.788+02:00Great, thank you very much!
I think we should also...Great, thank you very much!<br />I think we should also improve education about ICs.<br />¿Do you know this paper?<br />http://learnbayes.org/papers/confidenceIntervalsFallacy/fundamentalError_PBR.pdfRosana Ferrerohttps://www.blogger.com/profile/05891745549463934951noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-62332668192820537152019-10-11T19:41:38.519+02:002019-10-11T19:41:38.519+02:00Thank you for the wonderful package.
Is there a g...Thank you for the wonderful package.<br /><br />Is there a good way to determine equivalence margins?Anonymoushttps://www.blogger.com/profile/11008284843134095666noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-56681229644173713102019-10-11T13:09:40.125+02:002019-10-11T13:09:40.125+02:00I know that editorials are mass-produced, includin...I know that editorials are mass-produced, including similar criticisms.<br /><br />However, I don't know anyone who showed "how can I calculate post-hoc power" in the case of Welch's t-test, especially when the sample sizes of the two groups are different.<br /><br /><br />I'm asking to show it in a calculation code or mathematical formula in following community but,...<br /><br />https://stats.stackexchange.com/questions/430030/what-is-the-post-hoc-power-in-my-experiment-how-to-calculate-thisAnonymoushttps://www.blogger.com/profile/11008284843134095666noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-49611279115177915332019-10-04T10:27:21.576+02:002019-10-04T10:27:21.576+02:00Gelman nails it down: "t’s fine to estimate p...Gelman nails it down: "t’s fine to estimate power (or, more generally, statistical properties of estimates) after the data have come in—but only only only only only if you do this based on a scientifically grounded assumed effect size. One should not not not not not estimate the power (or other statistical properties) of a study based on the “effect size observed in that study.” <br /><br /><br />https://statmodeling.stat.columbia.edu/2018/09/24/dont-calculate-post-hoc-power-using-observed-estimate-effect-size/Sehttps://www.blogger.com/profile/07697710753267461129noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-29233456722390713382019-09-15T18:19:35.510+02:002019-09-15T18:19:35.510+02:00This resource is available for repeated measures (...This resource is available for repeated measures (and mixed) designs, though I think only for 2 way designs<br />https://www.aggieerin.com/shiny-server/tests/omegaprmss.html<br /><br />Also provides CIsawfoothttps://www.blogger.com/profile/04375839310761465743noreply@blogger.comtag:blogger.com,1999:blog-987850932434001559.post-69851497628740454072019-09-15T18:19:09.819+02:002019-09-15T18:19:09.819+02:00Here’s the thing: the problem really isn’t how to ...Here’s the thing: the problem really isn’t how to explain p-values better. The problem is that people generally don’t know a) what the aim of science is and b) why we would want to use p-values in furtherance of that aim.<br /><br />Long story short: There can be no such thing as certain (or even probable) knowledge. Knowledge can be objective, but it will always remain relative to fundamental assumptions. That implies that we can only achieve <i>successively better</i> knowledge. For that, we can employ valid, deductive logic, which enables us to make choices (www.theopensociety.net/2011/08/the-power-of-logic) that can in turn be informed by (a distribution of!) p-values.Peterhttps://www.blogger.com/profile/00208530977584493728noreply@blogger.com