About a year ago, I wrote on this blog: If
I ever make a follow up to my current MOOC, I will call it ‘Improving Your
Statistical Questions’. The more I learn about how people use statistics, the
more I believe the main problem is not how people interpret the numbers they
get from statistical tests. The real issue is which statistical questions
researchers ask from their data. If you approach a statistician to get help
with the data analysis, most of their time will be spend asking you ‘but what is
your question?’. I hope this course helps to take a step back, reflect on this
question, and get some practical advice on how to answer it.
There are 5 modules, with 15 videos, and 13
assignments that provide hands on explanations of how to use the insights from
the lectures in your own research. The first week discusses different questions
you might want to ask. Only one of these is a hypothesis test, and I examine in
detail if you really want to test a hypothesis, or are simply going through the
motions of the statistical ritual. I also discuss why NHST is often not a very
risky prediction, and why range predictions are a more exciting question to ask
(if you can). Module 2 focuses on falsification in practice and theory, including
a lecture and some assignments on how to determine the smallest effect size of
interest in the studies you perform. I also share my favorite colloquium
question for whenever you dozed of and wake up at the end only to find no one
else is asking a question, when you can always raise you hand to ask ‘so, what
would falsify your hypothesis?’ Module 3 discusses the importance of
justifying error rates, a more detailed discussion on power analysis (following
up on the ‘sample size justification’ lecture in MOOC1), and a lecture on the
many uses of learning how to simulate data. Module 4 moves beyond single
studies, and asks what you can expect from lines of research, how to perform a
meta-analysis, and why the scientific literature does not look like reality
(and how you can detect, and prevent contributing to, a biased literature). I
was tempted to add this to MOOC1, but I am happy I didn’t, as there has been a
lot of exciting work on bias detection that is now part of the lecture. The
last module has three different topics I think are important: computational
reproducibility, philosophy of science (this video would also have been a good
first video lecture, but I don’t want to scare people away!) and maybe my
favorite lecture in the MOOC on scientific integrity in practice. All are
accompanied by assignments, and the assignments is where the real learning
happens.
If after this course some people feel more
comfortable to abandon hypothesis testing and just describe their data, make
their predictions a bit more falsifiable, design more informative studies,
publish sets of studies that look a bit more like reality, and make their work
more computationally reproducible, I’ll be very happy.
The content of this MOOC is based on over 40
workshops and talks I gave in the last 3 years since my previous MOOC came out,
testing this material on live crowds. It comes with some of the pressure a
recording artist might feel for a second record when their first was somewhat
successful. As my first MOOC hits 30k enrolled learners (many of who attend
very few of the content, but still with thousands of people taking in a lot of
the material) I hope it comes close and lives up to expectations.
I’m very grateful to Chelsea Parlett
Pelleriti who checked all assignments for statistical errors or incorrect
statements, and provided feedback that made every exercise in this MOOC better.
If you need a statistics editor, you can find her at: https://cmparlettpelleriti.github.io/TheChatistician.html.
Special thanks to Tim de Jonge who populated the Coursera environment as a
student assistant, and Sascha Prudon for recording and editing the videos. Thanks
to Uri Simonsohn for feedback on Assignment 2.1, Lars Penke for suggesting the
SESOI example in lecture 2.2, Lisa DeBruine for co-developing Assignment 2.4, Joe
Hilgard for the PET-PEESE code in assignment 4.3, Matti Heino for the GRIM test
example in lecture 4.3, and Michelle Nuijten for feedback on assignment 4.4. Thanks
to Seth Green, Russ Zack and Xu Fei at Code Ocean for help in using their
platform to make it possible to run the R code online. I am extremely grateful
for all alpha testers who provided feedback on early versions of the
assignments: Daniel Dunleavy, Robert Gorsch, Emma Henderson, Martine Jansen,
Niklas Johannes, Kristin Jankowsky, Cian McGinley, Robert Görsch, Chris Noone,
Alex Riina, Burak Tunca, Laura Vowels, and Lara Warmelink, as well as the
beta-testers who gave feedback on the material on Coursera: Johannes Breuer,
Marie Delacre, Fabienne Ennigkeit, Marton L. Gy, and Sebastian Skejø. Finally,
thanks to my wife for buying me six new shirts because ‘your audience has
expectations’ (and for accepting how I worked through the summer holiday to
complete this MOOC).
All material in the MOOC is shared with a
CC-BY-NC-SA license, and you can access all material in the MOOC for free (and use it in your own education). Improving Your Statistical Questions is available from today. I hope you enjoy it!
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete