A blog on statistics, methods, philosophy of science, and open science. Understanding 20% of statistics will improve 80% of your inferences.

Tuesday, July 14, 2015

Increased depression after the MH17 crash: How convincing is the data?

This blog post is now in press in the American Journal of Epidemiology

Based on the data of pregnant Dutch women participating in a longitudinal study Truijens and colleagues (1) have drawn the provocative conclusion that the MH17 crash has increased depressive symptoms among pregnant women in The Netherlands (Dutch newspaper article on it here). They compare the answers on the Edinburgh Depression Scale (EDS) of 126 women in the month after the MH17 crash with the EDS scores from 106 pregnant women collected during the summer of 2013, and observe a small effect, t(230) = 2.12, P = 0.03, d = 0.3.

The women participated in the HAPPY study (2), which started in January 2013, in which the EDS is administered at 12, 22, and 32 weeks of gestation, as well as one and six weeks after giving birth. This means the researchers have access to the EDS scores from more women at different stages during or after their pregnancy. Instead of ‘defining’ two groups at 32 weeks of gestation, the authors could have chosen to compare the data from all women in the HAPPY study to examine whether there was an increase in depressive symptoms. If the MH17 crash impacted pregnant women, similar patterns should be observed for the other women whose data was available. The authors do not explain why they report a subset of their data, even though selective reporting requires a strong justification.

The evidence itself is weak. High p-values (e.g., above 0.02) are rather unlikely when there is a true effect (3), and Bayesian statistics (4) with a Cauchy prior and r scale of 0.707 yield a JZS Bayes Factor of 1.96, indicating the observed difference is only 1.96 times more likely under the alternative hypothesis than under the null hypothesis, which is considered anecdotal evidence.

If the MH17 crash increased depressive symptoms, we might expect women would be more depressed at 32 weeks of gestation (after the MH17 crash) than at 22 weeks of gestation (before the MH17 crash). A dependent samples t-test comparing the mean at 22 weeks (mean = 5.09) against the mean at 32 weeks (mean = 5.21) for women measured in 2014 is the most direct test of the hypothesis. Even when measurements are highly correlated, this difference is not statistically significant. In other words, there is no significant increase in depressive symptoms after the MH17 crash in 32 week pregnant Dutch women compared to how depressed the same women were 10 weeks earlier, before the crash had happened.

Not only the presence of a control condition, but also the choice of the specific control condition, is open for debate. Had the 2014 data been compared to a larger Dutch validation study (5), EDS scores in week 22 and 32 would have differed significantly between years. Comparisons between subjects over time do not allow for strong conclusions. Other factors, such as the weather, differed between 2013 and 2014. The authors note both summers had a ‘comparable number of hours of sunshine’. However, the summer of 2013 was warm and dry (with an official heatwave at the end of July), while the summer of 2014 was cold and wet (with August being the coldest month in 20 years). Having explicitly looked at influences of the weather, the authors could have provided a more accurate overview of the differences between these two summers.

Although I understand the researchers were tempted to look at their data after the MH17 disaster occurred, the observed difference might be spurious or caused by a confound. Replication of the observed pattern in the remaining data the authors have access to would take away concerns about the reliability (although not the validity) of the observed difference.


1) Truijens SEM, Boerekamp, CAM, Spek, V, Son, MJM, Oei, SG, & Pop, VJM. Increased levels of depressive symptoms among pregnant women in The Netherlands after the crash of flight MH17. American Journal of Epidemiology. (doi: 10.1093/aje/kwv161)

2) Truijens, SE, Meems, M, Kuppens, SM, Broeren, MA, Nabbe, KC, Wijnen, HA, & Pop, VJ. The HAPPY study (Holistic Approach to Pregnancy and the first Postpartum Year): Design of a large prospective cohort study. BMC pregnancy and childbirth. 2014:14:312.

3) Lakens, D, & Evers, ERK. Sailing from the seas of chaos into the corridor of stability practical recommendations to increase the informational value of studies. Perspectives on Psychological Science. 2014;9:278-292.

4) Rouder, JN, Speckman, PL, Sun, D, Morey, RD, & Iverson, G. Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review. 2009;16:225-237.

5) Bergink, V, Kooistra, L, Lambregtse-van den Berg, MP, Wijnen, H, Bunevicius, R, van Baar, A, & Pop, VJM. Validation of the Edinburgh Depression Scale during pregnancy. Journal of Psychosomatic Research. 2011;70:385-389.


  1. From the graph in the article (https://pbs.twimg.com/media/CJ4nV8hWgAAGmqP.png) it seems to be equally plausible that there was a drop in depression in the "control" (2013) group from 22 to 32 weeks. I suppose one would have to know the general etiology of depression across the span of pregnancy to say whether the 2013 or 2014 numbers were unusual.

    1. HI Nick, in the norming data (collected by some of the same authors) there is no difference in depression scores in the second and third trimester (weeks 22 and 32). There is at best not enough data to argue there is normally a dip in depression scores in this time, and at worst good reason to believe the very nice summer in 2013 is the reason for the dip in depression scores for women in 2013. The women in 2014 did not become more depressed after the MH7 disaster - that's all we know for sure.