In this blog post I will analyse the arguments that Dr. Amy Cuddy provided in a blog post “The "Power Posing Was Debunked" Myth: What the Research Actually Shows — and Why Scientific Discourse Matters” on February 26. You can find the LinkedIn post here:
In the post, Cuddy says she was “effectively silenced” by an “attempt to shut down
this line” of research. She credits “the courage of the individual scientists
who kept going despite enormous pressure not to” for the fact that she can
still summarize “what the evidence now shows”.
Power
posing has two categories of claimed effects. The first effect is on self-reported
feelings. For example, if we instruct people to stand in a constricted versus an
expanded posture, they will self-report feeling more powerful. There is an
ongoing debate about whether, or how much, this effect is caused by a demand
effect (i.e., people report what they think the investigator wants them to say,
not what they actually feel). A meta-analysis has shown this self-report effect
is larger in within-subject designs, and in studies without a cover story (Körner
et al., 2022). The second effect is on physiological or
behavioral outcomes. This is the contested area, and the research outcome that
Cuddy is mainly trying to defend in her blog post. If you want to explore a
meta-analysis on these two categories of effects, you can do so at https://metaanalyses.shinyapps.io/bodypositions/ (made by
Körner et al., 2022). I would especially recommend exploring the
QRP/Publication bias tab for the physiological and behavioral outcomes.
At the end
of the post, Cuddy writes that she is thankful that not everyone stopped doing
research on power poses, because then: “We would not know what we now know —
which is that these effects are real, that they matter, and that the story
people were told was wrong.” She
concludes with: “The evidence is there. It has been there for years. All I am
asking is that people look at it.”
I am happy
to do so. Let’s go.
Trying to find the
references
I tried to
look up the references cited by Cuddy in her post. However, this reference:
Andolfi,
V. R., & Antonietti, A. (2020). Contractive vs. expansive body posture
effects on convergent-integrative thinking tasks. Journal of Creative Behavior,
54(4), 871–880.
does not exist in
literature databases, and the authors (who do exist) do not list this paper on
their own websites. An
inspection of the journal’s website shows that a different article was published in volume 54, issue 4 on these pages. This raises questions about how this reference was generated, with
generation by AI being a plausible candidate (also in view of the 4 malformed
references I will point out below). The reference appears in the following sentence in Cuddy’s
LinkedIn post:
Andolfi
and Antonietti (2020, Journal of Creative Behavior) provided further evidence
that contractive postures specifically benefited convergent-integrative
thinking tasks. That level of specificity — where the direction of the effect
depends on the type of cognitive task — is exactly the kind of finding that
emerges when a field matures.
When Cuddy
says ‘The evidence is there’, this is not correct for the Andolfi and
Antonietti article, which does not seem to exist in the scholarly record.
There are
an additional 4 references that suggest
that the literature review may have in part been generated by automated tools, but for these 4 references, there
are papers that match the content discussed in the literature review in the
blog post.
|
Reference in blog post |
Actual Reference |
|
Michinov, E., & Michinov, N.
(2020). Creativity connected with body posture: The effects of expansive and
contractive postures on creative performance. Psychology
of Aesthetics, Creativity, and the Arts, 14(1), 116–127 |
Michinov, N., & Michinov, E.
(2022). Do open or closed postures boost creative performance? The effects of
postural feedback on divergent and convergent thinking. Psychology of
Aesthetics, Creativity, and the Arts, 16(3), 504–518. https://doi.org/10.1037/aca0000306 |
|
Wainio-Theberge, S., Bhatt, M.,
Bhattacharyya, K., et al. (2025). Neural correlates of power-related postures
and their behavioural consequences: A preliminary electrophysiological
investigation. Social Cognitive and Affective Neuroscience, 20(1), nsaf03 |
Wainio-Theberge, S., & Armony,
J. L. (2025). Neural correlates of power-related postures and their
behavioural consequences: A preliminary electrophysiological investigation.
Social Cognitive and Affective Neuroscience, 20(1), nsaf036. https://doi.org/10.1093/scan/nsaf036 |
|
Elkjær, E., Mikkelsen, M. B., Michalak, J., Mennin, D. S., &
O'Toole, M. S. (2023). Using bodily displays to facilitate approach action outcomes within
the context of a personally relevant task. Frontiers in
Psychology, 14, 1147printing |
Elkjær, E., Mikkelsen, M. B., Tramm, G., Michalak, J., Mennin, D. S.,
& O’Toole, M. S. (2022). Using bodily displays to facilitating approach action outcomes within
the context of a personally relevant task. Brain and Behavior, 13(1), e2855. https://doi.org/10.1002/brb3.2855 |
|
Körner, R., Köhler, H., &
Schütz, A. (2020). Powerful and confident children through expansive body
postures? A preregistered test of the effects of power posing on children.
School Psychology International, 41(4), 315–330. |
Körner, R., Köhler, H., & Schütz, A. (2020). Powerful and
confident children through expansive body postures? A preregistered study of
fourth graders. School Psychology International, 41(4), 315–330. https://doi.org/10.1177/0143034320912306 |
We see all
these references that are incorrect refer to the later literature, and summarize the
research of the people who ‘kept going’. These references are at the core of the argument Cuddy is making.
Evaluating the evidence: Three
examples
Cuddy wrote
a narrative review, which requires that the validity of the conclusions, and
the strength of the evidence, needs to be evaluated for every study. Let’s carefully
examine some of the papers she cited and evaluate the evidence. Cuddy writes
about a first study:
Wainio-Theberge
and colleagues (2025, Social Cognitive and Affective Neuroscience) published
the first EEG study of power posing, finding significant effects on arousal and
valence, with suggestive differences in frontal brain activity between
expansive and contractive postures. A new neural methodology for a question
people said was already answered.
From the
description in Cuddy’s blog, you might assume the “significant effects on arousal and valence, with suggestive
differences in frontal brain activity between expansive and contractive
postures” would support the hypothesis. But this is not the case. The
significant effects were actually in the opposite direction of the
hypothesis. This is not mentioned in the abstract of the Wainio-Theberge et al. article, and one would need to read the
paper to get this information:
We found
no significant posture differences in the EEG spectral exponent (t(101) = 1.01,
P = .32). In contrast, a significant posture effect was observed for frontal
asymmetry (t(101) = −2.63, P = .01); however, post hoc t-tests in each group
separately (‘Models 1c and 1e’) revealed that the effect was in the opposite
direction as hypothesized (see Discussion). Namely, we observed a significant
right-lateralized frontal alpha asymmetry (FAA) in the contractive group (t(45)
= 2.17, P = .04) and a left-lateralized FAA in the expansive one which failed
to reach significance (t(55) = −1.63, P = .11).
Cuddy writes in the blog that she responded to journalists skeptical about power posing: “I spent more than ten hours responding — reviewing the literature, pulling citations, writing carefully, anticipating distortions” In this case, her review of the literature presented a finding as providing support for power posing, when in fact the effect was in the opposite direction of the hypothesis.
As a second
paper, let’s take Barel and colleagues (2024). First, I want to thank the authors
for sharing their data, after I tried to access it by clicking the google drive
link in the article. All numbers were reproducible. Cuddy cites the paper as
follows:
“As other
researchers began testing that broader construct, using different measures in
different populations, they found effects consistently: action orientation
(Huang et al., 2011, Psychological Science), […], and risk-taking itself,
partially (Barel et al., 2024, BMC Psychology).
It is
unclear what is meant by 'partially', as the authors are clear that they found that power posing did not affect risk-taking: "There was no statistically
significant distribution in risk-taking between high and low power conditions
[χ2 = 0.00, p > 0.99]." The risk-taking outcome that Cuddy cites the study
for is a clear null result.
The basis
for "partially" is presumably a separate analysis reported in the paper: in a logistic
regression predicting risk-taking, the authors found a significant interaction
between power condition and cortisol change and write that they "did
partially replicate an effect of changes in cortisol levels on
risk-taking." But note that they claim an effect of cortisol changes
on risk, not of power posing on risk. For power posing to affect risk through
cortisol, power posing would first have to change cortisol, and it did not: the
authors report no main effects of time or power on cortisol. With that first
link missing, the high-power participants whose cortisol fell are not a subgroup of people for whom the power pose worked, as their cortisol would have moved the same
way without any pose. The significant effect is a within-group
association between two measures, which can't be attributed to the power posing manipulation.
To their
credit, the authors themselves
never claim power posing affected risk-taking. This framing comes from Cuddy,
who presents the paper as a partial replication of a risk-taking effect after a power posing manipulation, which
the study did not support.
When
discussing a third paper, Cuddy writes: “Körner, Köhler, and Schütz (2020,
School Psychology International) conducted a preregistered study of 108 German
fourth graders — children — and found that expansive postures increased
self-esteem, positive feelings, feelings of power, and even children's
perceptions of their relationship with their teacher. The strongest effects
were on school-related self-esteem. This is exactly the kind of applied,
developmentally informed research that matters — taking findings from the lab
and asking whether they help real children in real classrooms.”
The Körner
et al study was preregistered: https://aspredicted.org/blind.php?x=sn4su9 with 4 t-tests to examine 4 dependent variables
of interest. Of the 4 tests, 2 are significant (p = 0.04 and p = 0.013), but neither
survive a correction for multiple comparisons (0.05/4 = 0.0125) which was
necessary in this analysis.
The blog by
Cuddy states “The strongest effects were on school-related self-esteem.” But
the biggest effect is actually on the student-teacher relationship:
Finally,
there was a significant difference between the two groups regarding the
pictures related to the student–teacher relationship: high power posers more
frequently chose the picture showing a good student–teacher relationship than
low power posers, Χ²(1) = 11.181, p = .001, φ = –.322.
But there is a
problem with this finding. Students spent months building a relationship with their
teacher. Then, as part of the
experiment, the students posed for 60 seconds and self-reported on that
relationship, without any further interaction with the teacher. There is no possible causal mechanism for the
power pose to impact the relationship with teachers. Although unintended, this
question is an excellent probe for demand effects. As the power pose can’t
change history and impact the actual relationship between students and
teachers, the observed effect can only be caused by a demand effect. Neither
Cuddy nor the original authors realized this. Cuddy instead concludes: “This is
exactly the kind of applied, developmentally informed research that matters —
taking findings from the lab and asking whether they help real children in real
classrooms.”
Evaluating the Research
Line
Evaluating
evidence is effortful and messy. Single studies always have weaknesses, and the reader might reasonably wonder whether
I’m cherry-picking a few bad apples from an otherwise strong set. I don’t think I am, and I will
explain the more general pattern I observed when reading all the cited papers.
Exploratory claims
The Körner
et al (2020) study above was preregistered, and therefore we were able to
evaluate that the claims were not severely tested, as they would not survive
the required correction for multiple comparisons (Lakens,
2019). But most claims in the papers that
Cuddy cites are based on exploratory analyses. The studies all have many
dependent variables, and a large number of tests can be performed. These
studies observe a mix of significant and non-significant results, but the
significant results have a high probability of being Type 1 errors and can’t be
presented as evidence. If researchers in this field would perform more direct
replication studies, and would preregister their studies more, they could
address this problem. Some preregistered their studies, which is excellent, but some don't, even though they work in a highly
contested research area, and the significant results primarily come from exploratory analyses.
Researchers
in the field are often honest about this, but especially in a narrative
summary, it is easy to lose track of the fact that most of the authors of studies
cited by Cuddy do not consider their own findings to be strong evidence. For
example, Metzler et al (2023) write “Finally, it is important to
transparently report on the level of evidence this study provides for power
pose effects on low-level social behavior. This requires mentioning its
exploratory nature [...] we are convinced that the medium effect sizes, given
our sample size, would require replication before strong conclusions can be
drawn”. I would say this is especially important given that the main result was
a 3-way interaction with a p-value of 0.03: “the predicted three-fold
interaction suggested that this effect of emotion on action choices (more
avoidance for anger than fear) changed between sessions as a function of
adopted pose (OR = 1.19, 95% CI[1.02, 1.38], z = 2.18, p = .029)”.
Another example comes from Elkjær et al (2022). The main finding is: “Concerning approach tendencies, the 2 × 3 interaction analysis on DAT “approach threat 1” was significant (F(1, 87) = 3.27, p = .043, ηp2 = .07). Regarding DAT avoid threat (1 + 2), the overall 2 × 3 interaction analysis was significant (F(1, 87) = 6.39, p = .003, ηp2 = .13).” The study was preregistered (https://aspredicted.org/blind.php?x=9j3b38) which allows us to see that the preregistered predictions are not supported. The authors predicted significant effects for the expansive condition compared to both the constricted condition and the control condition. However, they did not find effects compared to the control condition. Such patterns of mixed results are present in many studies in the literature. On the one hand, this is part of normal research, especially early on in research lines, when researchers have not figured out how to reliably produce the effect they are examining. On the other hand, power posing has been studied since 2010, and a research line can never get a strong basis if it does not move beyond a literature where all significant results are based on exploratory partial confirmations.
If you want to see the exploration of data in action, I would recommend looking at the OSF repository related to the paper by Michinov and Michinov (2024): www.osf.io/c9mzh, and see which variables and ways of computing variables are reported in the final paper, and which are not.
Underpowered studies and
selection for significance
The sample
sizes in the studies cited by Cuddy are often small – especially for key
sub-group analyses, when the total sample size might be distributed across
cells in a 2x3 design. This would not be problematic if the effects of power
posing were known to be large. But even the self-report effect where
participants indicate they feel more or less powerful has a rather small effect size of only g =
0.37 (see https://metaanalyses.shinyapps.io/bodypositions/). Less direct effects, for example
on behavior, are likely to have a much smaller effects (unless researchers can
propose strong theoretical arguments why more indirect effects would be larger,
see Anvari et
al., 2023). In one-tailed independent t-tests, 80% power would require 184
participants (92 per condition), but none of the studies are close to achieving
such sample sizes.
The
research area of power posing is also characterized by the selective reporting of
significant results. This combination of underpowered studies and selection for
significance leads to highly inflated effect sizes. We can see these effects in
Andolfi et al (2017):
The effect
sizes of an open or closed posture simply can’t be in the range of d = 1.22, or
even d = 0.69 (for examples of realistic effect sizes to expect based on group
differences, see DataColada 18). The effects are inflated, and
there is no way of knowing what the true effect sizes are. They might be zero,
as many replication studies of exactly such implausibly large effects based on
studies with tiny samples have turned out to be.
The study
by Michinov and Michinov similarly shows effects for significant tests that are
too large. Adopting a posture for a few minutes can’t plausibly influence
creative tasks with effects such as d = 0.634. When you evaluate evidence, thinking
about selective reporting and inflated effects should be part of the
evaluation.
Quality of the design and
analysis
I could not
help noticing that there is a lot of room to improve the quality of the study
design and analysis, as reported in papers in this literature. This in itself
does not mean that the evidence is unreliable, but it does not make it easier
for a research field to generate high quality evidence. For example, Elkjær et
al (2022) report the following power
analysis:
“Based on a
priori power calculations, using a repeated-measures ANOVA interaction
analysis, 2 (time; before vs. after the manipulation) × 3 (condition; EXP, CON,
N), 90 participants were required to detect a small effect size (d = 0.34),
with an alpha of .05 and a beta of .20.”
At first
sight, this looks like best practice. They acknowledge power posing effects are
small (d = 0.34 is very much in line with the meta-analysis they published in
the same year). Regrettably, what the authors actually did was enter an f =
-.34, not a d, as you can see in the screenshot below, which leads to a sample
size that is much lower than what they would actually have needed to achieve
high power, according to their own meta-analytic effect size estimate:
This means that
despite the power analysis, the study was still massively underpowered. The
sample size justifications in all studies cited by Cuddy are problematic. This
is probably true for many research lines, but it is especially problematic for
a research line where researchers are still trying to establish if the basic
effect exists or not.
While reading the articles, I also noticed many of the issues that we often see in other literatures when research teams lack statistical expertise. There are often small inconsistencies in the correct degrees of freedom, incorrectly performed statistical tests, an overreliance on p-values despite underpowered studies, and misinterpretations of non-significant results. I don’t want to single out more examples, but it would probably be good for the field if researchers would enlist some methodological and statistical expertise if they want to generate reliable evidence.
Tools to evaluate claims
Cuddy writes: “When people are told that research is fake — without being given the tools to evaluate that claim — it doesn't just affect one researcher or one line of work. It feeds a broader cynicism: that science can't be trusted, that findings are arbitrary, that expertise is performance.” I strongly agree. This is why I have created a free textbook, Improving Your Statistical Inferences, to learn how to evaluate the actual evidence in scientific papers. Here are three decent heuristics to follow when you evaluate the evidence in a research line:
- If a finding shows what you want to be true, be extra skeptical.
- If you have a strong conflict of interest, be extra skeptical.
- Studies with low power due to too small sample sizes, lack of preregistration, no direct replications, strong indications of selective reporting, low methodological quality, repeating limitations in discussion sections without addressing them, implausibly large effect sizes, a lack of impact on other research areas, significant claims that mainly come from exploratory analyses, continued uncertainty about the basic effect after more than a decade and dozens of studies, and the research community disengaging with a literature are all signs of a lack of evidence.
According
to Cuddy, she “live[s] inside a false narrative” where power posing is
incorrectly believed to be a ‘myth’, and she believes that “none of this would
have happened if the methods guys, and the journalists who trusted them without
doing proper research, hadn't created the conditions that made it happen.”
Scientific criticism is a
cornerstone of a healthy science
When I read
Cuddy’s LinkedIn post, I was highly skeptical of the claim that there was
evidence for effects of power posing on measures other than self-report, and that the debunking was a 'myth'. But my
first response was to ignore the post. I did not want to examine the evidence
behind the claims Cuddy made, because I am clearly one of the “method guys” who,
according to Cuddy “manufactured the "debunked" narrative and aimed
it, with great precision, at a single researcher”. If I would criticize her post,
would I be seen as contributing to “the bullying I was subjected to”, as Cuddy
writes?
But I care
about criticism in science. And I think it is important that we can criticize scientific
claims. My decision to not follow up on examining the claims in the blog post kept
nagging me. Cuddy has 900,000 followers on LinkedIn who have read the very
strong statement that it is a “myth” that power posing was debunked. If the
evidence she presented was overstated – as I feared – scientific criticism
would be needed to correct the record. I think it is essential to increase
social safety in academia, while being able to criticize each other. I do not
want bullying and scientific criticism to become conflated. Scientific
criticism is too important for a healthy science to shy away from it, for fear
of being called a bully.
I think
scientific criticism is a cornerstone of a reliable science. We have a
responsibility to criticize public claims that we believe to be incorrect
(either because they are AI generated, miscitations, or overstate the
evidence). When I asked whether criticism like this should be voiced publicly (here, here,
here, and here),
most of the people in my network remarked that such criticisms should be voiced
publicly. Others thought I should share these issues privately. In a way, I
always have found it comforting to do things which you know will upset some
scientists either way. It makes it easier to act on my own principles. And I
believe it is essential for a science that aims to contribute to society to
maintain a healthy culture of public scientific criticism.
Thanks to Nina,
Sajedeh, Nick and Lisa for feedback on this blog post.
References
Andolfi, V.
R., Di Nuzzo, C., & Antonietti, A. (2017). Opening the mind through the
body: The effects of posture on creative processes. Thinking Skills and
Creativity, 24, 20–28. https://doi.org/10.1016/j.tsc.2017.02.012
Anvari, F.,
Kievit, R., Lakens, D., Pennington, C. R., Przybylski, A. K., Tiokhin, L.,
Wiernik, B. M., & Orben, A. (2023). Not All Effects Are Indispensable:
Psychological Science Requires Verifiable Lines of Reasoning for Whether an
Effect Matters. Perspectives on Psychological Science, 18(2), 503–507.
https://doi.org/10.1177/17456916221091565
Barel, E.,
Shahrabani, S., Mahagna, L., Massalha, R., Colodner, R., & Tzischinsky, O.
(2024). The effects of power posing on neuroendocrine levels and risk-taking. BMC Psychology, 12(1), 726. https://doi.org/10.1186/s40359-024-02194-7
Elkjær, E., Mikkelsen, M. B., Tramm, G.,
Michalak, J., Mennin, D. S., & O’Toole, M. S. (2022). Using bodily displays to
facilitating approach action outcomes within the context of a personally
relevant task. Brain and Behavior, 13(1), e2855.
https://doi.org/10.1002/brb3.2855
Körner, R.,
Röseler, L., Schütz, A., & Bushman, B. J. (2022). Dominance and prestige:
Meta-analytic review of experimentally induced body position effects on
behavioral, self-report, and physiological dependent variables. Psychological
Bulletin, 148(1–2), 67–85. https://doi.org/10.1037/bul0000356
Lakens, D.
(2019). The value of preregistration for psychological science: A conceptual
analysis. Japanese Psychological Review, 62(3), 221–230.
https://doi.org/10.24602/sjpr.62.3_221
Metzler,
H., Vilarem, E., Petschen, A., & Grèzes, J. (2023). Power pose effects on
approach and avoidance decisions in response to social threat. PLOS ONE, 18(8),
e0286904. https://doi.org/10.1371/journal.pone.0286904
Michinov, N., & Michinov, E. (2024). Can Sitting Postures Influence the Creative Mind? Positive Effect of Contractive Posture on Convergent-Integrative Thinking. Creativity Research Journal, 36(1), 58–69. https://doi.org/10.1080/10400419.2022.2072557
No comments:
Post a Comment