Professor priming – or not

This was my first contribution to a Registered Replication Report (RRR). Being one of 40 participating labs was an interesting exercise – it might seem straightforward to run the same study in different labs, but we learned that such small things as ü, ä and ö can generate a huge amount of problems and work (read this if you are into these kind of things).

Here is one of the central results:

So overall not a lot of action … our lab was actually the one with larges effect size (in the predicted direction).

Here is the abstract of the whole paper and here the Commentary by Ap Dijksterhuis naturally, he sees things a bit different …

Dijksterhuis and van Knippenberg (1998) reported that participants primed with an intelligent category (“professor”) subsequently performed 13.1% better on a trivia test than participants primed with an unintelligent category (“soccer hooligans”). Two unpublished replications of this study by the original authors, designed to verify the appropriate testing procedures, observed a smaller difference between conditions (2-3%) as well as a gender difference: men showed the effect (9.3% and 7.6%) but women did not (0.3% and -0.3%). The procedure used in those replications served as the basis for this multi-lab Registered Replication Report (RRR). A total of 40 laboratories collected data for this project, with 23 laboratories meeting all inclusion criteria. Here we report the meta-analytic result of those 23 direct replications (total N = 4,493) of the updated version of the original study, examining the difference between priming with professor and hooligan on a 30-item general knowledge trivia task (a supplementary analysis reports results with all 40 labs, N = 6,454). We observed no overall difference in trivia performance between participants primed with professor and those primed with hooligan (0.14%) and no moderation by gender.

The root of the problem

One of the root causes of where we are (as a science) in psychology and many other disciplines in terms of reproducibility of key (and other) results could not be better summed up than by the man himself Daryl Bem (2002):

“If a datum suggests a new hypothesis, try to find additional evidence for it elsewhere in the data. If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. If there are participants you don’t like, or trials, observers, or interviewers who gave you anomalous results, drop them (temporarily). Go on a fishing expedition for something — anything — interesting. ”

‘Go on a fishing expidition’ – why should there come anything good from such advise? Bem goes on …

“No, this is not immoral (SIC!). The rules of scientific and statistical inference that we overlearn in graduate school apply to the “Context of Justification.” They tell us what we can conclude in the articles we write for public consumption, and they give our readers criteria for deciding whether or not to believe us. But in the “Context of Discovery,” there are no formal rules, only heuristics or strategies.”

I disagree with this statement, because the idea of finding something through torturing the data (until they confess) is a hug source of false positive results. We find an effect and falsely conclude that something is there when in fact there is nothing. I found the above quote when reading this paper by Zwaan, Etz, Lucas & Donnellan (2017) – a target article for BBS which presents six common arguments against replication and a set of really good responses for such discussions.

Here are the six ‘concerns’ the authors discuss:

Concern I: Context Is Too Variable
Concern II: The Theoretical Value of Direct Replications is Limited
Concern III: Direct Replications Are Not Feasible in Certain Domains
Concern IV: Replications are a Distraction
Concern V: Replications Affect Reputations
Concern VI: There is no Standard Method to Evaluate Replication Results

Both are really good reads – for very different reasons.


Bern, D. (2002). Writing the empirical journal article. In Darley, J. M., Zanna, M. P., & Roediger III, H. L. (Eds) (2002). The Compleat Academic: A Career Guide. Washington, DC: American Psychological Association.

Zwaan, R. A., Etz, A., Lucas, R. E., & Donnellan, B. (2017, November 1). Making Replication Mainstream. Retrieved from

Everything you believe in is wrong – or is it simply terrorism?

The replication crisis has many interesting effects on how people (and scientists) think about Psychology (and, of course, other fields) … Here is a nice summary of effects that are hard to replicate. Among them ‘classics’ like the power pose or the big brother eyes.

A lot is happening because of these new insights in terms of research (e.g., replication studies) and communication (e.g., Fritz Strack on Facebook).

And then this: Susan Fiske in an upcoming piece in the APS Observer … I am really struggling with this rhetoric – Daniel Lakens to the rescue 🙂

Ah – and of course Gelman.

Everything is fucked …

This syllabus of an (obviously) awesome class has a ton of good reads:

Everything is fucked: The syllabus

by Sanjay Srivastava

I would have two additions:

  1. A multi lab replication project on ego-depletion (Hagger & Chatzisarantis, 2016)
  2. And the response from Roy Baumeister and Kathleen D. Vohs

It’s a really good statement of how f… up things are (in addition to all the other good examples above) …

“A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.” – Max Planck