A letter to the black goat

I wrote this letter to the black goat podcast … will update here if I hear back from them …


Dear goaters (is this a good way to address the three of you?),

I attended SIPS some weeks ago (first timer). I was unsure what to expect but got a lot of bang for my buck (which is great) – as a side note I would recommend first timers to try to go there with a concrete project or question or problem, I think there is a good chance to find people with similar issues interested in collaborations.
Here is an observation I would be interested to hear your thoughts on: in the SIPS world everything seems to be really straight forward – we want to ‘unfuck psych science’ (Lindsay, 2018), we want to pre-reg studies, upload data, learn R, comment on code, accelerate science, ‘do it right this time’, collaborate, respect, be inclusive (I really learned a lot in that regard listening to your podcast and talking to people at SIPS) …
All of that is great. I subscribe to all of these points.
Here is the twist – people @SIPS talk about ‘a movement’  (something that seems very American to me), maybe a movement is needed; people @SIPS talk about ‘a revolution’ – again, great! obviously there is a need to rattle the cage and accelerate things beyond ‘paradigm shifts at funerals’ (Planck?)
What happens if you go ‘Outside the SIPS Bubble’ (OSIPSB)?
I have no data (other than my own experiences) but I wonder whether the world (psychology, other sciences) is actually that ready, that willing and open to adopt to these new standards and to actually make a paradigm shift in how we do science. I work at a business school (consumer psych and JDM) there is a lot of finger pointing going on toward psychology (I have a similar feeling that within psychology there is a lot of finger pointing toward social psych) – ‘this is clearly a problem of psychologists but not of us [economists, consumer psychologists, business, accounting researchers …]’.
Another OSIPSB experience I had was talking to an Action Editor of the Journal of Consumer Psychology (JCR) last year – we got into a heated debate about the most basic issues, eg, sharing data, pre-reg studies …
Is this an observation you share? What would be good steps to address these issues? Should we talk about this within SIPS to get a better balance between enthusiasm and real world requirements (eg hiring decisions are still made mostly by senior faculty, assuming my observation hold, less interested in replication than ’new and exciting effects’ (quote anonymous senior AE of JCR).
Thanks for your thoughts!

Blind Haste (aka im Blindflug)

Chance encounters sometimes lead to interesting and new projects. This is one of those cases … I got to know Emanuel de Bellis during my time at Nestle and we never stopped collaborating ever since (he actually lead my 2017 ‘skype to’ statistics with my wife as a close second …)

We got our hands on a rich dataset of speed measures the police in Zürich (Switzerland) does unbeknownst to the drivers during they year for planning purposes. The radar is put into a small black box hardly anybody realises driving by:

(it’s the black box above the bin – not the bin!)

So – we got these data from over a million cars and got to work with them trying to find an answer to a question in perception research: Do people perceive their environment differently when light conditions deteriorate? and (even more important) Do drivers change their driving speed accordingly.

Well – they don’t … and here is correlational proof 🙂

Stay tuned for a causal demonstration – oh yes!

Blind haste: As light decreases, speeding increases

Worldwide, more than one million people die on the roads each year. A third of these fatal accidents are attributed to speeding, with properties of the individual driver and the environment regarded as key contributing factors. We examine real-world speeding behavior and its interaction with illuminance, an environmental property defined as the luminous flux incident on a surface. Drawing on an analysis of 1.2 million vehicle movements, we show that reduced illuminance levels are associated with increased speeding. This relationship persists when we control for factors known to influence speeding (e.g., fluctuations in traffic volume) and consider proxies of illuminance (e.g., sight distance). Our findings add to a long-standing debate about how the quality of visual conditions affects drivers’ speed perception and driving speed. Policy makers can intervene by educating drivers about the inverse illuminance‒speeding relationship and by testing how improved vehicle headlights and smart road lighting can attenuate speeding.

Everything you believe in is wrong – or is it simply terrorism?

The replication crisis has many interesting effects on how people (and scientists) think about Psychology (and, of course, other fields) … Here is a nice summary of effects that are hard to replicate. Among them ‘classics’ like the power pose or the big brother eyes.

A lot is happening because of these new insights in terms of research (e.g., replication studies) and communication (e.g., Fritz Strack on Facebook).

And then this: Susan Fiske in an upcoming piece in the APS Observer … I am really struggling with this rhetoric – Daniel Lakens to the rescue 🙂

Ah – and of course Gelman.

Everything is fucked …

This syllabus of an (obviously) awesome class has a ton of good reads:

Everything is fucked: The syllabus

by Sanjay Srivastava

I would have two additions:

  1. A multi lab replication project on ego-depletion (Hagger & Chatzisarantis, 2016)
  2. And the response from Roy Baumeister and Kathleen D. Vohs

It’s a really good statement of how f… up things are (in addition to all the other good examples above) …

“A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.” – Max Planck


How WEIRD subjects can be overcome … a comment on Henrich et al.

Joe Henrich published a target article in BBS talking about how economics and psychology base their research on WEIRD (Western, Educated, Industrialized, Rich and Democratic) subjects.

Here is the whole abstract:

Behavioral scientists routinely publish broad claims about human psychology and behavior in the world’s top journals based on samples drawn entirely from Western, Educated, Industrialized, Rich and Democratic (WEIRD) societies. Researchers—often implicitly—assume that either there is little variation across human populations, or that these “standard subjects” are as representative of the species as any other population. Are these assumptions justified? Here, our review of the comparative database from across the behavioral sciences suggests both that there is substantial variability in experimental results across populations and that WEIRD subjects are particularly unusual compared with the rest of the species—frequent outliers. The domains reviewed include visual perception, fairness, cooperation, spatial reasoning, categorization and inferential induction, moral reasoning, reasoning styles, self-concepts and related motivations, and the heritability of IQ. The findings suggest that members of WEIRD societies, including young children, are among the least representative populations one could find for generalizing about humans. Many of these findings involve domains that are associated with fundamental aspects of psychology, motivation, and behavior—hence, there are no obvious a priori grounds for claiming that a particular behavioral phenomenon is universal based on sampling from a single subpopulation. Overall, these empirical patterns suggests that we need to be less cavalier in addressing questions of human nature on the basis of data drawn from this particularly thin, and rather unusual, slice of humanity. We close by proposing ways to structurally re-organize the behavioral sciences to best tackle these challenges.

I would like to make three suggestions that could help to overcome the era of WEIRD subjects and generate more reliable and representative data. These suggestions will mainly touch contrasts 2, 3 and 4 elaborated by Henrich, Heine and Norezayan. While my suggestions tackle these contrasts from a technical and experimental perspective they do not provide a general solution for the first contrast on industrialized versus small scale societies. Here are my suggestions: 1) replications in multiple labs, 2) internet based experimentation and 3) drawing representative samples from a population.
The first suggestion, replication in multiple labs, foremost touches aspects like replication, multiple populations and open data access. For a publication in a journal a replication of an experiment in a different lab would be obligatory. The replication would then be published with the original, e.g., in the form of a comment. This would ensure that other research labs in other states or countries are involved and very different parts of the population could be sampled. Also results of experiments would be freely available to the public and the data sharing problem in Psychology, as described in the target article, but also in other fields like Medicine (Savage & Vieckers, 2009) would be a problem of the past. Of course such a step would be closely linked with certain standards on the one hand in building experiments and on the other hand in storing data. While a standard way to build experiments seems unlikely there are many methods available in computer science to store data in a reusable, for example through the usage of XML (Extensible Markup Language).
The second suggestion is based on the drawing of representative samples from the population. As described in the target article, research often suffers from a restriction to extreme subgroups from the population, from which generalized results are drawn. However, there is published work that overcomes these restrictions. As an example I would like to use the Hertwig, Zangerl, Biedert and Margraf (2008) paper on probabilistic numeracy. The authors based their study on a random-quote sample from the Swiss population including indicators as language, area where participant is living, gender and age. To fulfill all the necessary criteria 1000 participants were recruited using telephone interviews. Such studies are certainly more expensive and somewhat restricted to simpler experimental setups (Hertwig et al., used telephone interviews based on questionnaires).
The third suggestion adds additional data collection in a second location: the Internet. The emphasis in the last sentence should be set on ‘add’. Data collection solely Internet based is of course possible, already often performed and published in high impact journals. Online experimentation is technically much less demanding than ten years ago due to the availability of ready made solutions for questionnaires or even experiments. The point I would like to make here should not be built on a separation of lab and online based experiments. My suggestion combines these two research locations and enables a researcher to profit from the many benefits arising. A possible scenario could include running an experiment in the laboratory first to guarantee, among other things, high control on the situation in order to show an effect with a small, restricted sample. In a second step the experiment is transferred to the Web and run online, admittedly giving away some of the control but providing the large benefit of having access to a diverse, large samples of participants from different populations easily. As an example I would like to point to a recent blog and related experiments started by Paolacci and Warglien (2009) at the University of Venice, Italy. These researchers started replicating well known experiments from the decision making literature like framing, anchoring or the conjunction fallacy with a service called the Mechanical Turk provided by Amazon. This service is based on the idea of crowdsourcing (outsourcing a task to a large group of people) and lets a researcher have easy access to a large group of motivated participants.
Some final words on the combination and possible restrictions of the three suggestions. What would a combination of all three suggestions look like? It would be a replication of experiments, using representative samples of different populations in online experiments. This seems useful from a data quality, logistics and prize point of view. However, several issues were left untouched in my discussion, such as the question of independence of the second lab for replication studies, the restriction of representative samples to one country (as opposed to multiple comparisons as routinely found in, e.g., anthropological studies), the differences between online and lab based experimentation or the instances where equipment needed for an experiments (e.g., eye trackers or fMRI) does not allow for online experimentation. Keeping that in mind the above suggestions draw an idealized picture of how to run experiments and re-use the collected data, nevertheless I would argue that such steps could help to reduce the percentage of WEIRD subjects in research substantially.

Hertwig, R., Zangerl, M.A., Biedert, E., & Margraf, J. (2008). The Public’s Probabilistic Numeracy: How Tasks, Education and Exposure to Games of Chance Shape It. Journal of Behavioral Decision Making, 21, 457-570.

Paolacci, G., & Warglien, M. (2009). Experimental turk: A blog on social science experiments on Amazon Mechanical Turk. Accessed on November 17th 2009:

Savage, C.J., & Vickers, A.J. (2009). Empirical Study of Data Sharing by Authors Publishing in PLoS Journals. PLoS ONE 4(9): e7078.doi:10.1371/journal.pone.0007078

Flashlight paper draft

We submitted our Flashlight paper today. Find a draft at the address below:
Schulte-Mecklenbeck, Michael , Murphy, Ryan O. and Hutzler, Florian,Flashlight – an Online Eye-Tracking Tool(July 13, 2009). Available at SSRN: http://ssrn.com/abstract=1433225

The software will be uploaded to the project page http://vlab.ethz.ch/vLab_Decision_Theory_and_Behavioral_Game_Theory/Flashlight.html.

Have fun playing with it!