Blind Haste (aka im Blindflug)

Chance encounters sometimes lead to interesting and new projects. This is one of those cases … I got to know Emanuel de Bellis during my time at Nestle and we never stopped collaborating ever since (he actually lead my 2017 ‘skype to’ statistics with my wife as a close second …)

We got our hands on a rich dataset of speed measures the police in Zürich (Switzerland) does unbeknownst to the drivers during they year for planning purposes. The radar is put into a small black box hardly anybody realises driving by:

(it’s the black box above the bin – not the bin!)

So – we got these data from over a million cars and got to work with them trying to find an answer to a question in perception research: Do people perceive their environment differently when light conditions deteriorate? and (even more important) Do drivers change their driving speed accordingly.

Well – they don’t … and here is correlational proof 🙂

Stay tuned for a causal demonstration – oh yes!

Blind haste: As light decreases, speeding increases

Worldwide, more than one million people die on the roads each year. A third of these fatal accidents are attributed to speeding, with properties of the individual driver and the environment regarded as key contributing factors. We examine real-world speeding behavior and its interaction with illuminance, an environmental property defined as the luminous flux incident on a surface. Drawing on an analysis of 1.2 million vehicle movements, we show that reduced illuminance levels are associated with increased speeding. This relationship persists when we control for factors known to influence speeding (e.g., fluctuations in traffic volume) and consider proxies of illuminance (e.g., sight distance). Our findings add to a long-standing debate about how the quality of visual conditions affects drivers’ speed perception and driving speed. Policy makers can intervene by educating drivers about the inverse illuminance‒speeding relationship and by testing how improved vehicle headlights and smart road lighting can attenuate speeding.

Everything you believe in is wrong – or is it simply terrorism?

The replication crisis has many interesting effects on how people (and scientists) think about Psychology (and, of course, other fields) … Here is a nice summary of effects that are hard to replicate. Among them ‘classics’ like the power pose or the big brother eyes.

A lot is happening because of these new insights in terms of research (e.g., replication studies) and communication (e.g., Fritz Strack on Facebook).

And then this: Susan Fiske in an upcoming piece in the APS Observer … I am really struggling with this rhetoric – Daniel Lakens to the rescue 🙂

Ah – and of course Gelman.

Everything is fucked …

This syllabus of an (obviously) awesome class has a ton of good reads:

Everything is fucked: The syllabus

by Sanjay Srivastava

I would have two additions:

  1. A multi lab replication project on ego-depletion (Hagger & Chatzisarantis, 2016)
  2. And the response from Roy Baumeister and Kathleen D. Vohs

It’s a really good statement of how f… up things are (in addition to all the other good examples above) …

“A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.” – Max Planck


How WEIRD subjects can be overcome … a comment on Henrich et al.

Joe Henrich published a target article in BBS talking about how economics and psychology base their research on WEIRD (Western, Educated, Industrialized, Rich and Democratic) subjects.

Here is the whole abstract:

Behavioral scientists routinely publish broad claims about human psychology and behavior in the world’s top journals based on samples drawn entirely from Western, Educated, Industrialized, Rich and Democratic (WEIRD) societies. Researchers—often implicitly—assume that either there is little variation across human populations, or that these “standard subjects” are as representative of the species as any other population. Are these assumptions justified? Here, our review of the comparative database from across the behavioral sciences suggests both that there is substantial variability in experimental results across populations and that WEIRD subjects are particularly unusual compared with the rest of the species—frequent outliers. The domains reviewed include visual perception, fairness, cooperation, spatial reasoning, categorization and inferential induction, moral reasoning, reasoning styles, self-concepts and related motivations, and the heritability of IQ. The findings suggest that members of WEIRD societies, including young children, are among the least representative populations one could find for generalizing about humans. Many of these findings involve domains that are associated with fundamental aspects of psychology, motivation, and behavior—hence, there are no obvious a priori grounds for claiming that a particular behavioral phenomenon is universal based on sampling from a single subpopulation. Overall, these empirical patterns suggests that we need to be less cavalier in addressing questions of human nature on the basis of data drawn from this particularly thin, and rather unusual, slice of humanity. We close by proposing ways to structurally re-organize the behavioral sciences to best tackle these challenges.

I would like to make three suggestions that could help to overcome the era of WEIRD subjects and generate more reliable and representative data. These suggestions will mainly touch contrasts 2, 3 and 4 elaborated by Henrich, Heine and Norezayan. While my suggestions tackle these contrasts from a technical and experimental perspective they do not provide a general solution for the first contrast on industrialized versus small scale societies. Here are my suggestions: 1) replications in multiple labs, 2) internet based experimentation and 3) drawing representative samples from a population.
The first suggestion, replication in multiple labs, foremost touches aspects like replication, multiple populations and open data access. For a publication in a journal a replication of an experiment in a different lab would be obligatory. The replication would then be published with the original, e.g., in the form of a comment. This would ensure that other research labs in other states or countries are involved and very different parts of the population could be sampled. Also results of experiments would be freely available to the public and the data sharing problem in Psychology, as described in the target article, but also in other fields like Medicine (Savage & Vieckers, 2009) would be a problem of the past. Of course such a step would be closely linked with certain standards on the one hand in building experiments and on the other hand in storing data. While a standard way to build experiments seems unlikely there are many methods available in computer science to store data in a reusable, for example through the usage of XML (Extensible Markup Language).
The second suggestion is based on the drawing of representative samples from the population. As described in the target article, research often suffers from a restriction to extreme subgroups from the population, from which generalized results are drawn. However, there is published work that overcomes these restrictions. As an example I would like to use the Hertwig, Zangerl, Biedert and Margraf (2008) paper on probabilistic numeracy. The authors based their study on a random-quote sample from the Swiss population including indicators as language, area where participant is living, gender and age. To fulfill all the necessary criteria 1000 participants were recruited using telephone interviews. Such studies are certainly more expensive and somewhat restricted to simpler experimental setups (Hertwig et al., used telephone interviews based on questionnaires).
The third suggestion adds additional data collection in a second location: the Internet. The emphasis in the last sentence should be set on ‘add’. Data collection solely Internet based is of course possible, already often performed and published in high impact journals. Online experimentation is technically much less demanding than ten years ago due to the availability of ready made solutions for questionnaires or even experiments. The point I would like to make here should not be built on a separation of lab and online based experiments. My suggestion combines these two research locations and enables a researcher to profit from the many benefits arising. A possible scenario could include running an experiment in the laboratory first to guarantee, among other things, high control on the situation in order to show an effect with a small, restricted sample. In a second step the experiment is transferred to the Web and run online, admittedly giving away some of the control but providing the large benefit of having access to a diverse, large samples of participants from different populations easily. As an example I would like to point to a recent blog and related experiments started by Paolacci and Warglien (2009) at the University of Venice, Italy. These researchers started replicating well known experiments from the decision making literature like framing, anchoring or the conjunction fallacy with a service called the Mechanical Turk provided by Amazon. This service is based on the idea of crowdsourcing (outsourcing a task to a large group of people) and lets a researcher have easy access to a large group of motivated participants.
Some final words on the combination and possible restrictions of the three suggestions. What would a combination of all three suggestions look like? It would be a replication of experiments, using representative samples of different populations in online experiments. This seems useful from a data quality, logistics and prize point of view. However, several issues were left untouched in my discussion, such as the question of independence of the second lab for replication studies, the restriction of representative samples to one country (as opposed to multiple comparisons as routinely found in, e.g., anthropological studies), the differences between online and lab based experimentation or the instances where equipment needed for an experiments (e.g., eye trackers or fMRI) does not allow for online experimentation. Keeping that in mind the above suggestions draw an idealized picture of how to run experiments and re-use the collected data, nevertheless I would argue that such steps could help to reduce the percentage of WEIRD subjects in research substantially.

Hertwig, R., Zangerl, M.A., Biedert, E., & Margraf, J. (2008). The Public’s Probabilistic Numeracy: How Tasks, Education and Exposure to Games of Chance Shape It. Journal of Behavioral Decision Making, 21, 457-570.

Paolacci, G., & Warglien, M. (2009). Experimental turk: A blog on social science experiments on Amazon Mechanical Turk. Accessed on November 17th 2009:

Savage, C.J., & Vickers, A.J. (2009). Empirical Study of Data Sharing by Authors Publishing in PLoS Journals. PLoS ONE 4(9): e7078.doi:10.1371/journal.pone.0007078

Flashlight paper draft

We submitted our Flashlight paper today. Find a draft at the address below:
Schulte-Mecklenbeck, Michael , Murphy, Ryan O. and Hutzler, Florian,Flashlight – an Online Eye-Tracking Tool(July 13, 2009). Available at SSRN:

The software will be uploaded to the project page

Have fun playing with it!

double, triple or even sextuple blind?

There is an interesting discussion on how the scientific review process should be handled going on at blog. The point is that the obvious shortcomings in the current review system (the authors know who the editor is (and vice versa), the reviewer knows (or can easily infer) who the author is …) can be handle through triple blind reviews: authors, reviewers AND editors are included (anonymous upload to a webpage (id through a code), quatruple blind reviews: no one know who the editor of the journal is, quintimple blind: after publication of a paper the authors name is kept secret for some years or, and that’s the actual kicker: sextuple blind: there is no journal name any more – just the paper and the users decide whether it is worth citing or not …
In a more detailed way you can find this approach explained here