Something about reverse inference

Often, when we run process tracing studies (e.g., eye-tracking, mouse-tracking, thinking-aloud) we talk about cognitive processes (things we can’t observe) in a way that they are actually and directly observable. This is pretty weird – which becomes obvious when looking at the data from the paper below. In this paper we simply instruct participants to follow a strategy when making choices between risky gamble problems. Taking the example of fixation duration we see that there is surprisingly litte difference between calculating an expected value, using a heuristic (priority heuristic) and just making decisions without instructions (no instruction) … maybe we should rethink our mapping of observation to cognitive processes a bit?

Here is the paper:

Schulte-Mecklenbeck, M., Kühberger, A., Gagl, S., & Hutzler, F. (in press). Inducing thought processes: Bringing process measures and cognitive processes closer together. Journal of Behavioral Decision Making. [ PDF ]


The challenge in inferring cognitive processes from observational data is to correctly align overt behavior with its covert cognitive process. To improve our understanding of the overt–covert mapping in the domain of decision making, we collected eye-movement data during decisions between gamble-problems. Participants were either free to choose or instructed to use a specific choice strategy (maximizing expected value or a choice heuristic). We found large differences in looking patterns between free and instructed choices. Looking patterns provided no support for the common assumption that attention is equally distributed between outcomes and probabilities, even when participants were instructed to maximize expected value. Eye-movement data are to some extent ambiguous with respect to underlying cognitive processes.

Eye-Tracking with N > 1

This is one of the fastest papers I have ever written. It was a great collaboration with Tomás Lejarraga from the Universitat de les Illes Balears. Why was it great? Because it is one of the rare cases (at least in my academic life) where all people involved in a project contribute equally and quickly. Often, the weight of a contribution lies with one person which slows down things – with Tomás this was different – we were often sitting in front of a computer writing together (have never done this before, thought it would not work). Surprisingly this collaborative writing worked out very well and we had the skeleton of the paper within an afternoon. This was followed by many hours of tuning and tacking turns – but in principle we wrote the most important parts together – which was pretty cool.

Even cooler – you can do eye-tracking in groups, using our code.

Here is the [PDF] and abstract:

The recent introduction of inexpensive eye-trackers has opened up a wealth of opportunities for researchers to study attention in interactive tasks. No software package was previously available to help researchers exploit those opportunities. We created “the pyeTribe”, a software package that offers, among others, the following features: First, a communication platform between many eye-trackers to allow simultaneous recording of multiple participants. Second, the simultaneous calibration of multiple eye-trackers without the experimenter’s supervision. Third, data collection restricted to periods of interest, thus reducing the volume of data and easing analysis. We used a standard economic game (the public goods game) to examine data quality and demonstrate the potential of our software package. Moreover, we conducted a modeling analysis, which illustrates how combining process and behavioral data can improve models of human decision making behavior in social situations. Our software is open source and can thus be used and improved by others.


The exams package

I gave the R package exams a shot for my decision making lecture. Here is what it does:

“Automatic generation of exams based on exercises in Sweave (R/LaTeX) or R/Markdown format, including multiple-choice questions and arithmetic problems. Exams can be produced in various formats, including PDF, HTML, Moodle XML, QTI 1.2 (for OLAT/OpenOLAT), QTI 2.1, ARSnova, and TCExam. In addition to fully customizable PDF exams, a standardized PDF format is provided that can be printed, scanned, and automatically evaluated.”

After some fiddling and help from one of the authors (the incredible nice Achim Zeileis, Uni Innsbruck)  I got the following setup going:

  • pool of ~ 100 questions in .Rmd format (all multiple choice, 3-6 answer options) grouped into lectures
  • sampling out of the pool (e.g., 5 questions out of each lecture)
  • random order of questions in each version of the exam (while keeping the lecture order, which I think is useful to give student more structure to work from)
  • random order of the answers for each question
  • exam with the correct answers

Screen Shot 2016-06-11 at 11.00.22









There are three parts:

  1. questions[] defining the answers to a question
  2. solutions[] defining the correct answers
  3. in LaTeX the actual question

All of this information goes into an .Rmd file.

Once this is done one has to define the questions to be included (the pool) and set the details for the selection process:

sol <- exams2pdf(myexam, 
n = 2, 
nsamp = 5, 
dir = odir, 
 template = c("my_exam", "solution"), 
 encoding = 'UTF-8',
 header = list(Date = "10.06.2016")

This code would give me 2 exams with a sample of 5 questions out of each block of questions.

Pretty awesome (after some setup work).

Thanks Achim et al. !!


all that mutate() and summarise() beauty

The friendly people from RStudio recently started a webinar series with talks on the following topics (among others):

Data wrangling with R and RStudio
The Grammar and Graphics of Data Science (both dplyr happiness)
RStudio and Shiny

… and many more.

Our friend Dr. Nathaniel D. Philipps also started a cool R course with videos, shiny apps and many other new goodies.



*apply in all its variations …

Here is an excellent stackoverflow post on how *apply in all its variations can be used.
One of the followups points at plyr (from demi-R-god Hadley Wickham) which provides a consistent naming convention for all the *apply variations. I like plyr a lot, because like ggplot, it is easy to grasp and relatively intuitive to find an answer to even tricky problems.

Here is the translation from *apply to plyr …

Base function   Input   Output   plyr function 
aggregate        d       d       ddply + colwise 
apply            a       a/l     aaply / alply 
by               d       l       dlply 
lapply           l       l       llply  
mapply           a       a/l     maply / mlply 
replicate        r       a/l     raply / rlply 
sapply           l       a       laply 


R Style Guide

This is mainly a note to self:

There are several style guides for R out there. I particularly like the one from Google and the somewhat lighter version of Hadley (ggplot god).

All of that style guide thinking started after a question on about R workflow … How do we organize large R projects. Hadley (again) is favoring an Load-Clean-Func-Do approach which looks somewhat like that:

  • load.R # load data
  • clean.R # clean up crap
  • func.R # add functions
  • do.R # do the work

I kind of started doing something along these lines, with splitting files into load/clean (still together, could go separate …), cleaning, graphing (which does not make a lot of sense in an extra file) and large junks of analysis … got to redo some directories now …

Other cool links from today’s follow-this-link trip: and

Why anybody should learn/use R …

I had a discussion the other day on the re-appearing topic why one should learn R …
I took the list below from the R-Bloggers which argues why grad students should learn R:

  • R is free, and lets grad students escape the burdens of commercial license costs.
  • R has really good online documentation; and the community is unparalleled.  
  • The command-line interface is perfect for learning by doing. 
  • R is on the cutting edge, and expanding rapidly.
  • The R programming language is intuitive.  
  • R creates stunning visuals. 
  • R and LaTeX work together — seamlessly. 
  • R is used by practitioners in a plethora of academic disciplines. 
  • R makes you think.  
  • There’s always more than one way to accomplish something.

This is a great list – I would add that from the perspective of an university it makes sense to save a lot of money in not having to buy licenses. And reproducability is great with R because the code is always written in a text-file and not bound by software versions (as in other three or four letter (feel free to combine from: [A, P, S]) packages).