Much behavioural science research involves experimental testing. And any extensive programme of experimental research requires fast and inexpensive access to experimental participants. Amazon Mechanical Turk (MTurk), is becoming the default option for experimental behavioural science research. Researchers post “HITs” (human intelligence tasks) on MTurk and can get hundreds of participants per day at low rates (from $.10-$.20 per-participant). MTurk is very popular in psychology, where the need for large quantities of experimental participants is greatest, but is also applicable to experimental economists and other behavioural scientists. But how does the quality of MTurk data compare to traditional university-based participant pools?
Research on traditional university-based participant pools is often termed W.E.I.R.D. – being based on western, educated, industrialised, rich and democratic countries – the argument being that this is a rather small slice of all potential participants. MTurk has some defining features in this regard. University participant pools are often predominately undergraduate students, while MTurkers come from all age ranges and backgrounds. MTurk can be a relatively small participant pool for researchers planning on doing a series of experiments (without any participant taking part in more than one experiment). MTurkers are often not-naïve, with many MTurkers having participated in thousands of academic experiments. This is problematic for the external validity of classical problems such as the cognitive reflection test on MTurk. But this can also be useful for testing theories of rationality, which often claim performance errors are caused by participant-naivety, and are not relevant for the long-run decisions that define real-world economic decisions.
Researchers often worry that MTurk data is unreliable because MTurkers may be participating purely for (fairly low) pay, and not paying attention. MTurk researchers often control for this via “attention filters” which require MTurkers to carefully read a large block of text and enter an unconventional response in order to proceed to the experiment, to guarantee that they are paying attention. A forthcoming paper by David Hauser and Norbert Schwarz tests the performance of participants from MTurk and traditional university-based participant pools on attention filters. Hauser and Schwarz find that MTurkers outperform on both previously-used and novel attention filters, and that MTurkers show larger between-group effect-sizes on a later measure (with effect-size being another proxy of attention). MTurkers pay more attention than university-based participants, and this is reflected both in pass-rates on attention checks and standardised effect sizes. The university-based participants were completing the experiments online at their own computer, so both groups were in similarly distracting environments. This is an important finding, since university-based participants are traditionally assumed to be sufficiently attentive for findings based on their data to be readily generalised.
MTurk researchers have always innately distrusted the reliability of their data given the high speed and low cost, and often run far more participants per-cell than traditional lab-based experiments. Researchers are also converging on a set of best practices to maximise the reliability of MTurk data. This can help give us greater confidence on the replicability of research findings from MTurk.
MTurk has become the default option for collecting high-quality experimental data in the behavioural sciences. It is cheap, fast, and relatively more diverse than traditional university-based participant pools. MTurkers are not naïve, but often this leads to a higher bar for experimental manipulations to attain statistical significance. While it is always dangerous to extrapolate conclusions about the wider population from small samples, this can be dealt with by replicating effects established on MTurk with traditional participant pools. Given these advantages of MTurk, psychologists have been quick to embrace this research tool, although MTurk has lots of advantages for experimental researchers from across the behavioural sciences.