Sunday, February 15, 2015

Are participants paying attention?

Much behavioural science research involves experimental testing. And any extensive programme of experimental research requires fast and inexpensive access to experimental participants. Amazon Mechanical Turk (MTurk), is becoming the default option for experimental behavioural science research. Researchers post “HITs” (human intelligence tasks) on MTurk and can get hundreds of participants per day at low rates (from $.10-$.20 per-participant). MTurk is very popular in psychology, where the need for large quantities of experimental participants is greatest, but is also applicable to experimental economists and other behavioural scientists. But how does the quality of MTurk data compare to traditional university-based participant pools?

Research on traditional university-based participant pools is often termed W.E.I.R.D. – being based on western, educated, industrialised, rich and democratic countries – the argument being that this is a rather small slice of all potential participants. MTurk has some defining features in this regard. University participant pools are often predominately undergraduate students, while MTurkers come from all age ranges and backgrounds. MTurk can be a relatively small participant pool for researchers planning on doing a series of experiments (without any participant taking part in more than one experiment). MTurkers are often not-naïve, with many MTurkers having participated in thousands of academic experiments. This is problematic for the external validity of classical problems such as the cognitive reflection test on MTurk. But this can also be useful for testing theories of rationality, which often claim performance errors are caused by participant-naivety, and are not relevant for the long-run decisions that define real-world economic decisions.

Researchers often worry that MTurk data is unreliable because MTurkers may be participating purely for (fairly low) pay, and not paying attention. MTurk researchers often control for this via “attention filters” which require MTurkers to carefully read a large block of text and enter an unconventional response in order to proceed to the experiment, to guarantee that they are paying attention. A forthcoming paper by David Hauser and Norbert Schwarz tests the performance of participants from MTurk and traditional university-based participant pools on attention filters. Hauser and Schwarz find that MTurkers outperform on both previously-used and novel attention filters, and that MTurkers show larger between-group effect-sizes on a later measure (with effect-size being another proxy of attention). MTurkers pay more attention than university-based participants, and this is reflected both in pass-rates on attention checks and standardised effect sizes. The university-based participants were completing the experiments online at their own computer, so both groups were in similarly distracting environments. This is an important finding, since university-based participants are traditionally assumed to be sufficiently attentive for findings based on their data to be readily generalised.

MTurk researchers have always innately distrusted the reliability of their data given the high speed and low cost, and often run far more participants per-cell than traditional lab-based experiments. Researchers are also converging on a set of best practices to maximise the reliability of MTurk data. This can help give us greater confidence on the replicability of research findings from MTurk.

MTurk has become the default option for collecting high-quality experimental data in the behavioural sciences. It is cheap, fast, and relatively more diverse than traditional university-based participant pools. MTurkers are not naïve, but often this leads to a higher bar for experimental manipulations to attain statistical significance. While it is always dangerous to extrapolate conclusions about the wider population from small samples, this can be dealt with by replicating effects established on MTurk with traditional participant pools. Given these advantages of MTurk, psychologists have been quick to embrace this research tool, although MTurk has lots of advantages for experimental researchers from across the behavioural sciences.

5 comments:

Spamgirl said...

Turkers pay more attention to attention check questions. Turkers also have scripts available which highlight these questions so they stand out and are not missed. That doesn't mean they're paying more attention to all questions.

Unknown said...

Spamgirl,
I believe you, but why go to the effort of using a script if they're just going to skim through the task questions? It seems like a person who spends hours preparing to cheat for an exam rather than just using the time to study.
Mark

Amélie said...

Their payment is sometimes dependent on passing the attention checks, so it can make sense to look for those and pay attention to them, before just skimming to the rest.

Amélie said...

Their payment is sometimes dependent on passing the attention checks, so it can make sense to look for those and pay attention to them, before just skimming to the rest.

Spamgirl said...

@Mark - we take that same exam 10-20 times per day, so one script does 'em all. Scripts based on text, not tags, make it simple.

And @Améli is right, we will get a rejection if we don't pass the ACs, so we better pass those ACs, and if we can do it while satisficing, even better.