Much behavioural science research involves experimental
testing. And any extensive programme of experimental research requires fast and
inexpensive access to experimental participants. Amazon Mechanical Turk (MTurk),
is becoming the default option for experimental behavioural science research.
Researchers post “HITs” (human intelligence tasks) on MTurk and can get
hundreds of participants per day at low rates (from $.10-$.20 per-participant).
MTurk is very popular in psychology, where the need for large quantities of
experimental participants is greatest, but is also applicable to experimental
economists and other behavioural scientists. But how does the quality of MTurk
data compare to traditional university-based participant pools?
Research on traditional university-based participant pools
is often termed W.E.I.R.D.
– being based on western, educated, industrialised, rich and democratic
countries – the argument being that this is a rather small slice of all
potential participants. MTurk has some defining features in this regard.
University participant pools are often predominately undergraduate students,
while MTurkers come from all age ranges and backgrounds. MTurk can be a
relatively small participant pool for researchers planning on doing a
series of experiments (without any participant taking part in more than one
experiment). MTurkers are often not-naïve, with many MTurkers having
participated in thousands of academic experiments. This is problematic for
the external validity of classical problems such as the cognitive reflection
test on MTurk. But this can also be useful for testing theories of
rationality, which often claim performance errors are caused by
participant-naivety, and are not relevant for the long-run decisions that
define real-world economic decisions.
Researchers often worry that MTurk data is unreliable
because MTurkers may be participating purely for (fairly low) pay, and not
paying attention. MTurk researchers often control for this via “attention
filters” which require MTurkers to carefully read a large block of text and
enter an unconventional response in order to proceed to the experiment, to
guarantee that they are paying attention. A
forthcoming paper by David Hauser and Norbert Schwarz tests the performance
of participants from MTurk and traditional university-based participant pools
on attention filters. Hauser and Schwarz find that MTurkers outperform on both
previously-used and novel attention filters, and that MTurkers show larger
between-group effect-sizes on a later measure (with effect-size being another
proxy of attention). MTurkers pay more attention than university-based
participants, and this is reflected both in pass-rates on attention checks and
standardised effect sizes. The university-based participants were completing
the experiments online at their own computer, so both groups were in similarly
distracting environments. This is an important finding, since university-based
participants are traditionally assumed to be sufficiently attentive for
findings based on their data to be readily generalised.
MTurk researchers have always innately distrusted the
reliability of their data given the high speed and low cost, and often run far
more participants per-cell than traditional lab-based experiments. Researchers
are also converging on a set of best practices to maximise the reliability of
MTurk data. This can help give us greater confidence on the
replicability of research findings from MTurk.
MTurk has become the default option for collecting high-quality
experimental data in the behavioural sciences. It is cheap, fast, and
relatively more diverse than traditional university-based participant pools.
MTurkers are not naïve, but often this leads to a higher bar for experimental
manipulations to attain statistical significance. While it is always dangerous
to extrapolate conclusions about the wider population from small samples,
this can be dealt with by replicating effects established on MTurk with
traditional participant pools. Given these advantages of MTurk, psychologists
have been quick to embrace this research tool, although MTurk has lots of
advantages for experimental researchers from across the behavioural sciences.
5 comments:
Turkers pay more attention to attention check questions. Turkers also have scripts available which highlight these questions so they stand out and are not missed. That doesn't mean they're paying more attention to all questions.
Spamgirl,
I believe you, but why go to the effort of using a script if they're just going to skim through the task questions? It seems like a person who spends hours preparing to cheat for an exam rather than just using the time to study.
Mark
Their payment is sometimes dependent on passing the attention checks, so it can make sense to look for those and pay attention to them, before just skimming to the rest.
Their payment is sometimes dependent on passing the attention checks, so it can make sense to look for those and pay attention to them, before just skimming to the rest.
@Mark - we take that same exam 10-20 times per day, so one script does 'em all. Scripts based on text, not tags, make it simple.
And @Améli is right, we will get a rejection if we don't pass the ACs, so we better pass those ACs, and if we can do it while satisficing, even better.
Post a Comment