Amazon Mechanical Turk is service often used in psychology research where workers are paid some small amount of money to complete tasks, in the case of psychology research this is typically a survey. The consensus appears to be that mTurk data is at least as high quality as student samples (Bartneck et. al., 2015).
To validate the quality of the data collected on this website, a survey was run on this website and then on Mechanical Turk.
The survey consisted of two pages. On the first there were 26 items that were rated on a five point scale, and on the 2nd page there were six additional demographic questions.
The survey was designed so that data validity could be looked at in several ways.
The first comparison to be made is the endorsement of items that are very unlikely to be true. The survey contained two of those.
The base rate of severe electrical burns is presumably low. Gordon, Reid, Awwaad (1986) report a rate of 2.6 cases per million per year, so we should expect extremely rare agreement with this item.
Response selected | Current website (n=54,631) | mTurk (n=1,403) |
---|---|---|
[NONE] | 1.10% | 0.42% |
Strongly disagree | 88.23% | 76.81% |
Disagree | 7.92% | 11.05% |
Neutral | 1.60% | 4.63% |
Agree | 0.62% | 4.77% |
Strongly agree | 0.82% | 2.28% |
Both sources of respondents reported implausibly high rates of electrical burns, but this presumably invalid responding was more prevalent in AMT users. AMT users were 4.9 times more likely to select Agree or Strongly agree in response to this item.
The portion of people who own goats is presumably low. Statistics on this were hard to find, but this low quality source suggests that in the UK 53,000 households keep goats (compared to a population of 65 million), so agreement with this item should be extremely rare.
Response selected | Current website (n=54,631) | mTurk (n=1,403) |
---|---|---|
[NONE] | 1.14% | 1.06% |
Strongly disagree | 88.55% | 79.52% |
Disagree | 6.91% | 8.55% |
Neutral | 1.71% | 4.27% |
Agree | 0.74% | 4.27% |
Strongly agree | 1.21% | 2.28% |
Respondents from both sources reported an implausibly high rate of goat ownership, but again AMT users were worse. AMT users were 3.9 times more likely to select Agree or Strongly agree.
The second comparison is of inconsistent responding. If a respondent is providing valid responses, they should not give responses that are incompatible with each other.
This survey contained one pair of items where if you agreed with one you should disagree with the other. These two items were "I am tall" and "I am short". The table below takes the data from people who strongly agreed with "I am short" and breaks down their responses to the item "I am tall".
Response to "I am tall" | Users of this website who selected "Strongly agree" for "I am short" (n=7,132) | AMT users who selected "Strongly agree" for "I am short" (n=189) |
---|---|---|
[NONE] | 0.42% | 0.52% |
Strongly disagree | 92.31% | 79.36% |
Disagree | 4.64% | 5.82% |
Neutral | 0.85% | 0.52% |
Agree | 0.70% | 5.29% |
Strongly agree | 1.06% | 8.46% |
AMT respondents were 7.8 times more likely to go on to Agree or Strongly agree with the statement "I am tall" after already selecting Strongly agree for the statement "I am short". It should be pointed out that the 13.7% rate of invalid responding on this question can not be taken as an absolute rate, because it applies to only a subset of the respondents, and this subset could be more likely to be low quality responders. For example, if valid responders give responses with a peak at three and invalid responders give responses distributed randomly, a higher percentage of people responding 5 or 1 will be invalid responders than those who responded 3.
feet, inches |
centimeters |
[social media scripts are not loaded until requested.]