SLIDE 1 Everyone On Mechanical Turk is Above a Threshold
Using Facebook Ads to Measure Online Media Effects
Kevin Munger, Mario Luca, Jonathan Nagler, Joshua Tucker
Penn State University and Princeton University
September 7, 2018
SLIDE 2
Overview
Online media effects are rapidly changing—how do we keep up?
SLIDE 3
Overview
Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations?
SLIDE 4
Overview
Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations? Our intuitions about social media can be actively misleading—how do we adjust?
SLIDE 5
Overview
Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations? Our intuitions about social media can be actively misleading—how do we adjust? Takeaway from this paper: traditional sampling and traditional survey experiments fail to allow us to study low digital literacy populations
SLIDE 6
Overview
Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations? Our intuitions about social media can be actively misleading—how do we adjust? Takeaway from this paper: traditional sampling and traditional survey experiments fail to allow us to study low digital literacy populations Case study: clickbait!
SLIDE 7
What is clickbait?
“Clickbait” is a new term for an old phenomenon.
SLIDE 8
What is clickbait?
“Clickbait” is a new term for an old phenomenon. Media companies’ strategy always determined by technological, political, regulatory contexts
SLIDE 9
What is clickbait?
“Clickbait” is a new term for an old phenomenon. Media companies’ strategy always determined by technological, political, regulatory contexts New technology lowers cost of news production/distribution new entrants competing for attention
SLIDE 10
Classic Clickbait
SLIDE 11
Modern clickbait
Various formulations have come and gone; ongoing battle with Facebook
SLIDE 12
Modern clickbait
Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan
SLIDE 13 Modern clickbait
Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan
◮ High levels of affect polarization
SLIDE 14 Modern clickbait
Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan
◮ High levels of affect polarization ◮ Partisans are the biggest consumers of political news
SLIDE 15 Modern clickbait
Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan
◮ High levels of affect polarization ◮ Partisans are the biggest consumers of political news ◮ Signalling partisanship can supplant source cues in establishing
credibility
SLIDE 16 Modern clickbait
Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan
◮ High levels of affect polarization ◮ Partisans are the biggest consumers of political news ◮ Signalling partisanship can supplant source cues in establishing
credibility
One common form of partisan clickbait with potentially damaging consequences: emotional clickbait
SLIDE 17
Emotional clickbait
Turn a partisan headline into emotional clickbait by adding one of these phrases
SLIDE 18 Emotional clickbait
Turn a partisan headline into emotional clickbait by adding one of these phrases
◮ People are loving this:
SLIDE 19 Emotional clickbait
Turn a partisan headline into emotional clickbait by adding one of these phrases
◮ People are loving this: ◮ Democrats are freaking out:
SLIDE 20 Emotional clickbait
Turn a partisan headline into emotional clickbait by adding one of these phrases
◮ People are loving this: ◮ Democrats are freaking out: ◮ This will make you furious:
SLIDE 21 Emotional clickbait
Turn a partisan headline into emotional clickbait by adding one of these phrases
◮ People are loving this: ◮ Democrats are freaking out: ◮ This will make you furious: ◮ Republicans are shocked...
SLIDE 22
What we tried
Online survey experiment
SLIDE 23
What we tried
Online survey experiment Randomly assign respondents to one of four different headlines, keeping the story constant
SLIDE 24
What we tried
Online survey experiment Randomly assign respondents to one of four different headlines, keeping the story constant Look for effets on affective polarization, trust in media and information retention questions
SLIDE 25
Null Results from First MTurk Study
Tried again: made the subject matter more topical, added a placebo condition
SLIDE 26
Null Results from First MTurk Study
Tried again: made the subject matter more topical, added a placebo condition (After the pilot, we pre-registered the R code we used to analyze all results)
SLIDE 27
Null Results from Second MTurk Study
Tried again: shortened the survey, removed “preference for clickbait” questionnaire which could dampen treatment effects
SLIDE 28
Null Results from Third MTurk Study
SLIDE 29
Null Results from Third MTurk Study
Is Mturk the problem?
SLIDE 30
Null Results from Third MTurk Study
Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015)
SLIDE 31
Null Results from Third MTurk Study
Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018)
SLIDE 32
Null Results from Third MTurk Study
Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018) Econ-style experiments also largely replicate on MTurk compared to students or a nationally representative sample (Snowberg and Yariv, 2018)
SLIDE 33
Null Results from Third MTurk Study
Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018) Econ-style experiments also largely replicate on MTurk compared to students or a nationally representative sample (Snowberg and Yariv, 2018) But: MTurk users are all above a certain threshold of digital literacy
SLIDE 34
Null Results from Third MTurk Study
Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018) Econ-style experiments also largely replicate on MTurk compared to students or a nationally representative sample (Snowberg and Yariv, 2018) But: MTurk users are all above a certain threshold of digital literacy Actually interested in the effect of clickbait on the clickers (Leeper, 2016; Knox et al., 2014)
SLIDE 35
The Clickers
SLIDE 36 25 50 75 100 Age Density of Respondents
group
Facebook MTurk USA Ages of Online Samples
SLIDE 37
Null Results from the FB Study
We got the right sample and didn’t find results
SLIDE 38 False start Roll Off New Tab DVs
Percentage of Respondents
group
Facebook MTurk Attrition from Online Samples
SLIDE 39
Null Results from the FB Study
We got the right sample and didn’t find results Attrition was non-random and covaried with demographics of interest
SLIDE 40 25 50 75 Age Density of Respondents
group
Finished New Tab Other Ages of FB Sample at Attrition Points
SLIDE 41 25 50 75 100 Age Density of Respondents
group
Finished New Tab Other Ages of MTurk Sample at Attrition Points
SLIDE 42
Examine Predictors of Stopping at New Tab
Combine the data, run a fully interacted model to look at differential effects in the two samples
SLIDE 43
- Effect of Age on Stopping at New Tab: MTurk v Facebook
SLIDE 44
“Attention Checks” With Digitally Naive Populations
Passed attention check MTurk: 82% Passed attention check FB: 57%
SLIDE 45 Time Spent on Headline Choice Sets
- Attention Check Time on Stopping at New Tab: MTurk v Facebook
SLIDE 46 Time Spent on Headline Choice Sets
- Attention Check Time on Stopping at New Tab: MTurk v Facebook
- Placebo Choice Time on Stopping at New Tab: MTurk v Facebook
SLIDE 47
Attention Checks
Combine the data, run a fully interacted model to predict missing the attention check
SLIDE 48
Attention Checks
Combine the data, run a fully interacted model to predict missing the attention check Differential effects (similar to above)
SLIDE 49 Attention Checks
Combine the data, run a fully interacted model to predict missing the attention check Differential effects (similar to above)
- Effect of Age on Missing Attention Check: MTurk v Facebook
SLIDE 50 Attention Checks
Combine the data, run a fully interacted model to predict missing the attention check Differential effects (similar to above)
- Effect of Age on Missing Attention Check: MTurk v Facebook
Conditional on all covariates, “effect” of being in the Facebook sample is negative and significant (p < .05)
SLIDE 51
Non-Numeric Ages Entered Into Text Box: Digital Dexterity
Open-response question: “How old are you”
SLIDE 52
Non-Numeric Ages Entered Into Text Box: Digital Dexterity
Open-response question: “How old are you” All but 1 MTurker entered a number
SLIDE 53
Non-Numeric Ages Entered Into Text Box: Digital Dexterity
Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:
SLIDE 54 Non-Numeric Ages Entered Into Text Box: Digital Dexterity
Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:
◮ Seventy one years
SLIDE 55 Non-Numeric Ages Entered Into Text Box: Digital Dexterity
Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:
◮ Seventy one years ◮ 78 and not senile.
SLIDE 56 Non-Numeric Ages Entered Into Text Box: Digital Dexterity
Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:
◮ Seventy one years ◮ 78 and not senile. ◮ 68 yrs. Old. Live. Chicago. With. My. Sister. And. Her.
SLIDE 57 Non-Numeric Ages Entered Into Text Box: Digital Dexterity
Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:
◮ Seventy one years ◮ 78 and not senile. ◮ 68 yrs. Old. Live. Chicago. With. My. Sister. And. Her.
◮ ,67
SLIDE 58
Older People Using Mechanical Turk
Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016)
SLIDE 59
Older People Using Mechanical Turk
Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016) Preliminary survey: biggest barrier to crowdwork is not knowing what crowdwork is
SLIDE 60
Older People Using Mechanical Turk
Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016) Preliminary survey: biggest barrier to crowdwork is not knowing what crowdwork is In-depth study of non-crowdworkers encouraged to sign up and complete basic tasks
SLIDE 61 Older People Using Mechanical Turk
Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016) Preliminary survey: biggest barrier to crowdwork is not knowing what crowdwork is In-depth study of non-crowdworkers encouraged to sign up and complete basic tasks Modal respondent reported having used the internet for 10 years
SLIDE 62
Older People Using Mechanical Turk
many participants were not familiar or comfortable with opening content in new tabs/windows....‘How do I get back to the instructions? (P7)’....P3 explained: ‘There’s too many things to remember all at once...One of my complaints about some things on a computer is that, you know, if there’s a bunch of instructions or stuff to know and you have to open up a box and then if you go back to what you’re working on the box is gone, and you can’t just look up [sic] and reference it. hese barriers, which may seem trivial from a requester’s perspective, significantly affected older adults’ abilities and time required to complete the tasks. (Brewer, Morris and Piper, 2016)
SLIDE 63
“Reflexivity”
Twitter bot experiments—isn’t it obvious that these are bots?
SLIDE 64
“Reflexivity”
Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading
SLIDE 65
“Reflexivity”
Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious
SLIDE 66 “Reflexivity”
Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious
◮ As an American studying other Americans engaged with online
politics, my intuition seems like it should be more applicable than eg in cross-national ethnography, or study of online racists
SLIDE 67 “Reflexivity”
Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious
◮ As an American studying other Americans engaged with online
politics, my intuition seems like it should be more applicable than eg in cross-national ethnography, or study of online racists
◮ “But I have hundreds of friends on Twitter who are like me”-effect
SLIDE 68 “Reflexivity”
Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious
◮ As an American studying other Americans engaged with online
politics, my intuition seems like it should be more applicable than eg in cross-national ethnography, or study of online racists
◮ “But I have hundreds of friends on Twitter who are like me”-effect
Models that describe us don’t describe most people
SLIDE 69 “Reflexivity”: Online Echo Chambers
Considerable energy has been spent investigating the phenomenon
SLIDE 70 “Reflexivity”: Online Echo Chambers
Considerable energy has been spent investigating the phenomenon
My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks
SLIDE 71 “Reflexivity”: Online Echo Chambers
Considerable energy has been spent investigating the phenomenon
My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks
◮ Including, of course, Computational Social Scientists and
journalists
SLIDE 72 “Reflexivity”: Online Echo Chambers
Considerable energy has been spent investigating the phenomenon
My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks
◮ Including, of course, Computational Social Scientists and
journalists
Science works: empirical consensus falsified the theory of ubiquitous echo chambers
SLIDE 73 “Reflexivity”: Online Echo Chambers
Considerable energy has been spent investigating the phenomenon
My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks
◮ Including, of course, Computational Social Scientists and
journalists
Science works: empirical consensus falsified the theory of ubiquitous echo chambers But the supply of social science research is inelastic, so there are serious opportunity costs
SLIDE 74
Costs of the Bourdieusian Scholastic View
Online Echo Chambers Fake News Experienced by CSS YES NO
SLIDE 75
Costs of the Bourdieusian Scholastic View
Online Echo Chambers Fake News Experienced by CSS YES NO Experienced by public NO NO
SLIDE 76
Costs of the Bourdieusian Scholastic View
Online Echo Chambers Fake News Experienced by CSS YES NO Experienced by public NO NO Specific sub-populations YES YES
SLIDE 77
Costs of the Bourdieusian Scholastic View
Online Echo Chambers Fake News Experienced by CSS YES NO Experienced by public NO NO Specific sub-populations YES YES Studied by CSS TOO MUCH TOO LATE
SLIDE 78 2005−2015 2016−2018
25 50 75 100 125 Time_Period Number of Hits (Source: Political Science''
group
Echo Chamber Fake News Shifting Focus of Political Science: Knowledge Production
SLIDE 79
Survey Generalizability
“The Generalizability of Survey Experiments” (Mullinix et al., 2015):
SLIDE 80 Survey Generalizability
“The Generalizability of Survey Experiments” (Mullinix et al., 2015):
◮ General point: survey experiments generalize...
SLIDE 81 Survey Generalizability
“The Generalizability of Survey Experiments” (Mullinix et al., 2015):
◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a
student sample where a moderator is age”
SLIDE 82 Survey Generalizability
“The Generalizability of Survey Experiments” (Mullinix et al., 2015):
◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a
student sample where a moderator is age”
Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors
SLIDE 83 Survey Generalizability
“The Generalizability of Survey Experiments” (Mullinix et al., 2015):
◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a
student sample where a moderator is age”
Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors Clearly excludes MTurk
SLIDE 84 Survey Generalizability
“The Generalizability of Survey Experiments” (Mullinix et al., 2015):
◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a
student sample where a moderator is age”
Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors Clearly excludes MTurk Excludes even our over-sample of digital naives due to difficulty of survey instruments we never thought existed
SLIDE 85 Survey Generalizability
“The Generalizability of Survey Experiments” (Mullinix et al., 2015):
◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a
student sample where a moderator is age”
Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors Clearly excludes MTurk Excludes even our over-sample of digital naives due to difficulty of survey instruments we never thought existed Moving forward: develop better sampling and survey techniques for studying low digital literacy populations
SLIDE 86
Thank You
SLIDE 87 Brewer, Robin, Meredith Ringel Morris, and Anne Marie Piper. 2016. Why would anybody do this?: Understanding older adults’ motivations and challenges in crowd work. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM pp. 2246–2257. Coppock, Alexander. 2018. “Generalizing from survey experiments conducted on mechanical Turk: A replication approach.” Political Science Research and Methods pp. 1–16. Huff, Connor, and Dustin Tingley. 2015. “”Who Are These People?” Evaluating the Demographic Characteristics and Political Preferences of MTurk Survey Respondents.” Research and Politics 2 (1): 1–12. Knox, Dean, Teppei Yamamoto, Matthew A Baum, and Adam
- Berinsky. 2014. Design, Identification, and Sensitivity Analysis for
Patient Preference Trials. Technical report Working Paper. Leeper, Thomas J. 2016. “How does treatment self-selection affect inferences about political communication?” Journal of Experimental Political Science . Mullinix, Kevin J, Thomas J Leeper, James N Druckman, and Jeremy
SLIDE 88
- Freese. 2015. “The generalizability of survey experiments.” Journal
- f Experimental Political Science 2 (2): 109–138.
Snowberg, Erik, and Leeat Yariv. 2018. “Testing the Waters: Behavior across Participant Pools.”.