Everyone On Mechanical Turk is Above a Threshold of Digital - - PowerPoint PPT Presentation

everyone on mechanical turk is above a threshold of
SMART_READER_LITE
LIVE PREVIEW

Everyone On Mechanical Turk is Above a Threshold of Digital - - PowerPoint PPT Presentation

Everyone On Mechanical Turk is Above a Threshold of Digital Literacy: Using Facebook Ads to Measure Online Media Effects Kevin Munger, Mario Luca, Jonathan Nagler, Joshua Tucker Penn State University and Princeton University September 7, 2018


slide-1
SLIDE 1

Everyone On Mechanical Turk is Above a Threshold

  • f Digital Literacy:

Using Facebook Ads to Measure Online Media Effects

Kevin Munger, Mario Luca, Jonathan Nagler, Joshua Tucker

Penn State University and Princeton University

September 7, 2018

slide-2
SLIDE 2

Overview

Online media effects are rapidly changing—how do we keep up?

slide-3
SLIDE 3

Overview

Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations?

slide-4
SLIDE 4

Overview

Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations? Our intuitions about social media can be actively misleading—how do we adjust?

slide-5
SLIDE 5

Overview

Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations? Our intuitions about social media can be actively misleading—how do we adjust? Takeaway from this paper: traditional sampling and traditional survey experiments fail to allow us to study low digital literacy populations

slide-6
SLIDE 6

Overview

Online media effects are rapidly changing—how do we keep up? Online media effects are uniquely heterogeneous—how do we generalize, how do we study the correct populations? Our intuitions about social media can be actively misleading—how do we adjust? Takeaway from this paper: traditional sampling and traditional survey experiments fail to allow us to study low digital literacy populations Case study: clickbait!

slide-7
SLIDE 7

What is clickbait?

“Clickbait” is a new term for an old phenomenon.

slide-8
SLIDE 8

What is clickbait?

“Clickbait” is a new term for an old phenomenon. Media companies’ strategy always determined by technological, political, regulatory contexts

slide-9
SLIDE 9

What is clickbait?

“Clickbait” is a new term for an old phenomenon. Media companies’ strategy always determined by technological, political, regulatory contexts New technology lowers cost of news production/distribution new entrants competing for attention

slide-10
SLIDE 10

Classic Clickbait

slide-11
SLIDE 11

Modern clickbait

Various formulations have come and gone; ongoing battle with Facebook

slide-12
SLIDE 12

Modern clickbait

Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan

slide-13
SLIDE 13

Modern clickbait

Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan

◮ High levels of affect polarization

slide-14
SLIDE 14

Modern clickbait

Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan

◮ High levels of affect polarization ◮ Partisans are the biggest consumers of political news

slide-15
SLIDE 15

Modern clickbait

Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan

◮ High levels of affect polarization ◮ Partisans are the biggest consumers of political news ◮ Signalling partisanship can supplant source cues in establishing

credibility

slide-16
SLIDE 16

Modern clickbait

Various formulations have come and gone; ongoing battle with Facebook Political clickbait is necessarily partisan

◮ High levels of affect polarization ◮ Partisans are the biggest consumers of political news ◮ Signalling partisanship can supplant source cues in establishing

credibility

One common form of partisan clickbait with potentially damaging consequences: emotional clickbait

slide-17
SLIDE 17

Emotional clickbait

Turn a partisan headline into emotional clickbait by adding one of these phrases

slide-18
SLIDE 18

Emotional clickbait

Turn a partisan headline into emotional clickbait by adding one of these phrases

◮ People are loving this:

slide-19
SLIDE 19

Emotional clickbait

Turn a partisan headline into emotional clickbait by adding one of these phrases

◮ People are loving this: ◮ Democrats are freaking out:

slide-20
SLIDE 20

Emotional clickbait

Turn a partisan headline into emotional clickbait by adding one of these phrases

◮ People are loving this: ◮ Democrats are freaking out: ◮ This will make you furious:

slide-21
SLIDE 21

Emotional clickbait

Turn a partisan headline into emotional clickbait by adding one of these phrases

◮ People are loving this: ◮ Democrats are freaking out: ◮ This will make you furious: ◮ Republicans are shocked...

slide-22
SLIDE 22

What we tried

Online survey experiment

slide-23
SLIDE 23

What we tried

Online survey experiment Randomly assign respondents to one of four different headlines, keeping the story constant

slide-24
SLIDE 24

What we tried

Online survey experiment Randomly assign respondents to one of four different headlines, keeping the story constant Look for effets on affective polarization, trust in media and information retention questions

slide-25
SLIDE 25

Null Results from First MTurk Study

Tried again: made the subject matter more topical, added a placebo condition

slide-26
SLIDE 26

Null Results from First MTurk Study

Tried again: made the subject matter more topical, added a placebo condition (After the pilot, we pre-registered the R code we used to analyze all results)

slide-27
SLIDE 27

Null Results from Second MTurk Study

Tried again: shortened the survey, removed “preference for clickbait” questionnaire which could dampen treatment effects

slide-28
SLIDE 28

Null Results from Third MTurk Study

slide-29
SLIDE 29

Null Results from Third MTurk Study

Is Mturk the problem?

slide-30
SLIDE 30

Null Results from Third MTurk Study

Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015)

slide-31
SLIDE 31

Null Results from Third MTurk Study

Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018)

slide-32
SLIDE 32

Null Results from Third MTurk Study

Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018) Econ-style experiments also largely replicate on MTurk compared to students or a nationally representative sample (Snowberg and Yariv, 2018)

slide-33
SLIDE 33

Null Results from Third MTurk Study

Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018) Econ-style experiments also largely replicate on MTurk compared to students or a nationally representative sample (Snowberg and Yariv, 2018) But: MTurk users are all above a certain threshold of digital literacy

slide-34
SLIDE 34

Null Results from Third MTurk Study

Is Mturk the problem? There were pretty big differences between MTurk and CCES in 2012 (Huff and Tingley, 2015) Many classic (non-digital) experiments replicate on MTurk (Coppock, 2018) Econ-style experiments also largely replicate on MTurk compared to students or a nationally representative sample (Snowberg and Yariv, 2018) But: MTurk users are all above a certain threshold of digital literacy Actually interested in the effect of clickbait on the clickers (Leeper, 2016; Knox et al., 2014)

slide-35
SLIDE 35

The Clickers

slide-36
SLIDE 36

25 50 75 100 Age Density of Respondents

group

Facebook MTurk USA Ages of Online Samples

slide-37
SLIDE 37

Null Results from the FB Study

We got the right sample and didn’t find results

slide-38
SLIDE 38

False start Roll Off New Tab DVs

Percentage of Respondents

group

Facebook MTurk Attrition from Online Samples

slide-39
SLIDE 39

Null Results from the FB Study

We got the right sample and didn’t find results Attrition was non-random and covaried with demographics of interest

slide-40
SLIDE 40

25 50 75 Age Density of Respondents

group

Finished New Tab Other Ages of FB Sample at Attrition Points

slide-41
SLIDE 41

25 50 75 100 Age Density of Respondents

group

Finished New Tab Other Ages of MTurk Sample at Attrition Points

slide-42
SLIDE 42

Examine Predictors of Stopping at New Tab

Combine the data, run a fully interacted model to look at differential effects in the two samples

slide-43
SLIDE 43
  • Effect of Age on Stopping at New Tab: MTurk v Facebook
slide-44
SLIDE 44

“Attention Checks” With Digitally Naive Populations

Passed attention check MTurk: 82% Passed attention check FB: 57%

slide-45
SLIDE 45

Time Spent on Headline Choice Sets

  • Attention Check Time on Stopping at New Tab: MTurk v Facebook
slide-46
SLIDE 46

Time Spent on Headline Choice Sets

  • Attention Check Time on Stopping at New Tab: MTurk v Facebook
  • Placebo Choice Time on Stopping at New Tab: MTurk v Facebook
slide-47
SLIDE 47

Attention Checks

Combine the data, run a fully interacted model to predict missing the attention check

slide-48
SLIDE 48

Attention Checks

Combine the data, run a fully interacted model to predict missing the attention check Differential effects (similar to above)

slide-49
SLIDE 49

Attention Checks

Combine the data, run a fully interacted model to predict missing the attention check Differential effects (similar to above)

  • Effect of Age on Missing Attention Check: MTurk v Facebook
slide-50
SLIDE 50

Attention Checks

Combine the data, run a fully interacted model to predict missing the attention check Differential effects (similar to above)

  • Effect of Age on Missing Attention Check: MTurk v Facebook

Conditional on all covariates, “effect” of being in the Facebook sample is negative and significant (p < .05)

slide-51
SLIDE 51

Non-Numeric Ages Entered Into Text Box: Digital Dexterity

Open-response question: “How old are you”

slide-52
SLIDE 52

Non-Numeric Ages Entered Into Text Box: Digital Dexterity

Open-response question: “How old are you” All but 1 MTurker entered a number

slide-53
SLIDE 53

Non-Numeric Ages Entered Into Text Box: Digital Dexterity

Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:

slide-54
SLIDE 54

Non-Numeric Ages Entered Into Text Box: Digital Dexterity

Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:

◮ Seventy one years

slide-55
SLIDE 55

Non-Numeric Ages Entered Into Text Box: Digital Dexterity

Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:

◮ Seventy one years ◮ 78 and not senile.

slide-56
SLIDE 56

Non-Numeric Ages Entered Into Text Box: Digital Dexterity

Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:

◮ Seventy one years ◮ 78 and not senile. ◮ 68 yrs. Old. Live. Chicago. With. My. Sister. And. Her.

  • Husband. I am. Wildow
slide-57
SLIDE 57

Non-Numeric Ages Entered Into Text Box: Digital Dexterity

Open-response question: “How old are you” All but 1 MTurker entered a number 39 Facebookers entered [eg]:

◮ Seventy one years ◮ 78 and not senile. ◮ 68 yrs. Old. Live. Chicago. With. My. Sister. And. Her.

  • Husband. I am. Wildow

◮ ,67

slide-58
SLIDE 58

Older People Using Mechanical Turk

Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016)

slide-59
SLIDE 59

Older People Using Mechanical Turk

Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016) Preliminary survey: biggest barrier to crowdwork is not knowing what crowdwork is

slide-60
SLIDE 60

Older People Using Mechanical Turk

Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016) Preliminary survey: biggest barrier to crowdwork is not knowing what crowdwork is In-depth study of non-crowdworkers encouraged to sign up and complete basic tasks

slide-61
SLIDE 61

Older People Using Mechanical Turk

Qualitative study of older, non-Mturk users: they can’t do basic tasks on MTurk (Brewer, Morris, and Piper, 2016) Preliminary survey: biggest barrier to crowdwork is not knowing what crowdwork is In-depth study of non-crowdworkers encouraged to sign up and complete basic tasks Modal respondent reported having used the internet for 10 years

  • r more
slide-62
SLIDE 62

Older People Using Mechanical Turk

many participants were not familiar or comfortable with opening content in new tabs/windows....‘How do I get back to the instructions? (P7)’....P3 explained: ‘There’s too many things to remember all at once...One of my complaints about some things on a computer is that, you know, if there’s a bunch of instructions or stuff to know and you have to open up a box and then if you go back to what you’re working on the box is gone, and you can’t just look up [sic] and reference it. hese barriers, which may seem trivial from a requester’s perspective, significantly affected older adults’ abilities and time required to complete the tasks. (Brewer, Morris and Piper, 2016)

slide-63
SLIDE 63

“Reflexivity”

Twitter bot experiments—isn’t it obvious that these are bots?

slide-64
SLIDE 64

“Reflexivity”

Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading

slide-65
SLIDE 65

“Reflexivity”

Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious

slide-66
SLIDE 66

“Reflexivity”

Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious

◮ As an American studying other Americans engaged with online

politics, my intuition seems like it should be more applicable than eg in cross-national ethnography, or study of online racists

slide-67
SLIDE 67

“Reflexivity”

Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious

◮ As an American studying other Americans engaged with online

politics, my intuition seems like it should be more applicable than eg in cross-national ethnography, or study of online racists

◮ “But I have hundreds of friends on Twitter who are like me”-effect

slide-68
SLIDE 68

“Reflexivity”

Twitter bot experiments—isn’t it obvious that these are bots? The intuitions of Computational Social Scientists are not merely insufficient; they can be actively misleading We all know this, but the problem is pernicious

◮ As an American studying other Americans engaged with online

politics, my intuition seems like it should be more applicable than eg in cross-national ethnography, or study of online racists

◮ “But I have hundreds of friends on Twitter who are like me”-effect

Models that describe us don’t describe most people

slide-69
SLIDE 69

“Reflexivity”: Online Echo Chambers

Considerable energy has been spent investigating the phenomenon

  • f echo chambers
slide-70
SLIDE 70

“Reflexivity”: Online Echo Chambers

Considerable energy has been spent investigating the phenomenon

  • f echo chambers

My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks

slide-71
SLIDE 71

“Reflexivity”: Online Echo Chambers

Considerable energy has been spent investigating the phenomenon

  • f echo chambers

My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks

◮ Including, of course, Computational Social Scientists and

journalists

slide-72
SLIDE 72

“Reflexivity”: Online Echo Chambers

Considerable energy has been spent investigating the phenomenon

  • f echo chambers

My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks

◮ Including, of course, Computational Social Scientists and

journalists

Science works: empirical consensus falsified the theory of ubiquitous echo chambers

slide-73
SLIDE 73

“Reflexivity”: Online Echo Chambers

Considerable energy has been spent investigating the phenomenon

  • f echo chambers

My read of the literature: they don’t exist...except among users in specialized (partisan or professional) networks

◮ Including, of course, Computational Social Scientists and

journalists

Science works: empirical consensus falsified the theory of ubiquitous echo chambers But the supply of social science research is inelastic, so there are serious opportunity costs

slide-74
SLIDE 74

Costs of the Bourdieusian Scholastic View

Online Echo Chambers Fake News Experienced by CSS YES NO

slide-75
SLIDE 75

Costs of the Bourdieusian Scholastic View

Online Echo Chambers Fake News Experienced by CSS YES NO Experienced by public NO NO

slide-76
SLIDE 76

Costs of the Bourdieusian Scholastic View

Online Echo Chambers Fake News Experienced by CSS YES NO Experienced by public NO NO Specific sub-populations YES YES

slide-77
SLIDE 77

Costs of the Bourdieusian Scholastic View

Online Echo Chambers Fake News Experienced by CSS YES NO Experienced by public NO NO Specific sub-populations YES YES Studied by CSS TOO MUCH TOO LATE

slide-78
SLIDE 78

2005−2015 2016−2018

25 50 75 100 125 Time_Period Number of Hits (Source: Political Science''

group

Echo Chamber Fake News Shifting Focus of Political Science: Knowledge Production

slide-79
SLIDE 79

Survey Generalizability

“The Generalizability of Survey Experiments” (Mullinix et al., 2015):

slide-80
SLIDE 80

Survey Generalizability

“The Generalizability of Survey Experiments” (Mullinix et al., 2015):

◮ General point: survey experiments generalize...

slide-81
SLIDE 81

Survey Generalizability

“The Generalizability of Survey Experiments” (Mullinix et al., 2015):

◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a

student sample where a moderator is age”

slide-82
SLIDE 82

Survey Generalizability

“The Generalizability of Survey Experiments” (Mullinix et al., 2015):

◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a

student sample where a moderator is age”

Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors

slide-83
SLIDE 83

Survey Generalizability

“The Generalizability of Survey Experiments” (Mullinix et al., 2015):

◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a

student sample where a moderator is age”

Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors Clearly excludes MTurk

slide-84
SLIDE 84

Survey Generalizability

“The Generalizability of Survey Experiments” (Mullinix et al., 2015):

◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a

student sample where a moderator is age”

Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors Clearly excludes MTurk Excludes even our over-sample of digital naives due to difficulty of survey instruments we never thought existed

slide-85
SLIDE 85

Survey Generalizability

“The Generalizability of Survey Experiments” (Mullinix et al., 2015):

◮ General point: survey experiments generalize... ◮ “some convenience samples would be inappropriate such as a

student sample where a moderator is age”

Any sample with a hard digital literacy cutoff is inappropriate for making generalizations about online behaviors Clearly excludes MTurk Excludes even our over-sample of digital naives due to difficulty of survey instruments we never thought existed Moving forward: develop better sampling and survey techniques for studying low digital literacy populations

slide-86
SLIDE 86

Thank You

slide-87
SLIDE 87

Brewer, Robin, Meredith Ringel Morris, and Anne Marie Piper. 2016. Why would anybody do this?: Understanding older adults’ motivations and challenges in crowd work. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM pp. 2246–2257. Coppock, Alexander. 2018. “Generalizing from survey experiments conducted on mechanical Turk: A replication approach.” Political Science Research and Methods pp. 1–16. Huff, Connor, and Dustin Tingley. 2015. “”Who Are These People?” Evaluating the Demographic Characteristics and Political Preferences of MTurk Survey Respondents.” Research and Politics 2 (1): 1–12. Knox, Dean, Teppei Yamamoto, Matthew A Baum, and Adam

  • Berinsky. 2014. Design, Identification, and Sensitivity Analysis for

Patient Preference Trials. Technical report Working Paper. Leeper, Thomas J. 2016. “How does treatment self-selection affect inferences about political communication?” Journal of Experimental Political Science . Mullinix, Kevin J, Thomas J Leeper, James N Druckman, and Jeremy

slide-88
SLIDE 88
  • Freese. 2015. “The generalizability of survey experiments.” Journal
  • f Experimental Political Science 2 (2): 109–138.

Snowberg, Erik, and Leeat Yariv. 2018. “Testing the Waters: Behavior across Participant Pools.”.