evaluation: Effects of sample source and analysis method Reanne - PowerPoint PPT Presentation

Online probing for questionnaire evaluation: Effects of sample source and analysis method Reanne Townsend, Rosalynn Yang, Kristin Chen, Gonzalo Rivero, & Terisa Davis (Westat ) Gordon Willis, Stephanie Fowler, & Richard Moser (NIH) AAPOR 2018 Taking Survey and Public Opinion Research to New Heights

Background and Introduction • Online Probing (OP) is a questionnaire evaluation methodology which administers probe questions within a web survey to assess targeted items. (see Edgar, Murphy & Keating 2016; Meitinger & Behr 2016). • Some have experimented with Online Probing procedures to determine whether features such as text box size and probe placement affect data quality (e.g. Behr, Bandilla, Kaczmirek & Braun 2014; Fowler et al 2017). • However many questions remain about how other features of Online Probing study design may influence results. | AAPOR 2018 2

Research Questions 1. How does the amount and quality of data provided in response to Online Probing differ by sample source or recruitment strategy ? – Probability, nonprobability with quotas, convenience sample 2. How do content analysis results differ by analysis method ? – T raditional “hand - coding” vs. unsupervised keyword extraction | AAPOR 2018 3

Design • Short 10-minute web survey completed by 3,089 respondents – Questionnaire composed of items from the Health Information National Trends Survey (HINTS) • Respondents come from 3 different web panels, using varied sampling or recruitment methodologies nonprobability probability quota convenience GfK YouGov mTurk (n=1,033) (n=1,000) (n=1,056) Probability-based sample Nonprobability sample with Nonprobability convenience (ABS/RDD) demographic quotas sample | AAPOR 2018 4

Design • Online Probes were administered as open-ended questions at the end of the questionnaire (retrospectively) – One probe each for 4 items. Mix of question and probe types • 2 ask respondents to list examples of a construct (“social media”, and “medical information”) • 1 asks for method for calculating whether smoked 100 cigarettes • 1 asks for reason behind evaluation of cancer likelihood | AAPOR 2018 5

Design Cognitive probe used for thematic analysis | AAPOR 2018 6

Research Question 1: Sample Source

RQ1 Analysis How does the amount and quality of data provided in response to Online Probing differ by sample source or recruitment strategy? • Outcome 1: Proportion of respondents giving a “useful” response – Coded by hand – Nonresponse/off-topic, Minimal response, Potentially useful response • Outcome 2: Average character count among potentially useful responses – Excluding spaces | AAPOR 2018 8

RQ1 Results Outcome 1: Proportion of useful responses, by sample source 98.7 100 Percent of respondents 90 85.6 78.9 80 70 60 50 40 30 20 10 0 GfK YouGov mTurk GfK YouGov mTurk GfK YouGov mTurk GfK YouGov mTurk Cancer chance reason Medical Info examples 100 cigarettes calc Social media site examples Useful response Minimal response | AAPOR 2018 9

RQ1 Results Outcome 2: Average number of characters per useful response, by sample source 100 90 83.7 80 73.8 Number of characters 70 60.4 60 50 40 30 20 10 0 GfK YouGov mTurk GfK YouGov mTurk GfK YouGov mTurk GfK YouGov mTurk Cancer chance reason Medical Info examples 100 cigarettes calc Social media site examples Average number of characters | AAPOR 2018 10

RQ1 Results Summary • mTurk respondents consistently provide longer and more useful responses compared to the other web panels – Could be that mTurk workers satisfice less due to the option mTurk requestors have to reject unsatisfactory work (resulting in no payment) • There is also variance in length and usefulness of responses by type of probe – Probes asking for respondent to list examples seem have shorter and less useful responses | AAPOR 2018 11

Research Question 2: Analysis Method

RQ2 Analysis How does analysis method affect a thematic content analysis of the open ended web probe responses? • Method 1: Traditional “by - hand” coding – 2 coders, categories determined jointly by coders, all responses coded (75 double) • Method 2: Natural Language Processing (NLP) – unsupervised keyword extraction and topic model – Identify relevant keywords – Group keywords into “topics” based on contextual similarity • Pre-trained word embedding model Word2Vec, trained on Google News – Associate individual responses with topics based on the occurrence of keywords | AAPOR 2018 13 *note: both methods allowed for multiple categories per response

RQ2 Results Results of Thematic Coding using traditional, by-hand method % of Inter-rater Category Description responses agreement Family 47.2 1.00 Family History, genetics Smoking, Lifestyle & environment (incl. diet, exercise, pollution, Lifestyle 31.0 0.93 "chemicals") Can't know, no way to know, 50/50 chance, can't control it, it's Random 19.7 0.62 random, it's luck of the draw Common 12.5 0.79 Cancer is common, everyone gets it, everything causes cancer Don't know/No idea, Don't care, not concerned, don't think about Don't know 7.4 0.60 it, why worry All other responses (e.g. current age, other health issues, Other 7.6 0.58 medical advances) Faith 4.6 0.65 Faith, feeling, intuition, positive thinking | AAPOR 2018 14

RQ2 Results Results of Thematic Coding using unsupervised NLP model Category Example keywords % - parent, ancestor, grandparent, family, sibling, uncle Family 37.9 - baby, man, woman, teenager, friend Belief/Certainty - luck, presume, uncertain, unsure, hunch, gut, prediction, hopeful, hope 34.2 - paranoid, everyone, anybody, anytime, jesus, christ, optimism, god, faith - sunshine, sun, sunny, beach Sun & Other 30.3 - environment, industry, metal, research, knowledge, capability, technology, future Disease, age, -disorder, death, disease, insurance, sick, treatment, condition, lifestyle, longevity 15.4 lifestyle -prostate, stomach, heart, freckle, skin, colon, lung, bone, testicular, depression take, try, address, counteract, visit, focus, avoid, help, prevent, maintain, protect, Actions 13.6 exercise, combat, minimize, limit -prone, cause, trigger, culprit, precursor, tendency, predisposition Risk & fear 10.5 -fear, risk, danger, paranoia, harmful, damage, chemtrails Diet & Smoking - eating, sugar, vegetable, nutrient, pollution, additive, toxic, chemical 6.8 | AAPOR 2018 15 - smoker, cigarette, drinker, substance

RQ2 Results Compare conclusions between analysis methods • Similarities – Family history and genetics as most common response • 47% of hand coded responses, 38% of NLP responses – Many respondents feel they can’t predict or control whether they get cancer • “Random” and “Faith” from hand coding (24%), “Belief/Certainty” for NLP (34%) | AAPOR 2018 16

RQ2 Results Compare conclusions between analysis methods • Differences – “Environmental & Lifestyle factors” • Hand-coding grouped all lifestyle factors together (incl. smoking, diet, exercise, pollution, sun) • NLP has “action”, “diet/smoking”, “sun & others”, and “Disease, age & lifestyle” – A lot of overlap with “Environmental & Lifestyle”, but not completely – “Cancer is common” sentiment did not show up as a category in NLP analysis (13% in hand coding) | AAPOR 2018 17

Discussion RQ1. Amount and quality of data by sample source • The amount and quality of information elicited from Online Probing can differ depending on the source of the sample – mTurkers provide more information, but are they “professional respondents” and not generalizable? • Possible next steps: – Examine whether thematic coding results differ by sample source – Further exploration of how question and probe type affect amount and quality of information | AAPOR 2018 18

Discussion RQ2. Thematic coding results by analysis method • Thematic categories defined by keywords that can be identified outside of syntactical context can be similar between hand coding and NLP (e.g. family & genetics) – Concepts which require context outside of individual keywords are not as easily categorized by unsupervised keyword extraction (e.g. “Sun & Others”) – Possible next step: Classification could be improved by using more sophisticated NLP methods, such as using n-grams instead of single keywords, and a probabilistic framework rather than deterministic | AAPOR 2018 19

Thank you! Contact: ReanneTownsend@Westat.com

evaluation: Effects of sample source and analysis method Reanne - PowerPoint PPT Presentation

Online probing for questionnaire evaluation: Effects of sample source and analysis method Reanne Townsend, Rosalynn Yang, Kristin Chen, Gonzalo Rivero, & Terisa Davis (Westat ) Gordon Willis, Stephanie Fowler, & Richard Moser (NIH)

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Effects and State Liam OConnor CSE, UNSW (and Data61) Term 2 2019 1 Effects State IO

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

SEM Photographs of Activated ash samples SEM Micrographs (Original ash samples) (a) Sample S1F1

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Sample Preparation Sample Preparation Sample Size 6 mm x 12 mm x 50 mm 10 mm x 12 mm

SAMPLE SIZE IN TRIAXIAL LOADS How sample size affects the frictional behavior Photo by H.

Math 1710 Class 24 Examples Power 2-Sample CIs Dr. Allen Back and HTs 2-Sample

Sample and Hold Dag T. Wisland Spring 2014 Outline Sample and hold basics Non ideal

PS 4 Panel Models 11 December 2014 PS 4 Panel Models Pooled OLS vs Fixed Effects Pooled OLS vs

Work Together Fir ire Effects to Cult ltural Resources First Order (Direct) effects Second

Interspecific strategic effects Interspecific strategic effects Interspecific strategic effects

Why Mixed Effects Models? Mixed Effects Models Recap/Intro Three issues with ANOVA

Sample Score Report by three areas, or claims. Sample Score

1/17/2017 The Future Needs Everyone: Promoting Workplace Success for Millennials with

The Lord of Hosts Amos 4:13 (ESV) For behold, He who forms mountains and creates the wind And

E xe rc ise fo r fa lls pre ve ntio n: An inve nto ry o f e xe rc ise pro g ra ms I MPACT ,

1 Peter Series Lesson #112 November 30, 2017 Dean Bible Ministries www.deanbibleministries.org Dr.

ThorCons Path to Thorium Utilization ThorCon the Do-able Molten-Salt Reactor ThorCon Design

Houstons Economy in 2019: Sorting Out the Right Path Forward as Growth Returns Robert W.

Nineth Sunday After Pentecost Welcome & Prayer Welcome & Prayer The grace of our Lord

Business Statistics CONTENTS The one-sample -test for Hypotheses and SPSS Old exam

Sambuz

Useful Links

Newsletter

Mail Us

evaluation: Effects of sample source and analysis method Reanne - PowerPoint PPT Presentation

Online probing for questionnaire evaluation: Effects of sample source and analysis method Reanne Townsend, Rosalynn Yang, Kristin Chen, Gonzalo Rivero, & Terisa Davis (Westat ) Gordon Willis, Stephanie Fowler, & Richard Moser (NIH)

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Effects and State Liam OConnor CSE, UNSW (and Data61) Term 2 2019 1 Effects State IO

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

SEM Photographs of Activated ash samples SEM Micrographs (Original ash samples) (a) Sample S1F1

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Sample Preparation Sample Preparation Sample Size 6 mm x 12 mm x 50 mm 10 mm x 12 mm

SAMPLE SIZE IN TRIAXIAL LOADS How sample size affects the frictional behavior Photo by H.

Math 1710 Class 24 Examples Power 2-Sample CIs Dr. Allen Back and HTs 2-Sample

Sample and Hold Dag T. Wisland Spring 2014 Outline Sample and hold basics Non ideal

PS 4 Panel Models 11 December 2014 PS 4 Panel Models Pooled OLS vs Fixed Effects Pooled OLS vs

Work Together Fir ire Effects to Cult ltural Resources First Order (Direct) effects Second

Interspecific strategic effects Interspecific strategic effects Interspecific strategic effects

Why Mixed Effects Models? Mixed Effects Models Recap/Intro Three issues with ANOVA

Sample Score Report by three areas, or claims. Sample Score

1/17/2017 The Future Needs Everyone: Promoting Workplace Success for Millennials with

The Lord of Hosts Amos 4:13 (ESV) For behold, He who forms mountains and creates the wind And

E xe rc ise fo r fa lls pre ve ntio n: An inve nto ry o f e xe rc ise pro g ra ms I MPACT ,

1 Peter Series Lesson #112 November 30, 2017 Dean Bible Ministries www.deanbibleministries.org Dr.

ThorCons Path to Thorium Utilization ThorCon the Do-able Molten-Salt Reactor ThorCon Design

Houstons Economy in 2019: Sorting Out the Right Path Forward as Growth Returns Robert W.

Nineth Sunday After Pentecost Welcome &amp; Prayer Welcome &amp; Prayer The grace of our Lord

Business Statistics CONTENTS The one-sample -test for Hypotheses and SPSS Old exam

Sambuz

Useful Links

Newsletter

Mail Us

Nineth Sunday After Pentecost Welcome & Prayer Welcome & Prayer The grace of our Lord