Conducting rigorous research on large open-access developmental - - PowerPoint PPT Presentation

conducting rigorous research on large open access
SMART_READER_LITE
LIVE PREVIEW

Conducting rigorous research on large open-access developmental - - PowerPoint PPT Presentation

Conducting rigorous research on large open-access developmental datasets Amy Orben Department of Experimental Psychology, University of Oxford ABCD Workshop, Portland @OrbenAmy 1 1. Curbing analytical flexibility 2. Preregistration +


slide-1
SLIDE 1

Amy Orben Department of Experimental Psychology, University of Oxford

ABCD Workshop, Portland @OrbenAmy

Conducting rigorous research on large open-access developmental datasets

1

slide-2
SLIDE 2

2

  • 1. Curbing analytical flexibility
  • 2. Preregistration + Registered Reports
  • 3. Specification Curve Analysis
  • 4. Effect Sizes
slide-3
SLIDE 3

Derren Brown: The System

3 (Kate Button)

slide-4
SLIDE 4

While there was a system to guarantee that she won, it wasn’t the system she thought it was.

4

slide-5
SLIDE 5

Race 1: 7776 people, randomly allocated a horse She was the 1 / 7776 who by chance had 5 consecutive wins

5

slide-6
SLIDE 6

Race 1: 7776 people, randomly allocated a horse Race 2: 1296 race 1 winners, randomly allocated a horse

6

slide-7
SLIDE 7

Race 1: 7776 people, randomly allocated a horse Race 2: 1296 race 1 winners, randomly allocated a horse Race 3: 216 race 2 winners, randomly allocated a horse

7

slide-8
SLIDE 8

Race 1: 7776 people, randomly allocated a horse Race 2: 1296 race 1 winners, randomly allocated a horse Race 3: 216 race 2 winners, randomly allocated a horse Race 4: 36 race 3 winners, randomly allocated a horse

8

slide-9
SLIDE 9

Race 1: 7776 people, randomly allocated a horse Race 2: 1296 race 1 winners, randomly allocated a horse Race 3: 216 race 2 winners, randomly allocated a horse Race 4: 36 race 3 winners, randomly allocated a horse Race 5: 6 race 4 winners, randomly allocated a horse

9

slide-10
SLIDE 10

Race 1: 7776 people, randomly allocated a horse Race 2: 1296 race 1 winners, randomly allocated a horse Race 3: 216 race 2 winners, randomly allocated a horse Race 4: 36 race 3 winners, randomly allocated a horse Race 5: 6 race 4 winners, randomly allocated a horse She was the 1 / 7776 who by chance had 5 consecutive wins

10

slide-11
SLIDE 11

11

The “Winning Streak”

slide-12
SLIDE 12

12

Data

Gelman: http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf

slide-13
SLIDE 13

13

Data

slide-14
SLIDE 14

14

Data

slide-15
SLIDE 15

15

Data

slide-16
SLIDE 16

16

Data

Statistically Significant Result

slide-17
SLIDE 17

17

The Scientific Headline

Data

slide-18
SLIDE 18

Garden of Forking Paths

“The researcher degrees of freedom do not feel like degrees of freedom because, conditional on the data, each choice appears to be deterministic. But if we average over all possible data that could have occurred, we need to look at the entire garden of forking paths and recognize how each path can lead to statistical significance in its own way."

18 Gelman: http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf

slide-19
SLIDE 19

19

slide-20
SLIDE 20

20 University of Pennsylvania undergraduates

20

Does listening to the song ”When I’m Sixty-Four” cause people to become older?

“When I’m Sixty-Four” or “Kalimba” Indicate birthday and father’s age (control for baseline age across participants)

slide-21
SLIDE 21

20 University of Pennsylvania undergraduates

21

Does listening to the song ”When I’m Sixty-Four” cause people to become older?

“When I’m Sixty-Four” or “Kalimba” Indicate birthday and father’s age (control for baseline age across participants)

People were 1½ years younger after “When I’m Sixty-Four” F(1,17) = 4.92, p = 0.040

slide-22
SLIDE 22

22 Simmons, Nelson, Simonsohn (2011)

slide-23
SLIDE 23

23 Simmons, Nelson, Simonsohn (2011)

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

26

slide-27
SLIDE 27

27

Why might these problems be amplified by large-scale openly accessible data?

slide-28
SLIDE 28

28

An Example

slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31

31

slide-32
SLIDE 32

Data from Twenge et al. (2017), Orben (2017)

slide-33
SLIDE 33

33

Big Data – Small Effects

slide-34
SLIDE 34

34 Orben and Przybylski (Nature Human Behaviour, 2019)

slide-35
SLIDE 35

35

The Garden of Forking Paths

slide-36
SLIDE 36

Data that is ”Too Big To Fail”

  • Large numbers of participants ensure that even extremely modest

covariations (e.g. r’s < 0.05) between self-report items will result in alpha levels typically interpreted as compelling evidence for rejecting the null hypothesis by psychological scientists (i.e. p’s < 0.05)

  • Large batteries of ill-defined questions lead to an explosion of

possible analytical pathways (researcher degrees of freedom)

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-37
SLIDE 37

37

What can we do?

slide-38
SLIDE 38

Solutions to Analytical Flexibility

  • Transparency:
  • Amount of variables
  • Termination rules
  • All experimental conditions
  • Observations that are eliminated
  • Covariates

38 Simmons, Nelson, Simonsohn (2011)

slide-39
SLIDE 39

The 21-Word Solution

We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study.

39 Felix Schönbrodt: A voluntary commitment to research transparency

slide-40
SLIDE 40

Solution #1

Decide on one analytical pathway beforehand using pre-registration or registered report methodologies

(Chambers, 2013; Munafò et al., 2017; van ’t Veer, 2016; Lakens, 2014)

Pro: Simple way to decrease researcher degrees

  • f freedom

http://blogs.discovermagazine.com/neuroskeptic/201 3/10/16/the-f-problem/ 40

slide-41
SLIDE 41

Solution #1

Decide on one analytical pathway beforehand using pre-registration or registered report methodologies

(Chambers, 2013; Munafò et al., 2017; van ’t Veer, 2016; Lakens, 2014)

Pro: Simple way to decrease researcher degrees

  • f freedom

Con: Researcher needs to prove that they have not previously seen or engaged with the data

41

slide-42
SLIDE 42

Preregistration

42

slide-43
SLIDE 43

43, taken from Chris Chambers

slide-44
SLIDE 44

Stage 1 at Cortex

44

slide-45
SLIDE 45

45 Simmonsohn, Simmons, Nelson (2015)

Solution #2

Examine all possible analytical pathways using Specification Curve Analysis

(SCA; Simonsohn, Simmons, & Nelson, 2015)

Pro: Works around researcher degrees of freedom even when data has been previously accessed

slide-46
SLIDE 46

46 Simmonsohn, Simmons, Nelson (2015)

slide-47
SLIDE 47

1 Identify Specifications Decide on all possible analytical pathways 2 Implementing Specifications Run all possible analyses and graph outcomes 3 Statistical Inferences Run bootstraps to test whether original dataset has more significant specifications than a dataset where null hypothesis is true

47

slide-48
SLIDE 48
  • SCREENSHOT OF MEDIA ARTICLE ABOUT JUNG ET AL 2014

48

slide-49
SLIDE 49

49

slide-50
SLIDE 50

50

slide-51
SLIDE 51

51

slide-52
SLIDE 52

52

slide-53
SLIDE 53

Specification Curve Analysis

53 Simmonsohn, Simmons, Nelson (2015)

slide-54
SLIDE 54

Specification Curve Analysis

54 Simmonsohn, Simmons, Nelson (2015)

slide-55
SLIDE 55
  • ADD STUFF ABOUT MULTIVERSE

55

slide-56
SLIDE 56

56

slide-57
SLIDE 57

57

slide-58
SLIDE 58

58

slide-59
SLIDE 59

59 Poldrack et al. (2017)

slide-60
SLIDE 60

Well-being Any possible combination of 24 questions about well-being, self-esteem and feelings (cohort members) or of 25 questions of strengths and difficulties questionnaire (caregivers) Technology Use Mean of any possible combination of 5 questions concerning TV use, electronic games, social media use, owning a computer and using internet at home Covariates Included or not

(mother’s ethnicity, education, employment, psychological distress, equivalised household income, whether biological father is present, number of siblings in household, conflict in mother-child relationship, frequency of mother-child interaction, long- term illness, negative attitudes towards school, mother’s word activity score)

Total 3,221,225,472 specifications

MCS

1 Identify Specifications Decide on all possible analytical pathways

60

slide-61
SLIDE 61

2 Implementing Specifications Run all possible analyses and graph outcomes

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-62
SLIDE 62

2 Implementing Specifications Run all possible analyses and graph outcomes

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-63
SLIDE 63

2 Implementing Specifications Run all possible analyses and graph outcomes

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-64
SLIDE 64

2 Implementing Specifications Run all possible analyses and graph outcomes

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-65
SLIDE 65

2 Implementing Specifications Run all possible analyses and graph outcomes

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-66
SLIDE 66

2 Implementing Specifications Run all possible analyses and graph outcomes

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-67
SLIDE 67

2 Implementing Specifications Run all possible analyses and graph outcomes

Orben and Przybylski (Nature Human Behaviour, 2019)

slide-68
SLIDE 68

Other Examples

Preregistered with 3 datasets: Orben and Przybylski (Psychological Science, 2019) Longitudinal: Orben, Dienlin and Przybylski (PNAS, 2019)

slide-69
SLIDE 69

Solution #3

Include extra transparency about effect sizes This can be putting effect sizes into perspective using other variables, Smallest Effect Sizes of Interest or real-life cut-offs

slide-70
SLIDE 70
slide-71
SLIDE 71
slide-72
SLIDE 72
slide-73
SLIDE 73
slide-74
SLIDE 74

74

Or: https://psyarxiv.com/syp5a/

slide-75
SLIDE 75

75

slide-76
SLIDE 76

76

Good analysis of large-scale data is inherently rooted in transparency Some of the tools to help are:

  • 1. Preregistration + Registered Reports
  • 2. Specification Curve Analysis
  • 3. Considering Effect Sizes
slide-77
SLIDE 77

Thank you

Professor Robin Dunbar Professor Andrew Przybylski

77

Professor Dorothy Bishop

slide-78
SLIDE 78

Amy Orben Department of Experimental Psychology, University of Oxford

ABCD Workshop, Portland @OrbenAmy

Conducting rigorous research on large open-access developmental datasets

78