Political Science 209 - Fall 2018 Observational Studies Florian - - PowerPoint PPT Presentation

political science 209 fall 2018
SMART_READER_LITE
LIVE PREVIEW

Political Science 209 - Fall 2018 Observational Studies Florian - - PowerPoint PPT Presentation

Political Science 209 - Fall 2018 Observational Studies Florian Hollenbach 24th September 2018 Review What is the fundamental problem of causal inference? Florian Hollenbach 1 Review What about randomized control trials allows us to


slide-1
SLIDE 1

Political Science 209 - Fall 2018

Observational Studies

Florian Hollenbach 24th September 2018

slide-2
SLIDE 2

Review

What is the fundamental problem of causal inference?

Florian Hollenbach 1

slide-3
SLIDE 3

Review

What about randomized control trials allows us to credibly estimate a causal effect?

Florian Hollenbach 2

slide-4
SLIDE 4

Get out the Vote Study

What can induce citizens to vote?

Florian Hollenbach 3

slide-5
SLIDE 5

What was the experiment?

Florian Hollenbach 4

slide-6
SLIDE 6

What was the experiment?

Letters to randomized households with treatment:

  • 1. Naming and Shaming: your neighbors will know
  • 2. Civic Duty
  • 3. Hawthorne Effect Message
  • 4. Control (no letter)

Florian Hollenbach 4

slide-7
SLIDE 7

Let’s go to R-studio quick

Florian Hollenbach 5

slide-8
SLIDE 8

Observational Studies and Causal Inference

What is the main problem for observational studies?

Florian Hollenbach 6

slide-9
SLIDE 9

Observational Studies and Causal Inference

What is the main problem for observational studies?

  • Confounders: variables that are associated with both

treatment and outcome

Florian Hollenbach 6

slide-10
SLIDE 10

What is the Problem with Confounders?

Florian Hollenbach 7

slide-11
SLIDE 11

What is the Problem with Confounders?

  • If pre-treatment characteristics are associated with treatment

and outcome, we can’t disentangle causal effect from confounding bias

Florian Hollenbach 7

slide-12
SLIDE 12

What is the Problem with Confounders?

  • If pre-treatment characteristics are associated with treatment

and outcome, we can’t disentangle causal effect from confounding bias

  • Selection into treament example: Maybe minimum wage was

increased because unemployment was particularly low in NJ, but not PA

Florian Hollenbach 7

slide-13
SLIDE 13

Examples of Confounding

  • Are incumbents more likely to win elections? Yes, but. . .

Florian Hollenbach 8

slide-14
SLIDE 14

Examples of Confounding

  • Are incumbents more likely to win elections? Yes, but. . .
  • Incumbents receive more campaign contributions
  • Incumbents have more staff

Florian Hollenbach 8

slide-15
SLIDE 15

Examples of Confounding

  • Does higher income lead countries to democratize?

Florian Hollenbach 9

slide-16
SLIDE 16

Examples of Confounding

  • Does higher income lead countries to democratize?
  • Higher income countries have more educated populations

Florian Hollenbach 9

slide-17
SLIDE 17

What can we do about confounding in observational studies?

Florian Hollenbach 10

slide-18
SLIDE 18

What can we do about confounding in observational studies?

  • Make Treatment and Control groups as similar to each other

as possible

  • Especially on variables that might matter for treatment status

and outcome

  • Analyze subsets or statistical control, such that we compare

treated and control units that have same value on confounder

Florian Hollenbach 10

slide-19
SLIDE 19

Another problem with observational studies:

  • Reverse causality

Florian Hollenbach 11

slide-20
SLIDE 20

Another problem with observational studies:

  • Reverse causality
  • Example: Does economic growth cause democratization or

democratization cause growth? Why do experiments not suffer from the threat of reverse causality?

Florian Hollenbach 11

slide-21
SLIDE 21

Observational studies

Difference-in-Differences Design

Florian Hollenbach 12

slide-22
SLIDE 22

Difference-in-Differences Design

  • Compare trends before and after the treatment across the

same units

  • Takes initial conditions into account

Florian Hollenbach 13

slide-23
SLIDE 23

Difference-in-Differences Design

  • Need data measured for both treatment and control at two

different time periods: before and after treatment

  • Total difference between P2 and S2 can not be attributed to
  • treatment. Why?

Florian Hollenbach 14

slide-24
SLIDE 24

Difference-in-Differences Design

What might be a necessary condition for Diff-in-Diff to work?

Florian Hollenbach 15

slide-25
SLIDE 25

Difference-in-Differences Design

What might be a necessary condition for Diff-in-Diff to work? Parralel Trends Assumptions

Florian Hollenbach 15

slide-26
SLIDE 26

Difference-in-Differences Design

Florian Hollenbach 16

slide-27
SLIDE 27

Describing numeric variables:

  • Mean
  • Median
  • Quantiles

Florian Hollenbach 17

slide-28
SLIDE 28

Quantiles

  • splitting observations into equaly size groups, e.g., quartiles,

quantiles

  • 75th percentile is the threshold under which 75% of
  • bservations lie
  • What percentile is the median?

Florian Hollenbach 18

slide-29
SLIDE 29

Describing the spread of numeric variables:

  • IQR:

Florian Hollenbach 19

slide-30
SLIDE 30

Describing the spread of numeric variables:

  • IQR:

Difference between 75th percentile and 25th percentile

Florian Hollenbach 19

slide-31
SLIDE 31

Describing the spread of numeric variables:

Standard Deviation

Florian Hollenbach 20

slide-32
SLIDE 32

Describing the spread of numeric variables:

Standard Deviation SD =

  • 1

n

N

i=1(xi − ¯

x)2

Florian Hollenbach 20

slide-33
SLIDE 33

Standard Deviation

Florian Hollenbach 21

slide-34
SLIDE 34

Describing single Variables

  • Barplots can be used to summarize factor(?) variables
  • Proportion of observations in each category as the height of

each bar

Florian Hollenbach 22

slide-35
SLIDE 35

Barplots

Florian Hollenbach 23

slide-36
SLIDE 36

Histograms

  • Histograms look similar to barplots
  • Used for numeric variables
  • Numeric variables are binned into groups

Florian Hollenbach 24

slide-37
SLIDE 37

Histograms

  • Each bar is for one bin
  • Height of each bar is the density of the bin

Florian Hollenbach 25

slide-38
SLIDE 38

Histograms

  • Each bar is for one bin
  • Height of each bar is the density of the bin
  • Important: Height is share of observations in bin divided by

bin size

Florian Hollenbach 25

slide-39
SLIDE 39

Histograms

  • Each bar is for one bin
  • Height of each bar is the density of the bin
  • Important: Height is share of observations in bin divided by

bin size

  • Unit of vertical axis (y-axis) is interpreted as percentage per

horizontal (x-axis) unit

Florian Hollenbach 25

slide-40
SLIDE 40

Histograms

  • Area of each bar is the share of observations that fall into that

bin

  • Area of all bins sum to one

Florian Hollenbach 26

slide-41
SLIDE 41

Histograms

Distribution of Subjects's Age

Age Density 20 30 40 50 60 70 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035

Florian Hollenbach 27

slide-42
SLIDE 42

Boxplots

  • Boxplots also display the distribution of a numeric variable
  • Boxplots show the median, quartiles, and IQR

Florian Hollenbach 28

slide-43
SLIDE 43

Boxplots

Florian Hollenbach 29

slide-44
SLIDE 44

Boxplots can show how two variables covary

1 50000 100000 150000 200000 250000 300000

Income by Treatment Status

Income

Florian Hollenbach 30

slide-45
SLIDE 45

Survey Sampling

  • A sample is a small share of the population in that we are

interested in

Florian Hollenbach 31

slide-46
SLIDE 46

Survey Sampling

  • A sample is a small share of the population in that we are

interested in

  • How do we draw samples in such a way that polls accurately

reflect what is going to happen?

  • How to construct samples that will represent the population?

Florian Hollenbach 31

slide-47
SLIDE 47

Survey Sampling

  • Example: We want to know the voting intentions of Texans

(or Americans)

  • We can hardly ask all eligible voters about their intention

Florian Hollenbach 32

slide-48
SLIDE 48

Survey Sampling

  • Example: We want to know the voting intentions of Texans

(or Americans)

  • We can hardly ask all eligible voters about their intention
  • We take a sample

Florian Hollenbach 32

slide-49
SLIDE 49

Survey Sampling

  • The size of the sample is less important than its composition

Florian Hollenbach 33

slide-50
SLIDE 50

Literary Digest Sample

  • Mail questionnaire to 10 million people
  • Addresses came from phone books and club memberships
  • Problems?

Florian Hollenbach 34

slide-51
SLIDE 51

Literary Digest Sample

  • Mail questionnaire to 10 million people
  • Addresses came from phone books and club memberships
  • Problems?
  • Biased sample

Florian Hollenbach 34

slide-52
SLIDE 52

Quota Samping

  • Sample certain groups until quota is filled
  • Does not mean unobservables are representative

Florian Hollenbach 35

slide-53
SLIDE 53

Simple Random Sampling

  • Think of all voters sitting in a box, survey firm randomly draws

voters

  • Random draws without replacement give us an unbiased

estimate of the population

  • Everybody has the same chance of being in the sample

Florian Hollenbach 36

slide-54
SLIDE 54

Simple Random Sampling

  • Pre-determined number of units are randomly selected from

population

  • Sample will be representative of population on observed and

unobserved characteristics

Florian Hollenbach 37

slide-55
SLIDE 55

Simple Random Sampling

  • Not every single sample will be exactly representative
  • If we were to take a lot of random samples (say 1000 samples
  • f 1000 respondents), on average the samples would be

representative

Florian Hollenbach 38

slide-56
SLIDE 56

Simple Random Sampling

  • Each single sample can be off and different
  • Polls are associated with uncertainty

Florian Hollenbach 39

slide-57
SLIDE 57

Simple Random Sampling

  • Each single sample can be off and different
  • Polls are associated with uncertainty

Florian Hollenbach 39

slide-58
SLIDE 58

Random Sampling is hard

  • How to create sampling frame?
  • Random digit dialing? Walking to random houses?
  • Multi-stage cluster sampling

Florian Hollenbach 40

slide-59
SLIDE 59

Non-reponse bias

  • Unit non-response bias:

Florian Hollenbach 41

slide-60
SLIDE 60

Non-reponse bias

  • Item non-response bias: What was the last crime you

committed?

  • Sensitive questions: non-response, social desirability bias

Turnout, racial prejudice, corruption

Florian Hollenbach 42

slide-61
SLIDE 61

Why could this be a problem in the Afghanistan example?

Florian Hollenbach 43