Statistical Foundations I Department of Government London School of - - PowerPoint PPT Presentation

statistical foundations i
SMART_READER_LITE
LIVE PREVIEW

Statistical Foundations I Department of Government London School of - - PowerPoint PPT Presentation

Definition Treatment Effects Statistical Inference Statistical Foundations I Department of Government London School of Economics and Political Science Definition Treatment Effects Statistical Inference 1 What is an experiment? 2 Treatment


slide-1
SLIDE 1

Definition Treatment Effects Statistical Inference

Statistical Foundations I

Department of Government London School of Economics and Political Science

slide-2
SLIDE 2

Definition Treatment Effects Statistical Inference

1 What is an experiment? 2 Treatment Effects 3 Statistical Inference

slide-3
SLIDE 3

Definition Treatment Effects Statistical Inference

1 What is an experiment? 2 Treatment Effects 3 Statistical Inference

slide-4
SLIDE 4

Definition Treatment Effects Statistical Inference

Principles of causality

1 Correlation/Relationship 2 Nonconfounding 3 Direction (“temporal precedence”) 4 Mechanism 5 Appropriate level of analysis

slide-5
SLIDE 5

Definition Treatment Effects Statistical Inference

Principles of causality

1 Correlation/Relationship 2 Nonconfounding 3 Direction (“temporal precedence”) 4 Mechanism 5 Appropriate level of analysis

slide-6
SLIDE 6

Definition Treatment Effects Statistical Inference

Experiments are different

slide-7
SLIDE 7

Definition Treatment Effects Statistical Inference

Experiments are different

1 Draw causal inferences through design

slide-8
SLIDE 8

Definition Treatment Effects Statistical Inference

Experiments are different

1 Draw causal inferences through design 2 Randomization breaks selection bias and fixes

temporal precedence

slide-9
SLIDE 9

Definition Treatment Effects Statistical Inference

Experiments are different

1 Draw causal inferences through design 2 Randomization breaks selection bias and fixes

temporal precedence

3 We don’t need to “control” for anything

slide-10
SLIDE 10

Definition Treatment Effects Statistical Inference

Experiments are different

1 Draw causal inferences through design 2 Randomization breaks selection bias and fixes

temporal precedence

3 We don’t need to “control” for anything 4 We see “causal effects” in the comparison of

experimental groups

slide-11
SLIDE 11

Definition Treatment Effects Statistical Inference

Definitions I

The observation of units after, and possibly before, a randomly assigned intervention in a controlled setting, which tests one or more precise causal expectations

slide-12
SLIDE 12

Definition Treatment Effects Statistical Inference

Definitions I

A randomized experiment, or randomized control trial (RCT), is:

The observation of units after, and possibly before, a randomly assigned intervention in a controlled setting, which tests one or more precise causal expectations

slide-13
SLIDE 13

Definition Treatment Effects Statistical Inference

Definitions I

A randomized experiment, or randomized control trial (RCT), is:

The observation of units after, and possibly before, a randomly assigned intervention in a controlled setting, which tests one or more precise causal expectations

If we manipulate the thing we want to know the effect of (X), and control (i.e., hold constant) everything we do not want to know the effect of (Z), the only thing that can affect the outcome (Y ) is X.

slide-14
SLIDE 14

Definition Treatment Effects Statistical Inference

Definitions II

slide-15
SLIDE 15

Definition Treatment Effects Statistical Inference

Definitions II

Unit: A physical object at a particular point in time

slide-16
SLIDE 16

Definition Treatment Effects Statistical Inference

Definitions II

Treatment: An intervention, whose effect(s) we wish to assess relative to some other (non-)intervention

slide-17
SLIDE 17

Definition Treatment Effects Statistical Inference

Definitions II

Outcome: The variable we are trying to explain

slide-18
SLIDE 18

Definition Treatment Effects Statistical Inference

Definitions II

Potential outcomes: The outcome value for each unit that we would observe if that unit received each treatment Multiple potential outcomes for each unit, but we

  • nly observe one of them
slide-19
SLIDE 19

Definition Treatment Effects Statistical Inference

Definitions II

Causal effect: The comparisons between the unit-level potential outcomes under each intervention This is what we want to know!

slide-20
SLIDE 20

Definition Treatment Effects Statistical Inference

Example

slide-21
SLIDE 21

Definition Treatment Effects Statistical Inference

Example

Unit: Schools in Kenya

slide-22
SLIDE 22

Definition Treatment Effects Statistical Inference

Example

Outcome: Student learning

slide-23
SLIDE 23

Definition Treatment Effects Statistical Inference

Example

Treatment: An additional teacher per class, reducing effective class size

slide-24
SLIDE 24

Definition Treatment Effects Statistical Inference

Example

Potential outcomes:

1 Knowledge in a “large” class 2 Knowledge in a “small class

slide-25
SLIDE 25

Definition Treatment Effects Statistical Inference

Example

Causal effect: Difference in knowledge between the two conditions

slide-26
SLIDE 26

Definition Treatment Effects Statistical Inference

Units

Units can be almost anything Common units in experimental designs:

Individual people Sites (schools, classes, surgeries) Areas (districts, states)

Units are period-specific

Randomization can occur over time

slide-27
SLIDE 27

Definition Treatment Effects Statistical Inference

Outcomes

Experiments can have many outcome concepts/measures Quite common to think about just one at a time Outcomes can be anything that:

Is observable/measurable Can be measured at the level of randomization or lower

slide-28
SLIDE 28

Definition Treatment Effects Statistical Inference

Treatments

Synonyms: manipulation, intervention, factor, condition, cell Treatments are operationalizations of independent variables in a causal theory A set of treatments generates observable variation in X

slide-29
SLIDE 29

Definition Treatment Effects Statistical Inference

Developing Treatments

From theory, we derive testable hypotheses

Hypotheses are expectations about differences in

  • utcomes across levels of a putatively causal

variable In an experiment, an hypothesis must be testable by an ATE

The experimental manipulations induce variation in the causal variable that enable tests of the hypotheses

slide-30
SLIDE 30

Definition Treatment Effects Statistical Inference

Example: Framing and Attention1

Theory: Presentation of information affects politicians’ attention Hypothesis:

Information framed as a conflict draws more attention from political elites than information not framed as a conflict.

Manipulation:

Control group: Presentation of headline information Treatment group: Same information presented as conflict

Outcome: How likely are legislators to read full article

1Walgrave, Sevenans, Van Camp, Loewen (2017) – “What Draws Politicians’ Attention? An Experimental

Study of Issue Framing and its Effect on Individual Political Elites”

slide-31
SLIDE 31

Definition Treatment Effects Statistical Inference

Ex.: Presence/Absence

Theory: Legislators vote in line with constituents’ preferences Hypothesis: Exposure to a poll of constituent views shifts legislative votes. Manipulation: Control group receives no polling information. Treatment group receives a letter containing polling information. Outcome: How legislators vote on relevant piece of legislation

slide-32
SLIDE 32

Definition Treatment Effects Statistical Inference

Ex.: Levels/doses

Theory: Legislators vote in line with constituents’ preferences Hypothesis: Exposure to a poll of constituent views shifts legislative votes. Manipulation: Control group receives no polling information. Treatment group 1 receives a letter containing polling information. Treatment group 2 receives two letters containing polling information. etc. Outcome: How legislators vote on relevant piece of legislation

slide-33
SLIDE 33

Definition Treatment Effects Statistical Inference

Ex.: Qualitative variation

Theory: Legislators vote in line with constituents’ preferences Hypothesis: Exposure to a poll of constituent views shifts legislative votes. Manipulation: Control group receives no polling information. Treatment group 1 receives a letter containing polling information suggesting public support. Treatment group 2 receives a letter containing polling information suggesting public opposition. Outcome: How legislators vote on relevant piece of legislation

slide-34
SLIDE 34

Definition Treatment Effects Statistical Inference

Treatments Test Hypotheses!

slide-35
SLIDE 35

Definition Treatment Effects Statistical Inference

Treatments Test Hypotheses!

Derive experimental design from hypotheses

slide-36
SLIDE 36

Definition Treatment Effects Statistical Inference

Treatments Test Hypotheses!

Derive experimental design from hypotheses Experimental “factors” are expressions of hypotheses as randomized groups

slide-37
SLIDE 37

Definition Treatment Effects Statistical Inference

Treatments Test Hypotheses!

Derive experimental design from hypotheses Experimental “factors” are expressions of hypotheses as randomized groups What intervention each group receives depends

  • n hypotheses

presence/absence levels/doses qualitative variations

slide-38
SLIDE 38

Definition Treatment Effects Statistical Inference

Questions?

slide-39
SLIDE 39

Definition Treatment Effects Statistical Inference

Complexities

Experiments can have additional “moving parts”

Control groups and placebo groups Pre-treatment outcome measurement Within-subjects design features Repeated measures of outcomes Cluster randomization Sampling from a population . . .

None of these are necessary for causal inference

slide-40
SLIDE 40

Definition Treatment Effects Statistical Inference

1 What is an experiment? 2 Treatment Effects 3 Statistical Inference

slide-41
SLIDE 41

Definition Treatment Effects Statistical Inference

The Fundamental Problem of Causal Inference!

Units have multiple potential outcomes We can only observe one of them! Thus we never know the individual-level causal effect of a treatment for a given unit

slide-42
SLIDE 42

Definition Treatment Effects Statistical Inference

Two Solutions!

1 Assume units are all “homogeneous” (i.e.,

identical)

2 Randomly assign units to treatments and

compare average outcomes

slide-43
SLIDE 43

Definition Treatment Effects Statistical Inference

“The Perfect Doctor”

Unit Y0 Y1 1 ? ? 2 ? ? 3 ? ? 4 ? ? 5 ? ? 6 ? ? 7 ? ? 8 ? ? Mean ? ?

slide-44
SLIDE 44

Definition Treatment Effects Statistical Inference

“The Perfect Doctor”

Unit Y0 Y1 1 ? 14 2 6 ? 3 4 ? 4 5 ? 5 6 ? 6 6 ? 7 ? 10 8 ? 9 Mean 5.4 11

slide-45
SLIDE 45

Definition Treatment Effects Statistical Inference

“The Perfect Doctor”

Unit Y0 Y1 1 13 14 2 6 3 4 1 4 5 2 5 6 3 6 6 1 7 8 10 8 8 9 Mean 7 5

slide-46
SLIDE 46

Definition Treatment Effects Statistical Inference

Experimental Inference I

We cannot see individual-level causal effects

slide-47
SLIDE 47

Definition Treatment Effects Statistical Inference

Experimental Inference I

We cannot see individual-level causal effects We can see average causal effects

Ex.: Average difference in cancer between those who do and do not smoke

slide-48
SLIDE 48

Definition Treatment Effects Statistical Inference

Experimental Inference I

We cannot see individual-level causal effects We can see average causal effects

Ex.: Average difference in cancer between those who do and do not smoke

We want to know: TEi = Y1i − Y0i

slide-49
SLIDE 49

Definition Treatment Effects Statistical Inference

Experimental Inference II

We want to know: TEi = Y1i − Y0i

slide-50
SLIDE 50

Definition Treatment Effects Statistical Inference

Experimental Inference II

We want to know: TEi = Y1i − Y0i We can average: ATE = E[Y1i − Y0i] = E[Y1i] − E[Y0i]

slide-51
SLIDE 51

Definition Treatment Effects Statistical Inference

Experimental Inference II

We want to know: TEi = Y1i − Y0i We can average: ATE = E[Y1i − Y0i] = E[Y1i] − E[Y0i] But we still only see one potential outcome for each unit: ATEnaive = E[Y1i|X = 1] − E[Y0i|X = 0]

slide-52
SLIDE 52

Definition Treatment Effects Statistical Inference

Experimental Inference II

We want to know: TEi = Y1i − Y0i We can average: ATE = E[Y1i − Y0i] = E[Y1i] − E[Y0i] But we still only see one potential outcome for each unit: ATEnaive = E[Y1i|X = 1] − E[Y0i|X = 0] Is this what we want to know?

slide-53
SLIDE 53

Definition Treatment Effects Statistical Inference

Experimental Inference III

What we want and what we have: ATE = E[Y1i] − E[Y0i] (1) ATEnaive = E[Y1i|X = 1] − E[Y0i|X = 0] (2)

slide-54
SLIDE 54

Definition Treatment Effects Statistical Inference

Experimental Inference III

What we want and what we have: ATE = E[Y1i] − E[Y0i] (1) ATEnaive = E[Y1i|X = 1] − E[Y0i|X = 0] (2) Are the following statements true? E[Y1i] = E[Y1i|X = 1] E[Y0i] = E[Y0i|X = 0]

slide-55
SLIDE 55

Definition Treatment Effects Statistical Inference

Experimental Inference III

What we want and what we have: ATE = E[Y1i] − E[Y0i] (1) ATEnaive = E[Y1i|X = 1] − E[Y0i|X = 0] (2) Are the following statements true? E[Y1i] = E[Y1i|X = 1] E[Y0i] = E[Y0i|X = 0] Not in general!

slide-56
SLIDE 56

Definition Treatment Effects Statistical Inference

Experimental Inference IV

Only true when both of the following hold: E[Y1i] = E[Y1i|X = 1] = E[Y1i|X = 0] (3) E[Y0i] = E[Y0i|X = 1] = E[Y0i|X = 0] (4) In that case, potential outcomes are independent of treatment assignment If true, then: ATEnaive = E[Y1i|X = 1] − E[Y0i|X = 0] (5) = E[Y1i] − E[Y0i] = ATE

slide-57
SLIDE 57

Definition Treatment Effects Statistical Inference

Experimental Inference V

This holds in experiments because of randomization, which is a special, physical process of unpredictable sorting2 Units differ only in what side of coin was up Experiments randomly reveal potential outcomes Randomization balances Z in expectation

2Not “random” in the casual, everyday sense of the word

slide-58
SLIDE 58

Definition Treatment Effects Statistical Inference

slide-59
SLIDE 59

Definition Treatment Effects Statistical Inference

Experimental Analysis I

The statistic of interest in an experiment is the (sample) average treatment effect (SATE) This boils down to being a mean-difference between two groups:

  • SATE =

1

n1

n1

  • i=1

Y1i

1

n0

n0

  • i=1

Y0i

  • (5)
slide-60
SLIDE 60

Definition Treatment Effects Statistical Inference

Experimental Analysis I

The statistic of interest in an experiment is the (sample) average treatment effect (SATE) This boils down to being a mean-difference between two groups:

  • SATE =

1

n1

n1

  • i=1

Y1i

1

n0

n0

  • i=1

Y0i

  • (5)

Experiments do not require “controlling for” anything, if randomization occurred successfully

slide-61
SLIDE 61

Definition Treatment Effects Statistical Inference

Experimental Data Structures

An experimental data structure looks like:

unit treatment

  • utcome

A 5 B 7 C 9 D 4 E 1 9 F 1 4 G 1 13 H 1 12

slide-62
SLIDE 62

Definition Treatment Effects Statistical Inference

Questions?

slide-63
SLIDE 63

Definition Treatment Effects Statistical Inference

1 What is an experiment? 2 Treatment Effects 3 Statistical Inference

slide-64
SLIDE 64

Definition Treatment Effects Statistical Inference

Experimental Analysis I

We don’t just care about the size of the SATE. We also want to measure it precisely and know whether it is significantly different from zero (i.e., different from no effect/difference) To know that, we need to estimate the variance of the SATE The variance is influenced by: Total sample size Variance of the outcome, Y Relative size of each treatment group “Advanced” design features

slide-65
SLIDE 65

Definition Treatment Effects Statistical Inference

Experimental Analysis II

Formula for the variance of the SATE is:

  • Var(SATE) =

 

  • Var(Y0)

n0

  +  

  • Var(Y1)

n1

 

  • Var(Y0) is control group variance
  • Var(Y1) is treatment group variance

We often express this as the standard error of the estimate:

  • SE SATE =
  • Var(Y0)

n0

+

  • Var(Y1)

n1

slide-66
SLIDE 66

Definition Treatment Effects Statistical Inference

Intuition about Variance

Bigger sample → smaller SEs Smaller variance → smaller SEs Efficient use of sample size:

When treatment group variances equal, equal sample sizes are most efficient When variances differ, sample units are better allocated to the group with higher variance in Y

slide-67
SLIDE 67

Definition Treatment Effects Statistical Inference

Statistical Inference

To assess whether an effect differs from zero, we need to know the sampling distribution of the ATE Two major ways to do this:

1 Assume a parametric distribution (e.g., t-test) 2 Randomization inference

In large samples, the latter approaches the former

slide-68
SLIDE 68

Definition Treatment Effects Statistical Inference

Randomization Inference I

The randomization (or permutation) distribution is an empirical sampling distribution It conveys the variation we would observe in

  • ATE if a null hypothesis, H0 : ATE = 0 was

true If this null hypothesis is true, then treatment had no effect; the variation in permuted ATEs therefore only reflects sampling variance

slide-69
SLIDE 69

Definition Treatment Effects Statistical Inference

unit treatment

  • utcome

A 5 B 7 C 9 D 4 E 1 9 F 1 4 G 1 13 H 1 12

  • ATE = 3.25
slide-70
SLIDE 70

Definition Treatment Effects Statistical Inference

unit treatment

  • utcome

A 5 B 1 7 C 9 D 1 4 E 9 F 1 4 G 13 H 1 12

  • ATE = −1.5
slide-71
SLIDE 71

Definition Treatment Effects Statistical Inference

unit treatment

  • utcome

A 1 5 B 1 7 C 9 D 4 E 1 9 F 4 G 13 H 1 12

  • ATE = 0.75
slide-72
SLIDE 72

Definition Treatment Effects Statistical Inference

Randomization Distribution

Randomization ATE 1 3.25 2

  • 1.50

3 0.75 4 . . . . . . . . . In a two-condition experiment, the number of possible permutations is given by

n n1

slide-73
SLIDE 73

Definition Treatment Effects Statistical Inference

Randomization Inference II

Randomization inference works as follows:

1 Generate every possible randomization scheme

Or sample from all possible randomizations

2 Calculate ATE under each randomization 3 The distribution of those estimates is the

randomization distribution

4 Its variance is

  • Var(ATE)

5 Proportion of values further from 0 than the

  • bserved
  • ATE is the p-value for a test of the

null hypothesis (H0 : ATE = 0)

slide-74
SLIDE 74

Definition Treatment Effects Statistical Inference

Randomization Distribution

Permuted ATE Frequency −6 −4 −2 2 4 6 500 1000 1500

slide-75
SLIDE 75

Definition Treatment Effects Statistical Inference

Randomization Inference in R

# construct data d <- data.frame(x = c(0,0,0,0,1,1,1,1), y = c(5,7,9,4,11,4,13,12)) # calculate ATE from each randomization set.seed(1) # set random number seed n <- 10000 # number of randomizations rd <- replicate(n, coef(lm(d$y ~ sample(d$x, 8)))[2L]) # visualize the randomization distribution hist(rd) abline(v = coef(lm(y~x, data = d))[2L], col = "red") # one-tailed significance test sum(rd >= coef(lm(y ~ x, data = d))[2L])/n # two-tailed significance test sum(abs(rd) >= coef(lm(y ~ x, data = d))[2L])/n

slide-76
SLIDE 76

Definition Treatment Effects Statistical Inference

Parametric Analysis Stata/R

R:

t.test(outcome ~ treatment, data = data) lm(outcome ~ factor(treatment), data = data) Stata: ttest outcome, by(treatment) reg outcome i.treatment

slide-77
SLIDE 77

Definition Treatment Effects Statistical Inference

Questions?

slide-78
SLIDE 78