ECON 626: Applied Microeconomics Lecture 8: Permutations and - - PowerPoint PPT Presentation

econ 626 applied microeconomics lecture 8 permutations
SMART_READER_LITE
LIVE PREVIEW

ECON 626: Applied Microeconomics Lecture 8: Permutations and - - PowerPoint PPT Presentation

ECON 626: Applied Microeconomics Lecture 8: Permutations and Bootstraps Professors: Pamela Jakiela and Owen Ozier Part I: Randomization Inference Randomization Inference vs Confidence Intervals See Imbens and Rubin, Causal Inference , first


slide-1
SLIDE 1

ECON 626: Applied Microeconomics Lecture 8: Permutations and Bootstraps

Professors: Pamela Jakiela and Owen Ozier

slide-2
SLIDE 2

Part I: Randomization Inference

slide-3
SLIDE 3

Randomization Inference vs Confidence Intervals

  • See Imbens and Rubin, Causal Inference, first chapters.
  • 100 years ago, Fisher was after a “sharp null,” where Neyman and

Gosset (Student) were concerned with average effects.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 3

slide-4
SLIDE 4

Randomization Inference

How can we do hypothesis testing without asymptotic approximations?

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-5
SLIDE 5

Randomization Inference

How can we do hypothesis testing without asymptotic approximations? Begin with idea of a sharp null: Y1i = Y0i ∀i. (Gerber and Green, p.62)

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-6
SLIDE 6

Randomization Inference

How can we do hypothesis testing without asymptotic approximations? Begin with idea of a sharp null: Y1i = Y0i ∀i. (Gerber and Green, p.62)

  • If Y1i = Y0i ∀i, then if we observe either, we have seen both.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-7
SLIDE 7

Randomization Inference

How can we do hypothesis testing without asymptotic approximations? Begin with idea of a sharp null: Y1i = Y0i ∀i. (Gerber and Green, p.62)

  • If Y1i = Y0i ∀i, then if we observe either, we have seen both.
  • All possible treatment arrangements would yield the same Y values.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-8
SLIDE 8

Randomization Inference

How can we do hypothesis testing without asymptotic approximations? Begin with idea of a sharp null: Y1i = Y0i ∀i. (Gerber and Green, p.62)

  • If Y1i = Y0i ∀i, then if we observe either, we have seen both.
  • All possible treatment arrangements would yield the same Y values.
  • We could then calculate all possible treatment effect estimates under

the sharp null.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-9
SLIDE 9

Randomization Inference

How can we do hypothesis testing without asymptotic approximations? Begin with idea of a sharp null: Y1i = Y0i ∀i. (Gerber and Green, p.62)

  • If Y1i = Y0i ∀i, then if we observe either, we have seen both.
  • All possible treatment arrangements would yield the same Y values.
  • We could then calculate all possible treatment effect estimates under

the sharp null.

  • The distribution of these possible treatment effects allows us to

compute p-values:

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-10
SLIDE 10

Randomization Inference

How can we do hypothesis testing without asymptotic approximations? Begin with idea of a sharp null: Y1i = Y0i ∀i. (Gerber and Green, p.62)

  • If Y1i = Y0i ∀i, then if we observe either, we have seen both.
  • All possible treatment arrangements would yield the same Y values.
  • We could then calculate all possible treatment effect estimates under

the sharp null.

  • The distribution of these possible treatment effects allows us to

compute p-values: The probability that, under the null, something this large or larger would occur at random. (For the two sided test, “large” means in absolute value terms.)

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-11
SLIDE 11

Randomization Inference

How can we do hypothesis testing without asymptotic approximations? Begin with idea of a sharp null: Y1i = Y0i ∀i. (Gerber and Green, p.62)

  • If Y1i = Y0i ∀i, then if we observe either, we have seen both.
  • All possible treatment arrangements would yield the same Y values.
  • We could then calculate all possible treatment effect estimates under

the sharp null.

  • The distribution of these possible treatment effects allows us to

compute p-values: The probability that, under the null, something this large or larger would occur at random. (For the two sided test, “large” means in absolute value terms.)

  • This extends naturally to the case where treatment assignments are

restricted in some way. Recall, for example, the Bruhn and McKenzie (2009) discussion of the many different restrictions that can be used to yield balanced randomization.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 4

slide-12
SLIDE 12

Randomization Inference

It is often impractical to enumerate all possible treatment effects. Instead, we sample a large number of them:

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 5

slide-13
SLIDE 13

Randomization Inference

It is often impractical to enumerate all possible treatment effects. Instead, we sample a large number of them:

  • Regress Y on T. Note the absolute value of the coefficient on T.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 5

slide-14
SLIDE 14

Randomization Inference

It is often impractical to enumerate all possible treatment effects. Instead, we sample a large number of them:

  • Regress Y on T. Note the absolute value of the coefficient on T.
  • For a large number of iterations:

◮ Devise an alternative random assignment of treatment. In the

simplest, unrestricted case, this means scrambling the relationship between Y and T randomly, preserving the number of treatment and comparison units in T. Call this assignment AlternativeTreatment.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 5

slide-15
SLIDE 15

Randomization Inference

It is often impractical to enumerate all possible treatment effects. Instead, we sample a large number of them:

  • Regress Y on T. Note the absolute value of the coefficient on T.
  • For a large number of iterations:

◮ Devise an alternative random assignment of treatment. In the

simplest, unrestricted case, this means scrambling the relationship between Y and T randomly, preserving the number of treatment and comparison units in T. Call this assignment AlternativeTreatment.

◮ Regress Y on AlternativeTreatment. UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 5

slide-16
SLIDE 16

Randomization Inference

It is often impractical to enumerate all possible treatment effects. Instead, we sample a large number of them:

  • Regress Y on T. Note the absolute value of the coefficient on T.
  • For a large number of iterations:

◮ Devise an alternative random assignment of treatment. In the

simplest, unrestricted case, this means scrambling the relationship between Y and T randomly, preserving the number of treatment and comparison units in T. Call this assignment AlternativeTreatment.

◮ Regress Y on AlternativeTreatment. ◮ Note whether the absolute value of the coefficient on

AlternativeTreatment equals or exceeds the absolute value of the

  • riginal (true) coefficient on T. If so, increment a counter.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 5

slide-17
SLIDE 17

Randomization Inference

It is often impractical to enumerate all possible treatment effects. Instead, we sample a large number of them:

  • Regress Y on T. Note the absolute value of the coefficient on T.
  • For a large number of iterations:

◮ Devise an alternative random assignment of treatment. In the

simplest, unrestricted case, this means scrambling the relationship between Y and T randomly, preserving the number of treatment and comparison units in T. Call this assignment AlternativeTreatment.

◮ Regress Y on AlternativeTreatment. ◮ Note whether the absolute value of the coefficient on

AlternativeTreatment equals or exceeds the absolute value of the

  • riginal (true) coefficient on T. If so, increment a counter.
  • Divide the counter by the number of iterations. You have a p-value!

Gerber and Green, p.63: “... the calculation of p-values based on an inventory of possible randomizations is called randomization inference.”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 5

slide-18
SLIDE 18

Randomization Inference

Gerber and Green, p.64:

  • “The sampling distribution of the test statistic under the null

hypothesis is computed by simulating all possible random

  • assignments. When the number of random assignments is too large

to simulate, the sampling distribution may be approximated by a large random sample of possible assignments. p-values are calculated by comparing the observed test statistic to the distribution of test statistics under the null hypothesis.”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 6

slide-19
SLIDE 19

Randomization Inference

Gerber and Green, p.64:

  • “The sampling distribution of the test statistic under the null

hypothesis is computed by simulating all possible random

  • assignments. When the number of random assignments is too large

to simulate, the sampling distribution may be approximated by a large random sample of possible assignments. p-values are calculated by comparing the observed test statistic to the distribution of test statistics under the null hypothesis.” NOTE: How large a random sample? What is the standard deviation of a binary outcome with mean 0.05?

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 6

slide-20
SLIDE 20

Randomization Inference

Gerber and Green, p.64:

  • “The sampling distribution of the test statistic under the null

hypothesis is computed by simulating all possible random

  • assignments. When the number of random assignments is too large

to simulate, the sampling distribution may be approximated by a large random sample of possible assignments. p-values are calculated by comparing the observed test statistic to the distribution of test statistics under the null hypothesis.” NOTE: How large a random sample? What is the standard deviation of a binary outcome with mean 0.05? About 0.22.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 6

slide-21
SLIDE 21

Randomization Inference

Gerber and Green, p.64:

  • “The sampling distribution of the test statistic under the null

hypothesis is computed by simulating all possible random

  • assignments. When the number of random assignments is too large

to simulate, the sampling distribution may be approximated by a large random sample of possible assignments. p-values are calculated by comparing the observed test statistic to the distribution of test statistics under the null hypothesis.” NOTE: How large a random sample? What is the standard deviation of a binary outcome with mean 0.05? About 0.22. Standard error (of this estimated p-value) ...in a sample of size 100 alternative randomizations?

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 6

slide-22
SLIDE 22

Randomization Inference

Gerber and Green, p.64:

  • “The sampling distribution of the test statistic under the null

hypothesis is computed by simulating all possible random

  • assignments. When the number of random assignments is too large

to simulate, the sampling distribution may be approximated by a large random sample of possible assignments. p-values are calculated by comparing the observed test statistic to the distribution of test statistics under the null hypothesis.” NOTE: How large a random sample? What is the standard deviation of a binary outcome with mean 0.05? About 0.22. Standard error (of this estimated p-value) ...in a sample of size 100 alternative randomizations? About 0.022.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 6

slide-23
SLIDE 23

Randomization Inference

Gerber and Green, p.64:

  • “The sampling distribution of the test statistic under the null

hypothesis is computed by simulating all possible random

  • assignments. When the number of random assignments is too large

to simulate, the sampling distribution may be approximated by a large random sample of possible assignments. p-values are calculated by comparing the observed test statistic to the distribution of test statistics under the null hypothesis.” NOTE: How large a random sample? What is the standard deviation of a binary outcome with mean 0.05? About 0.22. Standard error (of this estimated p-value) ...in a sample of size 100 alternative randomizations? About 0.022. ...in a sample of size 10,000 alternative randomizations?

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 6

slide-24
SLIDE 24

Randomization Inference

Gerber and Green, p.64:

  • “The sampling distribution of the test statistic under the null

hypothesis is computed by simulating all possible random

  • assignments. When the number of random assignments is too large

to simulate, the sampling distribution may be approximated by a large random sample of possible assignments. p-values are calculated by comparing the observed test statistic to the distribution of test statistics under the null hypothesis.” NOTE: How large a random sample? What is the standard deviation of a binary outcome with mean 0.05? About 0.22. Standard error (of this estimated p-value) ...in a sample of size 100 alternative randomizations? About 0.022. ...in a sample of size 10,000 alternative randomizations? About 0.0022.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 6

slide-25
SLIDE 25

Randomization Inference Confidence Intervals

Major drawback

  • This doesn’t give you a confidence interval automatically.
  • Under assumptions, you can construct them (Gerber and Green,

section 3.5):

◮ “The most traightforward method for filling in missing potential

  • utcomes is to assume that the treatment effect τi is the same for all

subjects.”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 7

slide-26
SLIDE 26

Randomization Inference Confidence Intervals

Gerber and Green, section 3.5:

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 8

slide-27
SLIDE 27

Randomization Inference Confidence Intervals

Gerber and Green, section 3.5:

  • “For subjects in the control condition, missing Yi(1) values are

imputed by adding the estimated ATE to the observed values of Yi(0).”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 8

slide-28
SLIDE 28

Randomization Inference Confidence Intervals

Gerber and Green, section 3.5:

  • “For subjects in the control condition, missing Yi(1) values are

imputed by adding the estimated ATE to the observed values of Yi(0).”

  • “Similarly, for subjects in the treatment condition, missing Yi(0)

values are imputed by subtracting the estimated ATE from the

  • bserved values of Yi(1).”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 8

slide-29
SLIDE 29

Randomization Inference Confidence Intervals

Gerber and Green, section 3.5:

  • “For subjects in the control condition, missing Yi(1) values are

imputed by adding the estimated ATE to the observed values of Yi(0).”

  • “Similarly, for subjects in the treatment condition, missing Yi(0)

values are imputed by subtracting the estimated ATE from the

  • bserved values of Yi(1).”
  • “This approach yields a complete schedule of potential outcomes,

which we may then use to simulate all possible random allocations.”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 8

slide-30
SLIDE 30

Randomization Inference Confidence Intervals

Gerber and Green, section 3.5:

  • “For subjects in the control condition, missing Yi(1) values are

imputed by adding the estimated ATE to the observed values of Yi(0).”

  • “Similarly, for subjects in the treatment condition, missing Yi(0)

values are imputed by subtracting the estimated ATE from the

  • bserved values of Yi(1).”
  • “This approach yields a complete schedule of potential outcomes,

which we may then use to simulate all possible random allocations.”

  • “In order to form a 95% confidence interval, we list the estimated

ATE from each random allocation in ascending order. The estimate at the 2.5th percentile marks the bottom of the interval, and the estimate at the 97.5th percentile marks the top of the interval.”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 8

slide-31
SLIDE 31

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-32
SLIDE 32

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-33
SLIDE 33

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-34
SLIDE 34

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-35
SLIDE 35

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-36
SLIDE 36

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • UMD Economics 626: Applied Microeconomics

Lecture 8: Permutation and Bootstrap, Slide 9

slide-37
SLIDE 37

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • = 4 · 4 + 4 · 4 + 1 · 1 + 1 · 1 =

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-38
SLIDE 38

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • = 4 · 4 + 4 · 4 + 1 · 1 + 1 · 1 = = 16 + 16 + 1 + 1 =

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-39
SLIDE 39

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • = 4 · 4 + 4 · 4 + 1 · 1 + 1 · 1 = = 16 + 16 + 1 + 1 = 34

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-40
SLIDE 40

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • = 4 · 4 + 4 · 4 + 1 · 1 + 1 · 1 = = 16 + 16 + 1 + 1 = 34 out of...

8 4

  • =

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-41
SLIDE 41

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • = 4 · 4 + 4 · 4 + 1 · 1 + 1 · 1 = = 16 + 16 + 1 + 1 = 34 out of...

8 4

  • = 70.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-42
SLIDE 42

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • = 4 · 4 + 4 · 4 + 1 · 1 + 1 · 1 = = 16 + 16 + 1 + 1 = 34 out of...

8 4

  • = 70. p-value just under 50 percent.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-43
SLIDE 43

Activity 1

Remember the Lady Tasting Tea from the first class? Suppose she gets a certain number right. For example:

  • Eight cups, four of which had milk added first.
  • After tasting, suppose she correctly says there are four cups which

had milk added first, but while she correctly identifies three cups, she gets one wrong. What is the probability of being that correlated with the truth, or better? (in absolute value terms?)

  • 4

3 4 1

  • +

4 1 4 3

  • +

4 4 4

  • +

4 4 4

  • = 4 · 4 + 4 · 4 + 1 · 1 + 1 · 1 = = 16 + 16 + 1 + 1 = 34 out of...

8 4

  • = 70. p-value just under 50 percent.

Or what if it were ten cups, and she got 4 out of 5? This becomes unwieldy to calculate exactly. Activity: randomly sample.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 9

slide-44
SLIDE 44

Part IIa: Bootstrap

slide-45
SLIDE 45

Bootstrap basics

  • See Angrist and Pischke, pp.300-301 (Bootstrap).
  • Sampling {Yi, Xi} with replacement: “pairs bootstrap” or

“nonparametric bootstrap.”

  • Keeping Xi fixed, sampling ˆ

ei with replacement, constructing new

  • utcomes Yi treating Xi as fixed using the original ˆ

β: one kind of “parametric bootstrap.”

  • Keeping Xi fixed, constructing new outcomes Yi treating Xi as fixed

using the original ˆ β, but randomly flipping the sign of ˆ ei, preserving relationships between Xi and the variance of the residual: “wild bootstrap.”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 11

slide-46
SLIDE 46

Part IIb: Few Clusters; Wild Cluster Bootstrap

slide-47
SLIDE 47

What is the problem with having too few clusters?

.2 .4 .6 .8 1 CDF of p-values .2 .4 .6 .8 1 p-values

Ideal

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 13

slide-48
SLIDE 48

What is the problem with having too few clusters?

.2 .4 .6 .8 1 CDF of p-values .2 .4 .6 .8 1 p-values

Ideal Six uneven clusters

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 13

slide-49
SLIDE 49

How will “bootstrapping” solve the cluster problem?

Bootstrapping is drawing (often with replacement) from some aspect of the data to quantify variability of a statistic.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 14

slide-50
SLIDE 50

How will “bootstrapping” solve the cluster problem?

Bootstrapping is drawing (often with replacement) from some aspect of the data to quantify variability of a statistic. We cluster standard errors because we are concerned that the error may be heteroskedastic and correlated within clusters. So we could not sensibly use a bootstrapping procedure that ignored covariates or the cluster structure.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 14

slide-51
SLIDE 51

How will “bootstrapping” solve the cluster problem?

Bootstrapping is drawing (often with replacement) from some aspect of the data to quantify variability of a statistic. We cluster standard errors because we are concerned that the error may be heteroskedastic and correlated within clusters. So we could not sensibly use a bootstrapping procedure that ignored covariates or the cluster structure. Cameron, Gelbach, Miller (2008) and Cameron and Miller (2015) discuss a procedure that respects covariate structure (“wild”)

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 14

slide-52
SLIDE 52

How will “bootstrapping” solve the cluster problem?

Bootstrapping is drawing (often with replacement) from some aspect of the data to quantify variability of a statistic. We cluster standard errors because we are concerned that the error may be heteroskedastic and correlated within clusters. So we could not sensibly use a bootstrapping procedure that ignored covariates or the cluster structure. Cameron, Gelbach, Miller (2008) and Cameron and Miller (2015) discuss a procedure that respects covariate structure (“wild”) and cluster structure (“cluster”)

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 14

slide-53
SLIDE 53

How will “bootstrapping” solve the cluster problem?

Bootstrapping is drawing (often with replacement) from some aspect of the data to quantify variability of a statistic. We cluster standard errors because we are concerned that the error may be heteroskedastic and correlated within clusters. So we could not sensibly use a bootstrapping procedure that ignored covariates or the cluster structure. Cameron, Gelbach, Miller (2008) and Cameron and Miller (2015) discuss a procedure that respects covariate structure (“wild”) and cluster structure (“cluster”) while drawing alternative residuals (“bootstrap”).

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 14

slide-54
SLIDE 54

The Wild Cluster Bootstrap

Cameron, Gelbach, and Miller procedure goes as follows (Cameron and Miller 2015, Section VI.C.2):

  • “First, estimate the main model, imposing (forcing) the null

hypothesis that you wish to test... For example, for test of statistical significance of a single variable regress yig on all components of xig except the variable that has regressor with coefficient zero under the null hypothesis.”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 15

slide-55
SLIDE 55

The Wild Cluster Bootstrap

Cameron, Gelbach, and Miller procedure goes as follows (Cameron and Miller 2015, Section VI.C.2):

  • “First, estimate the main model, imposing (forcing) the null

hypothesis that you wish to test... For example, for test of statistical significance of a single variable regress yig on all components of xig except the variable that has regressor with coefficient zero under the null hypothesis.”

  • “Form the residual ˜

uig = yig − x′

ig ˜

βH0”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 15

slide-56
SLIDE 56

The Wild Cluster Bootstrap

Cameron, Gelbach, and Miller procedure goes as follows (Cameron and Miller 2015, Section VI.C.2):

  • In each resampling:

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 16

slide-57
SLIDE 57

The Wild Cluster Bootstrap

Cameron, Gelbach, and Miller procedure goes as follows (Cameron and Miller 2015, Section VI.C.2):

  • In each resampling:

◮ “Randomly assign cluster g the weight dg = −1 with probability 0.5

and the weight dg = 1 with probability 0.5. All observations in cluster g get the same value of the weight.” (Rademacher weights)

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 16

slide-58
SLIDE 58

The Wild Cluster Bootstrap

Cameron, Gelbach, and Miller procedure goes as follows (Cameron and Miller 2015, Section VI.C.2):

  • In each resampling:

◮ “Randomly assign cluster g the weight dg = −1 with probability 0.5

and the weight dg = 1 with probability 0.5. All observations in cluster g get the same value of the weight.” (Rademacher weights)

◮ “Generate new pseudo-residuals u∗

ig = dg × ˜

uig, and hence new

  • utcome variables y ∗

ig = x′ ig ˜

βH0 + u∗

  • ig. Then proceed with step 2 as

before, regressing y ∗

ig on xig [not imposing the null], and calculate w ∗

[the t-statistic from this regression, with clustered standard errors.]”

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 16

slide-59
SLIDE 59

The Wild Cluster Bootstrap

Cameron, Gelbach, and Miller procedure goes as follows (Cameron and Miller 2015, Section VI.C.2):

  • In each resampling:

◮ “Randomly assign cluster g the weight dg = −1 with probability 0.5

and the weight dg = 1 with probability 0.5. All observations in cluster g get the same value of the weight.” (Rademacher weights)

◮ “Generate new pseudo-residuals u∗

ig = dg × ˜

uig, and hence new

  • utcome variables y ∗

ig = x′ ig ˜

βH0 + u∗

  • ig. Then proceed with step 2 as

before, regressing y ∗

ig on xig [not imposing the null], and calculate w ∗

[the t-statistic from this regression, with clustered standard errors.]”

  • The p-value for the the test based on the original sample statistic w

equals the proportion of times that |w| > |w ∗

b |.

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 16

slide-60
SLIDE 60

What happens with six clusters

.2 .4 .6 .8 1 CDF of p-values .2 .4 .6 .8 1 p-values

Ideal Six uneven clusters

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 17

slide-61
SLIDE 61

What happens with six clusters

.2 .4 .6 .8 1 CDF of p-values .2 .4 .6 .8 1 p-values

Ideal Six uneven clusters WCB Rademacher

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 17

slide-62
SLIDE 62

So-called Rademacher and Webb weights

17 33 50 67 83 100 Percent

  • 2
  • 1

1 2 dr

Rademacher weights UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 18

slide-63
SLIDE 63

So-called Rademacher and Webb weights

17 33 50 67 83 100 Percent

  • 2
  • 1

1 2 dr

Rademacher weights

17 33 50 67 83 100 Percent

  • 2
  • 1

1 2 dw

Webb weights UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 18

slide-64
SLIDE 64

What happens with six clusters

.2 .4 .6 .8 1 CDF of p-values .2 .4 .6 .8 1 p-values

Ideal Six uneven clusters WCB Rademacher

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 19

slide-65
SLIDE 65

What happens with six clusters

.2 .4 .6 .8 1 CDF of p-values .2 .4 .6 .8 1 p-values

Ideal Six uneven clusters WCB Rademacher WCB Webb

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 19

slide-66
SLIDE 66

What happens with eight clusters

.2 .4 .6 .8 1 CDF of p-values .2 .4 .6 .8 1 p-values

Ideal Eight uneven clusters WCB Rademacher WCB Webb

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 20

slide-67
SLIDE 67

What happens with six clusters (zoom)

.1 .2 .3 .4 .5 CDF of p-values .05 .1 .15 .2 p-values

Ideal Six uneven clusters WCB Rademacher WCB Webb

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 21

slide-68
SLIDE 68

What happens with eight clusters (zoom)

.1 .2 .3 .4 .5 CDF of p-values .05 .1 .15 .2 p-values

Ideal Eight uneven clusters WCB Rademacher WCB Webb

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 22

slide-69
SLIDE 69

Activity 2

UMD Economics 626: Applied Microeconomics Lecture 8: Permutation and Bootstrap, Slide 23