Introduction to Impact Evaluation of RBF Programs Damien de Walque - - PowerPoint PPT Presentation

introduction to impact evaluation of rbf programs
SMART_READER_LITE
LIVE PREVIEW

Introduction to Impact Evaluation of RBF Programs Damien de Walque - - PowerPoint PPT Presentation

Introduction to Impact Evaluation of RBF Programs Damien de Walque | Gil Shapira RBF for Health Impact Evaluation o Build evidence on what works, what doesnt and why o RBF for Health impact evaluations characteristics o Built into program


slide-1
SLIDE 1

Introduction to Impact Evaluation of RBF Programs

Damien de Walque | Gil Shapira

slide-2
SLIDE 2

RBF for Health Impact Evaluation

  • Build evidence on what works, what doesn’t and why
  • RBF for Health impact evaluations characteristics
  • Built into program operations
  • Government ownership
  • Feedback loop for evidence-based decision making
  • Valid Treatment and Control Groups
slide-3
SLIDE 3

Policy questions we are interested to answer Does RBF work?

  • What is the impact of RBF on:
  • Utilization of services?
  • Health outcomes?
  • Does it impact differently different populations?
  • Are there unintended consequences of RBF?
  • Is RBF cost effective relative to other interventions?
slide-4
SLIDE 4

Policy questions we are interested to answer How can RBF work better?

  • What components of an RBF “package” matter most:
  • Performance incentives? Increased financing? Autonomy?

Improved supervision?

  • What are the right incentives?
  • Who should be incentivized? Providers? Households?

Communities?

  • How to reduce reporting errors and corruption?
  • What are the optimal provider capabilities?
  • What are the key organizational building blocks to make

RBF work?

slide-5
SLIDE 5

An Example: The Impact Evaluation of the Rwanda Performance- Based Financing Project

slide-6
SLIDE 6

Rwanda Performance-Based Financing project (Basinga et al. 2011)

  • Improved prenatal care quality (+0.16 std dev), increased

utilization of skilled delivery (+8.1pp) and child preventive care services (+11 pp)

  • No impact on timely prenatal care
  • Greatest effect on services that are under the provider

control and had the highest payment rates

  • Financial performance incentives can improve both use of

and quality of health services.

  • An equal amount of financial resources without the

incentives would not have achieved the same gain in

  • utcomes.
slide-7
SLIDE 7

Impact of Rwanda PBF on Child Preventive Care Utilization

0% 5% 10% 15% 20% 25% 30% 35%

21% 24%

33%

23%

Visit by child 0-23 months in last 4 weeks (=1)

slide-8
SLIDE 8

Impact of Rwanda PBF on Institutional delivery

0% 10% 20% 30% 40% 50% 60%

35% 36% 56% 50% Delivery in-facility for birth in last 18 months

slide-9
SLIDE 9

Rwanda Performance-Based Financing project (Gertler & Vermeersch forthcoming)

  • No impact on family planning
  • Large impacts on child health outcomes (weight 0-11

months, height 24-47 months)

  • Impacts are larger for better skilled providers
  • PBF worked through incentives, not so much through

increased knowledge

slide-10
SLIDE 10

Measuring Impact

Impact Evalua uation

  • n

Methods hods fo for Po Policy Make kers

Slides by Sebastian Martinez, Christel Vermeersch and Paul Gertler. We thank Patrick Premand and Martin Ruegenberg for

  • contributions. The content of this presentation reflects the views of the authors and not necessarily those of the World Bank.
slide-11
SLIDE 11

Impact Evaluation

Logical Framework

Measuring Impact Data Operational Plan Resources

How the program works in theory Identification Strategy

slide-12
SLIDE 12

Causal

Inference

Counterfactuals False Counterfactuals

Before & After (Pre & Post) Enrolled & Not Enrolled

(Apples & Oranges)

slide-13
SLIDE 13

IE Methods

Toolbox

Randomized Assignment Discontinuity Design

Diff-in-Diff

Randomized Promotion

Difference-in-Differences P-Score matching

Matching

slide-14
SLIDE 14

Causal

Inference

Counterfactuals False Counterfactuals

Before & After (Pre & Post) Enrolled & Not Enrolled

(Apples & Oranges)

slide-15
SLIDE 15

Our Objective

Estimate the causal effect (impact)

  • f intervention (P) on outcome (Y).

(P) = Program or Treatment (Y) = Indicator, Measure of Success

Example: What is the effect of a Cash Transfer Program (P)

  • n Household Consumption (Y)?
slide-16
SLIDE 16

Causal Inference

What is the impact of (P) on (Y)?

α= (Y | P=1)-(Y | P=0)

Can we all go home?

slide-17
SLIDE 17

Problem of Missing Data

For a program beneficiary:

α= (Y | P=1)-(Y | P=0)

we observe

(Y | P=1): Household Consumption (Y) with a cash transfer program (P=1)

but we do not observe

(Y | P=0): Household Consumption (Y) without a cash transfer program (P=0)

slide-18
SLIDE 18

Solution

Estimate what would have happened to Y in the absence of P .

We call this the Counterfactual.

slide-19
SLIDE 19

Estimating impact of P on Y

OBSERVE (Y | P=1) Outcome with treatment ESTIMATE (Y | P=0) The Counterfactual

  • Intention to Treat (ITT) –

Those offered treatment

  • Treatment on the Treated

(TOT) – Those receiving treatment

  • Use comparison or

control group

α= (Y | P=1)-(Y | P=0)

IMPACT = - counterfactual

Outcome with treatment

slide-20
SLIDE 20

Example: What is the Impact of…

giving Fulanito

(P) (Y)?

additional pocket money

  • n Fulanito’s consumption
  • f candies
slide-21
SLIDE 21

The Perfect Clone

Fulanito Fulanito’s Clone

IMPACT=6-4=2 Candies

6 candies 4 candies

slide-22
SLIDE 22

In reality, use statistics

Treatment Comparison

Average Y=6 candies Average Y=4 Candies

IMPACT=6-4=2 Candies

slide-23
SLIDE 23

Finding good comparison groups

We want to find clones for the Fulanitos in our programs. The treatment and comparison groups should

  • have identical characteristics
  • except for benefiting from the intervention.

In practice, use program eligibility & assignment rules to construct valid estimates of the counterfactuals

slide-24
SLIDE 24

Case Study: Progresa

National anti-poverty program in Mexico

  • Started 1997
  • 5 million beneficiaries by 2004
  • Eligibility – based on poverty index

Cash Transfers

  • Conditional on school and health care attendance.
slide-25
SLIDE 25

Case Study: Progresa

Rigorous impact evaluation with rich data

  • 506 communities, 24,000 households
  • Baseline 1997, follow-up 2008

Many outcomes of interest

Here: Consumption per capita

What is the effect of Progresa (P) on Consumption Per Capita (Y)?

If impact is a increase of $20 or more, then scale up nationally

slide-26
SLIDE 26

Eligibility and Enrollment

Ineligibles

(Non-Poor)

Eligibles

(Poor)

Enrolled Not Enrolled

slide-27
SLIDE 27

Causal

Inference

Counterfactuals False Counterfactuals

Before & After (Pre & Post) Enrolled & Not Enrolled

(Apples & Oranges)

slide-28
SLIDE 28

Counterfeit Counterfactual #1

Y Time

T=0 Baseline T=1 Endline A-B = 4 A-C = 2

IMPACT?

B A C (counterfactual)

Before & After

slide-29
SLIDE 29

Case 1: Before & After

What is the effect of Progresa (P) on consumption (Y)?

Y Time

T=1997 T=1998

α = $35

IMPACT=A-B= $35

B A 233 268

(1) Observe only beneficiaries (P=1) (2) Two observations in time: Consumption at T=0 and consumption at T=1.

slide-30
SLIDE 30

Case 1: Before & After

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

Consumption (Y)

Outcome with Treatment (After)

268.7

Counterfactual (Before)

233.4

Impact (Y | P=1) - (Y | P=0)

35.3***

Estimated Impact on Consumption (Y)

Linear Regression

35.27**

Multivariate Linear Regression

34.28**

slide-31
SLIDE 31

Case 1: What’s the problem?

Y Time

T=0 T=1

α = $35

B A 233 268

Economic Boom:

  • Real Impact=A-C
  • A-B is an
  • verestimate

C ? D ?

Impact? Impact?

Economic Recession:

  • Real Impact=A-D
  • A-B is an

underestimate

slide-32
SLIDE 32

Causal

Inference

Counterfactuals False Counterfactuals

Before & After (Pre & Post) Enrolled & Not Enrolled

(Apples & Oranges)

slide-33
SLIDE 33

False Counterfactual #2

If we have post-treatment data on

  • Enrolled: treatment group
  • Not-enrolled: “control” group (counterfactual)

Those ineligible to participate. Or those that choose NOT to participate.

Selection Bias

  • Reason for not enrolling may be correlated

with outcome (Y)

Control for observables. But not un-observables!

  • Estimated impact is confounded with other

things.

Enrolled & Not Enrolled

slide-34
SLIDE 34

Measure outcomes in post-treatment (T=1)

Case 2: Enrolled & Not Enrolled

Enrolled Y=268 Not Enrolled Y=290

Ineligibles

(Non-Poor)

Eligibles

(Poor)

In what ways might E&NE be different, other than their enrollment in the program?

slide-35
SLIDE 35

Case 2: Enrolled & Not Enrolled

Consumption (Y)

Outcome with Treatment (Enrolled)

268

Counterfactual (Not Enrolled)

290

Impact (Y | P=1) - (Y | P=0)

  • 22**

Estimated Impact on Consumption (Y)

Linear Regression

  • 22**

Multivariate Linear Regression

  • 4.15

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

slide-36
SLIDE 36

Progresa Policy Recommendation?

Will you recommend scaling up Progresa? B&A: Are there other time-varying factors that also influence consumption? E&NE:

  • Are reasons for enrolling correlated with consumption?
  • Selection Bias.

Impact on Consumption (Y)

Case 1: Before & After Linear Regression

35.27**

Multivariate Linear Regression

34.28**

Case 2: Enrolled & Not Enrolled Linear Regression

  • 22**

Multivariate Linear Regression

  • 4.15

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

slide-37
SLIDE 37

B&A

Compare: Same individuals Before and After they receive P. Problem: Other things may have happened over time.

E&NE

Compare: Group of individuals Enrolled in a program with group that chooses not to enroll. Problem: Selection Bias. We don’t know why they are not enrolled.

Keep in Mind

Both counterfactuals may lead to biased estimates of the counterfactual and the impact.

!

slide-38
SLIDE 38

IE Methods

Toolbox

Randomized Assignment Discontinuity Design

Diff-in-Diff

Randomized Promotion

Difference-in-Differences P-Score matching

Matching

slide-39
SLIDE 39

IE Methods

Toolbox

Randomized Assignment Discontinuity Design

Diff-in-Diff

Randomized Promotion

Difference-in-Differences P-Score matching

Matching

slide-40
SLIDE 40

Randomized Treatments & Controls

  • Randomize!
  • Lottery for who is offered benefits
  • Fair, transparent and ethical way to assign benefits to equally

deserving populations.

Eligibles > Number of Benefits

  • Give each eligible unit the same chance of receiving treatment
  • Compare those offered treatment with those not offered

treatment (controls).

Oversubscription

  • Give each eligible unit the same chance of receiving treatment

first, second, third…

  • Compare those offered treatment first, with those
  • ffered later (controls).

Randomized Phase In

slide-41
SLIDE 41

= Ineligible

Randomized treatments and controls

= Eligible

  • 1. Population

External Validity

  • 2. Evaluation sample
  • 3. Randomize

treatment Internal Validity

Comparison

slide-42
SLIDE 42

Unit of Randomization

Choose according to type of program

  • Individual/Household
  • School/Health

Clinic/catchment area

  • Block/Village/Community
  • Ward/District/Region

Keep in mind

  • Need “sufficiently large” number of units to

detect minimum desired impact: Power.

  • Spillovers/contamination
  • Operational and survey costs
slide-43
SLIDE 43

Case 3: Randomized Assignment

Progresa CCT program Unit of randomization: Community

  • 320 treatment communities (14446 households):

First transfers in April 1998.

  • 186 control communities (9630 households):

First transfers November 1999

506 communities in the evaluation sample Randomized phase-in

slide-44
SLIDE 44

Case 3: Randomized Assignment

Treatment Communities

320

Control Communities

186

Time T=1 T=0 Comparison Period

slide-45
SLIDE 45

Case 3: Randomized Assignment

How do we know we have good clones?

In the absence of Progresa, treatment and comparisons should be identical Let’s compare their characteristics at baseline (T=0)

slide-46
SLIDE 46

Case 3: Balance at Baseline

Case 3: Randomized Assignment

Control Treatment T-stat

Consumption ($ monthly per capita) 233.47

233.4

  • 0.39

Head’s age (years)

42.3 41.6 1.2

Spouse’s age (years)

36.8 36.8

  • 0.38

Head’s education (years)

2.8 2.9

  • 2.16**

Spouse’s education (years)

2.6 2.7

  • 0.006

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

slide-47
SLIDE 47

Case 3: Balance at Baseline

Case 3: Randomized Assignment

Control Treatment T-stat

Head is female=1

0.07 0.07 0.66

Indigenous=1

0.42 0.42 0.21

Number of household members

5.7 5.7

  • 1.21

Bathroom=1

0.56 0.57

  • 1.04

Hectares of Land

1.71 1.67 1.35

Distance to Hospital (km)

106 109

  • 1.02

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

slide-48
SLIDE 48

Case 3: Randomized Assignment

Treatment Group (Randomized to treatment) Counterfactual (Randomized to Comparison) Impact (Y | P=1) - (Y | P=0) Baseline (T=0) Consumption (Y)

233.47 233.40 0.07

Follow-up (T=1) Consumption (Y)

268.75 239.5 29.25**

Estimated Impact on Consumption (Y)

Linear Regression

29.25**

Multivariate Linear Regression

29.75**

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

slide-49
SLIDE 49

Progresa Policy Recommendation?

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

Impact of Progresa on Consumption (Y)

Case 1: Before & After Multivariate Linear Regression

34.28**

Case 2: Enrolled & Not Enrolled Linear Regression

  • 22**

Multivariate Linear Regression

  • 4.15

Case 3: Randomized Assignment Multivariate Linear Regression

29.75**

slide-50
SLIDE 50

Keep in Mind

Randomized Assignment

In Randomized Assignment, large enough samples, produces 2 statistically equivalent groups. We have identified the perfect clone.

Randomized beneficiary Randomized comparison

Feasible for prospective evaluations with over- subscription/excess demand. Most pilots and new programs fall into this category.

!

slide-51
SLIDE 51

Randomized assignment with different benefit levels

Traditional impact evaluation question:

  • What is the impact of a program on an outcome?

Other policy question of interest:

  • What is the optimal level for program benefits?
  • What is the impact of a “higher-intensity” treatment

compared to a “lower-intensity” treatment?

Randomized assignment with 2 levels of benefits:

Comparison Low Benefit High Benefit

X

slide-52
SLIDE 52

= Ineligible

Randomized assignment with different benefit levels

= Eligible

  • 1. Eligible Population
  • 2. Evaluation sample
  • 3. Randomize

treatment

(2 benefit levels)

Comparison

slide-53
SLIDE 53

Randomized assignment with multiple interventions

Other key policy question for a program with various benefits:

  • What is the impact of an intervention compared to another?
  • Are there complementarities between various interventions?

Randomized assignment with 2 benefit packages:

Intervention 2 Comparison Treatment Intervention 1 Comparison Group A

X

Group C Treatment Group B Group D

slide-54
SLIDE 54

= Ineligible

Randomized assignment with multiple interventions

= Eligible

  • 1. Eligible Population
  • 2. Evaluation sample
  • 3. Randomize

intervention 1

  • 4. Randomize

intervention 2

X

slide-55
SLIDE 55

IE Methods

Toolbox

Randomized Assignment Discontinuity Design

Diff-in-Diff

Randomized Promotion

Difference-in-Differences P-Score matching

Matching

slide-56
SLIDE 56

Difference-in-differences

(Diff-in-diff)

Y=Girl’s school attendance P=Tutoring program Diff-in-Diff: Impact=(Yt1-Yt0)-(Y

c1-Y c0)

Enrolled Not Enrolled After

0.74 0.81

Before

0.60 0.78

Difference +0.14

+0.03 0.11

  • =
slide-57
SLIDE 57

Difference-in-differences

(Diff-in-diff)

Diff-in-Diff: Impact=(Yt1-Y

c1)-(Yt0-Y c0)

Y=Girl’s school attendance P=Tutoring program Enrolled Not Enrolled After

0.74 0.81

Before

0.60 0.78

Difference

  • 0.07
  • 0.18

0.11

  • =
slide-58
SLIDE 58

Impact =(A-B)-(C-D)=(A-C)-(B-D)

School Attendance B=0.60 C=0.81 D=0.78 T=0 T=1 Time Enrolled Not enrolled

Impact=0.11

A=0.74

slide-59
SLIDE 59

Impact =(A-B)-(C-D)=(A-C)-(B-D)

School Attendance

Impact<0.11

B=0.60 A=0.74 C=0.81 D=0.78 T=0 T=1 Time Enrolled Not enrolled

slide-60
SLIDE 60

Case 6: Difference in differences

Enrolled Not Enrolled Difference Baseline (T=0) Consumption (Y)

233.47 281.74

  • 48.27

Follow-up (T=1) Consumption (Y)

268.75 290

  • 21.25

Difference

35.28 8.26 27.02

Estimated Impact on Consumption (Y)

Linear Regression

27.06**

Multivariate Linear Regression

25.53**

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

slide-61
SLIDE 61

Progresa Policy Recommendation?

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

Impact of Progresa on Consumption (Y)

Case 1: Before & After

34.28**

Case 2: Enrolled & Not Enrolled

  • 4.15

Case 3: Randomized Assignment

29.75**

Case 4: Randomized Promotion

30.4**

Case 5: Discontinuity Design

30.58**

Case 6: Difference-in-Differences

25.53**

slide-62
SLIDE 62

Keep in Mind

Difference-in-Differences

Differences in Differences combines Enrolled & Not Enrolled with Before & After. Slope: Generate counterfactual for change in

  • utcome

Trends –slopes- are the same in treatments and controls

(Fundamental assumption).

To test this, at least 3

  • bservations in time are

needed:

  • 2 observations before
  • 1 observation after.

!

slide-63
SLIDE 63

IE Methods

Toolbox

Randomized Assignment Discontinuity Design

Diff-in-Diff

Randomized Promotion

Difference-in-Differences P-Score matching

Matching

slide-64
SLIDE 64

Choosing your IE method(s)

Prospective/Retrospective Evaluation? Eligibility rules and criteria? Roll-out plan (pipeline)? Is the number of eligible units larger than available resources at a given point in time?

  • Poverty targeting?
  • Geographic

targeting?

  • Budget and capacity

constraints?

  • Excess demand for

program?

  • Etc.

Key information you will need for identifying the right method for your program:

slide-65
SLIDE 65

Choosing your IE method(s)

Best Design Have we controlled for everything? Is the result valid for everyone?

  • Best comparison group you

can find + least operational risk

  • External validity
  • Local versus global treatment

effect

  • Evaluation results apply to

population we’re interested in

  • Internal validity
  • Good comparison group

Choose the best possible design given the

  • perational context:
slide-66
SLIDE 66

Choosing your method

Targeted (Eligibility Cut-off) Universal (No Eligibility Cut-off) Limited Resources (Never Able to Achieve Scale) Fully Resourced (Able to Achieve Scale) Limited Resources (Never Able to Achieve Scale) Fully Resourced (Able to Achieve Scale) Phased Implementation Over Time

  • Randomized

Assignment

  • RDD
  • Randomized

Assignment (roll-out)

  • RDD
  • Randomized

Assignment

  • Matching

with DiD

  • Randomized

Assignment (roll-out)

  • Matching

with DiD

Immediate Implementation

  • Random

Assignment

  • RDD
  • Random

Promotion

  • RDD
  • Random

Assignment

  • Matching

with DiD

  • Random

Promotion

slide-67
SLIDE 67

Remember

The objective of impact evaluation is to estimate the causal effect or impact of a program on outcomes

  • f interest.
slide-68
SLIDE 68

Remember

To estimate impact, we need to estimate the counterfactual.

  • what would have happened in the absence of

the program and

  • use comparison or control groups.
slide-69
SLIDE 69

Remember

We have a toolbox with 5 methods to identify good comparison groups.

slide-70
SLIDE 70

Remember

Choose the best evaluation method that is feasible in the program’s operational context.

slide-71
SLIDE 71

Reference

This material constitutes supporting material for the "Impact Evaluation in Practice book. This additional material is made freely but please acknowledge its use as follows: Gertler, P . J.; Martinez, S., Premand, P ., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of this presentation reflects the views of the authors and not necessarily those of the World Bank."

slide-72
SLIDE 72

Thank You Thank You