Introduction to Impact Evaluation of RBF Programs
Damien de Walque | Gil Shapira
Introduction to Impact Evaluation of RBF Programs Damien de Walque - - PowerPoint PPT Presentation
Introduction to Impact Evaluation of RBF Programs Damien de Walque | Gil Shapira RBF for Health Impact Evaluation o Build evidence on what works, what doesnt and why o RBF for Health impact evaluations characteristics o Built into program
Damien de Walque | Gil Shapira
Policy questions we are interested to answer Does RBF work?
Policy questions we are interested to answer How can RBF work better?
Improved supervision?
Communities?
RBF work?
Rwanda Performance-Based Financing project (Basinga et al. 2011)
utilization of skilled delivery (+8.1pp) and child preventive care services (+11 pp)
control and had the highest payment rates
and quality of health services.
incentives would not have achieved the same gain in
Impact of Rwanda PBF on Child Preventive Care Utilization
0% 5% 10% 15% 20% 25% 30% 35%
21% 24%
33%
23%
Visit by child 0-23 months in last 4 weeks (=1)
Impact of Rwanda PBF on Institutional delivery
0% 10% 20% 30% 40% 50% 60%
35% 36% 56% 50% Delivery in-facility for birth in last 18 months
Rwanda Performance-Based Financing project (Gertler & Vermeersch forthcoming)
months, height 24-47 months)
increased knowledge
Slides by Sebastian Martinez, Christel Vermeersch and Paul Gertler. We thank Patrick Premand and Martin Ruegenberg for
Logical Framework
Measuring Impact Data Operational Plan Resources
How the program works in theory Identification Strategy
Counterfactuals False Counterfactuals
Before & After (Pre & Post) Enrolled & Not Enrolled
(Apples & Oranges)
Randomized Assignment Discontinuity Design
Diff-in-Diff
Randomized Promotion
Difference-in-Differences P-Score matching
Matching
Counterfactuals False Counterfactuals
Before & After (Pre & Post) Enrolled & Not Enrolled
(Apples & Oranges)
Estimate the causal effect (impact)
(P) = Program or Treatment (Y) = Indicator, Measure of Success
Example: What is the effect of a Cash Transfer Program (P)
What is the impact of (P) on (Y)?
Can we all go home?
For a program beneficiary:
we observe
(Y | P=1): Household Consumption (Y) with a cash transfer program (P=1)
but we do not observe
(Y | P=0): Household Consumption (Y) without a cash transfer program (P=0)
We call this the Counterfactual.
OBSERVE (Y | P=1) Outcome with treatment ESTIMATE (Y | P=0) The Counterfactual
Those offered treatment
(TOT) – Those receiving treatment
control group
IMPACT = - counterfactual
Outcome with treatment
giving Fulanito
(P) (Y)?
additional pocket money
Fulanito Fulanito’s Clone
IMPACT=6-4=2 Candies
6 candies 4 candies
Treatment Comparison
Average Y=6 candies Average Y=4 Candies
IMPACT=6-4=2 Candies
We want to find clones for the Fulanitos in our programs. The treatment and comparison groups should
In practice, use program eligibility & assignment rules to construct valid estimates of the counterfactuals
National anti-poverty program in Mexico
Cash Transfers
Rigorous impact evaluation with rich data
Many outcomes of interest
Here: Consumption per capita
What is the effect of Progresa (P) on Consumption Per Capita (Y)?
If impact is a increase of $20 or more, then scale up nationally
Ineligibles
(Non-Poor)
Eligibles
(Poor)
Enrolled Not Enrolled
Counterfactuals False Counterfactuals
Before & After (Pre & Post) Enrolled & Not Enrolled
(Apples & Oranges)
Y Time
T=0 Baseline T=1 Endline A-B = 4 A-C = 2
IMPACT?
B A C (counterfactual)
What is the effect of Progresa (P) on consumption (Y)?
Y Time
T=1997 T=1998
α = $35
IMPACT=A-B= $35
B A 233 268
(1) Observe only beneficiaries (P=1) (2) Two observations in time: Consumption at T=0 and consumption at T=1.
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Consumption (Y)
Outcome with Treatment (After)
268.7
Counterfactual (Before)
233.4
Impact (Y | P=1) - (Y | P=0)
35.3***
Estimated Impact on Consumption (Y)
Linear Regression
35.27**
Multivariate Linear Regression
34.28**
Y Time
T=0 T=1
α = $35
B A 233 268
Economic Boom:
C ? D ?
Impact? Impact?
Economic Recession:
underestimate
Counterfactuals False Counterfactuals
Before & After (Pre & Post) Enrolled & Not Enrolled
(Apples & Oranges)
If we have post-treatment data on
Those ineligible to participate. Or those that choose NOT to participate.
Selection Bias
with outcome (Y)
Control for observables. But not un-observables!
things.
Enrolled & Not Enrolled
Measure outcomes in post-treatment (T=1)
Enrolled Y=268 Not Enrolled Y=290
Ineligibles
(Non-Poor)
Eligibles
(Poor)
In what ways might E&NE be different, other than their enrollment in the program?
Consumption (Y)
Outcome with Treatment (Enrolled)
268
Counterfactual (Not Enrolled)
290
Impact (Y | P=1) - (Y | P=0)
Estimated Impact on Consumption (Y)
Linear Regression
Multivariate Linear Regression
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Will you recommend scaling up Progresa? B&A: Are there other time-varying factors that also influence consumption? E&NE:
Impact on Consumption (Y)
Case 1: Before & After Linear Regression
35.27**
Multivariate Linear Regression
34.28**
Case 2: Enrolled & Not Enrolled Linear Regression
Multivariate Linear Regression
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Compare: Same individuals Before and After they receive P. Problem: Other things may have happened over time.
Compare: Group of individuals Enrolled in a program with group that chooses not to enroll. Problem: Selection Bias. We don’t know why they are not enrolled.
Both counterfactuals may lead to biased estimates of the counterfactual and the impact.
Randomized Assignment Discontinuity Design
Diff-in-Diff
Randomized Promotion
Difference-in-Differences P-Score matching
Matching
Randomized Assignment Discontinuity Design
Diff-in-Diff
Randomized Promotion
Difference-in-Differences P-Score matching
Matching
deserving populations.
Eligibles > Number of Benefits
treatment (controls).
Oversubscription
first, second, third…
Randomized Phase In
= Ineligible
= Eligible
External Validity
treatment Internal Validity
Comparison
Choose according to type of program
Clinic/catchment area
Keep in mind
detect minimum desired impact: Power.
Progresa CCT program Unit of randomization: Community
First transfers in April 1998.
First transfers November 1999
506 communities in the evaluation sample Randomized phase-in
Treatment Communities
Control Communities
Time T=1 T=0 Comparison Period
How do we know we have good clones?
In the absence of Progresa, treatment and comparisons should be identical Let’s compare their characteristics at baseline (T=0)
Case 3: Randomized Assignment
Control Treatment T-stat
Consumption ($ monthly per capita) 233.47
233.4
Head’s age (years)
42.3 41.6 1.2
Spouse’s age (years)
36.8 36.8
Head’s education (years)
2.8 2.9
Spouse’s education (years)
2.6 2.7
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Case 3: Randomized Assignment
Control Treatment T-stat
Head is female=1
0.07 0.07 0.66
Indigenous=1
0.42 0.42 0.21
Number of household members
5.7 5.7
Bathroom=1
0.56 0.57
Hectares of Land
1.71 1.67 1.35
Distance to Hospital (km)
106 109
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Treatment Group (Randomized to treatment) Counterfactual (Randomized to Comparison) Impact (Y | P=1) - (Y | P=0) Baseline (T=0) Consumption (Y)
233.47 233.40 0.07
Follow-up (T=1) Consumption (Y)
268.75 239.5 29.25**
Estimated Impact on Consumption (Y)
Linear Regression
29.25**
Multivariate Linear Regression
29.75**
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Impact of Progresa on Consumption (Y)
Case 1: Before & After Multivariate Linear Regression
34.28**
Case 2: Enrolled & Not Enrolled Linear Regression
Multivariate Linear Regression
Case 3: Randomized Assignment Multivariate Linear Regression
29.75**
In Randomized Assignment, large enough samples, produces 2 statistically equivalent groups. We have identified the perfect clone.
Randomized beneficiary Randomized comparison
Feasible for prospective evaluations with over- subscription/excess demand. Most pilots and new programs fall into this category.
Traditional impact evaluation question:
Other policy question of interest:
compared to a “lower-intensity” treatment?
Randomized assignment with 2 levels of benefits:
Comparison Low Benefit High Benefit
= Ineligible
= Eligible
treatment
(2 benefit levels)
Comparison
Other key policy question for a program with various benefits:
Randomized assignment with 2 benefit packages:
Intervention 2 Comparison Treatment Intervention 1 Comparison Group A
Group C Treatment Group B Group D
= Ineligible
= Eligible
intervention 1
intervention 2
Randomized Assignment Discontinuity Design
Diff-in-Diff
Randomized Promotion
Difference-in-Differences P-Score matching
Matching
Y=Girl’s school attendance P=Tutoring program Diff-in-Diff: Impact=(Yt1-Yt0)-(Y
c1-Y c0)
Enrolled Not Enrolled After
0.74 0.81
Before
0.60 0.78
Difference +0.14
+0.03 0.11
Diff-in-Diff: Impact=(Yt1-Y
c1)-(Yt0-Y c0)
Y=Girl’s school attendance P=Tutoring program Enrolled Not Enrolled After
0.74 0.81
Before
0.60 0.78
Difference
0.11
School Attendance B=0.60 C=0.81 D=0.78 T=0 T=1 Time Enrolled Not enrolled
Impact=0.11
A=0.74
School Attendance
Impact<0.11
B=0.60 A=0.74 C=0.81 D=0.78 T=0 T=1 Time Enrolled Not enrolled
Enrolled Not Enrolled Difference Baseline (T=0) Consumption (Y)
233.47 281.74
Follow-up (T=1) Consumption (Y)
268.75 290
Difference
35.28 8.26 27.02
Estimated Impact on Consumption (Y)
Linear Regression
27.06**
Multivariate Linear Regression
25.53**
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Impact of Progresa on Consumption (Y)
Case 1: Before & After
34.28**
Case 2: Enrolled & Not Enrolled
Case 3: Randomized Assignment
29.75**
Case 4: Randomized Promotion
30.4**
Case 5: Discontinuity Design
30.58**
Case 6: Difference-in-Differences
25.53**
Differences in Differences combines Enrolled & Not Enrolled with Before & After. Slope: Generate counterfactual for change in
Trends –slopes- are the same in treatments and controls
(Fundamental assumption).
To test this, at least 3
needed:
Randomized Assignment Discontinuity Design
Diff-in-Diff
Randomized Promotion
Difference-in-Differences P-Score matching
Matching
Prospective/Retrospective Evaluation? Eligibility rules and criteria? Roll-out plan (pipeline)? Is the number of eligible units larger than available resources at a given point in time?
targeting?
constraints?
program?
Key information you will need for identifying the right method for your program:
Best Design Have we controlled for everything? Is the result valid for everyone?
can find + least operational risk
effect
population we’re interested in
Choose the best possible design given the
Targeted (Eligibility Cut-off) Universal (No Eligibility Cut-off) Limited Resources (Never Able to Achieve Scale) Fully Resourced (Able to Achieve Scale) Limited Resources (Never Able to Achieve Scale) Fully Resourced (Able to Achieve Scale) Phased Implementation Over Time
Assignment
Assignment (roll-out)
Assignment
with DiD
Assignment (roll-out)
with DiD
Immediate Implementation
Assignment
Promotion
Assignment
with DiD
Promotion
the program and
This material constitutes supporting material for the "Impact Evaluation in Practice book. This additional material is made freely but please acknowledge its use as follows: Gertler, P . J.; Martinez, S., Premand, P ., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of this presentation reflects the views of the authors and not necessarily those of the World Bank."