[PPT] - Performance Evaluation of Policies Performance Evaluation of PowerPoint Presentation

SLIDE 1

Performance Evaluation of Policies Performance Evaluation of Policies and Programmes

Adam B. Jaffe Director, Motu Economic and Public Policy Research Treasury/MBIE Seminar Treasury/MBIE Seminar 27 August 2013

SLIDE 2

Background g

Evidence-based policy, etc.

Sk ti i ?

Skepticism?

SLIDE 3

The Problem

A li i lik d W

A policy or a programme is like a new drug. We

would like to know if it is effective, and how its ff ti t lt ti effectiveness compares to alternatives.

With a drug, it is not enough that the patient gets
better. With a policy, it is not enough that the

policy goal is met.

Want to measure the treatment effect, i.e. how

the state of the policy objectives compares to p y j p what it would have been without the policy.

SLIDE 4

We’d like to know…

M it d f i t (“ t t ” d “ t ”)

Magnitude of impacts (“outputs” and “outcomes”)
Magnitude of impacts relative to resources

required (cost-effectiveness)

Relative effectiveness of different instruments or

approaches

Relative effectiveness in different contexts

Relative effectiveness in different contexts (conditional cost-effectiveness)

SLIDE 5

Examples p

H lth i d li d

Health service delivery modes
Scholarships
Tax subsidies
Regulations

Regulations

Grant programs
……

SLIDE 6

Analytical Issues y

O t t d t th t h d t

Outputs and outcomes that are hard to measure
Long and/or uncertain lags between action and
utcomes
Characterizing the unobserved “but for” world

g Selection bias in programme participation

Others I will not say much about:
Others I will not say much about:
Incremental versus average impact

G l ilib i ff t

General equilibrium effects

SLIDE 7

Thought on metrics g

Quantify where possible, but…
Non-quantifiable doesn’t mean unimportant
Multiple metrics

Multiple metrics

Tradeoff between comparability and precision

Al t l i di t th th

Almost always proxy or indicator rather than

“true” variable

Measurement (random) error
Behavioral changes in response to evaluation
Long/uncertain lags  ongoing evaluation

SLIDE 8

Isolating the Treatment Effect g

Typically, start by comparing performance of

Typically, start by comparing performance of treated group before and after the treatment

Issues
Issues
Placebo effect
Regression to the mean
Regression to the mean
Sectoral trends

C h i t t d t h i

Compare change in treated group to change in

“control group”

SLIDE 9

“Difference in difference” approach pp

“Gold Standard is DID with Random Assignment

Gold Standard is DID with Random Assignment (“RA”) to treatment group and control group

SLIDE 10

Hypothetical Comparison of Mean Sales Growth for Funded and Unfunded Firms Ignoring Selection Bias

25 30 20 h Mean=20.8 15 Sales Growth Mean=12.5 5 10 "Treatment Effect" = 8.3 5 Unfunded Firms Funded Firms

SLIDE 11

Selection Bias

Frequently, government program provides

Frequently, government program provides assistance to some individuals or firms but not to

thers
thers
Makes those not provided assistance a natural

control group but control group, but…

Programme targets are chosen on the basis of

need (unemployed; under achieving students) need (unemployed; under-achieving students),

r expectation of success (scholarships;

research grants) research grants)

Creates selection bias in difference-in-difference

analysis

SLIDE 12

Regression Discontinuity (“RD”) g y ( ) Approach to Selection Bias

Retain information on ranking used to select

individuals or firms for participation in the p p program

Use this measure of qualification or need as

Use this measure of qualification or need as regressor in explaining subsequent success of treated and untreated groups treated and untreated groups

Dummy variable for program participation then

captures treatment effect after controlling for captures treatment effect after controlling for selection effect

SLIDE 13

Hypothetical Comparison of Mean Sales Growth for Funded and Unfunded Firms Controlling for Selection Bias via Project Ranking at Application

25 30 20

wth

10 15 Sales Gro Treatment Effect= Regression Discontinuity=3 5 Unfunded Firms Funded Firms 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Project Ranking at Application

SLIDE 14

Regression Discontinuity (“RD”) g y ( ) Approach to Selectivity Bias

Statistically controls for the source of non-

random difference between the treated and untreated groups

Works for positive or negative selection effect

p g

Requires retention of information about criteria

for selection

Requires ability to measure success of both

treated and untreated individuals/firms

Note: if the selection criteria are not, in fact,

correlated with success, then slope will be zero but RD measure of treatment effect is still unbiased

SLIDE 15

RD versus Random Assignment g

Both approaches measure the average

Both approaches measure the average treatment effect for treated entities

If the treatment effect were uniform for all
If the treatment effect were uniform for all

entities, then RD reproduces the result of random assignment random assignment

More likely, the magnitude of the treatment effect

may be correlated with the selection measure may be correlated with the selection measure

Most appropriate targets may get biggest boost; or

D i t li it ff t f t lifi d

Decreasing returns may limit effect for most qualified
Has implications for potential expansion of

program to previously untreated group

SLIDE 16

Hypothetical Comparison of Mean Sales Growth for Funded and Unfunded Firms Controlling for Selection Bias via Project Ranking at Application

25 30 20

wth

10 15 Sales Gro Treatment Effect= Regression Discontinuity=3 5 Unfunded Firms Funded Firms 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Project Ranking at Application

SLIDE 17

RD versus Random RA

RA always produces unbiased estimate of

RA always produces unbiased estimate of average effect, but tells you nothing about the underlying variation in efficacy underlying variation in efficacy

Note that in social settings, neither typically

deals with placebo effect deals with placebo effect

Both methods require tracking of untreated

group; not clear which approach makes this group; not clear which approach makes this easier

SLIDE 18

Example of RD Approach p pp

“Reading First” was a billion-dollar program to

Reading First was a billion dollar program to introduce new pedagogy, new student evaluation measures, and specific teacher training methods measures, and specific teacher training methods to improve reading performance of 1st-3rd graders graders

Schools were chosen for the program using a

ranking index based on poverty rates and ranking index based on poverty rates and fraction of students reading below grade level Evaluation was carried out over three years in

Evaluation was carried out over three years in

248 schools, 125 of which were Reading First Schools Schools

SLIDE 19

RD Analysis of Impact of Reading y p g First

Source: Abt Associates, Reading First Final Report, 2008

SLIDE 20

Public Research Programmes g

Need to track performance of unsuccessful

Need to track performance of unsuccessful applicants

Condition for eligibility to begin with?

Condition for eligibility to begin with?

System of identifiers combined with external data—

StarMetrics approach pp

Outputs and outcomes are hard to measure and

subject to measurement response subject to measurement response

Routine/ongoing rather than episodic

SLIDE 21

Concluding Thoughts g g

Combination of faith and hard-to-measure

Combination of faith and hard to measure

utcomes
Accept that some questions are not answerable:
Accept that some questions are not answerable:
Relative effectiveness across policies with

incommensurable outcomes incommensurable outcomes

Incremental versus marginal
GE effects
GE effects
Perfect should not be the enemy of good

B t littl k l d i d thi

But a little knowledge is a dangerous thing
Long lags as an advantage?

SLIDE 22

Advert

Science and Innovation Policy for New Science and Innovation Policy for New Zeland Motu Public Policy Seminar Motu Public Policy Seminar Wednesday 04 September Spectrum Theater BP House Spectrum Theater, BP House Veronica Jacobsen, Discussant