Performance Evaluation of Policies Performance Evaluation of - - PowerPoint PPT Presentation

performance evaluation of policies performance evaluation
SMART_READER_LITE
LIVE PREVIEW

Performance Evaluation of Policies Performance Evaluation of - - PowerPoint PPT Presentation

Performance Evaluation of Policies Performance Evaluation of Policies and Programmes Adam B. Jaffe Director, Motu Economic and Public Policy Research Treasury/MBIE Seminar Treasury/MBIE Seminar 27 August 2013 Background g


slide-1
SLIDE 1

Performance Evaluation of Policies Performance Evaluation of Policies and Programmes

Adam B. Jaffe Director, Motu Economic and Public Policy Research Treasury/MBIE Seminar Treasury/MBIE Seminar 27 August 2013

slide-2
SLIDE 2

Background g

  • Evidence-based policy, etc.

Sk ti i ?

  • Skepticism?
slide-3
SLIDE 3

The Problem

A li i lik d W

  • A policy or a programme is like a new drug. We

would like to know if it is effective, and how its ff ti t lt ti effectiveness compares to alternatives.

  • With a drug, it is not enough that the patient gets
  • better. With a policy, it is not enough that the

policy goal is met.

  • Want to measure the treatment effect, i.e. how

the state of the policy objectives compares to p y j p what it would have been without the policy.

slide-4
SLIDE 4

We’d like to know…

M it d f i t (“ t t ” d “ t ”)

  • Magnitude of impacts (“outputs” and “outcomes”)
  • Magnitude of impacts relative to resources

required (cost-effectiveness)

  • Relative effectiveness of different instruments or

approaches

  • Relative effectiveness in different contexts

Relative effectiveness in different contexts (conditional cost-effectiveness)

slide-5
SLIDE 5

Examples p

H lth i d li d

  • Health service delivery modes
  • Scholarships
  • Tax subsidies
  • Regulations

Regulations

  • Grant programs
  • ……
slide-6
SLIDE 6

Analytical Issues y

O t t d t th t h d t

  • Outputs and outcomes that are hard to measure
  • Long and/or uncertain lags between action and
  • utcomes
  • Characterizing the unobserved “but for” world

g Selection bias in programme participation

  • Others I will not say much about:
  • Others I will not say much about:
  • Incremental versus average impact

G l ilib i ff t

  • General equilibrium effects
slide-7
SLIDE 7

Thought on metrics g

  • Quantify where possible, but…
  • Non-quantifiable doesn’t mean unimportant
  • Multiple metrics

Multiple metrics

  • Tradeoff between comparability and precision

Al t l i di t th th

  • Almost always proxy or indicator rather than

“true” variable

  • Measurement (random) error
  • Behavioral changes in response to evaluation
  • Long/uncertain lags  ongoing evaluation
slide-8
SLIDE 8

Isolating the Treatment Effect g

  • Typically, start by comparing performance of

Typically, start by comparing performance of treated group before and after the treatment

  • Issues
  • Issues
  • Placebo effect
  • Regression to the mean
  • Regression to the mean
  • Sectoral trends

C h i t t d t h i

  • Compare change in treated group to change in

“control group”

slide-9
SLIDE 9

“Difference in difference” approach pp

  • “Gold Standard is DID with Random Assignment

Gold Standard is DID with Random Assignment (“RA”) to treatment group and control group

slide-10
SLIDE 10

Hypothetical Comparison of Mean Sales Growth for Funded and Unfunded Firms Ignoring Selection Bias

25 30 20 h Mean=20.8 15 Sales Growth Mean=12.5 5 10 "Treatment Effect" = 8.3 5 Unfunded Firms Funded Firms

slide-11
SLIDE 11

Selection Bias

  • Frequently, government program provides

Frequently, government program provides assistance to some individuals or firms but not to

  • thers
  • thers
  • Makes those not provided assistance a natural

control group but control group, but…

  • Programme targets are chosen on the basis of

need (unemployed; under achieving students) need (unemployed; under-achieving students),

  • r expectation of success (scholarships;

research grants) research grants)

  • Creates selection bias in difference-in-difference

analysis

slide-12
SLIDE 12

Regression Discontinuity (“RD”) g y ( ) Approach to Selection Bias

  • Retain information on ranking used to select

individuals or firms for participation in the p p program

  • Use this measure of qualification or need as

Use this measure of qualification or need as regressor in explaining subsequent success of treated and untreated groups treated and untreated groups

  • Dummy variable for program participation then

captures treatment effect after controlling for captures treatment effect after controlling for selection effect

slide-13
SLIDE 13

Hypothetical Comparison of Mean Sales Growth for Funded and Unfunded Firms Controlling for Selection Bias via Project Ranking at Application

25 30 20

  • wth

10 15 Sales Gro Treatment Effect= Regression Discontinuity=3 5 Unfunded Firms Funded Firms 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Project Ranking at Application

slide-14
SLIDE 14

Regression Discontinuity (“RD”) g y ( ) Approach to Selectivity Bias

  • Statistically controls for the source of non-

random difference between the treated and untreated groups

  • Works for positive or negative selection effect

p g

  • Requires retention of information about criteria

for selection

  • Requires ability to measure success of both

treated and untreated individuals/firms

  • Note: if the selection criteria are not, in fact,

correlated with success, then slope will be zero but RD measure of treatment effect is still unbiased

slide-15
SLIDE 15

RD versus Random Assignment g

  • Both approaches measure the average

Both approaches measure the average treatment effect for treated entities

  • If the treatment effect were uniform for all
  • If the treatment effect were uniform for all

entities, then RD reproduces the result of random assignment random assignment

  • More likely, the magnitude of the treatment effect

may be correlated with the selection measure may be correlated with the selection measure

  • Most appropriate targets may get biggest boost; or

D i t li it ff t f t lifi d

  • Decreasing returns may limit effect for most qualified
  • Has implications for potential expansion of

program to previously untreated group

slide-16
SLIDE 16

Hypothetical Comparison of Mean Sales Growth for Funded and Unfunded Firms Controlling for Selection Bias via Project Ranking at Application

25 30 20

  • wth

10 15 Sales Gro Treatment Effect= Regression Discontinuity=3 5 Unfunded Firms Funded Firms 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Project Ranking at Application

slide-17
SLIDE 17

RD versus Random RA

  • RA always produces unbiased estimate of

RA always produces unbiased estimate of average effect, but tells you nothing about the underlying variation in efficacy underlying variation in efficacy

  • Note that in social settings, neither typically

deals with placebo effect deals with placebo effect

  • Both methods require tracking of untreated

group; not clear which approach makes this group; not clear which approach makes this easier

slide-18
SLIDE 18

Example of RD Approach p pp

  • “Reading First” was a billion-dollar program to

Reading First was a billion dollar program to introduce new pedagogy, new student evaluation measures, and specific teacher training methods measures, and specific teacher training methods to improve reading performance of 1st-3rd graders graders

  • Schools were chosen for the program using a

ranking index based on poverty rates and ranking index based on poverty rates and fraction of students reading below grade level Evaluation was carried out over three years in

  • Evaluation was carried out over three years in

248 schools, 125 of which were Reading First Schools Schools

slide-19
SLIDE 19

RD Analysis of Impact of Reading y p g First

Source: Abt Associates, Reading First Final Report, 2008

slide-20
SLIDE 20

Public Research Programmes g

  • Need to track performance of unsuccessful

Need to track performance of unsuccessful applicants

  • Condition for eligibility to begin with?

Condition for eligibility to begin with?

  • System of identifiers combined with external data—

StarMetrics approach pp

  • Outputs and outcomes are hard to measure and

subject to measurement response subject to measurement response

  • Routine/ongoing rather than episodic
slide-21
SLIDE 21

Concluding Thoughts g g

  • Combination of faith and hard-to-measure

Combination of faith and hard to measure

  • utcomes
  • Accept that some questions are not answerable:
  • Accept that some questions are not answerable:
  • Relative effectiveness across policies with

incommensurable outcomes incommensurable outcomes

  • Incremental versus marginal
  • GE effects
  • GE effects
  • Perfect should not be the enemy of good

B t littl k l d i d thi

  • But a little knowledge is a dangerous thing
  • Long lags as an advantage?
slide-22
SLIDE 22

Advert

Science and Innovation Policy for New Science and Innovation Policy for New Zeland Motu Public Policy Seminar Motu Public Policy Seminar Wednesday 04 September Spectrum Theater BP House Spectrum Theater, BP House Veronica Jacobsen, Discussant