MAPA Mapping Scorecard Calibration using a Monotone Adjacent - - PowerPoint PPT Presentation

mapa mapping
SMART_READER_LITE
LIVE PREVIEW

MAPA Mapping Scorecard Calibration using a Monotone Adjacent - - PowerPoint PPT Presentation

MAPA Mapping Scorecard Calibration using a Monotone Adjacent Pooling Algorithm Presented at Edinburgh Credit Scoring and Control IX September 7-9, 2005 Raymond Anderson Standard Bank Group Johannesburg, South Africa On What Why Calibrate?


slide-1
SLIDE 1

MAPA Mapping

Scorecard Calibration using a Monotone Adjacent Pooling Algorithm Raymond Anderson Standard Bank Group Johannesburg, South Africa

Presented at Edinburgh Credit Scoring and Control IX September 7-9, 2005

slide-2
SLIDE 2

Why Calibrate?

  • Consistent meaning

– Across Scorecards – Over Time – Across Products

  • Good/Bad

Definition

– Bad = 60 days past due – Good = Current – Indet = Mid Range

  • Moving Forward

– Focus on predictive accuracy – Pricing – Provisioning – Capital Adequacy

  • Traditional Retail Norm

– Focus on ranking ability – Initially an Accept/Reject paradigm

On What Measure?

NOTE: Basel based upon corporate methodologies. DEFAULT / NOT DEFAULT

  • Basel II

– Bad=90 days past due – Good=Not Bad – No Mid Range

slide-3
SLIDE 3

Predictive Modeling Techniques

– Linear Probability Modeling

  • Ranks well, but scores unreliable as estimates
  • Generalised Additive Non-parametric Regression

– Logistic Regression

  • Reasonable estimates, if Basel def’n used

– Decision Trees

  • Use historical results directly

If Basel definition not used, or probability estimates unreliable, then mapping is necessary.

slide-4
SLIDE 4

LPM Score Results

Score Distribution

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

8 8 9 3 9 1 7 9 3 1 9 4 5 9 5 9 9 7 3 9 8 7 1 1 1 1 5 1 2 9 1 4 5

LPM Score 200 400 600 800 1000 1200 Score Distribution Actual P(Not Default)

Score by Natural Log Odds

y = 0.0492x - 0.0779 R2 = 0.5786

1 2 3 4 5 6 7 8 9

880 903 917 931 945 959 973 987 1001 1015 1029 1045

LPM Score

Ln(Odds)

Actual Ln(Not Default/Default Odds) Linear (Actual Ln(Not

Gini = 55.52%

slide-5
SLIDE 5

Assumptions

  • Ranking Ability paramount!
  • Estimates necessary, but secondary
  • Score banding / Risk Indicators

– Use Historical Figures – Grouping Unscientific

  • Logit, with score as sole independent

– Simple, but assumes linearity

  • Fitting of Lorenz curve (Glößner 2003)

– Very complicated

Possible Methodologies

slide-6
SLIDE 6

MAPA Mapping - Process

I. Data Selection & Preparation II. 1st Pass: MAPA Interpolation III. 2nd Pass: Correct for Errors IV. Create Mapping Table V. Implement Mapping Table

slide-7
SLIDE 7

Data Selection & Preparation

A. Apply MAPA to identify Pools B. Calculate Ln(odds) per Pool C. Interpolate High and Low Ln(Odds) for each Pool D. Interpolate Ln(Odds) for each Record A. Out of time/out of sample? B. Within Universe C. Rank by Score D. Set Target Variable

1st Pass: MAPA Interpolation

slide-8
SLIDE 8

Pool Definition

If monotonic, then P(Good) increases with score. Use Iterative process: a) find score with lowest cumulative P(Good); b) set that score as upper bound for pool; c) clear and repeat with remaining scores until all scores pooled.

Pool 1 = 53.97% @903 Pool2= 62.50% @911 Pool3= 62.79% @914 Pool4= 72.90% @928

50% 55% 60% 65% 70% 75% 80% 85% 90% 95%

880 892 899 904 909 914 919 924 929 Score Lower Range P(Good)

20 40 60 80 100 120 140 160 180 200 BreakScores Pool1 Pool2 Pool3 Pool4

A

Gini = 56.22%

B

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0

913 938 963 988 1013 1038 Score Full Range Natural Log Odds

BreakScores Average Ln(Odds)

slide-9
SLIDE 9

Interpolation

Use average P(Good) per Pool [B] to Interpolate Ln(Odds) for breakrecords [C], and use to interpolate Ln(Odds) for all records [D]. Aggregate by score, and we have a smoothed score to Ln(Odds) mapping… but with errors.

0.0 0.5 1.0 1.5 2.0 250 500 750 1000 1250 Record Number Natural Log Odds

Average Ln(Odds) Interpolated BreakScores Record Ln(Odds)

33,440 Records Total

B C D

1 2 3 4 5 6 7 8 9 880 939 989 1039 Original Score Natural Log Odds Actual Pool Smoothed

1st Pass Results

slide-10
SLIDE 10

2nd Pass: Error Correction

Net deficiency of 31.3 bads out of 1,990 (1.6%), distributed in same fashion as bads. Error corrected by spreading over bads, assuming normally distributed with Z-value from –3 to +3. P(Good) estimate adjusted downwards.

  • 40
  • 30
  • 20
  • 10

10 20 30 880 939 989 1039

Score Errors

20 40 60 80 100 120

Bad Distribution Actual Error

50% 60% 70% 80% 90% 100% Count 5000 10000 15000 Record Number (Total) Adjusted P(Good) 0.0% 0.5% 1.0% 1.5% 2.0% 2.5% 3.0% 3.5% 4.0% Bad Spread

2nd Pass Spread

Normal Std Dev = 3

slide-11
SLIDE 11

200 300 400 500 600 700 800 900 850 900 950 1000 1050 1100 Original Score (LPM) New Score (LogLinear)

Step 3: Mapping Table

1 2 3 4 5 6 7 8 9 880 939 989 1039 Original Score Natural Log Odds Actual Pool 2nd Pass

Now have final P(Good). We can map Ln(Odds) onto new loglinear scores. Example at right has 32/1 odds at baseline 500, doubling every fifty points. ( ) ( )

) ln( ln ln

INCR BASE S INCR BASE

Odds Odds Odds S S S − × + = ′

Gini = 55.51%

slide-12
SLIDE 12

Rescale

Base Score = 500 Base Odds = 32 Double Every 50 points

( )

47 . 60 989 | = = S Odds

( ) ( ) ( )

BASE S INCR INCR BASE

Odds Odds Odds S S S ln ln ) ln( − × + = ′

( ) ( ) ( )

9 . 545 32 ln 47 . 60 ln ) 2 ln( 50 500

989

= − × + = ′ S

Thus, score of 989 maps to 546. Converts to Bad Rate of 1.625%

( ) ( ) ( ) ( )

1

ln ln exp 1

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + × − ′ + =

BASE INCR INCR BASE

Odds S Odds S S Bad P

( ) ( ) ( ) ( )

1

32 ln 50 2 ln 500 546 exp 1 % 625 . 1 546 |

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + × − + = = = ′ S Bad P

slide-13
SLIDE 13

Conclusions

  • Historical focus on ranking ability (power)
  • Need reasonable estimates (accuracy)
  • Problems where scorecard build and required

definitions differ, where estimates unreliable, or significant changes to business environment..

  • Requirement:

– Business issues drive scorecard development – Apply transformations to obtain PD estimates for Basel II

slide-14
SLIDE 14

Conclusion cont’d

ADVANTAGES

  • Conceptually Simple
  • Non-Linear
  • No Power Loss
  • Handles any Binary

Transformation

  • Allows updates using

latest performance

– Historical (Detailed) – Informed (Constant)

ISSUES

  • Always Backward

Looking!!!

  • Requires Mapping Table
  • Small Numbers? Bias?
  • Raw scores still needed

– Scorecard Monitoring – Strategy???

  • Endpoint Treatment?
  • Other variations may

provide improvements