MAPA Mapping
Scorecard Calibration using a Monotone Adjacent Pooling Algorithm Raymond Anderson Standard Bank Group Johannesburg, South Africa
Presented at Edinburgh Credit Scoring and Control IX September 7-9, 2005
MAPA Mapping Scorecard Calibration using a Monotone Adjacent - - PowerPoint PPT Presentation
MAPA Mapping Scorecard Calibration using a Monotone Adjacent Pooling Algorithm Presented at Edinburgh Credit Scoring and Control IX September 7-9, 2005 Raymond Anderson Standard Bank Group Johannesburg, South Africa On What Why Calibrate?
Presented at Edinburgh Credit Scoring and Control IX September 7-9, 2005
– Across Scorecards – Over Time – Across Products
– Bad = 60 days past due – Good = Current – Indet = Mid Range
– Focus on predictive accuracy – Pricing – Provisioning – Capital Adequacy
– Focus on ranking ability – Initially an Accept/Reject paradigm
NOTE: Basel based upon corporate methodologies. DEFAULT / NOT DEFAULT
– Bad=90 days past due – Good=Not Bad – No Mid Range
Score Distribution
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
8 8 9 3 9 1 7 9 3 1 9 4 5 9 5 9 9 7 3 9 8 7 1 1 1 1 5 1 2 9 1 4 5
LPM Score 200 400 600 800 1000 1200 Score Distribution Actual P(Not Default)
Score by Natural Log Odds
y = 0.0492x - 0.0779 R2 = 0.5786
1 2 3 4 5 6 7 8 9
880 903 917 931 945 959 973 987 1001 1015 1029 1045
LPM Score
Ln(Odds)
Actual Ln(Not Default/Default Odds) Linear (Actual Ln(Not
A. Apply MAPA to identify Pools B. Calculate Ln(odds) per Pool C. Interpolate High and Low Ln(Odds) for each Pool D. Interpolate Ln(Odds) for each Record A. Out of time/out of sample? B. Within Universe C. Rank by Score D. Set Target Variable
If monotonic, then P(Good) increases with score. Use Iterative process: a) find score with lowest cumulative P(Good); b) set that score as upper bound for pool; c) clear and repeat with remaining scores until all scores pooled.
Pool 1 = 53.97% @903 Pool2= 62.50% @911 Pool3= 62.79% @914 Pool4= 72.90% @928
50% 55% 60% 65% 70% 75% 80% 85% 90% 95%
880 892 899 904 909 914 919 924 929 Score Lower Range P(Good)
20 40 60 80 100 120 140 160 180 200 BreakScores Pool1 Pool2 Pool3 Pool4
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
913 938 963 988 1013 1038 Score Full Range Natural Log Odds
BreakScores Average Ln(Odds)
Use average P(Good) per Pool [B] to Interpolate Ln(Odds) for breakrecords [C], and use to interpolate Ln(Odds) for all records [D]. Aggregate by score, and we have a smoothed score to Ln(Odds) mapping… but with errors.
0.0 0.5 1.0 1.5 2.0 250 500 750 1000 1250 Record Number Natural Log Odds
Average Ln(Odds) Interpolated BreakScores Record Ln(Odds)
33,440 Records Total
B C D
1 2 3 4 5 6 7 8 9 880 939 989 1039 Original Score Natural Log Odds Actual Pool Smoothed
Net deficiency of 31.3 bads out of 1,990 (1.6%), distributed in same fashion as bads. Error corrected by spreading over bads, assuming normally distributed with Z-value from –3 to +3. P(Good) estimate adjusted downwards.
10 20 30 880 939 989 1039
Score Errors
20 40 60 80 100 120
Bad Distribution Actual Error
50% 60% 70% 80% 90% 100% Count 5000 10000 15000 Record Number (Total) Adjusted P(Good) 0.0% 0.5% 1.0% 1.5% 2.0% 2.5% 3.0% 3.5% 4.0% Bad Spread
2nd Pass Spread
Normal Std Dev = 3
200 300 400 500 600 700 800 900 850 900 950 1000 1050 1100 Original Score (LPM) New Score (LogLinear)
1 2 3 4 5 6 7 8 9 880 939 989 1039 Original Score Natural Log Odds Actual Pool 2nd Pass
Now have final P(Good). We can map Ln(Odds) onto new loglinear scores. Example at right has 32/1 odds at baseline 500, doubling every fifty points. ( ) ( )
) ln( ln ln
INCR BASE S INCR BASE
Odds Odds Odds S S S − × + = ′
47 . 60 989 | = = S Odds
BASE S INCR INCR BASE
Odds Odds Odds S S S ln ln ) ln( − × + = ′
9 . 545 32 ln 47 . 60 ln ) 2 ln( 50 500
989
= − × + = ′ S
1
ln ln exp 1
−
⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + × − ′ + =
BASE INCR INCR BASE
Odds S Odds S S Bad P
1
32 ln 50 2 ln 500 546 exp 1 % 625 . 1 546 |
−
⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + × − + = = = ′ S Bad P
– Historical (Detailed) – Informed (Constant)