model e v al u ation and implementation
play

Model e v al u ation and implementation C R E D IT R ISK MOD E L - PowerPoint PPT Presentation

Model e v al u ation and implementation C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y Comparing classification reports Create the reports w ith classification_report() and compare CREDIT RISK


  1. Model e v al u ation and implementation C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y

  2. Comparing classification reports Create the reports w ith classification_report() and compare CREDIT RISK MODELING IN PYTHON

  3. ROC and AUC anal y sis Models w ith be � er performance w ill ha v e more li � More li � means the AUC score is higher CREDIT RISK MODELING IN PYTHON

  4. Model calibration We w ant o u r probabilities of defa u lt to acc u ratel y represent the model ' s con � dence le v el The probabilit y of defa u lt has a degree of u ncertaint y in it ' s predictions A sample of loans and their predicted probabilities of defa u lt sho u ld be close to the percentage of defa u lts in that sample Sample of A v erage predicted Sample percentage of act u al Calibrated ? loans PD defa u lts 10 0.12 0.12 Yes 10 0.25 0.65 No 1 h � p :// datascienceassn . org / sites / defa u lt /� les / Predicting %20 good %20 probabilities %20w ith %20 s u per v ised %20 le CREDIT RISK MODELING IN PYTHON

  5. Calc u lating calibration Sho w s percentage of tr u e defa u lts for each predicted probabilit y Essentiall y a line plot of the res u lts of calibration_curve() from sklearn.calibration import calibration_curve calibration_curve(y_test, probabilities_of_default, n_bins = 5) # Fraction of positives (array([0.09602649, 0.19521012, 0.62035996, 0.67361111]), # Average probability array([0.09543535, 0.29196742, 0.46898465, 0.65512207])) CREDIT RISK MODELING IN PYTHON

  6. Plotting calibration c u r v es plt.plot(mean_predicted_value, fraction_of_positives, label="%s" % "Example Model") CREDIT RISK MODELING IN PYTHON

  7. Checking calibration c u r v es As an e x ample , t w o e v ents selected ( abo v e and belo w perfect line ) CREDIT RISK MODELING IN PYTHON

  8. Calibration c u r v e interpretation CREDIT RISK MODELING IN PYTHON

  9. Calibration c u r v e interpretation CREDIT RISK MODELING IN PYTHON

  10. Let ' s practice ! C R E D IT R ISK MOD E L IN G IN P YTH ON

  11. Credit acceptance rates C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y

  12. Thresholds and loan stat u s Pre v io u sl y w e set a threshold for a range of prob_default v al u es This w as u sed to change the predicted loan_status of the loan preds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.4 else 0) Loan prob _ defa u lt threshold loan _ stat u s 1 0.25 0.4 0 2 0.42 0.4 1 3 0.75 0.4 1 CREDIT RISK MODELING IN PYTHON

  13. Thresholds and acceptance rate Use model predictions to set be � er thresholds Can also be u sed to appro v e or den y ne w loans For all ne w loans , w e w ant to den y probable defa u lts Use the test data as an e x ample of ne w loans Acceptance rate : w hat percentage of ne w loans are accepted to keep the n u mber of defa u lts in a portfolio lo w Accepted loans w hich are defa u lts ha v e an impact similar to false negati v es CREDIT RISK MODELING IN PYTHON

  14. Understanding acceptance rate E x ample : Accept 85% of loans w ith the lo w est prob_default CREDIT RISK MODELING IN PYTHON

  15. Calc u lating the threshold Calc u late the threshold v al u e for an 85% acceptance rate import numpy as np # Compute the threshold for 85% acceptance rate threshold = np.quantile(prob_default, 0.85) 0.804 prob_default Predicted loan_status Loan Threshold Accept or Reject 1 0.65 0.804 0 Accept 2 0.85 0.804 1 Reject CREDIT RISK MODELING IN PYTHON

  16. Implementing the calc u lated threshold Reassign loan_status v al u es u sing the ne w threshold # Compute the quantile on the probabilities of default preds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.804 else 0) CREDIT RISK MODELING IN PYTHON

  17. Bad Rate E v en w ith a calc u lated threshold , some of the accepted loans w ill be defa u lts These are loans w ith prob_default v al u es aro u nd w here o u r model is not w ell calibrated CREDIT RISK MODELING IN PYTHON

  18. Bad rate calc u lation #Calculate the bad rate np.sum(accepted_loans['true_loan_status']) / accepted_loans['true_loan_status'].count() If non - defa u lt is 0 , and defa u lt is 1 then the sum() is the co u nt of defa u lts The .count() of a single col u mn is the same as the ro w co u nt for the data frame CREDIT RISK MODELING IN PYTHON

  19. Let ' s practice ! C R E D IT R ISK MOD E L IN G IN P YTH ON

  20. Credit strateg y and minim u m e x pected loss C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y

  21. Selecting acceptance rates First acceptance rate w as set to 85%, b u t other rates might be selected as w ell T w o options to test di � erent rates : Calc u late the threshold , bad rate , and losses man u all y A u tomaticall y create a table of these v al u es and select an acceptance rate The table of all the possible v al u es is called a strateg y table CREDIT RISK MODELING IN PYTHON

  22. Setting u p the strateg y table Set u p arra y s or lists to store each v al u e # Set all the acceptance rates to test accept_rates = [1.0, 0.95, 0.9, 0.85, 0.8, 0.75, 0.7, 0.65, 0.6, 0.55, 0.5, 0.45, 0.4, 0.35, 0.3, 0.25, 0.2, 0.15, 0.1, 0.05] # Create lists to store thresholds and bad rates thresholds = [] bad_rates = [] CREDIT RISK MODELING IN PYTHON

  23. Calc u lating the table v al u es Calc u late the threshold and bad rate for all acceptance rates for rate in accept_rates: # Calculate threshold threshold = np.quantile(preds_df['prob_default'], rate).round(3) # Store threshold value in a list thresholds.append(np.quantile(preds_gbt['prob_default'], rate).round(3)) # Apply the threshold to reassign loan_status test_pred_df['pred_loan_status'] = \ test_pred_df['prob_default'].apply(lambda x: 1 if x > thresh else 0) # Create accepted loans set of predicted non-defaults accepted_loans = test_pred_df[test_pred_df['pred_loan_status'] == 0] # Calculate and store bad rate bad_rates.append(np.sum((accepted_loans['true_loan_status']) / accepted_loans['true_loan_status'].count()).round(3)) CREDIT RISK MODELING IN PYTHON

  24. Strateg y table interpretation strat_df = pd.DataFrame(zip(accept_rates, thresholds, bad_rates), columns = ['Acceptance Rate','Threshold','Bad Rate']) CREDIT RISK MODELING IN PYTHON

  25. Adding accepted loans The n u mber of loans accepted for each acceptance rate Can u se len() or .count() CREDIT RISK MODELING IN PYTHON

  26. Adding a v erage loan amo u nt A v erage loan_amnt from the test set data CREDIT RISK MODELING IN PYTHON

  27. Estimating portfolio v al u e A v erage v al u e of accepted loan non - defa u lts min u s a v erage v al u e of accepted defa u lts Ass u mes each defa u lt is a loss of the loan_amnt CREDIT RISK MODELING IN PYTHON

  28. Total e x pected loss Ho w m u ch w e e x pect to lose on the defa u lts in o u r portfolio # Probability of default (PD) test_pred_df['prob_default'] # Exposure at default = loan amount (EAD) test_pred_df['loan_amnt'] # Loss given default = 1.0 for total loss (LGD) test_pred_df['loss_given_default'] CREDIT RISK MODELING IN PYTHON

  29. Let ' s practice ! C R E D IT R ISK MOD E L IN G IN P YTH ON

  30. Co u rse w rap u p C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y

  31. Yo u r jo u rne y... so far Prepare credit data for machine learning models Important to u nderstand the data Impro v ing the data allo w s for high performing simple models De v elop , score , and u nderstand logistic regressions and gradient boosted trees Anal yz e the performance of models b y changing the data Understand the � nancial impact of res u lts Implement the model w ith an u nderstanding of strateg y CREDIT RISK MODELING IN PYTHON

  32. Risk modeling techniq u es The models and frame w ork in this co u rse : Discrete - time ha z ard model ( point in time ): the probabilit y of defa u lt is a point - in - time e v ent St u ct u ral model frame w ork : the model e x plains the defa u lt e v en based on other factors Other techniq u es Thro u gh - the - c y cle model ( contin u o u s time ): macro - economic conditions and other e � ects are u sed , b u t the risk is seen as an independent e v ent Red u ced - form model frame w ork : a statistical approach estimating probabilit y of defa u lt as an independent Poisson - based e v ent CREDIT RISK MODELING IN PYTHON

  33. Choosing models Man y machine learning models a v ailable , b u t logistic regression and tree models w ere u sed These models are simple and e x plainable Their performance on probabilities is acceptable Man y � nancial sectors prefer model interpretabilit y Comple x or " black - bo x" models are a risk beca u se the b u siness cannot e x plain their decisions f u ll y Deep ne u ral net w orks are o � en too comple x CREDIT RISK MODELING IN PYTHON

  34. Tips from me to y o u Foc u s on the data Gather as m u ch data as possible Use man y di � erent techniq u es to prepare and enhance the data Learn abo u t the b u siness Increase v al u e thro u gh data Model comple x it y can be a t w o - edged s w ord Reall y comple x models ma y perform w ell , b u t are seen as a " black - bo x" In man y cases , b u siness u sers w ill not accept a model the y cannot u nderstand Comple x models can be v er y large and di � c u lt to p u t into prod u ction CREDIT RISK MODELING IN PYTHON

  35. Thank y o u! C R E D IT R ISK MOD E L IN G IN P YTH ON

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend