understanding credit risk
play

Understanding credit risk C R E D IT R ISK MOD E L IN G IN P YTH - PowerPoint PPT Presentation

Understanding credit risk C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y What is credit risk ? The possibilit y that someone w ho has borro w ed mone y w ill not repa y it all Calc u lated risk


  1. Understanding credit risk C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y

  2. What is credit risk ? The possibilit y that someone w ho has borro w ed mone y w ill not repa y it all Calc u lated risk di � erence bet w een lending someone mone y and a go v ernment bond When someone fails to repa y a loan , it is said to be in defa u lt The likelihood that someone w ill defa u lt on a loan is the probabilit y of defa u lt ( PD ) CREDIT RISK MODELING IN PYTHON

  3. What is credit risk ? The possibilit y that someone w ho has borro w ed mone y w ill not repa y it all Calc u lated risk di � erence bet w een lending someone mone y and a go v ernment bond When someone fails to repa y a loan , it is said to be in defa u lt The likelihood that someone w ill defa u lt on a loan is the probabilit y of defa u lt ( PD ) Pa y ment Pa y ment Date Loan Stat u s $100 J u n 15 Non - Defa u lt $100 J u l 15 Non - Defa u lt $0 A u g 15 Defa u lt CREDIT RISK MODELING IN PYTHON

  4. E x pected loss The dollar amo u nt the � rm loses as a res u lt of loan defa u lt Three primar y components : Probabilit y of Defa u lt ( PD ) E x pos u re at Defa u lt ( EAD ) Loss Gi v en Defa u lt ( LGD ) Form u la for e x pected loss : expected_loss = PD * EAD * LGD CREDIT RISK MODELING IN PYTHON

  5. T y pes of data u sed T w o Primar y t y pes of data u sed : Application data Beha v ioral data Application Beha v ioral Interest Rate Emplo y ment Length Grade Historical Defa u lt Amo u nt Income CREDIT RISK MODELING IN PYTHON

  6. Data col u mns Mi x of beha v ioral and application Col u mn Col u mn Contain col u mns sim u lating credit b u rea u Income Loan grade data Age Loan amo u nt Home o w nership Interest rate Emplo y ment length Loan stat u s Loan intent Historical defa u lt Percent Income Credit histor y length CREDIT RISK MODELING IN PYTHON

  7. E x ploring w ith cross tables pd.crosstab(cr_loan['person_home_ownership'], cr_loan['loan_status'], values=cr_loan['loan_int_rate'], aggfunc='mean').round(2) CREDIT RISK MODELING IN PYTHON

  8. E x ploring w ith v is u als plt.scatter(cr_loan['person_income'], cr_loan['loan_int_rate'],c='blue', alpha=0.5) plt.xlabel("Personal Income") plt.ylabel("Loan Interest Rate") plt.show() CREDIT RISK MODELING IN PYTHON

  9. Let ' s practice ! C R E D IT R ISK MOD E L IN G IN P YTH ON

  10. O u tliers in Credit Data C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y

  11. Data processing Prepared data allo w s models to train faster O � en positi v el y impacts model performance CREDIT RISK MODELING IN PYTHON

  12. O u tliers and performance Possible ca u ses of o u tliers : Problems w ith data entr y s y stems ( h u man error ) Iss u es w ith data ingestion tools CREDIT RISK MODELING IN PYTHON

  13. O u tliers and performance Possible ca u ses of o u tliers : Problems w ith data entr y s y stems ( h u man error ) Iss u es w ith data ingestion tools Feat u re Coe � cient With O u tliers Coe � cient Witho u t O u tliers Interest Rate 0.2 0.01 Emplo y ment Length 0.5 0.6 Income 0.6 0.75 CREDIT RISK MODELING IN PYTHON

  14. Detecting o u tliers w ith cross tables Use cross tables w ith aggregate f u nctions pd.crosstab(cr_loan['person_home_ownership'], cr_loan['loan_status'], values=cr_loan['loan_int_rate'], aggfunc='mean').round(2) CREDIT RISK MODELING IN PYTHON

  15. Detecting o u tliers v is u all y Detecting o u tliers v is u all y Histograms Sca � er plots CREDIT RISK MODELING IN PYTHON

  16. Remo v ing o u tliers Use the .drop() method w ithin Pandas indices = cr_loan[cr_loan['person_emp_length'] >= 60].index cr_loan.drop(indices, inplace=True) CREDIT RISK MODELING IN PYTHON

  17. Let ' s practice ! C R E D IT R ISK MOD E L IN G IN P YTH ON

  18. Risk w ith missing data in loan data C R E D IT R ISK MOD E L IN G IN P YTH ON Michael Crabtree Data Scientist , Ford Motor Compan y

  19. What is missing data ? NULLs in a ro w instead of an act u al v al u e An empt y string '' Not an entirel y empt y ro w Can occ u r in an y col u mn in the data CREDIT RISK MODELING IN PYTHON

  20. Similarities w ith o u tliers Negati v el y a � ect machine learning model performance Ma y bias models in u nanticipated w a y s Ma y ca u se errors for some machine learning models CREDIT RISK MODELING IN PYTHON

  21. Similarities w ith o u tliers Negati v el y a � ect machine learning model performance Ma y bias models in u nanticipated w a y s Ma y ca u se errors for some machine learning models Missing Data T y pe Possible Res u lt NULL in n u meric col u mn Error NULL in string col u mn Error CREDIT RISK MODELING IN PYTHON

  22. Ho w to handle missing data Generall y three w a y s to handle missing data Replace v al u es w here the data is missing Remo v e the ro w s containing missing data Lea v e the ro w s w ith missing data u nchanged Understanding the data determines the co u rse of action CREDIT RISK MODELING IN PYTHON

  23. Ho w to handle missing data Generall y three w a y s to handle missing data Replace v al u es w here the data is missing Remo v e the ro w s containing missing data Lea v e the ro w s w ith missing data u nchanged Understanding the data determines the co u rse of action Missing Data Interpretation Action NULL in loan_status Loan recentl y appro v ed Remo v e from prediction data NULL in person_age Age not recorded or disclosed Replace w ith median CREDIT RISK MODELING IN PYTHON

  24. Finding missing data N u ll v al u es are easil y fo u nd b y u sing the isnull() f u nction N u ll records can easil y be co u nted w ith the sum() f u nction .any() method checks all col u mns null_columns = cr_loan.columns[cr_loan.isnull().any()] cr_loan[null_columns].isnull().sum() # Total number of null values per column person_home_ownership 25 person_emp_length 895 loan_intent 25 loan_int_rate 3140 cb_person_default_on_file 15 CREDIT RISK MODELING IN PYTHON

  25. Replacing Missing data Replace the missing data u sing methods like .fillna() w ith aggregate f u nctions and methods cr_loan['loan_int_rate'].fillna((cr_loan['loan_int_rate'].mean()), inplace = True) CREDIT RISK MODELING IN PYTHON

  26. Dropping missing data Uses indices to identif y records the same as w ith o u tliers Remo v e the records entirel y u sing the .drop() method indices = cr_loan[cr_loan['person_emp_length'].isnull()].index cr_loan.drop(indices, inplace=True) CREDIT RISK MODELING IN PYTHON

  27. Let ' s practice ! C R E D IT R ISK MOD E L IN G IN P YTH ON

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend