machine learning to detect fraudsters
play

(Machine) Learning To Detect Fraudsters Hany Elemary Sarah LeBlanc - PowerPoint PPT Presentation

(Machine) Learning To Detect Fraudsters Hany Elemary Sarah LeBlanc CREDIT CARD FRAUD TRANSACTION APPLICATION CARD NOT FOUND 2 FRAUD DETECTION MODEL PLOT Not Fraud Fraud Application Count False Negatives False Positives 0 0.2 0.4


  1. (Machine) Learning To Detect Fraudsters Hany Elemary Sarah LeBlanc

  2. CREDIT CARD FRAUD TRANSACTION APPLICATION CARD NOT FOUND 2

  3. FRAUD DETECTION MODEL PLOT Not Fraud Fraud Application Count False Negatives False Positives 0 0.2 0.4 0.6 0.8 1 Model Score 3

  4. 
 MINIMIZE LOSSES Lost Profitability = 
 (Fraud Cost * FN ) + (Opportunity Cost * FP ) Legend: FN (Fraud missed) FP (Mistaken fraud) 4

  5. CURRENT STATE Vendor Fraud 
 Detection Rules Application Service Model Strategies Customer 5

  6. PROPOSED STATE Vendor Fraud 
 Fraud 
 Detection Detection Rules Application Service Strategies Customer CHAMPION CHALLENGER MODELS 6

  7. MODEL TRAINING Supervised Learning Fraud Classification Training Not Fraud Model Historical 
 Data 7

  8. DATA PATTERNS Filter Transform Impute Features 8

  9. DATA FILTERING Low Cardinality 9

  10. DATA FILTERING High Cardinality 10

  11. DATA FILTERING Medium Cardinality 11

  12. DATA FILTERING Medium Cardinality Predictive Model Training 12

  13. DATA TRANSFORMATION Fraud Status Email jack.smith@gmail.com annie.may@fraudster.com freddy.jr@gmail.com nicole.jack@fraudster.com jon.johnston@gmail.com claudia.penns@us.gov walter.carson@gmail.com ben.benjamin@fraudster.com 13

  14. DATA TRANSFORMATION Domain name Fraud Status gmail.com fraudster.com gmail.com fraudster.com gmail.com us.gov gmail.com fraudster.com 14

  15. DATA IMPUTATION Handling Missing Data Column 2 Column 3 Column 4 Column 1 15

  16. DATA IMPUTATION Handling Missing Data Column 2 Column 3 Column 4 Column 1 16

  17. FEATURE SELECTION IP to Zip Proximity 17

  18. ARCHITECTURE DATA SCIENTIST WORKFLOW Raw Data Transformed Data Trained Model DEVELOPER WORKFLOW Applications Trained Model Score 18

  19. DATA SCIENTIST WORKFLOW Raw Data Transformed Data Trained Model Clean Transform Impute Historical Data Store Binary Repository 19

  20. DEVELOPER WORKFLOW Applications Trained Model Score Vendor Rules Application Model Service Strategies Decisioning & Analytics 
 Platform Model Predictions 
 Store Message Queue Model 1 Model 1 Model 2 Model 3 Binary Repository Shadow Mode 20

  21. DEVELOPER WORKFLOW Applications Trained Model Score Vendor Rules Application Service Strategies Decisioning & Analytics 
 Platform Model Predictions 
 Store Message Queue Champion Model 2 Model 1 Model 3 Model Binary Repository Shadow Mode 21

  22. ARCHITECTURE DATA SCIENTIST WORKFLOW DEVELOPER WORKFLOW Rul Str 22

  23. VALUE STREAM Data Ingestion Model Training Governance Publish Service Publish Model Shadow Mode Governance Evaluation Champion Model 100 75 50 25 23

  24. THANK YOU Sarah LeBlanc Hany Elemary sleblanc@thoughtworks.com helemary@thoughtworks.com @sarah_g_leblanc @hanyelemary Questions? 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend