Predicting Real-Time Transaction Fraud Sami Niemi, PhD Barclays, - - PowerPoint PPT Presentation

predicting real time transaction fraud
SMART_READER_LITE
LIVE PREVIEW

Predicting Real-Time Transaction Fraud Sami Niemi, PhD Barclays, - - PowerPoint PPT Presentation

Predicting Real-Time Transaction Fraud Sami Niemi, PhD Barclays, Quantitative Analytics, Fraud Detection #StrataData - Predicting real-time transaction fraud using supervised learning Contents Background 1 Raw Data 2 Data Processing 3 4


slide-1
SLIDE 1

#StrataData - Predicting real-time transaction fraud using supervised learning

Predicting Real-Time Transaction Fraud

Sami Niemi, PhD Barclays, Quantitative Analytics, Fraud Detection

slide-2
SLIDE 2

#StrataData - Predicting real-time transaction fraud using supervised learning

Contents

2

Background

4 Development

Implementation Raw Data Data Processing Summary Validation

7 1 2 3 4 5 6

slide-3
SLIDE 3

#StrataData - Predicting real-time transaction fraud using supervised learning

Background – Definitions and Examples

an individual, or group of people, create or use a third-party's identity in order to apply for products or take over an account without the consent or knowledge of the third-party.

3rd Party Fraud

Card Present (CP)

  • e.g. lost, stolen, counterfeit/clone

Card not Present (CnP)

  • e.g. identity theft, hacking, fake online shops

Card Transaction Fraud

3

slide-4
SLIDE 4

#StrataData - Predicting real-time transaction fraud using supervised learning

Background – Motivation (global view)

4

slide-5
SLIDE 5

#StrataData - Predicting real-time transaction fraud using supervised learning

Background – Motivation (UK view)

5

Source: Fraud the Facts 2018 by UK Finance

slide-6
SLIDE 6

#StrataData - Predicting real-time transaction fraud using supervised learning

Background – Challenges

6

Fraudsters Adapt and Invent New MOs Real-Time Runtime Requirements Front Page News Material

slide-7
SLIDE 7

#StrataData - Predicting real-time transaction fraud using supervised learning

7

Aim of the project was to develop and implement new Debit CP and CnP real-time fraud detection models, which can reduce fraud losses and protect genuine customers.

slide-8
SLIDE 8

#StrataData - Predicting real-time transaction fraud using supervised learning

Raw Data – Sources

8

Debit Card Transactions

Non-Mon Events Other Cards and Accounts Customer Info Account Info Payment Instrument Info Confirmed Frauds

slide-9
SLIDE 9

#StrataData - Predicting real-time transaction fraud using supervised learning

Data Processing – Quality Assurance and Data Exploration

9

Data Quality

  • Reconciliation, Volumes, and Amounts
  • Daily and Monthly Summary Statistics
  • Anomaly and Outlier Detection

Exploration

  • Trend Analysis and Anomaly Detection
  • Distributions (PDFs and bar charts for Fraud / Non)
  • Correlations (covariance, correlation w/ target, etc.)

Report

  • Thresholding
  • Issue Generation and Resolution
  • Documentation and Governance
slide-10
SLIDE 10

#StrataData - Predicting real-time transaction fraud using supervised learning

Data – High Level Statistics

1

  • Total: 220 – 300M debit card transactions with total

value of £9 – 11B per month

  • CP: 110M contactless, 20M ATM
  • CnP: 85M e-commerce + telephony

Volumes

  • 10M unique customers per month
  • transacting in 220 countries
  • using 12M debit cards
  • with 1.9M different merchants

Customers

  • Fraud Rates
  • CP: less than 0.01%, depending on segment
  • CnP: less than 0.15%, depending on segment

Frauds

slide-11
SLIDE 11

#StrataData - Predicting real-time transaction fraud using supervised learning

Development – Datasets

1 1

Debit

14 months

CP Train

12 months

Sample

45M transactions

OOT

2 recent mnths

CnP Train

12 months

Sample

55M transactions

OOT

2 recent mnths

slide-12
SLIDE 12

#StrataData - Predicting real-time transaction fraud using supervised learning

Data Processing – Feature Engineering

1 2

and many more (e.g. merchant)… finally, ratios between values and current transaction.

slide-13
SLIDE 13

#StrataData - Predicting real-time transaction fraud using supervised learning

Development – Feature Selection

1 3

Univariate

  • Remove zero or extremely low variance
  • Remove if all or extremely high level of missing values
  • Very low Information Value or Spearman rank correction

Model

  • Lasso co-efficient importance
  • Random Forest feature importance

Wrapper

  • Recursive Feature Elimination

Domain

  • Business Review
  • Implementation Considerations

20k 10k 1k 500

slide-14
SLIDE 14

#StrataData - Predicting real-time transaction fraud using supervised learning

Development – Feature Selection & Business Review

1 4

Fraud Genuine Debit CP model feature: ratio of current transaction amount and maximum contactless in last X days

slide-15
SLIDE 15

#StrataData - Predicting real-time transaction fraud using supervised learning

Development – Model Development Cycle

1 5

Select Features Pick Model and Train Optimize Evaluate Review

slide-16
SLIDE 16

#StrataData - Predicting real-time transaction fraud using supervised learning

Development – Hyper-parameter Optimization

1 6

Fraction of Features in a Split Number of Iterations Performance

Example of Bayesian hyper-parameter optimization using hyper-opt

slide-17
SLIDE 17

#StrataData - Predicting real-time transaction fraud using supervised learning

Validation – CP Model Performance

1 7

Precision Recall Curve: AUC ~0.23 ROC Curve: AUC ~0.95

slide-18
SLIDE 18

#StrataData - Predicting real-time transaction fraud using supervised learning

Validation – CP Model Performance

1 8

Incumbent model New model Incumbent model New model False Positive Rate [bps] False Positive Rate [bps] Transaction Detection Rate Value Detection Rate

slide-19
SLIDE 19

#StrataData - Predicting real-time transaction fraud using supervised learning

Validation – CnP Model Performance

1 9

New model Incumbent model Incumbent model Transaction Detection Rate Value Detection Rate False Positive Rate [bps] False Positive Rate [bps] New model

slide-20
SLIDE 20

#StrataData - Predicting real-time transaction fraud using supervised learning

Validation – CP Model Interrogation

2

Time since a new card was issued Fraud Risk

Chip used Chip not-used

slide-21
SLIDE 21

#StrataData - Predicting real-time transaction fraud using supervised learning

Implementation – Development Artefacts

2 1

Model Artefacts

  • Model Specification (JSON)
  • Model File (txt)
  • Validation Data (parquet)

Feature Artefacts

  • Feature Specification (JSON)
  • Validation Data (parquet)
slide-22
SLIDE 22

#StrataData - Predicting real-time transaction fraud using supervised learning

Implementation - Process

2 2

Artefacts from Nexus using Jenkins Model File and Implementation Validation Feature Code Gen and Validation Feature Maturation and Shadow Operations Production Validation and Go-Live

slide-23
SLIDE 23

#StrataData - Predicting real-time transaction fraud using supervised learning

Summary

  • Increasing number of customers become victims of fraud,

especially remote purchase (e.g. e-commerce).

  • To improve fraud prevention and customer experience, we

undertook – Development of generation 1 models for Debit CP and CnP using tree ensemble algorithms – 12 months of training data were converted to about 20k features to develop the best possible models – Both models are in implementation, shadow operations due in May with go-live during summer

  • R&D for generation 2 models (e.g. RNNs, autoencoders) on-

going, promising results, but implementation requires more work…

2 3

slide-24
SLIDE 24

#StrataData - Predicting real-time transaction fraud using supervised learning

Rate today’s session

Session page on conference website O’Reilly Events App