The European Commissions science and knowledge service Joint - PowerPoint PPT Presentation

The European Commission’s science and knowledge service Joint Research Centre

Why machine learning may lead to unfairness Songül Tolan 1 , Marius Miron 1 , Emilia Gomez 1,2 , Carlos Castillo 2 1 European Commission’s Joint Research Centre 2 Universitat Pompeu Fabra

Machine learning for decision making

The criminal justice case Trade-off: predictive performance vs fairness

Criminal recidivism

Criminal recidivism prediction Prisoner Human Decision expert / Sentence

Criminal recidivism prediction Prisoner Human Decision Outcome expert / Sentence

Criminal recidivism prediction Features Machine Prediction Outcome learning model

Criminal recidivism prediction Examples of static Age at crime features: Sex Nationality Previous number of crimes Sentence Year of crime Probation

Fairness A decision is fair if it does not discriminate against people based on their membership to a protected group

Fairness Example of protected Age at crime features: Sex Nationality Previous number of crimes Sentence Year of crime Probation

Measuring unfairness Sex Nationality Features Machine Prediction Outcome learning model

Measuring unfairness False negative False positive Prediction Outcome

False negative rate = Miss rate Σ Σ

False positive rate = False alarm rate Σ Σ

Group fairness - sex Σ sex=Male Σ sex=Male

False negative rate disparity FNR female FNR disparity = FNR male How likely it is for a member of a group to be wrongfully labeled as non-recidivist.

Headache?

Too complicated? The fairness in machine learning literature comprises at least 21 disparity metrics.

Juvenile recidivism

Risk assessment tools Structured Assessment of Violence Risk in Youth (SAVRY) ● high degree of involvement from human experts ● open and interpretable (in comparison with COMPAS) ● 24 risk factors scored low, medium or high

SAVRY Examples of SAVRY Early violence features: Self-harm history Home violence Poor school achievement Stress and poor coping Substance abuse Criminal parent/caregiver

Criminal recidivism prediction Σ SAVRY sum SAVRY Final expert Outcome features evaluation Expert

Static ML Features Machine Prediction Outcome learning model

SAVRY ML SAVRY Machine Prediction Outcome features learning model

Static + SAVRY ML Features Machine Prediction Outcome learning model

Dataset Juvenile offenders in Catalonia 1 ● 855 people ● crimes between 2002 -2010, release in 2010 ● age at crime between 12 and 17 years old ● status followed up on 2013 and 2015 1. Open data: http://cejfe.gencat.cat/en/recerca/opendata/jjuvenil/reincidencia-justicia-menors/index.html

Experimental setup Training a set of ML methods ● logistic regression (logit), multi-layer perceptron (mlp), support vector machines (lsvm), k-nearest neighbors (knn), random forest (rf), naive bayes (nb) ● k-fold cross validation with k=10 (10% test, 10% validation, 80% training) ● we run 50 different experiments with different initial conditions ● we compute feature importance with LIME 1 1. LIME https://github.com/marcotcr/lime

Predictive performance - AUC ROC

Results, predictive performance AUC SAVRY Sum has 0.64 AUC Expert has 0.66 AUC

Results: disparity, sex False alarm rates Miss rates

Results: disparity, nationality False alarm rates Miss rates

Results: feature importance for logit

Results: feature importance for mlp

Results: difference in base rates (prevalence)

Results: difference in base rates

Conclusions ● ML models have better predictive performance ● ML models tend to discriminate more ● static features outweigh SAVRY features as importance ● preliminary study: the cause may be in the data (base rates)

Contributions We propose a methodology and a ML framework 1 ● to easily train ML models on tabular data (csv files) ● to evaluate these models in terms of predictive performance and fairness ● to connect to interpretability frameworks ● to reproduce with ease results and research 1. Open framework: https://gitlab.com/HUMAINT/humaint-fatml

Thank you! Any questions? You can find me at @nkundiushuti & marius.miron@ec.europa.eu & mariusmiron.com

The European Commissions science and knowledge service Joint - PowerPoint PPT Presentation

The European Commissions science and knowledge service Joint Research Centre Why machine learning may lead to unfairness Songl Tolan 1 , Marius Miron 1 , Emilia Gomez 1,2 , Carlos Castillo 2 1 European Commissions Joint Research Centre 2

Casey Rosenthal @caseyrosenthal Part One. SERVICE A SERVICE B SERVICE C SERVICE D SERVICE E

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

PERFORMANCE FAULT TOLERANCE AVAILABILITY FEATURE VELOCITY PERFORMANCE FAULT TOLERANCE

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

European Risk Summit Dr. Marianne Klingbeil Deputy Secretary General European Commission 12

The European Commissions science and knowledge service Joint Research Centre The European

OUTLINE CAPITALIZATION OF COLLECTIVE KNOWLEDGE: Knowledge management and Knowledge

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Knowledge acquisition Development cycle of a knowledge-based system Knowledge acquisition G53KRR

Knowledge Model Basics Challenges in knowledge modeling Basic knowledge-modeling constructs

The European Defence Fund Mihnea MOTOC Deputy Head, European Political Strategy Centre, European

European Investment Bank Promoting European objectives 21/08/2012 1 The EIB Group The European

Mail Service Quality Support: Mail Service Quality Support: Mail Service Quality Support: Mail

Our Commission Matthew 28:16-20 Our Commission: Introduction & Overview 1. Series overview

A True Positives Theorem for a Static Race Detector Nikos Gorogiannis Peter OHearn Ilya

PUBLIC POLICY TOWARD ABUSE OF FIRM DOMINANCE Outline Public policy: false positives and

M onkey: O ptimal N avigable Key -Value Store Niv Dayan, Manos Athanassoulis, Stratos Idreos

in the Range-Based Constraint Manager dm Balogh adam.balogh@ericsson.com Euro LLVM 2019

CSI5180. MachineLearningfor BioinformaticsApplications Fundamentals of Machine Learning tasks

Intrus ntrusion ion Det Detection, ection, Fi Fire rewalls, alls, an and d Intr ntrusion

Optimizing unit test execution in large software programs using dependency analysis Taesoo Kim,

and Evaluation CMSC 678 UMBC Central Question: How Well Are We Doing? Precision, Recall,

Sambuz

Useful Links

Newsletter

Mail Us

The European Commissions science and knowledge service Joint - PowerPoint PPT Presentation

The European Commissions science and knowledge service Joint Research Centre Why machine learning may lead to unfairness Songl Tolan 1 , Marius Miron 1 , Emilia Gomez 1,2 , Carlos Castillo 2 1 European Commissions Joint Research Centre 2

Casey Rosenthal @caseyrosenthal Part One. SERVICE A SERVICE B SERVICE C SERVICE D SERVICE E

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

PERFORMANCE FAULT TOLERANCE AVAILABILITY FEATURE VELOCITY PERFORMANCE FAULT TOLERANCE

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

European Risk Summit Dr. Marianne Klingbeil Deputy Secretary General European Commission 12

The European Commissions science and knowledge service Joint Research Centre The European

OUTLINE CAPITALIZATION OF COLLECTIVE KNOWLEDGE: Knowledge management and Knowledge

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Knowledge acquisition Development cycle of a knowledge-based system Knowledge acquisition G53KRR

Knowledge Model Basics Challenges in knowledge modeling Basic knowledge-modeling constructs

The European Defence Fund Mihnea MOTOC Deputy Head, European Political Strategy Centre, European

European Investment Bank Promoting European objectives 21/08/2012 1 The EIB Group The European

Mail Service Quality Support: Mail Service Quality Support: Mail Service Quality Support: Mail

Our Commission Matthew 28:16-20 Our Commission: Introduction &amp; Overview 1. Series overview

A True Positives Theorem for a Static Race Detector Nikos Gorogiannis Peter OHearn Ilya

PUBLIC POLICY TOWARD ABUSE OF FIRM DOMINANCE Outline Public policy: false positives and

M onkey: O ptimal N avigable Key -Value Store Niv Dayan, Manos Athanassoulis, Stratos Idreos

in the Range-Based Constraint Manager dm Balogh adam.balogh@ericsson.com Euro LLVM 2019

CSI5180. MachineLearningfor BioinformaticsApplications Fundamentals of Machine Learning tasks

Intrus ntrusion ion Det Detection, ection, Fi Fire rewalls, alls, an and d Intr ntrusion

Optimizing unit test execution in large software programs using dependency analysis Taesoo Kim,

and Evaluation CMSC 678 UMBC Central Question: How Well Are We Doing? Precision, Recall,

Sambuz

Useful Links

Newsletter

Mail Us

Our Commission Matthew 28:16-20 Our Commission: Introduction & Overview 1. Series overview