Fairness and bias in Machine Learning A quick review on tools to - PowerPoint PPT Presentation

QCon 2019 Fairness and bias in Machine Learning A quick review on tools to detect biases in machine learning model • Thierry Silbermann, Tech Lead Data Science at Nubank thierry.silbermann@nubank.com.br

Data collection • Today’s applications collect and mine vast quantities of personal information. • The collection and use of such data raise two important challenges. • First, massive data collection is perceived by many as a major threat to traditional notions of individual privacy. • Second, the use of personal data for algorithmic decision- making can have unintended and harmful consequences, such as unfair or discriminatory treatment of users.

Fairness • Fairness is increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as: • Mortgage lending • Hiring • Prison sentencing • (Approve customers, increase credit line)

Definitions of fairness http://fairware.cs.umass.edu/papers/Verma.pdf

Definitions of fairness • It is impossible to satisfy all definitions of fairness at the same time [Kleinberg et al., 2017] • Although fairness research is a very active field, clarity on which bias metrics and bias mitigation strategies are best is yet to be achieved [Friedler et al., 2018] • In addition to the multitude of fairness definitions, di ff erent bias handling algorithms address di ff erent parts of the model life-cycle, and understanding each research contribution, how, when and why to use it is challenging even for experts in algorithmic fairness. Tutorial: 21 fairness definitions and their politics: https://www.youtube.com/watch?v=jIXIuYdnyyk

Example: Prison sentencing Did not True Negative False Positive recidivate Recidivate False Negative True Positive Label Label low-risk high-risk Tutorial: 21 fairness definitions and their politics: https://www.youtube.com/watch?v=jIXIuYdnyyk

Example: Prison sentencing Decision maker: Of those I’ve labeled high-risk, how many will recidivate ? Predictive value Did not False Positive True Negative recidivate Recidivate True Positive False Negative Label Label low-risk high-risk Tutorial: 21 fairness definitions and their politics: https://www.youtube.com/watch?v=jIXIuYdnyyk

Example: Prison sentencing Decision maker: Of those I’ve labeled high-risk, how many will recidivate ? Predictive value Did not True Negative False Positive recidivate Defendant: What’s the probability I’ll be incorrectly classifying high-risk ? False positive rate Recidivate False Negative True Positive Label Label low-risk high-risk Tutorial: 21 fairness definitions and their politics: https://www.youtube.com/watch?v=jIXIuYdnyyk

Example: Prison sentencing Decision maker: Of those I’ve labeled high-risk, how many will recidivate ? Predictive value Did not False Positive True Negative recidivate Defendant: What’s the probability I’ll be incorrectly classifying high-risk ? False positive rate Recidivate True Positive False Negative Society [think hiring rather than criminal justice]: Is the selected set demographically balanced ? Label Label Demography low-risk high-risk Tutorial: 21 fairness definitions and their politics: https://www.youtube.com/watch?v=jIXIuYdnyyk

18 scores/metrics https://en.wikipedia.org/wiki/Confusion_matrix

Terminology • Favorable label : a label whose value corresponds to an outcome that provides an advantage to the recipient. • receiving a loan, being hired for a job, and not being arrested • Protected attribute : attribute that partitions a population into groups that have parity in terms of benefit received • race, gender, religion • Protected attributes are not universal, but are application specific • Privileged value of a protected attribute: group that has historically been at a systematic advantage • Group fairness : the goal of groups defined by protected attributes receiving similar treatments or outcomes • Individual fairness : the goal of similar individuals receiving similar treatments or outcomes

Terminology • Bias : systematic error • In the context of fairness, we are concerned with unwanted bias that places privileged groups at a systematic advantage and unprivileged groups at a systematic disadvantage. • Fairness metric : a quantification of unwanted bias in training data or models. • Bias mitigation algorithm : a procedure for reducing unwanted bias in training data or models.

But wait ! • I’m not using any feature that is discriminatory for my application ! • I’ve never used gender or even race !

But wait ! https://demographics.virginia.edu/DotMap/index.html

But wait ! Chicago Area, IL, USA https://demographics.virginia.edu/DotMap/index.html

Fairness metric • Confusion matrix • TP , FP , TN, FN, TPR, FPR, TNR, FNR • Prevalence, accuracy, PPV, FDR, FOR, NPV • LR+, LR-, DOR, F1

Fairness metric • Di ff erence of Means • Disparate Impact • Statistical Parity • Odd ratios • Consistency • Generalized Entropy Index

Statistical parity difference • Group fairness == statistical parity di ff erence == equal acceptance rate == benchmarking • A classifier satisfies this definition if subjects in both protected and unprotected groups have equal probability of being assigned to the positive predicted class. • Example, this would imply equal probability for male and female applicants to have good predicted credit score: • P( d = 1 | G = male ) = P ( d = 1 | G = female ) • The main idea behind this definition is that applicants should have an equivalent opportunity to obtain a good credit score, regardless of their gender.

Disparate impact X=0 X=1 The 80% test was originally framed by a panel of 32 professionals assembled by the State of California Fair Employment Practice Commission (FEPC) in 1971 FALSE A B Predicted condition TRUE C D

Disparate impact X=0 X=1 FALSE A B Predicted condition TRUE C D The 80% rule can then be quantified as:

Aequitas approach https://dsapp.uchicago.edu/projects/aequitas/

How about some solutions?

Disparate impact remover Relabelling Learning Fair representation

Disparate impact remover Prejudice remover regulariser Reject Option Classification Relabelling Adversarial Debiasing Optimised Preprocessing Learning Fair representation

Disparate impact remover Prejudice remover regulariser Reject Option Classification Equalised Odds Post-processing Meta-Algorithm for Fair Classification Relabelling Reweighing Adversarial Debiasing Optimised Preprocessing Additive counterfactually fair estimator Learning Fair representation Calibrated Equalised Odds Post-processing

How about fixing predictions? • There are three main paths to the goal of making fair predictions: • fair pre-processing, • fair in-processing, and • fair post-processing

AIF360, https://arxiv.org/abs/1810.01943

Pre-Processing • Reweighing generates weights for the training examples in each (group, label) combination di ff erently to ensure fairness before classification. • Optimized preprocessing (Calmon et al., 2017) learns a probabilistic transformation that edits the features and labels in the data with group fairness, individual distortion, and data fidelity constraints and objectives. • Learning fair representations (Zemel et al., 2013) finds a latent representation that encodes the data well but obfuscates information about protected attributes. • Disparate impact remover (Feldman et al., 2015) edits feature values to increase group fairness while preserving rank-ordering within groups.

In-Processing • Adversarial debiasing (Zhang et al., 2018) learns a classifier to maximize prediction accuracy and simultaneously reduce an adversaries ability to determine the protected attribute from the predictions. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit. • Prejudice remover (Kamishima et al., 2012) adds a discrimination-aware regularization term to the learning objective

Post-Processing • Equalized odds postprocessing (Hardt et al., 2016) solves a linear program to find probabilities with which to change output labels to optimize equalized odds. • Calibrated equalized odds post-processing (Pleiss et al., 2017) optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective. • Reject option classification (Kamiran et al., 2012) gives favorable outcomes to unprivileged groups and unfavorable outcomes to privileged groups in a confidence band around the decision boundary with the highest uncertainty.

Fairness and bias in Machine Learning A quick review on tools to - PowerPoint PPT Presentation

QCon 2019 Fairness and bias in Machine Learning A quick review on tools to detect biases in machine learning model Thierry Silbermann, Tech Lead Data Science at Nubank thierry.silbermann@nubank.com.br Data collection Todays

Fairness in Machine Learning Fairness in Supervised Learning Make decisions by machine learning:

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Fairness in Machine Learning: Part I Privacy & Fairness in Data Science CS848 Fall 2019

Bias and Fairness in Machine Learning Irene Y. Chen @irenetrampoline

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Fairness in Machine Learning: Practicum Privacy & Fairness in Data Science CS848 Fall 2019

Bias in, Bias out: Gender Equality and the Fourth Industrial Revolution Debra Howcroft and

Transistor bias circuits 1 Objectives Discuss the concept of dc biasing of a transistor for

go to the source The Media Bias Chart The Media Bias Chart A new taxonomy for discussing the

Implicit Bias Implicit bias Implicit bias refers to attitudes or stereotypes that affect our

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias

Review Selection bias, overfitting Bias v. variance v. residual Bias-variance tradeoff

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

ILLUMINATING PHILANTHROPY February 13, 2020 MCF Annual Conference Mala Thao Mark Hiemenz

Flux Correction of the Land Surface Temperature in the UM Model Chen Li, Dietmar Dommenget What

Instantaneous Impressions: Managing Bias within Systems of Care John Aller, PCC, LICDC Isaac

Speaker Series V: Leading the Development of an Inclusive Workplace May 16, 2019 Employment-Based

You never really understand a person until you consider things from his point of view . . .

Opinions: State v. Walton , 74 Mo. 270 (Mo. 1881) (jury bias-actual or implied) State v. Rashad ,

Weight Bias: Changing Public Perception Begins with Me Joe Nadglowski OAC President and CEO

Implicit Bias and Race Mikah K. Thompson, Esq. Director of Affirmative Action & Adjunct

Sambuz

Useful Links

Newsletter

Mail Us

Fairness and bias in Machine Learning A quick review on tools to - PowerPoint PPT Presentation

QCon 2019 Fairness and bias in Machine Learning A quick review on tools to detect biases in machine learning model Thierry Silbermann, Tech Lead Data Science at Nubank thierry.silbermann@nubank.com.br Data collection Todays

Fairness in Machine Learning Fairness in Supervised Learning Make decisions by machine learning:

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

BIAS BIAS LIGHT LIGHT &amp; &amp; MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Fairness in Machine Learning: Part I Privacy &amp; Fairness in Data Science CS848 Fall 2019

Bias and Fairness in Machine Learning Irene Y. Chen @irenetrampoline

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Fairness in Machine Learning: Practicum Privacy &amp; Fairness in Data Science CS848 Fall 2019

Bias in, Bias out: Gender Equality and the Fourth Industrial Revolution Debra Howcroft and

Transistor bias circuits 1 Objectives Discuss the concept of dc biasing of a transistor for

go to the source The Media Bias Chart The Media Bias Chart A new taxonomy for discussing the

Implicit Bias Implicit bias Implicit bias refers to attitudes or stereotypes that affect our

Equity &amp; Excellence: Hidden Bias Implicit Bias Inherent Bias

Review Selection bias, overfitting Bias v. variance v. residual Bias-variance tradeoff

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

ILLUMINATING PHILANTHROPY February 13, 2020 MCF Annual Conference Mala Thao Mark Hiemenz

Flux Correction of the Land Surface Temperature in the UM Model Chen Li, Dietmar Dommenget What

Instantaneous Impressions: Managing Bias within Systems of Care John Aller, PCC, LICDC Isaac

Speaker Series V: Leading the Development of an Inclusive Workplace May 16, 2019 Employment-Based

You never really understand a person until you consider things from his point of view . . .

Opinions: State v. Walton , 74 Mo. 270 (Mo. 1881) (jury bias-actual or implied) State v. Rashad ,

Weight Bias: Changing Public Perception Begins with Me Joe Nadglowski OAC President and CEO

Implicit Bias and Race Mikah K. Thompson, Esq. Director of Affirmative Action &amp; Adjunct

Sambuz

Useful Links

Newsletter

Mail Us

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Fairness in Machine Learning: Part I Privacy & Fairness in Data Science CS848 Fall 2019

Fairness in Machine Learning: Practicum Privacy & Fairness in Data Science CS848 Fall 2019

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias

Implicit Bias and Race Mikah K. Thompson, Esq. Director of Affirmative Action & Adjunct