Bounding the fairness and accuracy of classifiers from population - PowerPoint PPT Presentation

Bounding the fairness and accuracy of classifiers from population statistics ICML 2020 Sivan Sabato and Elad Yom-Tov Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 1 / 15

The 1-slide summary We show how to study a classifier without even a black box access to the classifier and without validation data. Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 2 / 15

The 1-slide summary We show how to study a classifier without even a black box access to the classifier and without validation data. Our methodology makes provable inferences about classifier quality . Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 2 / 15

The 1-slide summary We show how to study a classifier without even a black box access to the classifier and without validation data. Our methodology makes provable inferences about classifier quality . The quality combines the accuracy and the fairness of the classifier. Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 2 / 15

The 1-slide summary We show how to study a classifier without even a black box access to the classifier and without validation data. Our methodology makes provable inferences about classifier quality . The quality combines the accuracy and the fairness of the classifier. We make inferences using a small number of aggregate statistics. Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 2 / 15

The 1-slide summary We show how to study a classifier without even a black box access to the classifier and without validation data. Our methodology makes provable inferences about classifier quality . The quality combines the accuracy and the fairness of the classifier. We make inferences using a small number of aggregate statistics. We demonstrate in experiments a wide range of possible applications. Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 2 / 15

Introduction Classifiers affect many aspects of our lives. But some of these classifiers cannot be directly validated: ◮ Unavailability of representative individual-level validation data ◮ Company of government secret: not even black-box access What can we infer about a classifier using only aggregate statistics? Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 3 / 15

What can we tell about an unpublished classifier? A motivating example: A health insurance company classifies whether a client is as “at risk” for some medical condition. We do not know how this classification is done; We have no individual classification data. Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 4 / 15

What can we tell about an unpublished classifier? A motivating example: A health insurance company classifies whether a client is as “at risk” for some medical condition. We do not know how this classification is done; We have no individual classification data. But we would still like to study the properties of the classifier: ◮ Accuracy ◮ Fairness Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 4 / 15

What can we tell about an unpublished classifier? A motivating example: A health insurance company classifies whether a client is as “at risk” for some medical condition. We do not know how this classification is done; We have no individual classification data. But we would still like to study the properties of the classifier: ◮ Accuracy ◮ Fairness Can this be done with minimal information about the classifier? Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 4 / 15

Fairness Fairness is defined with respect to some attribute of the individual. ◮ E.g., race, age, gender, state of residence We will be interested in attributes with several different values. A sub-population includes the individual who share the attribute value (e.g., same race/age bracket/state, etc.). Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 5 / 15

Fairness Fairness is defined with respect to some attribute of the individual. ◮ E.g., race, age, gender, state of residence We will be interested in attributes with several different values. A sub-population includes the individual who share the attribute value (e.g., same race/age bracket/state, etc.). A fair classifier treats all sub-populations the same . Equalized Odds [Hardt et. al, 2016]: The false positive rate (FPR) and the false negative rate (FNR) are fixed across all sub-populations. Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 5 / 15

Using population statistics Back to the example: Use available information Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 6 / 15

Using population statistics Back to the example: Use available information Size of each sub-population Prevalence rate of the condition in each sub-population Fraction of positive predictions in each sub-population. State Population Fraction Have condition Classified as positive California 12.2% 0.3% 0.4% Texas 8.6% 1.2% 5% ... ... ... ... Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 6 / 15

Using population statistics Back to the example: Use available information Size of each sub-population Prevalence rate of the condition in each sub-population Fraction of positive predictions in each sub-population. State Population Fraction Have condition Classified as positive California 12.2% 0.3% 0.4% Texas 8.6% 1.2% 5% ... ... ... ... What is the accuracy of this classifier? What is the fairness? Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 6 / 15

Using population statistics Back to the example: Use available information Size of each sub-population Prevalence rate of the condition in each sub-population Fraction of positive predictions in each sub-population. State Population Fraction Have condition Classified as positive California 12.2% 0.3% 0.4% Texas 8.6% 1.2% 5% ... ... ... ... What is the accuracy of this classifier? What is the fairness? Without individual data, there are many possibilities: Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 6 / 15

The relationship between accuracy and fairness If fairness or error are constrained, this also constrains the other. Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 7 / 15

The relationship between accuracy and fairness If fairness or error are constrained, this also constrains the other. Example: Population Fraction Have condition Classified as positive State A 1 / 2 1 / 3 1 / 2 State B 1 / 2 2 / 3 2 / 3 Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 7 / 15

The relationship between accuracy and fairness If fairness or error are constrained, this also constrains the other. Example: Population Fraction Have condition Classified as positive State A 1 / 2 1 / 3 1 / 2 State B 1 / 2 2 / 3 2 / 3 ◮ True positives: . Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 7 / 15

The relationship between accuracy and fairness If fairness or error are constrained, this also constrains the other. Example: Population Fraction Have condition Classified as positive State A 1 / 2 1 / 3 1 / 2 State B 1 / 2 2 / 3 2 / 3 ◮ True positives: . ◮ Which are the predicted positives? Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 7 / 15

Bounding the fairness and accuracy of classifiers from population - PowerPoint PPT Presentation

Bounding the fairness and accuracy of classifiers from population statistics ICML 2020 Sivan Sabato and Elad Yom-Tov Sabato & Yom-Tov (Microsoft & BGU) Bounding fairness and accuracy 1 / 15 The 1-slide summary We show how to study a

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Hierarchical Bounding Volume October 11, 2005 () Hierarchical Bounding Volume October 11, 2005

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Towards Secure Distance Bounding Ioana Boureanu, Katerina Mitrokotsa, Serge Vaudenay COLE

On the Need for Provably Secure Distance Bounding Serge Vaudenay COLE POLYTECHNIQUE

Learning From Data Lecture 6 Bounding The Growth Function Bounding the Growth Function Models

Typically represent objects by bounding boxes. People have tried Goal rotated bounding boxes

Computer Graphics MTAT.03.015 Raimond Tunnel The Road So Far... Bounding Box With bounding

Spatial Data Structures Hierarchical Bounding Volumes Hierarchical Bounding Volumes Grids Grids

Distance Bounding for RFID Prof. Gildas Avoine Universit e catholique de Louvain, Belgium

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

COMP30112: Concurrency Topics 5.4: Fairness and Starvation Howard Barringer Room KB2.20: email:

Media Fairness, Diversity 1 Outline Fairness (case studies, basic definitions) Diversity

Swift Intensive Monitoring of NGC 4593 Ian M c Hardy, Sam Connolly, Keith Horne, Ed Cackett,

Noncommutative Discriminants of Quantum Cluster Algebras Kurt Trampel Joint work with Bach

Swiss-Cheese operad and Drinfeld center Najib Idrissi June 3rd, 2016 @ ETH Zrich Little disks

Density Estimation Optimizations for Global Illumination Rub en Garc a, Carlos Ure na,

Deep learning for MR imaging and analysis Shanshan Wang Paul C. Lauterbur Research Center for

Big Data Processing Technologies Chentao Wu Associate Professor Dept. of Computer Science and

Roto-Translation Equivariant Convolution Networks for Medical Image Analysis Erik J Bekkers,

Magnet neto-pl plasmonic asmonic Au/ u/Tb Tb 18 18 Co Co 82 82 nano nano-ring ring res