Repairing without Retraining: Avoiding Disparate Impact with - PowerPoint PPT Presentation

Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions Hao Wang, Berk Ustun, Flavio P. Calmon hao_wang@g.harvard.edu, {berk,Flavio}@seas.harvard.edu 0

Outline • Use cases • A bank enters a new market and discovers its credit score underperforms on customers over 60 years of age • A rural clinic purchases a classification model to detect lung cancer and discovers that patients in a certain subgroup have high FPR • Framework and methodology • “Counterfactual distribution” • Local perturbation and influence function • Model repair 1

<latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> Disparate Impact Outcome ˆ Input variables X Y (binary, e.g. recidivism risk) (e.g. age, criminal history) Classifier Sensitive attribute S (binary, e.g. race) 2

<latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> Disparate Impact Outcome ˆ Input variables X Y (binary, e.g. recidivism risk) (e.g. age, criminal history) Classifier Sensitive attribute S (binary, e.g. race) Performance target group baseline group disparate impact target Baseline group group Changes in input distribution… Can lead to different performance. 3

Counterfactual Distribution Definition. For a given disparity metric M ( · ) , a counterfactual distribution is a distribution of input variables over the target group such that: | M ( Q 0 Q X ∈ argmin X ) | , Q 0 X 2 P where P is the set of probability distributions over X . Distributions over input Observed Counterfactual SP FNR FPR Female Male Female Female Male Married 18% 63% 39% 23% 54% P X | S =0 Immigrant 10% 11% 11% 11% 12% HighestDegree is HS 32% 32% 24% 28% 37% HighestDegree is AS 7% 8% 9% 9% 6% HighestDegree is BS 15% 18% 21% 17% 13% HighestDegree is MSorPhD 6% 7% 13% 8% 5% AnyCapitalLoss 3% 5% 8% 5% 4% Age ≤ 30 39% 29% 29% 38% 35% WorkHrsPerWeek < 40 38% 17% 33% 37% 19% Q X JobType is WhiteCollar 34% 19% 36% 35% 15% P X | S =1 JobType is BlueCollar 5% 34% 4% 5% 39% JobType is Specialized 23% 21% 29% 23% 20% JobType is ArmedOrProtective 1% 2% 1% 1% 3% Industry is Private 73% 69% 64% 69% 70% Industry is Government 15% 12% 22% 17% 12% Industry is SelfEmployed 5% 15% 8% 6% 13% 4

Goal: Model Repair New sample baseline group: x Classifier New sample Preprocessor T ( x ) target group: T ( · ) x reduce Performance disparity Goal: repair a classifier that has disparate impact by preprocessing the data target Baseline group group 5

Goal: Model Repair New sample baseline group: x Classifier New sample Preprocessor T ( x ) target group: T ( · ) x reduce Performance disparity We build the pre-processor in two steps: 1) Compute a counterfactual distribution that minimizes disparate impact. 2) Solve an optimal transport problem target Baseline between the distribution of the target group group population and the counterfactual distribution. 6

Numerical Experiments: COMPAS and UCI Adult Original Model Repaired Model Target Group AUC Target Baseline Target Target Before After Disc. Disc. Dataset Metric Group Group Group Gap Group Gap Repair Repair SP Female 0.696 0.874 0.178 0.688 -0.007 0.895 0.758 adult FNR Female 0.478 0.639 0.161 0.483 0.004 0.895 0.880 adult FPR Male 0.021 0.119 0.098 0.023 0.002 0.829 0.714 adult SP White 0.514 0.594 0.079 0.533 0.018 0.704 0.667 compas FNR White 0.350 0.487 0.439 0.704 0.699 0.137 0.088 compas FPR Non-white 0.190 0.278 0.087 0.160 -0.029 0.732 0.680 compas 7 [Bache and Lichman, 2013], [Angwin et al., 2016]

Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions Poster Session: Thursday 06:30 -- 09:00 PM Pacific Ballroom http://github.com/ustunb/ctfdist 8

Repairing without Retraining: Avoiding Disparate Impact with - PowerPoint PPT Presentation

Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions Hao Wang, Berk Ustun, Flavio P. Calmon hao_wang@g.harvard.edu, {berk,Flavio}@seas.harvard.edu 0 Outline Use cases A bank enters a new market and

Air Force Retraining Program Air Force Retraining Program These slides are intended for those

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Outlier Channel Splitting Improving DNN Quantization without Retraining Ritchie Zhao , Yuwei Hu,

Careers Information Advice and Guidance (CIAG) for Adults National DWP Retraining A

Without sustaining injury Without sustaining injury Without sustaining injury Without sustaining

REPAIRING HISTORIC FLAGSTONE SIDEWALKS Diane Travis Denver Homeowner, Masonry Expert The City

Repairing the Posterolateral Corner Versus Reconstruction: Acute and Chronic CAPT Matthew T.

Repairing (electronic) vulnerabilities: towards an Ethics of e-waste. * I'd like to start my

Dec 08 In Circuit Programming method (ICP) Introduction g g ( ) Benefit of the highest update

Indentifying and Repairing Meniscus Root Tears Thomas Carter, MD Phoenix, AZ Meniscal Root

Repairing Our Cities Aging Pipelines Pipeline Safety

Smart Cement for Concrete and Repairing Grout Applications with Real-Time Monitoring and

Epoxy Rock Repairs Jim McAlister The Georgia Aquarium Inc. REPAIRING UNDERWATER ARTIFICIAL

REPAIRING PROGRAMS WITH SEMANTIC CODE SEARCH Yalin Ke Kathryn T . Stolee Claire Le Goues

Identifying and Repairing Form 709 Gift Tax and GST Return Reporting Errors Filing Corrective

North Bucks rRIPPLE ramblers Repairing & Improving Public Paths for Leisure & Exercise

Computing Closed-Form Solutions of Integrable Connections Thomas Cluzeau thomas.cluzeau@xlim.fr

A Systematic Study of Neural Discourse Models for Implicit Discourse Relation Attapol T.

How do you Learn New Skills? Emily Bache As a Professional Programmer - how do you learn new

MARKET STRUCTURE Overview Context: Youre analyzing a given industry. How would you expect

Desktop components Jens Bache-Wiig Agenda Intro Project status Coding QA

Fuzzy Matching In PostgreSQL A Story From The Trenches Charles Clavadetscher Swiss PostgreSQL

Combining Solr and Elasticsearch to Improve Autosuggestion on Mobile Local Search Toan Vinh Luu,

Lessons Learned from Building a Large Multilingual, Multi-region Website in Drupal 8 Stella

Repairing without Retraining: Avoiding Disparate Impact with - PowerPoint PPT Presentation

Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions Hao Wang, Berk Ustun, Flavio P. Calmon hao_wang@g.harvard.edu, {berk,Flavio}@seas.harvard.edu 0 Outline Use cases A bank enters a new market and

Air Force Retraining Program Air Force Retraining Program These slides are intended for those

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Outlier Channel Splitting Improving DNN Quantization without Retraining Ritchie Zhao , Yuwei Hu,

Careers Information Advice and Guidance (CIAG) for Adults National DWP Retraining A

Without sustaining injury Without sustaining injury Without sustaining injury Without sustaining

REPAIRING HISTORIC FLAGSTONE SIDEWALKS Diane Travis Denver Homeowner, Masonry Expert The City

Repairing the Posterolateral Corner Versus Reconstruction: Acute and Chronic CAPT Matthew T.

Repairing (electronic) vulnerabilities: towards an Ethics of e-waste. * I'd like to start my

Dec 08 In Circuit Programming method (ICP) Introduction g g ( ) Benefit of the highest update

Indentifying and Repairing Meniscus Root Tears Thomas Carter, MD Phoenix, AZ Meniscal Root

Repairing Our Cities Aging Pipelines Pipeline Safety

Smart Cement for Concrete and Repairing Grout Applications with Real-Time Monitoring and

Epoxy Rock Repairs Jim McAlister The Georgia Aquarium Inc. REPAIRING UNDERWATER ARTIFICIAL

REPAIRING PROGRAMS WITH SEMANTIC CODE SEARCH Yalin Ke Kathryn T . Stolee Claire Le Goues

Identifying and Repairing Form 709 Gift Tax and GST Return Reporting Errors Filing Corrective

North Bucks rRIPPLE ramblers Repairing &amp; Improving Public Paths for Leisure &amp; Exercise

Computing Closed-Form Solutions of Integrable Connections Thomas Cluzeau thomas.cluzeau@xlim.fr

A Systematic Study of Neural Discourse Models for Implicit Discourse Relation Attapol T.

How do you Learn New Skills? Emily Bache As a Professional Programmer - how do you learn new

MARKET STRUCTURE Overview Context: Youre analyzing a given industry. How would you expect

Desktop components Jens Bache-Wiig Agenda Intro Project status Coding QA

Fuzzy Matching In PostgreSQL A Story From The Trenches Charles Clavadetscher Swiss PostgreSQL

Combining Solr and Elasticsearch to Improve Autosuggestion on Mobile Local Search Toan Vinh Luu,

Lessons Learned from Building a Large Multilingual, Multi-region Website in Drupal 8 Stella

North Bucks rRIPPLE ramblers Repairing & Improving Public Paths for Leisure & Exercise