Predict Responsibly: Increasing Fairness by Learning to Defer David - - PowerPoint PPT Presentation

predict responsibly increasing fairness by learning to
SMART_READER_LITE
LIVE PREVIEW

Predict Responsibly: Increasing Fairness by Learning to Defer David - - PowerPoint PPT Presentation

Predict Responsibly: Increasing Fairness by Learning to Defer David Madras , Toniann Pitassi, Richard Zemel University of Toronto, Vector Institute December 8, 2017 David Madras , Toniann Pitassi, Richard Zemel (University of Toronto, Vector


slide-1
SLIDE 1

Predict Responsibly: Increasing Fairness by Learning to Defer

David Madras, Toniann Pitassi, Richard Zemel

University of Toronto, Vector Institute

December 8, 2017

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 1 / 15

slide-2
SLIDE 2

The Judge and the Black-Box

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 2 / 15

slide-3
SLIDE 3

The Judge and the Black-Box

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 2 / 15

slide-4
SLIDE 4

The Judge and the Black-Box

“0.6”

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 2 / 15

slide-5
SLIDE 5

The Judge and the Black-Box

“0.6”

What does the prediction “0.6” mean? What qualities should it have?

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 2 / 15

slide-6
SLIDE 6

What We Want From Black Box Predictions

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 3 / 15

slide-7
SLIDE 7

What We Want From Black Box Predictions

1 Accuracy David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 3 / 15

slide-8
SLIDE 8

What We Want From Black Box Predictions

1 Accuracy 2 Fairness David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 3 / 15

slide-9
SLIDE 9

What We Want From Black Box Predictions

1 Accuracy 2 Fairness 3 Responsibility — Ability to say “I Don’t Know” David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 3 / 15

slide-10
SLIDE 10

Why Say IDK?

Judge is external decision maker (DM) - may have more knowledge Can seek out extra information on difficult cases Can assess qualitative or difficult-to-codify features Can access privacy-sensitive information

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 4 / 15

slide-11
SLIDE 11

Learning to Punt

“Positive”, “Negative”, and “IDK” Learn two thresholds: t0, t1 At test time, punt to DM if t0 < xi < t1; else, output prediction

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 5 / 15

slide-12
SLIDE 12

Results - Punting

Trained our model (2-layer NN) with fair regularization Lfair = Accuracy + α · Fairness Simulated external DM by training separate (unfair) model This DM received some extra attributes in training, simulating a possible real-life imbalance between DM and model

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 6 / 15

slide-13
SLIDE 13

Results - COMPAS

0.22 0.24 0.26 0.28 0.30 0.32 0.34 Error Rate 0.00 0.05 0.10 0.15 0.20 Disparate Impact baseline-acc DM punt-fair punt-unfair binary-fair

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 7 / 15

slide-14
SLIDE 14

Results - Heritage Health

0.16 0.18 0.20 0.22 0.24 Error Rate 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 Disparate Impact baseline-acc DM punt-fair punt-unfair binary-fair

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 8 / 15

slide-15
SLIDE 15

DM-Aware Learning

What if judge has access to extra info on some defendants?

Detailed written analysis, classified info, further inquiry

What if judge is biased towards some types of defendants?

Unfairness may be concentrated on a few examples

By using info about the DM during learning, we could punt more intelligently This is learning to defer

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 9 / 15

slide-16
SLIDE 16

Learning to Defer

Modify our model to take DM scores YDM on training set Use IDK output as a mixing parameter πi Can describe system output Ysys as function of s ∼ Bernoulli(πi), YDM, and Ymodel Ysys =s · YDM + (1 − s) · Ymodel s ∈{0, 1}; Ysys, YDM, Ymodel ∈ [0, 1]

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 10 / 15

slide-17
SLIDE 17

Learning to Defer

Suppose we are optimizing some loss function L(Y , Ysys) over ground truth labels Y and system output Ysys We can then define a new loss function LDefer LDefer(Y , Ysys) = EsL(Y , Ysys) = EsL(Y , s · YDM + (1 − s) · Ymodel) Penalty for IDK ≈ DM loss on that example

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 11 / 15

slide-18
SLIDE 18

Results (Learning to Defer) - COMPAS

0.22 0.24 0.26 0.28 0.30 0.32 0.34 Error Rate 0.00 0.05 0.10 0.15 0.20 Disparate Impact defer-fair punt-fair binary-fair DM baseline-acc David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 12 / 15

slide-19
SLIDE 19

Results (Learning to Defer) - Heritage Health

0.16 0.18 0.20 0.22 0.24 Error Rate 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 Disparate Impact defer-fair punt-fair binary-fair DM baseline-acc David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 13 / 15

slide-20
SLIDE 20

Conclusion

We argue that it is important to consider IDK models as part of a larger pipeline We demonstrate that learning to defer can provide benefits above and beyond learning to punt Deferring intelligently can improve the entire pipeline in both accuracy and fairness

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 14 / 15

slide-21
SLIDE 21

Thank you!

David Madras, Toniann Pitassi, Richard Zemel (University of Toronto, Vector Institute) Predict Responsibly December 8, 2017 15 / 15