Fairness in ML 2: Equal opportunity and odds Privacy & Fairness - - PowerPoint PPT Presentation

fairness in ml 2 equal opportunity and odds
SMART_READER_LITE
LIVE PREVIEW

Fairness in ML 2: Equal opportunity and odds Privacy & Fairness - - PowerPoint PPT Presentation

Fairness in ML 2: Equal opportunity and odds Privacy & Fairness in Data Science CS848 Fall 2019 Slides adapted from https://fairmlclass.github.io/4.html 2 Outline Recap: Disparity impact Issues with Disparate Impact


slide-1
SLIDE 1

Fairness in ML 2: Equal opportunity and odds

Privacy & Fairness in Data Science CS848 Fall 2019

Slides adapted from https://fairmlclass.github.io/4.html

slide-2
SLIDE 2

Outline

  • Recap: Disparity impact

– Issues with Disparate Impact

  • Observational measure of fairness

– Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff

  • Achieving Equalized Odds

– Binary Classifier

2

slide-3
SLIDE 3

Recap: Disparate Impact

  • Let D=(X, Y, C) be a labeled data set, where X = 0 means

protected, C = 1 is the positive class (e.g., admitted), and Y is everything else.

  • We say that a classifier f has disparate impact (DI) of 𝜐 (0

< 𝜐 < 1) if: Pr 𝑔 𝑍 = 1 𝑌 = 0) Pr(𝑔 𝑍 = 1 | 𝑌 = 1) ≤ 𝜐 that is, if the protected class is positively classified less than 𝜐 times as often as the unprotected class. (legally, 𝜐 = 0.8 is common).

slide-4
SLIDE 4

Recap: Disparate Impact

4

X1 … … … … Race Bail … 1 … 1 1 (Y) 1 … 1 … 1 0 (N) 1 … 1 … 0 (N) .. … … … … … …

Y (features) f(Y) (prediction) X (protected attribute)

𝑄

012 𝐹 = Pr[𝐹|𝑌 = 1]

𝑄

016 𝐹 = Pr[𝐹|𝑌 = 0]

protected group

slide-5
SLIDE 5

Recap: Disparate Impact

5

X1 … … … … Race Bail … 1 … 1 1 (Y) 1 … 1 … 1 0 (N) 1 … 1 … 0 (N) .. … … … … … …

Y (features) f(Y) (prediction) X (protected attribute)

𝑄

016 𝑔 𝑍 = 1

𝑄

012[𝑔 𝑍 = 1] ≤ 𝜐

protected group Classifier f has DI of 𝜐:

slide-6
SLIDE 6

Demographic parity (or the reverse of disparate impact)

  • Definition. Classifier f satisfies demographic parity if f is

independent of X

  • When f is binary 0/1-variables, this means, for all groups

𝑦 and 𝑦′, 𝑄

01= 𝑔 𝑍 = 1 = 𝑄01=> 𝑔 𝑍 = 1

  • Approximate versions:

?@AB C D 12 ?@AB>[C D 12] ≥ 1 − 𝜗

– 𝑄

01= 𝑔 𝑍 = 1 − 𝑄01=> 𝑔 𝑍 = 1

≤ 𝜗

slide-7
SLIDE 7

Demographic parity Issues

7

X = 1 X = 0 C = 1

slide-8
SLIDE 8

Demographic parity Issues

  • Does not seem “fair” to allow random

performance on X = 0

  • Perfect classification is impossible

8

X = 1 X = 0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

slide-9
SLIDE 9

Outline

  • Recap: Disparity impact

– Issues with Disparate Impact

  • Observational measure of fairness

– Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff

  • Achieving Equalized Odds

– Binary Classifier

9

slide-10
SLIDE 10

True Positive Parity (TPP)

(or equal opportunity)

  • Assume classifier f and label C are binary 0/1-variables
  • Definition. Classifier f satisfies true positive parity if for

all groups 𝑦 and 𝑦′, 𝑄

01= 𝑔 𝑍 = 1|𝐷 = 1 = 𝑄01=> 𝑔 𝑍 = 1|𝐷 = 1

  • When positive outcome (1) is desirable
  • Equivalently, primary harm is due to false negatives

– Deny bail when person will not recidivate

10

slide-11
SLIDE 11

TPP

  • Forces similar performance on C = 1

11

X = 1 X = 0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

slide-12
SLIDE 12

False Positive Parity (FPP)

  • Assume classifier f and label C are binary 0/1-variables
  • Definition. Classifier f satisfies false positive parity if for

all groups 𝑦 and 𝑦′, 𝑄

01= 𝑔 𝑍 = 1|𝐷 = 0 = 𝑄01=> 𝑔 𝑍 = 1|𝐷 = 0

  • TPP & FPP: Equalized Odds, or Positive Rate Parity

f satisfies equalized odds if f is conditionally independent of X given C.

12

slide-13
SLIDE 13

Positive Rate Parity

13

X = 1 X = 0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 𝑄

012 𝑔(𝑍) = 1 𝐷 = 1] =?

𝑄

016 𝑔(𝑍) = 1 𝐷 = 1] =?

𝑄

012 𝑔(𝑍) = 1 𝐷 = 0] =?

𝑄

016 𝑔(𝑍) = 1 𝐷 = 0] =?

slide-14
SLIDE 14

Positive Rate Parity

14

X = 1 X = 0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 𝑄

012 𝑔(𝑍) = 1 𝐷 = 1] = 1

𝑄

016 𝑔(𝑍) = 1 𝐷 = 1] = 1

𝑄

012 𝑔(𝑍) = 1 𝐷 = 0] = 1/2

𝑄

016 𝑔(𝑍) = 1 𝐷 = 0] = 1/2

slide-15
SLIDE 15

Outline

  • Recap: Disparity impact

– Issues with Disparate Impact

  • Observational measure of fairness

– Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff

  • Achieving Equalized Odds

– Binary Classifier

15

slide-16
SLIDE 16

Predictive Value Parity

  • Assume classifier f and label C are binary 0/1-variables
  • Definition. Classifier f satisfies

– positive predictive value parity if if for all groups 𝑦 and 𝑦′, 𝑄

01= 𝐷 = 1|𝑔 𝑍 = 1 = 𝑄01=> 𝐷 = 1|𝑔 𝑍 = 1

– negative predictive value parity if if for all groups 𝑦 and 𝑦′, 𝑄

01= 𝐷 = 1|𝑔 𝑍 = 0 = 𝑄01=> 𝐷 = 1|𝑔 𝑍 = 0

– predictive value parity if satisfies both of the above.

  • Equalized chance of success given acceptance.

16

slide-17
SLIDE 17

Predictive Value Parity

17

X = 1 X = 0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

𝑄

012 𝐷 = 1 𝑔(𝑍) = 1] =

𝑄

016 𝐷 = 1 𝑔(𝑍) = 1] =

𝑄

012 𝐷 = 1 𝑔(𝑍) = 0] =

𝑄

016 𝐷 = 1 𝑔(𝑍) = 0] =

slide-18
SLIDE 18

Predictive Value Parity

18

X = 1 X = 0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

𝑄

012 𝐷 = 1 𝑔(𝑍) = 1] = 8/9

𝑄

016 𝐷 = 1 𝑔(𝑍) = 1] = 1/3

𝑄

012 𝐷 = 1 𝑔(𝑍) = 0] = 0

𝑄

016 𝐷 = 1 𝑔(𝑍) = 0] = 0

slide-19
SLIDE 19

Trade-off

  • Proposition. Assume differing base rates and an

imperfect classifier 𝑔 ≠ 𝐷. Then either

– Positive rate parity fails, or – Predictive value parity fails.

  • We will look at a similar result later in the course

due to Kleinberg, Mullainathan and Raghavan (2016)

19

slide-20
SLIDE 20

Intuition

  • So far, predictor is perfect.
  • Let's introduce an error.

20

slide-21
SLIDE 21

Intuition

  • But this doesn't satisfy positive rate parity!
  • Let's fix that!

21

slide-22
SLIDE 22

Intuition

  • Satisfies positive rate parity!

22

slide-23
SLIDE 23

Intuition

  • Does not satisfy predictive value parity!

23

slide-24
SLIDE 24

24

slide-25
SLIDE 25

Outline

  • Recap: Disparity impact

– Issues with Disparate Impact

  • Observational measure of fairness

– Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff

  • Achieving Equalized Odds

– Binary Classifier

25

slide-26
SLIDE 26

Equalized Odds

f satisfies equalized odds if f is conditionally independent of protected X given outcome C.

  • Let 𝑔

P be any classifier out of the existing training pipeline for the problem at hand that fails to satisfy equalized odds

26

slide-27
SLIDE 27

Classifier 𝑔 Pthat does not satisfy equalized odds

27

X = 1 X = 0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 𝑄

012 𝑔

P(𝑍) = 1 𝐷 = 0] ≠ 𝑄

016 𝑔

P(𝑍) = 1 𝐷 = 0]

slide-28
SLIDE 28

Derived Classifier

  • A new classifier 𝑔

Q is derived from 𝒈 S and the protected attribute X

– 𝑔 Q is independent of features Y conditional on (𝑔 P,X) – 𝑄

012 𝑔

Q 𝑍 = 𝑑|𝐷 = 1 is ∑ 𝑄 𝑑|𝑔 P 𝑍 = 𝑑V, 𝑌 = 1 ⋅ 𝑄

012 𝑔

P 𝑍 = 𝑑′|𝐷 = 1

  • Y>∈{6,2}

– 𝑄

012 𝑔

Q 𝑍 = 𝑑|𝐷 = 0 is ∑ 𝑄 𝑑|𝑔 P 𝑍 = 𝑑V, 𝑌 = 1 ⋅ 𝑄

012 𝑔

P 𝑍 = 𝑑′|𝐷 = 0

  • Y>∈{6,2}

– 𝑄

016 𝑔

Q 𝑍 = 𝑑|𝐷 = 1 – 𝑄

016 𝑔

Q 𝑍 = 𝑑|𝐷 = 0

28

X=1 c'=0 c’=1 c=0 p0 p1 c=1 1-p0 1-p1 X=0 c’=0 c’=1 c=0 p2 p3 c=1 1-p2 1-p3

slide-29
SLIDE 29

Derived Classifier

  • Options for 𝑔

Q:

– 𝑔 Q = 𝑔 P (+) – 𝑔 Q = 1 − 𝑔 P (x) – 𝑔 Q = (1,1) – 𝑔 Q = (0,0) – Or some randomized combination of these

29

𝑄

012 𝑔

Q(𝑍) = 1 𝐷 = 0] 𝑄

012 𝑔

Q(𝑍) = 1 𝐷 = 1]

0.0 1.0 0.5 0.0 1.0 0.5

+

x

1

  • 𝐷

Q is in the enclosed region

slide-30
SLIDE 30

Derived Classifier

30

𝑔 Q is in this region for X = 0 𝑔 Q is in this region for X = 1 𝑄

0 𝑔

Q(𝑍) = 1 𝐷 = 0] 𝑄

0 𝑔

Q(𝑍) = 1 𝐷 = 1]

slide-31
SLIDE 31

Derived Classifier

  • Loss minimization: 𝑚: 0,1 _ → 𝑆

– Indicate the loss of predicting 𝑔 Q 𝑍 = 𝑑 when the correct label is 𝑑′′

  • Minimize the expected loss E [𝑚 𝑔

Q(𝑍), 𝐷 ] s.t. – 𝑔 Q is derived – 𝑔 Q satisfies equalized odds

  • 𝑄

012 𝑔

Q 𝑍 = 1|𝐷 = 1 = 𝑄

016 𝑔

Q 𝑍 = 1|𝐷 = 1

  • 𝑄

012 𝑔

Q 𝑍 = 1|𝐷 = 0 = 𝑄

016 𝑔

Q 𝑍 = 1|𝐷 = 0

31

slide-32
SLIDE 32

Derived Classifier

  • E 𝑚 𝑔

Q 𝑍 , 𝐷 = ∑ 𝑚 𝑑, 𝑑V′ Pr[𝑔 Q(𝑍) = 𝑑, 𝐷 = 𝑑VV]

  • Y,Y>>∈{6,2}
  • Pr[𝑔

Q = 𝑑, 𝐷 = 𝑑′′] = Pr 𝑔 Q = 𝑑, 𝐷 = 𝑑′′ 𝑔 Q = 𝑔 P Pr 𝑔 Q = 𝑔 P +Pr 𝑔 Q = 𝑑, 𝐷 = 𝑑′′ 𝑔 Q ≠ 𝑔 P Pr 𝑔 Q ≠ 𝑔 P = Pr 𝑔 P = 𝑑, 𝐷 = 𝑑′′ Pr 𝑔 Q = 𝑔 P +Pr 𝑔 P = 1 − 𝑑, 𝐷 = 𝑑′′ Pr 𝑔 Q ≠ 𝑔 P

32

Based on the joint distribution X=1

c'=0 c’=1 c=0 p0 p1 c=1 1-p0 1-p1 X=0 c'=0 c’=1 c=0 p2 p3 c=1 1-p2 1-p3 𝑔 Q 𝑔 P

slide-33
SLIDE 33

Summary: Multiple fairness measures

  • Demographic parity or disparate impact

– Pro: Used in the law – Con: Perfect classification is impossible – Achieved by modifying data

  • Equal odds/ opportunity

– Pro: Perfect classification is possible – Con: Different groups can get different rates of positive prediction – Achieved by post processing the classifier

33

slide-34
SLIDE 34

Summary: Multiple fairness measures

  • Equal odds/opportunity

– Different groups may be treated unequally – Maybe due to the problem – Maybe due to bias in the dataset

  • While demographic parity seems like a good fairness

goal for the society, … Equal odds/opportunity seems to be measuring whether an algorithm is fair (independent of other factors like input data).

34

slide-35
SLIDE 35

Summary: Multiple fairness measures

  • Fairness through Awareness:

– Need to define a distance function d(x,x’) – A guarantee at the individual level (rather than on groups) – How does this connect to other notions of fairness?

35