Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. - - PowerPoint PPT Presentation

β–Ά
why is my classifier discriminatory
SMART_READER_LITE
LIVE PREVIEW

Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. - - PowerPoint PPT Presentation

Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. Johansson, David Sontag Massachusetts Institute of Technology (MIT) NeurIPS 2018, Poster #120 Thurs 12/6 10:45am 12:45pm @ 210 & 230 It is su y to make a surprisi singly y


slide-1
SLIDE 1

Why is My Classifier Discriminatory?

Irene Y. Chen, Fredrik D. Johansson, David Sontag Massachusetts Institute of Technology (MIT)

NeurIPS 2018, Poster #120 Thurs 12/6 10:45am – 12:45pm @ 210 & 230

slide-2
SLIDE 2

It is su

surprisi singly y easy y to make a

discriminatory algorithm.

slide-3
SLIDE 3

Source: Shutterstock

slide-4
SLIDE 4

0.16 0.18 0.20 0.22

Zero-one loss

White Other Hispanic Black Asian

slide-5
SLIDE 5

In this paper

  • 1. We want to find the sources of unfairness to guide

resource allocation.

slide-6
SLIDE 6

In this paper

  • 1. We want to find the sources of unfairness to guide

resource allocation.

  • 2. We decompose unfairness into bias, variance, and

noise.

slide-7
SLIDE 7

In this paper

  • 1. We want to find the sources of unfairness to guide

resource allocation.

  • 2. We decompose unfairness into bias, variance, and

noise.

  • 3. We demonstrate methods to guide feature

augmentation and training data collection to fix unfairness.

slide-8
SLIDE 8

Classification fairness: many factors

Model

  • Loss function constraints
  • Kamairan et al, 2010; Zafar et al, 2017
  • Representation learning
  • Zemel et al, 2013
  • Regularization
  • Kamishima et al, 2007; Bechvod and

Ligett, 2017

  • Tradeoffs
  • Chouldechova, 2017; Kleinberg et al,

2016; Corbett-Davies et al, 2017

slide-9
SLIDE 9

Classification fairness: many factors

Model

  • Loss function constraints
  • Kamairan et al, 2010; Zafar et al, 2017
  • Representation learning
  • Zemel et al, 2013
  • Regularization
  • Kamishima et al, 2007; Bechvod and

Ligett, 2017

  • Tradeoffs
  • Chouldechova, 2017; Kleinberg et al,

2016; Corbett-Davies et al, 2017

Data

slide-10
SLIDE 10

Classification fairness: many factors

Model

  • Loss function constraints
  • Kamairan et al, 2010; Zafar et al, 2017
  • Representation learning
  • Zemel et al, 2013
  • Regularization
  • Kamishima et al, 2007; Bechvod and

Ligett, 2017

  • Tradeoffs
  • Chouldechova, 2017; Kleinberg et al,

2016; Corbett-Davies et al, 2017

Data

  • Data processing
  • Haijan and Domingo-Ferrer,

2013; Feldman et al, 2015

  • Cohort selection
  • Sample size
  • Number of features
  • Group distribution
slide-11
SLIDE 11

Classification fairness: many factors

Model

  • Loss function constraints
  • Kamairan et al, 2010; Zafar et al, 2017
  • Representation learning
  • Zemel et al, 2013
  • Regularization
  • Kamishima et al, 2007; Bechvod and

Ligett, 2017

  • Tradeoffs
  • Chouldechova, 2017; Kleinberg et al,

2016; Corbett-Davies et al, 2017

Data

  • Data processing
  • Haijan and Domingo-Ferrer,

2013; Feldman et al, 2015

  • Cohort selection
  • Sample size
  • Number of features
  • Group distribution

We should examine fairness algorithms in the context of the da

data a and and mode del.

slide-12
SLIDE 12

Why might my classifier be unfair?

slide-13
SLIDE 13

Why might my classifier be unfair?

slide-14
SLIDE 14

Why might my classifier be unfair?

True data function

slide-15
SLIDE 15

Why might my classifier be unfair?

slide-16
SLIDE 16

Why might my classifier be unfair?

Learned model

slide-17
SLIDE 17

Why might my classifier be unfair?

Learned model

slide-18
SLIDE 18

Why might my classifier be unfair?

Learned model True data function

slide-19
SLIDE 19

Why might my classifier be unfair?

Learned model True data function

Error from variance

ce can be solved

by collect

cting mo more sa samp mples.

slide-20
SLIDE 20

Why might my classifier be unfair?

slide-21
SLIDE 21

Why might my classifier be unfair?

Learned model

slide-22
SLIDE 22

Why might my classifier be unfair?

Learned model

slide-23
SLIDE 23

Why might my classifier be unfair?

Learned model Orange dot model error

slide-24
SLIDE 24

Why might my classifier be unfair?

Learned model Orange dot model error Blue dot model error

slide-25
SLIDE 25

Why might my classifier be unfair?

True data function 𝒛 = 𝟏. πŸ”π’šπŸ‘

𝒛 = π’š βˆ’ 𝟐

slide-26
SLIDE 26

Why might my classifier be unfair?

True data function 𝒛 = 𝟏. πŸ”π’šπŸ‘

𝒛 = π’š βˆ’ 𝟐

Error from bi

bias s can be solved

by ch

changing the model cl class.

slide-27
SLIDE 27

Why might my classifier be unfair?

slide-28
SLIDE 28

Why might my classifier be unfair?

Learned model

slide-29
SLIDE 29

Why might my classifier be unfair?

Learned model Orange dot model error

slide-30
SLIDE 30

Why might my classifier be unfair?

Learned model Orange dot model error Blue dot model error

slide-31
SLIDE 31

Why might my classifier be unfair?

Learned model Orange dot model error Blue dot model error

Error from no

noise se can be solved

by collect

cting mo more featur tures.

slide-32
SLIDE 32

How do we define fairness?

slide-33
SLIDE 33

How do we define fairness?

We define fairness in the context of loss like false positive rate, false negative rate, etc. For example, zero-one loss for data D and prediction 𝑍 +: 𝛿- 𝑍 +, 𝑍, 𝐸 ∢= 𝑄2 𝑍 + β‰  𝑍 𝐡 = 𝑏) We can then formalize unfairness as group differences. Ξ“ 9 𝑍 + ∢= | 𝛿; βˆ’ 𝛿<| We rely on accurate Y labels and focus on algorithmic error

slide-34
SLIDE 34

How do we define fairness?

We define fairness in the context of loss like false positive rate, false negative rate, etc. For example, zero-one loss for data D and prediction 𝑍 +: 𝛿- 𝑍 +, 𝑍, 𝐸 ∢= 𝑄2 𝑍 + β‰  𝑍 𝐡 = 𝑏) We can then formalize unfairness as group differences. Ξ“ 9 𝑍 + ∢= | 𝛿; βˆ’ 𝛿<| We rely on accurate Y labels and focus on algorithmic error.

slide-35
SLIDE 35

Why might my classifier be unfair?

Theorem 1: For error over group a given predictor 𝑍

+: 𝛿̅- 𝑍 + = 𝐢 9- 𝑍 + + π‘Š 9

  • (𝑍

+) + 𝑂 C- Note that 𝑂- indicates the expectation of 𝑂- over X and data D. Accordingly, the expected discrimination level Ξ“ 9: = |𝛿; C βˆ’ 𝛿̅<| can be decomposed into differences in bias, differences in variance, and differences in noise. Ξ“ 9 = (𝐢 9; βˆ’ 𝐢 9< + (π‘Š 9;βˆ’π‘Š 9<) + (𝑂 C;βˆ’π‘‚ C<)|

slide-36
SLIDE 36

Why might my classifier be unfair?

Theorem 1: For error over group a given predictor 𝑍

+: 𝛿̅- 𝑍 + = 𝐢 9- 𝑍 + + π‘Š 9

  • (𝑍

+) + 𝑂 C- Note that 𝑂- indicates the expectation of 𝑂- over X and data D. Accordingly, the expected discrimination level Ξ“ 9: = |𝛿; C βˆ’ 𝛿̅<| can be decomposed into differences in bias, differences in variance, and differences in noise. Ξ“ 9 = (𝐢 9; βˆ’ 𝐢 9< + (π‘Š 9;βˆ’π‘Š 9<) + (𝑂 C;βˆ’π‘‚ C<)|

slide-37
SLIDE 37

0.16 0.18 0.20 0.22

Zero-one loss

White Other Hispanic Black Asian

Mortality prediction from MIMIC-III clinical notes

Asian Black Hispanic Other White

1. We found statistically significant racial differences in zero-one loss.

slide-38
SLIDE 38

5000 10000 15000

Training data size

0.27 0.25 0.23 0.21 0.19 Zero-one loss

Mortality prediction from MIMIC-III clinical notes

Asian Black Hispanic Other White

1. We found statistically significant racial differences in zero-one loss. 2. By subsampling data, we fit inverse power laws to estimate the benefit of more data and reducing variance.

slide-39
SLIDE 39

Cancer patients Cardiac patients

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Error enrichment

1106 1877 619 2564 19711 736 2100 1211 4181 17649

Mortality prediction from MIMIC-III clinical notes

Asian Black Hispanic Other White

1. We found statistically significant racial differences in zero-one loss. 2. By subsampling data, we fit inverse power laws to estimate the benefit of more data and reducing variance. 3. Using topic modeling, we identified subpopulations to gather more features to reduce noise.

slide-40
SLIDE 40

Where do we go from here?

  • 1. For accurate and fair models deployed in real world

applications, both the data and model should be considered.

  • 2. Using easily implemented fairness checks, we hope others

will check their algorithms for bias, variance, and noise-- which will guide further efforts to reduce unfairness.

Come to poster #120 in Room 210 & 230.

slide-41
SLIDE 41

Where do we go from here?

  • 1. For accurate and fair models deployed in real world

applications, both the data and model should be considered.

  • 2. Using easily implemented fairness checks, we hope others

will check their algorithms for bias, variance, and noise-- which will guide further efforts to reduce unfairness.

Come to poster #120 in Room 210 & 230.

slide-42
SLIDE 42

Where do we go from here?

  • 1. For accurate and fair models deployed in real world

applications, both the data and model should be considered.

  • 2. Using easily implemented fairness checks, we hope others

will check their algorithms for bias, variance, and noise-- which will guide further efforts to reduce unfairness.

Come to poster #120 in Room 210 & 230.