Fair Regression: Quantitative Definitions and Reduction- Based - - PowerPoint PPT Presentation

fair regression
SMART_READER_LITE
LIVE PREVIEW

Fair Regression: Quantitative Definitions and Reduction- Based - - PowerPoint PPT Presentation

Fair Regression: Quantitative Definitions and Reduction- Based Algorithms Steven Wu (University of Minnesota) Joint work with: Alekh Agarwal and Miro Dudk (Microsoft Research) Problem setting Distribution ! over examples: ( ", $,


slide-1
SLIDE 1

Fair Regression:

Quantitative Definitions and Reduction- Based Algorithms

Steven Wu (University of Minnesota) Joint work with: Alekh Agarwal and Miro Dudík (Microsoft Research)

slide-2
SLIDE 2

Problem setting

  • Distribution ! over examples: (", $, %)
  • ": feature vector
  • $: discrete protected attribute (e.g. racial groups, gender)
  • % ∈ [0, 1]: real-valued label (e.g. risk score, recidivism rate)
  • Prediction task: given loss function ℓ (e.g. square loss, logistic loss)

find a predictor . ∈ / to minimize 01[ ℓ(%, .(")]

  • ℓ is 1-Lipschitz:

ℓ 3, 4 − ℓ 36, 46 ≤ 3 − 36 + |4 − 46|

slide-3
SLIDE 3

Fairness notion: Statistical Parity

  • Statistical parity (SP): !(#) is independent of protected attribute %

& ! # ≥ ( % = * ] = & ! # ≥ ( for all groups * and ( ∈ [0, 1]

  • Implies any thresholding of !(#) is fair!
  • Motivated by practice of affirmative action as well as four-fifths rule
slide-4
SLIDE 4

Fairness notion: Bounded Group Loss

  • Bounded group loss (BGL): bounded group loss at level !

"#[ℓ &, ( ) |+ = -] ≤ !

for all groups -.

  • Enforces minimum prediction quality for each group
  • Diagnostic to detect groups requiring further data collection, better

features, etc.

  • Similar to minmax fairness
slide-5
SLIDE 5

Main results

  • Finite sample guarantees on:
  • Accuracy
  • Fairness violations
  • Reduction-based algorithm: a provably efficient algorithms that

iteratively solves a sequence of supervised learning problems (without fairness constraints):

  • Risk minimization under ℓ
  • Square loss minimization
  • Cost-sensitive classification (or weighted classification problem)
slide-6
SLIDE 6

Empirical Evaluation

  • Fairness constraint: statistical parity
  • Data sets: Adult, Law School, Communities & Crime
  • Losses: square loss, logistic loss
  • Reductions:
  • Cost-sensitive classification (CS)
  • Square loss minimization (LS)
  • Logistic loss minimization (LR)
  • Predictor classes: linear and tree ensemble
slide-7
SLIDE 7

Statistical Parity Disparity (CDF distance)

slide-8
SLIDE 8

Statistical Parity Disparity (CDF distance)

slide-9
SLIDE 9

Fair Regression:

Quantitative Definitions and Reduction- Based Algorithms

Poster: Thurs @ Pacific Ballroom #132