Yuji Roh, Data Intelligence Lab, KAIST
FR-Train:
A Mutual Information-based Fair and Robust Training
Yuji Roh, Kangwook Lee, Steven E. Whang, Changho Suh
FR-Train: A Mutual Information-based Fair and Robust Training Yuji - - PowerPoint PPT Presentation
FR-Train: A Mutual Information-based Fair and Robust Training Yuji Roh, Kangwook Lee, Steven E. Whang, Changho Suh Yuji Roh , Data Intelligence Lab, KAIST Trustworthy AI AI has significant potential to help solve challenging problems,
Yuji Roh, Data Intelligence Lab, KAIST
Yuji Roh, Kangwook Lee, Steven E. Whang, Changho Suh
“AI has significant potential to help solve challenging problems, including by advancing medicine, understanding language, and fueling scientific discovery. To realize that potential, it’s critical that AI is used and developed responsibly. ”
“Moving forward, “build for performance” will not suffice as an AI design paradigm. We must learn how to build, evaluate and monitor for trust.”
2
3
4
5
Poisoned Dataset Sanitization Fair Training
6
Poisoned Dataset Sanitization Fair Training Poisoned Dataset Fair and Robust Training
7
Poisoned Dataset Sanitization Fair Training Poisoned Dataset Fair and Robust Training
8
9
10
11
Demographic Parity
(⇔ Disparate Impact)
Equalized Odds
Feature Label Group attribute Predicted label
12
Demographic Parity
(⇔ Disparate Impact)
Equalized Odds
Feature Label Group attribute Predicted label
13
14
Clean
X 15
A, B : Sensitive groups : Positive label : Negative label Vanilla classifier (Acc, DI) = (1, 0.5)
Clean
Vanilla classifier (Acc, DI) = (1, 0.5)
16 X
A, B : Sensitive groups : Positive label : Negative label
Clean
Vanilla classifier (Acc, DI) = (1, 0.5) Fair classifier (Acc, DI) = (0.8, 1)
17 X
A, B : Sensitive groups : Positive label : Negative label
Vanilla classifier (Acc, DI) = (1, 0.5) Fair classifier (Acc, DI) = (0.8, 1)
Clean Poisoned
X 18 X
A, B : Sensitive groups : Positive label : Negative label
Vanilla classifier (Acc, DI) = (1, 0.5) Fair classifier (Acc, DI) = (0.8, 1) Vanilla classifier Accpoi = 0.9 (Accclean, DI) = (0.9, 0.67)
Clean Poisoned
Acc : ↓ DI : ↑
19 X X
A, B : Sensitive groups : Positive label : Negative label
Vanilla classifier (Acc, DI) = (1, 0.5) Fair classifier (Acc, DI) = (0.8, 1) Vanilla classifier Accpoi = 0.9 (Accclean, DI) = (0.9, 0.67) Fair classifier Accpoi = 0.8 (Accclean, DI) = (0.6, 1)
Clean Poisoned
Acc : ↓ DI : 一
20 X X
A, B : Sensitive groups : Positive label : Negative label
21
22
23
Giving loans
24
Distinguish the gender
Giving loans
Distinguish the gender
25
Distinguish whether poisoned or clean Poisoned training set + Predicted label affected by poisoning
Giving loans
Distinguish the gender
26
Distinguish whether poisoned or clean Clean validation set Poisoned training set + Predicted label affected by poisoning
Giving loans
Distinguish the gender
27
Distinguish whether poisoned or clean Clean validation set Poisoned training set + Predicted label affected by poisoning
Classifier Disc. Robustness Softmax
28
Theorem 1 - Fairness Theorem 2 - Robustness Disc. Fairness
Classifier Disc. Robustness Softmax
29
Theorem 1 - Fairness Theorem 2 - Robustness Disc. Fairness
Classifier Disc. Robustness Softmax
30
Theorem 1 - Fairness Theorem 2 - Robustness Disc. Fairness
Classifier Disc. Robustness Softmax
31
Theorem 1 - Fairness Theorem 2 - Robustness
Disc. Fairness
Classifier Disc. Fairness Disc. Robustness Softmax
32
Theorem 1 - Fairness Theorem 2 - Robustness
33
34
35
Two-step approach : Data sanitization + Fair training Fair-only algorithms Data sanitization using clean val. set Logistic regression
36
Two-step approach : Data sanitization + Fair training Fair-only algorithms Data sanitization using clean val. set Logistic regression
Low accuracy
37
Two-step approach : Data sanitization + Fair training Fair-only algorithms Data sanitization using clean val. set Logistic regression
Low fairness
38
Two-step approach : Data sanitization + Fair training Fair-only algorithms Data sanitization using clean val. set Logistic regression
Also low accuracy
39
Two-step approach : Data sanitization + Fair training Fair-only algorithms Data sanitization using clean val. set Logistic regression
Holistic approach = high fairness & accuracy
40
41
42