Flexibly Fair Representation Learning by Disentanglement Elliot - PowerPoint PPT Presentation

Flexibly Fair Representation Learning by Disentanglement Elliot Creager 1 2 David Madras 1 2 J¨ orn-Henrik Jacobsen 2 Marissa A. Weis 2 3 Kevin Swersky 4 Toniann Pitassi 1 2 Richard Zemel 1 2 June 13, 2019 1 University of Toronto 2 Vector Institute 3 University of T¨ 4 Google Research ubingen Creager et al. 2019, arXiv:1906.02589, poster # 131 1

Why Fair Representation Learning? g 1 Fair Representation: [ x , a ] f → z → ˆ y 1 g 2 z → ˆ y 2 g 3 z → ˆ y 3 . . . Given sensitive attribute a ∈ { 0 , 1 } , we want: • z ⊥ a (demographic parity) with z = f ( x , a ) • z maintains as much info about x as possible A fair representation acts as a group parity bottleneck Current approaches are flexible w.r.t. downstream task labels (Madras et al., 2018) but inflexible w.r.t. sensitive attributes Creager et al. 2019, arXiv:1906.02589, poster # 131 2

Further Motivation Subgroup discrimination • We would like to handle the case where a ∈ { 0 , 1 } N a is a vector of sensitive attributes • ML systems can discriminate against subgroups defined via conjunctions of sensitive attributes (Buolamwini & Gebru, 2018) Disentangled Representation Learning • Each dimension of z should correspond to no more than one semantic factor of variation (object shape, position, etc.) in the data • VAE variants encourage factorized posterior q ( z | x ) (Higgins et al., 2017) or aggregate posterior q ( z ) (Kim & Mnih, 2018; Chen et al., 2018) Creager et al. 2019, arXiv:1906.02589, poster # 131 3 (Kim & Mnih, 2018)

Flexibly Fair VAE Data flow at train time (left) and test time (right) for FFVAE Latent Code Modification Desiderata • To achieve DP w.r.t. some a i , use • z ⊥ b j ∀ j [ z , b ] \ b i • b i ⊥ b j ∀ i � = j • To achieve DP w.r.t. conjunction of binary attributes a i ∧ a j ∧ a k , • MutualInfo( a j , b j ) is large ∀ j use [ z , b ] \{ b i , b j , b k } Creager et al. 2019, arXiv:1906.02589, poster # 131 4

Flexibly Fair VAE Data flow at train time (left) and test time (right) for FFVAE Learning Objective � L FFVAE ( p , q ) = E q ( z , b | x ) [log p ( x | z , b ) + α log p ( a j | b j )] j � − γ D KL ( q ( z , b ) || q ( z ) q ( b j )) j − D KL [ q ( z , b | x ) || p ( z , b )] α encourages predictiveness in the latent code; γ encourages disentanglement Creager et al. 2019, arXiv:1906.02589, poster # 131 5

Results - Synthetic Data DSpritesUnfair Samples Figure: With correlated factors of variation, a fair classification task is predicting Shape without discriminating against XPosition 1.00 1.00 1.00 1.00 FFVAE FFVAE FactorVAE FactorVAE 0.95 0.95 0.95 0.95 CVAE CVAE -VAE -VAE 0.90 0.90 0.90 0.90 Accuracy Accuracy Accuracy Accuracy MLP MLP 0.85 0.85 0.85 0.85 0.80 FFVAE FFVAE 0.80 0.80 0.80 FactorVAE FactorVAE 0.75 0.75 CVAE CVAE 0.75 0.75 -VAE -VAE 0.70 0.70 MLP 0.70 MLP 0.70 0.0000 0.0005 0.0010 0.0015 0.0020 0.0025 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.00 0.05 0.10 0.15 0.20 0.25 DP DP DP DP a = Scale a = Shape a =Shape ∧ Scale a =Shape ∨ Scale Figure: Pareto-fronts showing fairness-accuracy tradeoff curves, DSpritesUnfair dataset. Optimal point is top left corner (perfect accuracy, no unfairness). y = XPosition. ∆ DP � | E [ˆ y = 1 | a = 1] − E [ˆ y = 1 | a = 0] | with ˆ y ∈ { 0 , 1 } Creager et al. 2019, arXiv:1906.02589, poster # 131 6

Results - Tabular and Image Data Celeb-A Communities & Crime 0.825 0.825 FFVAE FFVAE 0.85 0.84 0.800 0.800 -VAE -VAE 0.775 0.775 0.80 0.82 Accuracy Accuracy Accuracy 0.750 Accuracy 0.750 0.80 0.75 0.725 0.725 0.78 0.700 0.700 FFVAE FFVAE 0.70 0.675 0.675 FactorVAE 0.76 FactorVAE CVAE CVAE 0.650 0.650 -VAE -VAE 0.65 0.74 0.625 0.625 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 DP DP DP DP typical success: typical failure: typical success: typical failure: a = R ∧ P a = R ∨ B a = ¬ E ∧ M a = C ∧ M • Neighborhood-level population • Over 200 , 000 images of celebrity statistics: 120 features for 1 , 994 faces, each associated with 40 neighborhoods binary attributes (OvalFace, HeavyMakeup, etc.) • We choose racePctBlack (R), blackPerCap (B), and • We choose Chubby (C), Eyeglasses pctNotSpeakEnglWell (P) as (E) and Male (M) as sensitive sensitive attributes attributes • Held-out label • Held-out label y = violentCrimesPerCapita y = HeavyMakeup (H) Creager et al. 2019, arXiv:1906.02589, poster # 131 7

Conclusion • FFVAE enables flexibly fair downstream classification by disentangling information from multiple sensitive attributes • Future work: extending to other group fairness definitions, and studying robustness of disentangled/fair representation learners to distribution shift • Visit us at poster # 131 tonight! Creager et al. 2019, arXiv:1906.02589, poster # 131 8

Flexibly Fair Representation Learning by Disentanglement Elliot - PowerPoint PPT Presentation

Flexibly Fair Representation Learning by Disentanglement Elliot Creager 1 2 David Madras 1 2 J orn-Henrik Jacobsen 2 Marissa A. Weis 2 3 Kevin Swersky 4 Toniann Pitassi 1 2 Richard Zemel 1 2 June 13, 2019 1 University of Toronto 2 Vector

Fair Testing - O Wings We are learning to carry out a fair test. What is a fair test? Fair

S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation

Content-Collaborative Disentanglement Representation Learning for Enhanced Recommendation

Disentanglement of Visual Concepts from Classifying and Synthesizing Scenes Bolei Zhou The

THE COLLEGE FAIR What is a college fair? When should I attend a fair? Why should I go

SC SCIENCE FAIR IENCE FAIR Calallen Independent School District SCI SCIENCE ENCE FAIR FAIR

A TWO-STEP DISENTANGLEMENT METHOD SNU Datamining Laboratory 2018. 8. 6 Seminar Sungwon, Lyu

CHART | ART FAIR 29. 31. AUGUST 2014 CHART | ART FAIR IS AN INNOVATIVE ART FAIR WITH A HIGH

SALES Jim McCarthy, President/CEO Miami Valley Fair Housing Center Miami Valley Fair Housing Center

Pharma goes FAIR Herman van Vlijmen Janssen Pharmaceu9ca Beerse, Belgium What is FAIR?

Status of the FAIR Project Status of the FAIR Project I. Augustin FAIR Coordination Group GSI

Learning Discrete and Continuous Factors of Data via Alternating Disentanglement Yeonwoo Jeong,

Weakly Supervised Disentanglement with Guarantees Rui Shu Joint work with Yining Chen, Abhishek

You Talking to Me? A Corpus and Algorithm for Conversation Disentanglement Micha Elsner and

Disentangling Disentanglement in Variational Autoencoders ICML 2019 June 12, 2019 Departments

Solving Disentanglement Puzzles with Hints from Topology Alexa Tsintolas Topological Space

Change of Basis Marco Chiarandini Department of Mathematics & Computer Science University of

Probability review INFO 1301 Prof. Michael Paul Prof. William Aspray R1

Slide 1 / 42 Slide 2 / 42 1. Block 1 with a mass of 500 g moves at a constant speed of 5 m/s on

DISCRETE LOGARITHMS Elliptic Curve Cryptography December 2019 Bochum, Germany IN

Big picture All languages Decidable Turing machines NP P Context-free

Hopf orders in Hopf algebras with trivial Verschiebung Alan Koch Agnes Scott College June, 2015

Continuous Representations: What goes right and what goes wrong? Supplementary Slides Job Rock

On some open questions related to transport equations with critical regularity F. Bouchut 1 1

Flexibly Fair Representation Learning by Disentanglement Elliot - PowerPoint PPT Presentation

Flexibly Fair Representation Learning by Disentanglement Elliot Creager 1 2 David Madras 1 2 J orn-Henrik Jacobsen 2 Marissa A. Weis 2 3 Kevin Swersky 4 Toniann Pitassi 1 2 Richard Zemel 1 2 June 13, 2019 1 University of Toronto 2 Vector

Fair Testing - O Wings We are learning to carry out a fair test. What is a fair test? Fair

S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation

Content-Collaborative Disentanglement Representation Learning for Enhanced Recommendation

Disentanglement of Visual Concepts from Classifying and Synthesizing Scenes Bolei Zhou The

THE COLLEGE FAIR What is a college fair? When should I attend a fair? Why should I go

SC SCIENCE FAIR IENCE FAIR Calallen Independent School District SCI SCIENCE ENCE FAIR FAIR

A TWO-STEP DISENTANGLEMENT METHOD SNU Datamining Laboratory 2018. 8. 6 Seminar Sungwon, Lyu

CHART | ART FAIR 29. 31. AUGUST 2014 CHART | ART FAIR IS AN INNOVATIVE ART FAIR WITH A HIGH

SALES Jim McCarthy, President/CEO Miami Valley Fair Housing Center Miami Valley Fair Housing Center

Pharma goes FAIR Herman van Vlijmen Janssen Pharmaceu9ca Beerse, Belgium What is FAIR?

Status of the FAIR Project Status of the FAIR Project I. Augustin FAIR Coordination Group GSI

Learning Discrete and Continuous Factors of Data via Alternating Disentanglement Yeonwoo Jeong,

Weakly Supervised Disentanglement with Guarantees Rui Shu Joint work with Yining Chen, Abhishek

You Talking to Me? A Corpus and Algorithm for Conversation Disentanglement Micha Elsner and

Disentangling Disentanglement in Variational Autoencoders ICML 2019 June 12, 2019 Departments

Solving Disentanglement Puzzles with Hints from Topology Alexa Tsintolas Topological Space

Change of Basis Marco Chiarandini Department of Mathematics &amp; Computer Science University of

Probability review INFO 1301 Prof. Michael Paul Prof. William Aspray R1

Slide 1 / 42 Slide 2 / 42 1. Block 1 with a mass of 500 g moves at a constant speed of 5 m/s on

DISCRETE LOGARITHMS Elliptic Curve Cryptography December 2019 Bochum, Germany IN

Big picture All languages Decidable Turing machines NP P Context-free

Hopf orders in Hopf algebras with trivial Verschiebung Alan Koch Agnes Scott College June, 2015

Continuous Representations: What goes right and what goes wrong? Supplementary Slides Job Rock

On some open questions related to transport equations with critical regularity F. Bouchut 1 1

Change of Basis Marco Chiarandini Department of Mathematics & Computer Science University of