Guarantees for Spectral Clustering with Fairness Constraints Matthus - - PowerPoint PPT Presentation

guarantees for spectral clustering with fairness
SMART_READER_LITE
LIVE PREVIEW

Guarantees for Spectral Clustering with Fairness Constraints Matthus - - PowerPoint PPT Presentation

Guarantees for Spectral Clustering with Fairness Constraints Matthus Kleindessner, Samira Samadi , Pranjal Awasthi & Jamie Morgenstern Spectral Clustering (SC) and Fairness SC is the method of choice for clustering the nodes of a graph.


slide-1
SLIDE 1

Guarantees for Spectral Clustering with Fairness Constraints

Matthäus Kleindessner, Samira Samadi, Pranjal Awasthi & Jamie Morgenstern

slide-2
SLIDE 2

Spectral Clustering (SC) and Fairness

SC is the method of choice for clustering the nodes of a graph. Friendship network: SC can re- sult in highly unfair clustering with respect to the two demo- graphic groups. Fair clustering (Chierichetti et al. 2017): in every cluster, each group

Vs should be represented with (approximately) the same fraction as in the whole data set V . Goal: Study spectral clustering with fairness constraints.

2 / 7

slide-3
SLIDE 3

Spectral Clustering

Goal:Partition V into k clusters with min RatioCut objective value. ⋄ Encode a clustering V = C1 ˙ ∪ . . . ˙ ∪Ck by H ∈ Rn×k with Hil =

  • 1/
  • |Cl|,

i ∈ Cl 0, i / ∈ Cl (1) RatioCut(C1, . . . , Ck) = Tr(HTLH). L is the graph Laplacian matrix. ⋄ The exact problem: min

H∈Rn×k Tr(HTLH) subject to H is of form (1)

⋄ Solve the relaxed version: min

H∈Rn×k Tr(HTLH) subject to HTH = Ik.

⋄ Apply k-means clustering to the rows of H.

3 / 7

slide-4
SLIDE 4

Spectral Clustering with Fairness Constraints

Approach: Incorporate fairness as a linear constraint min

H∈Rn×k Tr(HTLH) subject to HTH = Ik & F TH = 0.

Convert the program to the standard form and solve. Our approach is analogous to existing versions of constrained SC that try to incorporate must-link constraints (e.g. Yu and Shi ’04) Friendship network: Our algo- rithm finds a fair clustering with respect to the two demographic groups.

4 / 7

slide-5
SLIDE 5

Analysis on Variant of Stochastic Block Model

Given V with a fair ground-truth clustering e.g., V = C1 ˙ ∪C2 Pr(i, j) =            a, i and j in same group and in same cluster b, i and j in same group, but in different clusters c, i and j in different groups, but in same cluster d, i and j in different groups, and in different clusters for some a > b > c > d. Theorem (informal): Fair SC recovers the ground-truth clus- tering C1 ˙ ∪C2 with high proba- bility.

V1 C2 C1 V2

Standard SC is likely to return V1 ˙ ∪V2.

5 / 7

slide-6
SLIDE 6

Experiments on Real Networks

FriendshipNet, FacebookNet, DrugNet

5 10 15 k 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Balance 5 10 15 RatioCut FriendshipNet --- gender 5 10 15 k 0.2 0.4 0.6 0.8 Balance 10 20 30 40 50 60 RatioCut FacebookNet --- gender 5 10 15 k 0.05 0.1 0.15 0.2 Balance 1 2 3 4 5 6 RatioCut DrugNet --- ethnicity Balance of data set Standard SC Algorithm 1 Normalized SC

  • Alg. 3

Average balance of clusters and RatioCut value as a function of number of clusters.

6 / 7

slide-7
SLIDE 7

Thank you! Poster #195

7 / 7