Fairness Constraints for Graph Embeddings* William L. Hamilton - PowerPoint PPT Presentation

Fairness Constraints for Graph Embeddings* William L. Hamilton Assistant Professor at McGill University and Mila Canada CIFAR Chair in AI Visiting Researcher at Facebook AI Research *Joint work with my PhD student Joey Bose, to appear in ICML 2019 (pdf) William L. Hamilton, McGill University and Mila 1

Graph embeddings William L. Hamilton, McGill University and Mila 2

Application: Node classification ? ? ? ? Machine Learning ? William L. Hamilton, McGill University and Mila 4

Application: Link prediction ? ? x ? Machine Learning William L. Hamilton, McGill University and Mila 5

Becoming ubiquitous in social applications Graph embedding techniques are a powerful approach for § social recommendations, bot detection, content screening, behavior prediction, geo-localization, E.g., Facebook, Huawei, Uber Eats, Pinterest, LinkedIn, WeChat § Classic collaborative filtering approaches can be re- § interpreted in a more general graph embedding framework. William L. Hamilton, McGill University and Mila 6

But what about fairness and privacy? Graph embeddings designed to capture ev erything that ever § might be useful for the objective. Even if we don’t provide the model information about § s (e.g., gender or age), the model wi se sensi sitive a attributes will use th this infor formati tion on. . Wha What if a us user do doesn’ n’t want nt thi his inf nformation n us used? d? § William L. Hamilton, McGill University and Mila 7

Fairness from a pragmatic perspective Strict privacy and discrimination concerns are one § motivation. But what if users just don’t want their recommendations do § depend on certain attributes? What if users want the system to “ignore” parts of their § demographics or past behavior? William L. Hamilton, McGill University and Mila 8

Fairness in graph embeddings Basic idea: How can we learn node embeddings that are Ba § invariant to particular sensitive attributes? Cha Challeng nges: § Graph data is not i.i.d. § There is not just one classification task that we are trying to § enforce fairness on. There are often many possible sensitive attributes. § William L. Hamilton, McGill University and Mila 9

Our work: Fairness in graph embeddings William L. Hamilton, McGill University and Mila 10

Preliminaries and set-up Learning an encoder function to map nodes to embeddings: § Using these embeddings to “score” the likelihood of a § relationship between nodes: William L. Hamilton, McGill University and Mila 11

Preliminaries and set-up Learning an encoder function to map nodes to embeddings: § Using these embeddings to “score” the likelihood of a § relationship between nodes: Score of a (possible) edge is a function of the two node embeddings and the relation type. William L. Hamilton, McGill University and Mila 12

Preliminaries and set-up Learning an encoder function to map nodes to embeddings: § Using these embeddings to “score” the likelihood of a § relationship between nodes: Goal: Train the embeddings (with a subset of the true edges) so that the score for all real edges is larger than all non-edges. William L. Hamilton, McGill University and Mila 13

Preliminaries and set-up Generic loss function: § X L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) e ∈ E train Scores assigned to Task-specific loss random negative function Sum over (batch sample edges. of) training edges. Score assigned to positive/real edge. William L. Hamilton, McGill University and Mila 14

Preliminaries and set-up: Concrete examples Score functions: § Loss-functions: § William L. Hamilton, McGill University and Mila 15

Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v Loss-functions: § William L. Hamilton, McGill University and Mila 16

Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v TransE: § s ( e ) = s ( h z u , r, z v i ) = �k z u + r � z v k 2 2 Loss-functions: § William L. Hamilton, McGill University and Mila 17

Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v TransE: § s ( e ) = s ( h z u , r, z v i ) = �k z u + r � z v k 2 2 Loss-functions: § m Max-margin: X § L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) = max(1 − s ( e ) + s ( e − i ) , 0) i =1 William L. Hamilton, McGill University and Mila 18

Preliminaries and set-up: Concrete examples Score functions: § s ( e ) = s ( h z u , r, z v i ) = z > Dot-product: § u z v TransE: § s ( e ) = s ( h z u , r, z v i ) = �k z u + r � z v k 2 2 Loss-functions: § m Max-margin: X § L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) = max(1 − s ( e ) + s ( e − i ) , 0) i =1 m Cross-entropy: § X L edge ( s ( e ) , s ( e − 1 ) , ..., s ( e − m )) = − log( σ ( s ( e )) − log(1 − σ ( s ( e − i )) i =1 William L. Hamilton, McGill University and Mila 19

Formalizing fairness How do we ensure fairness in this context? § William L. Hamilton, McGill University and Mila 20

Formalizing fairness How do we ensure fairness in this context? § So Solution: re repre resentational invariance § Want embeddings to be independent from the attributes: § Which is equivalent to minimizing the mutual information to § between the embeddings and the attributes: William L. Hamilton, McGill University and Mila 21

Enforcing fairness through an adversary William L. Hamilton, McGill University and Mila 22

Enforcing fairness through an adversary Key ey co componen ent 1: Composi sitional al en enco coder er. § Given a set of attributes, it outputs “filtered” embeddings § that should be invariant to those attributes. Trainable filter function (neural Input: node ID and network) outputs embedding Sum over all set of sensitive that is invariant to attribute k . sensitive attributes attributes William L. Hamilton, McGill University and Mila 23

Enforcing fairness through an adversary Key y comp mponent 2: Ad Adve versarial discrimi minators § For each sensitive attribute, train an adversarial discriminator that § tries to predict that sensitive attribute from the filtered embeddings: Ou Outpu put: Likelihood that node u has that attribute value. Discriminator In Input: : Filtered for sensitive embeddding for node u attribute k. and attribute value. William L. Hamilton, McGill University and Mila 24

Enforcing fairness through an adversary Pu Putting it all together in an adversarial loss: § Original loss function for the edge prediction task Constant that determines the Likelihood of discriminator predicting the strength of the fairness sensitive attributes. constraints William L. Hamilton, McGill University and Mila 25

Enforcing fairness through an adversary Pu Putting i g it a all t toge ogether i in a an a adversarial l los oss: § During training the encoder tries to minimize this loss and the § adversarial discriminators are trained to maximize it. William L. Hamilton, McGill University and Mila 26

Enforcing fairness through an adversary William L. Hamilton, McGill University and Mila 28

Dataset 1: MovieLens-1M Classic recommender system benchmark. § Bipartite graph between users and movies. § ): Users and movies No Node des (~1 (~10,000): § Edges (~1,000,000): Rating a user gives a movie Ed § Se Sensitive ve attributes : § Gender § Age (binned to become a categorical attribute) § Occupation § William L. Hamilton, McGill University and Mila 30

Dataset 2: Reddit Derived from public Reddit comments. § Bipartite graph between users and communities. § ): Users and communities Node No des (~3 (~300,000): § Edges (~7,000,000): Whether a user commented on that Ed § community Se Sensitive ve attributes : Randomly select 50 communities to be § “sensitive” communities William L. Hamilton, McGill University and Mila 31

Dataset 3: Freebase 15k-237 Derived from classic knowledge base completion § benchmark. Knowledge graph between set of typed entities. § ): Users and communities No Node des (~1 (~15,000): § Edges (~150,000): 237 different relation types (e.g., Ed § married_to, born_in, capital_of, director_of) ve attributes : Randomly selected 3 entity type Se Sensitive § annotations (e.g., is_actor) to be “sensitive attributes” William L. Hamilton, McGill University and Mila 32

Experiments: Three questions 1. What is the cost of invariance? 2. What is the impact of compositionality? 3. Can we generalize to unseen combinations of attributes? William L. Hamilton, McGill University and Mila 33

MovieLens: Fairness results How strongly can we enforce fairness? § Compare three approaches to enforcing fairness: § No adversary (i.e., just train on the recommendation task) § Independent adversarial model for each attribute § Full compositional model § William L. Hamilton, McGill University and Mila 34

Fairness Constraints for Graph Embeddings* William L. Hamilton - PowerPoint PPT Presentation

Fairness Constraints for Graph Embeddings* William L. Hamilton Assistant Professor at McGill University and Mila Canada CIFAR Chair in AI Visiting Researcher at Facebook AI Research *Joint work with my PhD student Joey Bose, to appear in

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Graph Embeddings Alicia Frame, PhD October 10, 2019 Overview Whats an embedding? How do

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi Rizi Graph Embedding Day

of Graph Embeddings Aleksandar Bojchevski Technical University of Munich, Germany Graph

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Fairness in Machine Learning Fairness in Supervised Learning Make decisions by machine learning:

On Combining State Space Reductions with Global Fairness Assumptions Shaojie Zhang 1 Jun Sun 2 Jun

Media Fairness, Diversity 1 Outline Fairness (case studies, basic definitions) Diversity

Fairness in Machine Learning: Part I Privacy & Fairness in Data Science CS848 Fall 2019

SybilGuard: Defending Against Sybil Attacks SybilGuard: Defending Against Sybil Attacks via

Promi mise ses s and Paradoxes in Un Underst standing Imp Impacts s of Ou Out of School

Why Are We Failing to Organise? Contemporary Social Activism in Poland Agnieszka Kwiatkowska

iSCSI Naming & Discovery 50 th IETF - Minneapolis March 2001 Mark Bakke, Cisco Yaron Klein,

Social Entrepreneurship: an overview Dr. Punita Bhatt Punita.Bhatt@coventry.ac.uk 1 Agenda

Co-production Between Contracting Authorities and Social Enterprises Complying with

Enterprises in Cumbria webinar The webinar will start shortly Social Enterprises in Cumbria An

Ea East Portlan land R Resili liency cy C Coal alit ition ( (EP EPRC) Opportunity to

Fairness Constraints for Graph Embeddings* William L. Hamilton - PowerPoint PPT Presentation

Fairness Constraints for Graph Embeddings* William L. Hamilton Assistant Professor at McGill University and Mila Canada CIFAR Chair in AI Visiting Researcher at Facebook AI Research *Joint work with my PhD student Joey Bose, to appear in

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Graph Embeddings Alicia Frame, PhD October 10, 2019 Overview Whats an embedding? How do

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi Rizi Graph Embedding Day

of Graph Embeddings Aleksandar Bojchevski Technical University of Munich, Germany Graph

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Fairness in Machine Learning Fairness in Supervised Learning Make decisions by machine learning:

On Combining State Space Reductions with Global Fairness Assumptions Shaojie Zhang 1 Jun Sun 2 Jun

Media Fairness, Diversity 1 Outline Fairness (case studies, basic definitions) Diversity

Fairness in Machine Learning: Part I Privacy &amp; Fairness in Data Science CS848 Fall 2019

SybilGuard: Defending Against Sybil Attacks SybilGuard: Defending Against Sybil Attacks via

Promi mise ses s and Paradoxes in Un Underst standing Imp Impacts s of Ou Out of School

Why Are We Failing to Organise? Contemporary Social Activism in Poland Agnieszka Kwiatkowska

iSCSI Naming &amp; Discovery 50 th IETF - Minneapolis March 2001 Mark Bakke, Cisco Yaron Klein,

Social Entrepreneurship: an overview Dr. Punita Bhatt Punita.Bhatt@coventry.ac.uk 1 Agenda

Co-production Between Contracting Authorities and Social Enterprises Complying with

Enterprises in Cumbria webinar The webinar will start shortly Social Enterprises in Cumbria An

Ea East Portlan land R Resili liency cy C Coal alit ition ( (EP EPRC) Opportunity to

Fairness in Machine Learning: Part I Privacy & Fairness in Data Science CS848 Fall 2019

iSCSI Naming & Discovery 50 th IETF - Minneapolis March 2001 Mark Bakke, Cisco Yaron Klein,