Quick Question A doctor is walking down the street with a boy. The - PowerPoint PPT Presentation

Quick Question • A doctor is walking down the street with a boy. The boy is the doctor’s son, but the doctor is not the boy’s father. How is that possible? GENDER BIAS IN WORD EMBEDDINGS 1

Simple Answer • The doctor is the boy’s mother… GENDER BIAS IN WORD EMBEDDINGS 2

Gender Bias in Word Embeddings HILA GONEN AND YOAV GOLDBERG BAR ILAN UNIVERSITY WIDS TLV 5/3/19 ACCEPTED TO NAACL 2019

Outline • Background • Gender Bias • Word embeddings • Current debiasing methods • Post-processing (Bolukbasi et al.) • During training (Zhao et al.) • Analyzing debiased embeddings • Conclusion GENDER BIAS IN WORD EMBEDDINGS 4

What do we mean by gender bias? GENDER BIAS IN WORD EMBEDDINGS 5

What do we mean by gender bias? GENDER BIAS IN WORD EMBEDDINGS 6

What do we mean by gender bias? (Zhao et al.,NAACL, 2018) GENDER BIAS IN WORD EMBEDDINGS 7

What do we mean by gender bias? (Hendricks et al., 2018) GENDER BIAS IN WORD EMBEDDINGS 8

Word Embeddings • We will focus on gender bias in word embeddings GENDER BIAS IN WORD EMBEDDINGS 9

Word Embeddings • We will focus on gender bias in word embeddings What are word embeddings? GENDER BIAS IN WORD EMBEDDINGS 9

Word Embeddings • Each word in the vocabulary is represented by a low dimensional vector (~ ) • All words are embedded into the same space • Similar words have similar vectors • (= their vectors are close to each other in the vector space) • Word embeddings are successfully used for various NLP applications GENDER BIAS IN WORD EMBEDDINGS 10

Training Word Embeddings • Learned from raw data • The Distributional Hypothesis: • words that occur in the same contexts tend to have similar meanings (Harris, 1954) • “You shall know a word by the company it keeps” (Firth, 1957) GENDER BIAS IN WORD EMBEDDINGS 11

LANGUAGE MODELING FOR CODE SWITCHING 15 12

LANGUAGE MODELING FOR CODE SWITCHING 16 12

Word Embeddings • TopK lists: dog (Mikolov et al. 2013) GENDER BIAS IN WORD EMBEDDINGS 13

Word Embeddings • TopK lists: food (Mikolov et al. 2013) GENDER BIAS IN WORD EMBEDDINGS 13

Word Embeddings • TopK lists: nurse (Mikolov et al. 2013) GENDER BIAS IN WORD EMBEDDINGS 13

Bias in Word Embeddings GENDER BIAS IN WORD EMBEDDINGS 14

Bias in Word Embeddings • Caliskan et al. replicate a spectrum of known biases from the literature using word embeddings • Show that text corpora contain several types of biases: • morally neutral as toward insects or flowers • problematic as toward race or gender • veridical, reflecting the distribution of gender with respect to careers or first names • Introduce methods for identifying these biases GENDER BIAS IN WORD EMBEDDINGS 15

Bias in Word Embeddings GENDER BIAS IN WORD EMBEDDINGS 16

Bias in Word Embeddings Concepts 1 Concepts 2 Attributes 1 Attributes 2 Flowers : Insects : Pleasant : Unpleasant : buttercup, daisy, lily ant, caterpillar, flea freedom, health, love abuse, crash, filth European American names : African American names : Pleasant : Unpleasant : Brad, Brendan Darnell, Lakisha joy, love, peace agony, terrible Male attributes : Female attributes : Math words : Arts Words : male, man, boy female, woman, girl math, algebra, geometry poetry, art, dance GENDER BIAS IN WORD EMBEDDINGS 17

Definition of Gender Bias in Word Embeddings (NIPS, 2016) GENDER BIAS IN WORD EMBEDDINGS 18

Definition of Gender Bias in Word Embeddings • We check how similar a word is to “he” and “she” (cosine similarity) • Note that we care about the difference between the two • This is the projection on the direction of “he – she”* * This is the gender direction, can be computed using several pairs together (e.g. man-woman, brother-sister) GENDER BIAS IN WORD EMBEDDINGS 19

Definition of Gender Bias in Word Embeddings • bias(consultant) = -0.0023 • bias(nurse) = -0.2471 • bias(captain) = 0.1521 • bias(table) = -0.0003 Zhao et al. GENDER BIAS IN WORD EMBEDDINGS 20

Reduce Bias after Training • Bolukbasi et al. suggest to remove bias after training: • Define a gender direction • Define inherently neutral words (nurse as opposed to mother) • Zero the projection of all neutral words on the gender direction: 𝑥 Projection of on gender direction • The bias of all neutral words is now zero by definition • We will address these embeddings as HARD-DEBIASED GENDER BIAS IN WORD EMBEDDINGS 21

Reduce bias during training (EMNLP, 2018) GENDER BIAS IN WORD EMBEDDINGS 22

Reduce bias during training • Zhao et al. suggest to reduce bias during training: • Train word embeddings using GloVe (Pennington et al., 2014) • Alter the loss to encourage the gender information to concentrate in the last coordinate • To ignore gender information – simply remove the last coordinate GENDER BIAS IN WORD EMBEDDINGS 23

Reduce bias during training • Details: • Use two groups of male/female seed words, and encourage words from different groups to differ in their last coordinate • Encourage the representation of neutral-gender words (excluding the last coordinate) to be orthogonal to the gender direction • We will address these embeddings as GN-GLOVE GENDER BIAS IN WORD EMBEDDINGS 24

Some Results • Bolukbasi et al.: • Bias of all inherently-neutral words is zero by definition • Generated analogies are less stereotyped • Zhao et al.: • Decrease bias in co-reference resolution GENDER BIAS IN WORD EMBEDDINGS 25

Problem solved? GENDER BIAS IN WORD EMBEDDINGS 26

Problem solved? • Not so fast… GENDER BIAS IN WORD EMBEDDINGS 26

Clustering male- and female- biased words • We take the most biased words in the vocabulary according to the original bias (500 male-biased and 500 female-biased) • We cluster them into two clusters using K-means • The clusters align with gender with accuracy of: • 92.5% compared to 99.99% (HARD-DEBIASED) • 85.6% compared to 100% (GN-GLOVE) GENDER BIAS IN WORD EMBEDDINGS 27

Clustering male- and female- biased words HARD-DEBIASED GN-GLOVE GENDER BIAS IN WORD EMBEDDINGS 28

Bias-by-neighbors • Bias is still manifested by the word being close to socially-marked feminine words • A new mechanism for measuring bias: • The percentage of male/female socially-biased words among the k nearest neighbors of the target word • Pearson correlation with bias-by-projection: • 0.686 compared to 0.741 (HARD-DEBIASED) • 0.736 compared to 0.773 (GN-GLOVE) GENDER BIAS IN WORD EMBEDDINGS 29

Professions • We take a predefined list of professions • We show correlation between the bias-by-projection and bias-by- neighbors, before and after debiasing GENDER BIAS IN WORD EMBEDDINGS 30

Professions HARD-DEBIASED GN-GLOVE GENDER BIAS IN WORD EMBEDDINGS 31

Association with stereotyped words • We evaluate the association between female/male names and female/male stereotyped words (experiments taken from Caliskan et al.) Female-associated Male-associated names Amy, Joan, Lisa John, Paul, Mike family vs. carrer home, parents, children executive, management, professional arts vs. math poetry, art, dance math, algebra, geometry arts vs. science dance, literature, novel science, technology, physics • All the associations have very small p-values GENDER BIAS IN WORD EMBEDDINGS 32

Classifying to gender • Can a classifier learn to generalize from some gendered words to others based only on their representations? train 1000 SVM 5000 most biased words 4000 test GENDER BIAS IN WORD EMBEDDINGS 33

Classifying to gender • Results: GENDER BIAS IN WORD EMBEDDINGS 34

Conclusion • Word embeddings exhibit gender bias • Debiasing is hard! • Social gender bias is picked up from the data by the models • A lot of the bias information is still recoverable (even when the bias is low/zero according to the definition usually used) • The way we define the bias is important, and needs to be reconsidered when trying to solve the problem GENDER BIAS IN WORD EMBEDDINGS 35

Questions? GENDER BIAS IN WORD EMBEDDINGS 36

Thank you! GENDER BIAS IN WORD EMBEDDINGS 37

Quick Question A doctor is walking down the street with a boy. The - PowerPoint PPT Presentation

Quick Question A doctor is walking down the street with a boy. The boy is the doctors son, but the doctor is not the boys father. How is that possible? GENDER BIAS IN WORD EMBEDDINGS 1 Simple Answer The doctor is the boys

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

God of Peace? Question Question Various approaches Question Various approaches Suggestions

QUICK INTRODUCTION People call me GONZ QUICK INTRODUCTION 1. Never went to Art School

Sorting Chapter 7 1 Quick Sort One of the most popular fast sorting algorithms Quick sort

Question Box An Open Mind Project What is Question Box? Question Box is an elegant shortcut

INAGEL QUICK KONJAC TR Natural thickener Contents 1. Inagel Quick Konjac TR properties 2.

Question 3 (a) Some candidates did not refer to the experiment. Question 3 (a)

Unionville-Chadds Ford 2017 Sleep Study Survey Analysis of Responses Question #1 What is your

Question Types and Options The following sections explain each question type, how to set correct

An Question Recommendation System for Question Answer Community (Stackoverflow) Presenter: Haoyu

Research Question Art Question: Art Question: How does art serve as experience:

Question Question: What is maintenance? 2 Answer The work of keeping something in proper

God of War or God of Peace? Question Question Various approaches Question Various approaches

Midterm Question 1-5 Questions about 1-5: Ask tomorrow in the discussion session.

God of War or God of Peace? Question Question Various approaches Question Various approaches

The Office of Research an d Spon s or ed Pr ogr am s Quick Guide to Grant Writing Quick Guide to

Guided Mesh Normal Filtering Wangyu Zhang USTC Bailin Deng EPFL, University of Hull

Welfare, Inequality & Poverty 1 Arthur CHARPENTIER - Welfare, Inequality and Poverty

Data-Intensive Distributed Computing CS 431/631 451/651 (Winter 2019) Part 9: Real-Time Data

Normal Map Ind Normal Map Ind dustry Survey dustry Survey EGMENT 0: Adam Myhill wit th

Quick Question A doctor is walking down the street with a boy. The boy is the doctors son,

1 Peter Series Lesson #104 September 7, 2017 Dean Bible Ministries www.deanbibleministries.org Dr.

Developing Music Technology for Music & Wearable Technology for Health Health and Learning

There Is No AI Ethics The Human Origins of Machine Prejudice Joanna J. Bryson University of

Sambuz

Useful Links

Newsletter

Mail Us

Quick Question A doctor is walking down the street with a boy. The - PowerPoint PPT Presentation

Quick Question A doctor is walking down the street with a boy. The boy is the doctors son, but the doctor is not the boys father. How is that possible? GENDER BIAS IN WORD EMBEDDINGS 1 Simple Answer The doctor is the boys

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

God of Peace? Question Question Various approaches Question Various approaches Suggestions

QUICK INTRODUCTION People call me GONZ QUICK INTRODUCTION 1. Never went to Art School

Sorting Chapter 7 1 Quick Sort One of the most popular fast sorting algorithms Quick sort

Question Box An Open Mind Project What is Question Box? Question Box is an elegant shortcut

INAGEL QUICK KONJAC TR Natural thickener Contents 1. Inagel Quick Konjac TR properties 2.

Question 3 (a) Some candidates did not refer to the experiment. Question 3 (a)

Unionville-Chadds Ford 2017 Sleep Study Survey Analysis of Responses Question #1 What is your

Question Types and Options The following sections explain each question type, how to set correct

An Question Recommendation System for Question Answer Community (Stackoverflow) Presenter: Haoyu

Research Question Art Question: Art Question: How does art serve as experience:

Question Question: What is maintenance? 2 Answer The work of keeping something in proper

God of War or God of Peace? Question Question Various approaches Question Various approaches

Midterm Question 1-5 Questions about 1-5: Ask tomorrow in the discussion session.

God of War or God of Peace? Question Question Various approaches Question Various approaches

The Office of Research an d Spon s or ed Pr ogr am s Quick Guide to Grant Writing Quick Guide to

Guided Mesh Normal Filtering Wangyu Zhang USTC Bailin Deng EPFL, University of Hull

Welfare, Inequality &amp; Poverty 1 Arthur CHARPENTIER - Welfare, Inequality and Poverty

Data-Intensive Distributed Computing CS 431/631 451/651 (Winter 2019) Part 9: Real-Time Data

Normal Map Ind Normal Map Ind dustry Survey dustry Survey EGMENT 0: Adam Myhill wit th

Quick Question A doctor is walking down the street with a boy. The boy is the doctors son,

1 Peter Series Lesson #104 September 7, 2017 Dean Bible Ministries www.deanbibleministries.org Dr.

Developing Music Technology for Music &amp; Wearable Technology for Health Health and Learning

There Is No AI Ethics The Human Origins of Machine Prejudice Joanna J. Bryson University of

Sambuz

Useful Links

Newsletter

Mail Us

Welfare, Inequality & Poverty 1 Arthur CHARPENTIER - Welfare, Inequality and Poverty

Developing Music Technology for Music & Wearable Technology for Health Health and Learning