Kai-Wei Chang UCLA References: http://kwchang.net Kai-Wei Chang - PowerPoint PPT Presentation

What It Takes to Control Societal Bias in Natural Language Processing Kai-Wei Chang UCLA References: http://kwchang.net Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 1

Always working?! http://viralscape.com/travel-expectations-vs-reality/ Performance on Benchmarks Performance in reality Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 2

NLP Models are Brittle Generating Natural Language Adversarial Examples [ASEHSC(EMNLP 18)] Retrofitting Contextualized Word Embeddings with Paraphrases [SCZC (EMNLP 19)] Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 3

Training NLP models Require Large Data How about low-resource languages? How about domains where annotations are expansive? BIG DATA Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 4

NLP Model is biased 1, Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods [ZWYOC NAACL 2018] • Coreference resolution is biased 1,2 • Model fails for female when given same context Semantics Only w/ Syntactic Cues 2 Rudinger et al. Gender Bias in Coreference Resolution. NAACL 2018 Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 5

Wino-bias data [ZWYOC NAACL 2018] v Stereotypical dataset v Anti-stereotypical dataset Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 6

Gender bias in Coref System 78 73 68 63 58 53 48 E2E E2E (Debiased WE) E2E (Full model) Steoetype Anti-Steoretype Avg Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 7

NLP Model is biased The Woman Worked as a Babysitter: On Biases in Language Generation [SCNP EMNLP 2019] • Language generation is biased Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 8

Outline v Gender Bias in NLP v Representational harm v Performance gap in downstream applications [ACL 2019] v Cross-lingual Dependency Parsing Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 9

I will show you… v How to *unlearn* unwanted bias in training data v How to inject knowledge that are not present in training data v Some ILP formulations Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 10

A carton of ML (NLP) pipeline Prediction (Structured) Inference Auxiliary Corpus/Models Representation (e.g, word embedding) Data Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 11

Representational Harm in NLP: Word Embeddings can be Sexist Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings [BCZSK NeurIPS16] v 𝑤 "#$ − 𝑤 &'"#$ + 𝑤 )$*+, ∼ 𝑤 #)$/ he: ________ she:_______ brother sister beer cocktail physician registered_nurse professor associate professor We use Google w2v embedding trained from the news Concurrent work: replicated IAT findings using word embeddings Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 12

she he father mother king queen Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 13

Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 14

May cause allocative harms in downstream applications Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 15

[BCZSK; NeurIPS 16] This can be done by projecting gender direction out from gender neutral words using linear operations Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 16

Make Gender Information Transparent in Word Embedding Learning Gender-Neutral Word Embeddings [ZZLWC; EMNLP18] dimensions for other latent aspects 𝑥 # dimensions reserve for 1 -1 ? gender information 𝑥 1 mother doctor father Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 17

Make Gender Information Transparent in Word Embedding Learning Gender-Neutral Word Embeddings [ZZLWC; EMNLP18] 𝑥 # 𝑥 1 Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 18

Make Gender Information Transparent in Word Embedding Learning Gender-Neutral Word Embeddings [ZZLWC; EMNLP18] 𝑥 # 𝑥 1 ! " Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 19

Gender bias in Coref System 78 73 68 63 58 53 48 E2E E2E (Debiased WE) E2E (Full model) Steoetype Anti-Steoretype Avg Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 20

How about… v language with grammatical gender v bilingual word embedding v contextulaized embedding Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 21

How about other languages? [ZSZHCCC EMNLP19] v Language with grammatical gender v Morphological agreement Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 22

v Linear Discriminative Analysis (LDA) v Identify grammatical gender direction feminine words masculine words Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 23

masculine Female Male feminine Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 24

How about bilingual embedding? [ZSZHCCC EMNLP19] Female doctor in Spanish male doctor in Spanish Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 27

How about Contextualized Representation? Gender Bias in Contextualized Word Embeddings [ZWYCOC; NAACL19] v First two components explain more variance than others (Feminine) The driver stopped the car at the hospital because she was paid to do so (Masculine) The driver stopped the car at the hospital because he was paid to do so gender direction: ELMo(driver) – ELMo(driver) Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 28

<latexit sha1_base64="r2/BIv92hsxnS9nS8QghuqsRs=">AB9HicbVA9TwJBEJ3DL8Qv1NLmIphYkTstCTaWGIiYAIXsrcsGFv9ydw5ALv8PGQmNs/TF2/hsXuELBl0zy8t5MZuaFseAGPe/bya2tb2xu5bcLO7t7+wfFw6OmUYmrEGVUPohJIYJLlkDOQr2EGtGolCwVji6mfmtMdOGK3mPk5gFERlI3ueUoJWCckfzwRCJ1uqp3C2WvIo3h7tK/IyUIEO9W/zq9BRNIiaRCmJM2/diDFKikVPBpoVOYlhM6IgMWNtSJmgnR+9NQ9s0rP7StS6I7V39PpCQyZhKFtjMiODTL3kz8z2sn2L8KUi7jBJmki0X9RLio3FkCbo9rRlFMLCFUc3urS4dE4o2p4INwV9+eZU0qxX/olK9q5Zq1kceTiBUzgHy6hBrdQhwZQeIRneIU3Z+y8O/Ox6I152Qzx/AHzucPgN+R6w=</latexit> <latexit sha1_base64="MiBzBz8O6rmAcrAGaRyZjZKL+Rs=">AB63icbVA9SwNBEJ2LXzF+RS1tFhPBKtwlhWIVtLGMYD4gOcLeZi9Zsrt37O4J4chfsLFQxNY/ZOe/cS+5QhMfDzem2FmXhBzpo3rfjuFjc2t7Z3ibmlv/+DwqHx80tFRoghtk4hHqhdgTmTtG2Y4bQXK4pFwGk3mN5lfveJKs0i+WhmMfUFHksWMoJNJlXDm+qwXHFr7gJonXg5qUCO1rD8NRhFJBFUGsKx1n3PjY2fYmUY4XReGiSaxphM8Zj2LZVYUO2ni1vn6MIqIxRGypY0aKH+nkix0HomAtspsJnoVS8T/P6iQmv/ZTJODFUkuWiMOHIRCh7HI2YosTwmSWYKGZvRWSCFSbGxlOyIXirL6+Tr3mNWr1h3qleZvHUYQzOIdL8OAKmnAPLWgDgQk8wyu8OcJ5cd6dj2VrwclnTuEPnM8fAOmNjA=</latexit> <latexit sha1_base64="nCmo2IjEmAQavNbnXg3xD8FOzI=">AB6nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTstCTaWGIUJIEL2VvmYMPe3mV3z4Rc+Ak2Fhpj6y+y89+4wBUKvmSl/dmMjMvSATXxnW/ncLa+sbmVnG7tLO7t39QPjxq6zhVDFsFrHqBFSj4BJbhuBnUQhjQKBj8H4ZuY/PqHSPJYPZpKgH9Gh5CFn1FjpvhpW+WKW3PnIKvEy0kFcjT75a/eIGZphNIwQbXuem5i/Iwqw5nAamXakwoG9Mhdi2VNELtZ/NTp+TMKgMSxsqWNGSu/p7IaKT1JApsZ0TNSC97M/E/r5ua8MrPuExSg5ItFoWpICYms7/JgCtkRkwsoUxeythI6oMzadkg3BW35lbTrNe+iVr+rVxrXeRxFOIFTOAcPLqEBt9CEFjAYwjO8wpsjnBfn3flYtBacfOY/sD5/AGDRo1I</latexit> Unequal Treatment of Gender v Classifier context gender f : ELMo(occupation) → f gender ELMo predictio embeddings n The driver stopped the car at the hospital because she was paid to do so Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 29

<latexit sha1_base64="MiBzBz8O6rmAcrAGaRyZjZKL+Rs=">AB63icbVA9SwNBEJ2LXzF+RS1tFhPBKtwlhWIVtLGMYD4gOcLeZi9Zsrt37O4J4chfsLFQxNY/ZOe/cS+5QhMfDzem2FmXhBzpo3rfjuFjc2t7Z3ibmlv/+DwqHx80tFRoghtk4hHqhdgTmTtG2Y4bQXK4pFwGk3mN5lfveJKs0i+WhmMfUFHksWMoJNJlXDm+qwXHFr7gJonXg5qUCO1rD8NRhFJBFUGsKx1n3PjY2fYmUY4XReGiSaxphM8Zj2LZVYUO2ni1vn6MIqIxRGypY0aKH+nkix0HomAtspsJnoVS8T/P6iQmv/ZTJODFUkuWiMOHIRCh7HI2YosTwmSWYKGZvRWSCFSbGxlOyIXirL6+Tr3mNWr1h3qleZvHUYQzOIdL8OAKmnAPLWgDgQk8wyu8OcJ5cd6dj2VrwclnTuEPnM8fAOmNjA=</latexit> <latexit sha1_base64="r2/BIv92hsxnS9nS8QghuqsRs=">AB9HicbVA9TwJBEJ3DL8Qv1NLmIphYkTstCTaWGIiYAIXsrcsGFv9ydw5ALv8PGQmNs/TF2/hsXuELBl0zy8t5MZuaFseAGPe/bya2tb2xu5bcLO7t7+wfFw6OmUYmrEGVUPohJIYJLlkDOQr2EGtGolCwVji6mfmtMdOGK3mPk5gFERlI3ueUoJWCckfzwRCJ1uqp3C2WvIo3h7tK/IyUIEO9W/zq9BRNIiaRCmJM2/diDFKikVPBpoVOYlhM6IgMWNtSJmgnR+9NQ9s0rP7StS6I7V39PpCQyZhKFtjMiODTL3kz8z2sn2L8KUi7jBJmki0X9RLio3FkCbo9rRlFMLCFUc3urS4dE4o2p4INwV9+eZU0qxX/olK9q5Zq1kceTiBUzgHy6hBrdQhwZQeIRneIU3Z+y8O/Ox6I152Qzx/AHzucPgN+R6w=</latexit> Unequal Treatment of Gender The writer taught himself to play violin . v Classifier context gender f : ELMo(occupation) → 100 • ELMo propagates gender information to 95 other words Acc (%) 90 • Male information is 14% more accurately 85 propagated than female 80 Male Context Female Context Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 30

Coreference with contextualized embedding v ELMo boosts the performance v However, enlarge the bias (Δ) OntoNotes Pro. Anti. 80 70 Δ: 29.6 Δ: 26.6 60 50 40 GloVe + ELMo 31 Kai-Wei Chang (http://kwchang.net/talks/genderbias/)

Should We Debias Word Embedding? v Awareness is better than blindness (Caliskan et. al. 17) v Completely removing bias from embedding is hard if not impossible (Gonen&Goldberg 19) Calibration Prediction (Structured) Inference Auxiliary Corpus/Models Representation (e.g, word embedding) Data Data Augmentation Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 32

Kai-Wei Chang UCLA References: http://kwchang.net Kai-Wei Chang - PowerPoint PPT Presentation

What It Takes to Control Societal Bias in Natural Language Processing Kai-Wei Chang UCLA References: http://kwchang.net Kai-Wei Chang (http://kwchang.net/talks/genderbias/) 1 Always working?!

C-VeT: UCLA Vehicular Testbed Pis: Mario Gerla (UCLA) gerla@cs.ucla.edu Giovanni Pau (UCLA)

Efficient Contextual Representation Learning With Continuous Outputs Kai-Wei Chang Liunian Harold

Biases in NLP Models and What It Takes to Control them Kai-Wei Chang 1 A carton of ML (NLP)

Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of

Lecture 1: Introduction Kai-Wei Chang CS @ UCLA kw@kwchang.net Couse webpage:

Lecture 5: Representation Learning Kai-Wei Chang CS @ UCLA kw@kwchang.net Couse webpage:

Lecture 6: Representing Words Kai-Wei Chang CS @ UCLA kw@kwchang.net Couse webpage:

Lecture 7: Language Structure: Grammar Kai-Wei Chang CS @ UCLA kw@kwchang.net Couse webpage:

Language Modeling Diyi Yang Some slides borrowed from Yulia Tsvetkov at CMU and Kai-Wei Chang at

Goliath grouper management stakeholder project Kai Lorenzen Kai Lorenzen, Jessica Sutt, Joy ,

KAI TAK what what s next? s next? KAI TAK Agenda Agenda Issues and

Online Aging Monitoring and Resilience Hao-Chun Chang, Li-An Huang, Kai-Chang Wu Department of

Robust Text Classifier on Test-Time Budgets Md Rizwan Parvez, Tolga Bolukbasi, Kai-Wei Chang,

Lecture 2: N-gram Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage:

Lecture 15: Dependency Parsing Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse

Kai-Wei Chang Joint work with Scott Wen-tau Yih, Chris Meek Microsoft Research Build an

1 2 3 4 5

GENI-VIOLIN: Distributed Suspend and Resume for GENI

Welcome back (+ midterm review) 18 March 2020 Modern Research Methods Logistics As a

EDA045F: Program Analysis LECTURE 8: DYNAMIC ANALYSIS 1 Christoph Reichenbach In the last

Natural Language Processing Lecture 132/26/2015 Martha Palmer Today Start on Parsing

Extended Variational Inference for Non-Gaussian Statistical Models Zhanyu Ma

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

GDP and More: Performance and Power Solutions for Multi-Core VLSI Systems Hai Wang University

Sambuz

Useful Links

Newsletter

Mail Us