Algorithms for NLP IITP, Fall 2019 Lecture 25: Computational Ethics - - PowerPoint PPT Presentation

algorithms for nlp
SMART_READER_LITE
LIVE PREVIEW

Algorithms for NLP IITP, Fall 2019 Lecture 25: Computational Ethics - - PowerPoint PPT Presentation

Algorithms for NLP IITP, Fall 2019 Lecture 25: Computational Ethics Yulia Tsvetkov 1 Tsvetkov Socially Responsible NLP What NLP Has To Do With Ethics? Applications Machine Translation Information Retrieval Question


slide-1
SLIDE 1

Tsvetkov – Socially Responsible NLP 1

Yulia Tsvetkov

Algorithms for NLP

IITP, Fall 2019

Lecture 25: Computational Ethics

slide-2
SLIDE 2

Tsvetkov – Socially Responsible NLP

What NLP Has To Do With Ethics?

  • Applications

○ Machine Translation ○ Information Retrieval ○ Question Answering ○ Dialogue Systems ○ Information Extraction ○ Summarization ○ Sentiment Analysis ○ ...

slide-3
SLIDE 3

Tsvetkov – Socially Responsible NLP

Language, People, and Web

The common misconception is that language has to do with words and what they mean. It doesn’t. It has to do with people and what they mean.

Herbert H. Clark & Michael F. Schober, 1992

slide-4
SLIDE 4

Tsvetkov – Socially Responsible NLP

Both Ethics and NLP are Interdisciplinary Fields

  • Philosophy
  • Sociology
  • Psychology
  • Linguistics
  • Sociolinguistics
  • Social psychology
  • Computational Social Science
  • Machine Learning
slide-5
SLIDE 5

Tsvetkov – Socially Responsible NLP

What is Ethics?

“Ethics is a study of what are good and bad ends to pursue in life and what it is right and wrong to do in the conduct of life. It is therefore, above all, a practical discipline. Its primary aim is to determine how one ought to live and what actions one ought to do in the conduct of one’s life.”

  • - Introduction to Ethics, John Deigh
slide-6
SLIDE 6

Tsvetkov – Socially Responsible NLP

What is Ethics?

It’s the good things It’s the right things

slide-7
SLIDE 7

Tsvetkov – Socially Responsible NLP

What is Ethics?

It’s the good things It’s the right things

How simple is it to define what’s good and what’s right?

slide-8
SLIDE 8

Tsvetkov – Socially Responsible NLP

The Trolley Dilemma

Should you pull the lever to divert the trolley?

[From Wikipedia]

slide-9
SLIDE 9

Tsvetkov – Socially Responsible NLP

The Chicken Dilemma

hen rooster

Ethical?

slide-10
SLIDE 10

Tsvetkov – Socially Responsible NLP

The Chicken Dilemma

hen rooster

➔ Ethics is inner guiding, moral principles, and values of people and society ➔ There are grey areas, there are often no give binary answers. ➔ Ethics changes over time with values and beliefs of people ➔ Legal ≠ ethical

slide-11
SLIDE 11

Tsvetkov – Socially Responsible NLP

Ethics ≠ Law

  • Illegal+immoral:
  • legal+immoral:
  • illegal+moral:
  • legal+moral:
slide-12
SLIDE 12

Tsvetkov – Socially Responsible NLP

Ethics ≠ Law

  • Illegal+immoral: murder
  • legal+immoral:
  • illegal+moral:
  • legal+moral:
slide-13
SLIDE 13

Tsvetkov – Socially Responsible NLP

Ethics ≠ Law

  • Illegal+immoral: murder
  • legal+immoral: cheating on a spouse
  • illegal+moral:
  • legal+moral:
slide-14
SLIDE 14

Tsvetkov – Socially Responsible NLP

Ethics ≠ Law

  • Illegal+immoral: murder
  • legal+immoral: cheating on a spouse
  • illegal+moral: civil disobedience
  • legal+moral: eating ice cream
slide-15
SLIDE 15

Tsvetkov – Socially Responsible NLP

Ethics ≠ Law

  • Illegal+immoral: murder
  • capital punishment
  • legal+immoral: cheating on a spouse
  • cancelling Game of Thrones
  • illegal+moral: civil disobedience
  • assassination of a dictator
  • legal+moral: eating an ice cream
  • eating the last ice cream in the freezer
slide-16
SLIDE 16

Tsvetkov – Socially Responsible NLP

Ethical Considerations are Time-Dependent

slide-17
SLIDE 17

Tsvetkov – Socially Responsible NLP

We Cannot Foresee All Possible Uses of Technology

slide-18
SLIDE 18

Tsvetkov – Socially Responsible NLP

Working on Ethical Issues in AI

  • Ethics is hard even to define, it is subjective and it changes over time:

should we be then trying to quantify and evaluate ethics in AI? ○ It is another problem with an ill-defined answer ■ It still has some definition of good and bad ■ Not everyone agrees on all examples ■ But they do agree on some examples ■ They do have some correlation between people ○ Complex NLP problems are also hard to quantify and evaluate ■ Summarization, QA, dialog, speech synthesis

slide-19
SLIDE 19

Tsvetkov – Socially Responsible NLP

Let’s Train an IQ Classifier

  • Intelligence Quotient: a number used to express the apparent relative intelligence of a person
slide-20
SLIDE 20

Tsvetkov – Socially Responsible NLP

An IQ Classifier

Let’s train a classifier to predict people’s IQ from their photos.

  • Who could benefit from such a classifier?
slide-21
SLIDE 21

Tsvetkov – Socially Responsible NLP

An IQ Classifier

Let’s train a classifier to predict people’s IQ from their photos.

  • Who could benefit from such a classifier?
  • Assume the classifier is 100% accurate. Who can be harmed from such a

classifier?

slide-22
SLIDE 22

Tsvetkov – Socially Responsible NLP

An IQ Classifier

Let’s train a classifier to predict people’s IQ from their photos.

  • Who could benefit from such a classifier?
  • Who can be harmed by such a classifier?
  • Suppose, our test results show 90% accuracy
slide-23
SLIDE 23

Tsvetkov – Socially Responsible NLP

An IQ Classifier

Let’s train a classifier to predict people’s IQ from their photos.

  • Who could benefit from such a classifier?
  • Who can be harmed by such a classifier?
  • Suppose, our test results show 90% accuracy

○ Evaluation reveals that white females have 95% accuracy ○ People with blond hair under age of 25 have only 60% accuracy

slide-24
SLIDE 24

Tsvetkov – Socially Responsible NLP

An IQ Classifier

Let’s train a classifier to predict people’s IQ from their photos.

  • Who could benefit from such a classifier?
  • Who can be harmed by such a classifier?
  • Are there biases in data?
slide-25
SLIDE 25

Tsvetkov – Socially Responsible NLP

An IQ Classifier

Let’s train a classifier to predict people’s IQ from their photos.

  • Who could benefit from such a classifier?
  • Who can be harmed by such a classifier?
  • Are there biases in data?
  • What personal data was used as training data? Privacy concerns?
  • Who is responsible?

○ Researcher/developer? Reviewer? University? Society?

slide-26
SLIDE 26

Tsvetkov – Socially Responsible NLP

What’s the Difference?

slide-27
SLIDE 27

Tsvetkov – Socially Responsible NLP

AI and People

Applications pervasive in our daily life!

slide-28
SLIDE 28

Tsvetkov – Socially Responsible NLP

Learn to Assess Computational Systems Adversarially

  • Who could benefit from such a technology?
  • Who can be harmed by such a technology?
  • Representativeness of training data
  • Could sharing this data have major effect on people’s lives?
  • What are confounding variables and corner cases to control for?
  • Does the system optimize for the “right” objective?
  • Could prediction errors have major effect on people’s lives?
slide-29
SLIDE 29

Tsvetkov – Socially Responsible NLP

  • Who could benefit from your technology?
  • Who can be harmed by your technology?
  • Representativeness of your training data
  • Could you by sharing this data have negative effect on people’s lives?
  • What are confounding variables and corner cases for you to control for?
  • Does your system optimize for the “right” objective?
  • Could prediction errors of your technology have major effect on people’s

lives?

Learn to Assess Computational Systems Adversarially

slide-30
SLIDE 30

Tsvetkov – Socially Responsible NLP

Topics in the Intersection of Ethics and NLP

  • Misrepresentation and human biases in NLP data and models
  • Hate speech and civility in online communication
  • Privacy and security
  • Democracy and the language of manipulation: bias in narratives,

censorship, fake news, targeted content

  • NLP for social good: low-resource NLP, NLP for disaster response
slide-31
SLIDE 31

Tsvetkov – Socially Responsible NLP

Topics in the Intersection of Ethics and NLP

  • Bias and Fairness concerns

○ Is my NLP model capturing social stereotypes? ○ Are my classifier’s predictions fair?

  • Dual Use NLP Applications

○ E.g., Persuasive Language generation ■ in targeted advertisement, say, in Payday loan ads?

  • Privacy Concerns

○ Demographic factors prediction (gender, age, etc.) ○ Sexual orientation prediction

  • Socially Beneficial Applications

○ Hate speech detection ○ Monitoring disease outbreaks etc. ○ Psychological monitoring/counseling ○ Low resource NLP ○ +many more

slide-32
SLIDE 32

Tsvetkov – Socially Responsible NLP

Misrepresentation and Bias

slide-33
SLIDE 33

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a female ? (Preotiuc-Pietro et al. ‘16)

Giggle – Laugh

slide-34
SLIDE 34

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a female ? (Preotiuc-Pietro et al. ‘16)

Giggle – Laugh

slide-35
SLIDE 35

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a female ? (Preotiuc-Pietro et al. ‘16)

Brutal – Fierce

slide-36
SLIDE 36

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a female ? (Preotiuc-Pietro et al. ‘16)

Brutal – Fierce

slide-37
SLIDE 37

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a older person ? (Preotiuc-Pietro et al. ‘16)

Impressive – Amazing

slide-38
SLIDE 38

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a older person ? (Preotiuc-Pietro et al. ‘16)

Impressive – Amazing

slide-39
SLIDE 39

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a person of higher occupational class ? (Preotiuc-Pietro et al. ‘16)

Suggestions – Proposals

slide-40
SLIDE 40

Tsvetkov – Socially Responsible NLP

Stereotypes

Which word is more likely to be used by a person of higher occupational class ? (Preotiuc-Pietro et al. ‘16)

Suggestions – Proposals

slide-41
SLIDE 41

Tsvetkov – Socially Responsible NLP

Why do we intuitively recognize a default social group?

slide-42
SLIDE 42

Tsvetkov – Socially Responsible NLP

Why do we intuitively recognize a default social group? Implicit Bias

slide-43
SLIDE 43

Tsvetkov – Socially Responsible NLP

A Note On Terminology

Bias in ML ⬄ Cognitive bias ⬄ Human biases in ML

  • Bias in ML

○ Bias of an estimator: the difference between this estimator's expected value and the true value

  • f the parameter being estimated

○ Inductive bias: assumptions made by the model to learn the target function and to generalize beyond training data

  • Cognitive Biases in Cognitive Science; Social Psychology; Behavioral

Economics

○ Our brains are evolutionarily hard-wired to store learned information for rapid retrieval and automatic judgments. Stereotypes inevitably form because of the innate tendency of the human mind to categorize the world to simplify processing

  • Human biases in ML

○ A mismatch between the data and assumptions used to build a model and the actual populations who would benefit from the technology.

slide-44
SLIDE 44

Tsvetkov – Socially Responsible NLP

How Do We Make Decisions

Kahneman & Tversky 1973, 1974, 2002

System 1

automatic fast parallel automatic effortless associative slow-learning

System 2

effortful slow serial controlled effort-filled rule-governed flexible

slide-45
SLIDE 45

Tsvetkov – Socially Responsible NLP

~10MP

slide-46
SLIDE 46

Tsvetkov – Socially Responsible NLP

How Do We Make Decisions

Our brains are evolutionarily hard-wired to store learned information for rapid retrieval and automatic judgments. Over 95% of cognition is relegated to the System 1 “auto-pilot.” System 1

automatic

System 2

effortful

slide-47
SLIDE 47

Tsvetkov – Socially Responsible NLP

Psychological Perspective on Implicit Bias

Stereotypes inevitably form because of the innate tendency of the human mind to:

  • Categorize the world to simplify processing
  • Store learned information in mental representations (called schemas)
  • Automatically and unconsciously activate stored information whenever one

encounters a category member

slide-48
SLIDE 48

Tsvetkov – Socially Responsible NLP

slide-49
SLIDE 49

Tsvetkov – Socially Responsible NLP

slide-50
SLIDE 50

Tsvetkov – Socially Responsible NLP

Stereotypes are internalized as associations through natural processes of learning and categorization

[Image credit: Geoff Kaufman]

slide-51
SLIDE 51

Tsvetkov – Socially Responsible NLP

Social stereotypes are not necessarily negative, but still have negative effect

[Image credit: Geoff Kaufman]

slide-52
SLIDE 52

Tsvetkov – Socially Responsible NLP

Implicit biases are distressingly pervasive, operate largely unconsciously, and can automatically influence the ways in which we see and treat others, even when we are determined to be fair and objective.

[Image credit: Geoff Kaufman]

slide-53
SLIDE 53

Tsvetkov – Socially Responsible NLP

How to Measure Implicit Bias?

slide-54
SLIDE 54

Tsvetkov – Socially Responsible NLP

Implicit Association Test - Greenwald et al. 1998

Category Items Good Spectacular, Appealing, Love, Triumph, Joyous, Fabulous, Excitement, Excellent Bad Angry, Disgust, Rotten, Selfish, Abuse, Dirty, Hatred, Ugly African Americans European Americans

slide-55
SLIDE 55

Tsvetkov – Socially Responsible NLP

How Does Implicit Bias Manifest?

slide-56
SLIDE 56

Tsvetkov – Socially Responsible NLP

Terrell et al. (2016)

slide-57
SLIDE 57

Tsvetkov – Socially Responsible NLP

Implicit Bias Manifests in Subtle Ways in the Form of Micro-inequities

Micro-inequities: ephemeral, covert, unintentional, frequently unrecognized events that reinforce power dynamics or perceptions of “difference”

slights, exclusions, slips of the tongue, nonverbal signals, unchecked assumptions, unequal expectations, etc.

slide-58
SLIDE 58

Tsvetkov – Socially Responsible NLP

And in Overt Toxic Comments

Online Disinhibition Effect (Suler’04) Benign disinhibition and Toxic disinhibition ➔ Dissociative anonymity (“You don’t know me”) ➔ Invisibility (“You can’t see me”) ➔ Asynchronicity (“See you later”) ➔ Solipsistic Introjection (“It’s all in my head”) ➔ Dissociative Imagination (“It’s just a game”) ➔ Minimization of Status and Authority (“Your rules don’t apply here”)

slide-59
SLIDE 59

Tsvetkov – Socially Responsible NLP

Consequences

slide-60
SLIDE 60

Tsvetkov – Socially Responsible NLP

Stereotype Threat

Fear of confirming a negative stereotype about one’s group (Steele & Aronson, 1995)

  • Often leads to anxiety and negative feelings that can use up mental resources

and undermine one’s confidence and ability to succeed

  • Exacerbated by repeated experiences with microaggressions reducing one’s

sense of belonging or self-belief in a particular domain ○ e.g., women in STEM: Beasley & Fischer’12; Shapiro & Williams’12; Cimpian & Leslie’ 17;

slide-61
SLIDE 61

Tsvetkov – Socially Responsible NLP

Stereotype Threat

  • Groups: Blacks and Whites
  • Threat: Intellectual ability
  • J. Aronson, C.M. Steele, M.F. Salinas, M.J. Lustina,

Readings About the Social Animal, 8th edition, ed. E. Aronson

slide-62
SLIDE 62

Tsvetkov – Socially Responsible NLP

[Slide credit: Geoff Kaufman]

slide-63
SLIDE 63

Tsvetkov – Socially Responsible NLP

Ways to Mitigate Biases

slide-64
SLIDE 64

Tsvetkov – Socially Responsible NLP

Implicit Bias is Malleable! Devine’s (1999) Dissociation Model

System 1: Stereotype Activation

  • Stereotypes are firmly implanted (and reinforced) by learning and exposure,

cognitive processes of categorization, etc.

  • Thus, stereotypes are automatically activated whenever a cue is present,

regardless of personal prejudice level; a “mental habit” System 2: Preventing Stereotype Application

  • Once a stereotype is activated, people can use System 2 processes to
  • vercome the influence of the stereotype
  • Must first be aware of the activation of stereotypes, then take steps to mitigate

their impact or weaken their power…

slide-65
SLIDE 65

Tsvetkov – Socially Responsible NLP

Techniques To Identify And Mitigate Implicit Biases

  • Implicit Association Test (Greenwald et al. 1998)
  • Counter-stereotypic Training (Blair et al. 2001; Kang et al. 2012; Kawakami et al. 2000;

Wittenbrink et al. 2001) ○ Deliberately and repeatedly negating stereotypes or associating individuals with counter-stereotypic traits or attributes

  • Mindset Training (Beattie et al. 2013; Sassenberg & Moskowitz 2005; Stone & Moskowitz 2011)

○ Cultivating a deliberative mindset, reminding oneself of egalitarian goals, reinforcing curiosity and constructive uncertainty about others

  • Meditation
  • etc.
slide-66
SLIDE 66

Tsvetkov – Socially Responsible NLP

Back to AI

slide-67
SLIDE 67

Tsvetkov – Socially Responsible NLP

Back to AI

AI is only System 1

slide-68
SLIDE 68

Tsvetkov – Socially Responsible NLP

  • Conversational agents
  • Personal assistants
  • Search engines
  • Translation engines
  • Medical research assistants

AI

slide-69
SLIDE 69

Tsvetkov – Socially Responsible NLP

Online data is riddled with SOCIAL STEREOTYPES

AI

BIASED

slide-70
SLIDE 70

Tsvetkov – Socially Responsible NLP

Racial Stereotypes

  • June 2016: web search query “three black teenagers”
slide-71
SLIDE 71

Tsvetkov – Socially Responsible NLP

Gender/Race/Age Stereotypes

  • June 2017: image search query “Doctor”
slide-72
SLIDE 72

Tsvetkov – Socially Responsible NLP

Gender/Race/Age Stereotypes

  • June 2017: image search query “Nurse”
slide-73
SLIDE 73

Tsvetkov – Socially Responsible NLP

Gender/Race/Age Stereotypes

  • June 2017: image search query “Homemaker”
slide-74
SLIDE 74

Tsvetkov – Socially Responsible NLP

Gender/Race/Age Stereotypes

  • June 2017: image search query “CEO”
slide-75
SLIDE 75

Tsvetkov – Socially Responsible NLP

Gender/Race/Age Stereotypes

  • June 2017: image search query “Professor”
slide-76
SLIDE 76

Tsvetkov – Socially Responsible NLP

Stereotypes in User-Generated Content

slide-77
SLIDE 77

Tsvetkov – Socially Responsible NLP

Consequence: models are biased

AI

BIASED

slide-78
SLIDE 78

Tsvetkov – Socially Responsible NLP

Biased AI Technologies

slide-79
SLIDE 79

Tsvetkov – Socially Responsible NLP

Is my NLP model capturing social stereotypes?

slide-80
SLIDE 80

Tsvetkov – Socially Responsible NLP

Bias in NLP Models

1. Bolukbasi T., Chang K.-W., Zou J., Saligrama V., Kalai A. (2016) Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word

  • Embeddings. NIPS

2. Caliskan, A., Bryson, J. J. and Narayanan, A. (2017) Semantics derived automatically from language corpora contain human-like biases. Science 3. Nikhil Garg, Londa Schiebinger, Dan Jurafsky, James Zou. (2018) Word embeddings quantify 100 years of gender and ethnic stereotypes. PNAS.

Slide from SRNLP Tutorial at NAACL 2018

slide-81
SLIDE 81

Tsvetkov – Socially Responsible NLP 1. Bolukbasi et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. NIPS (2016) 2. Caliskan, et al. Semantics derived automatically from language corpora contain human-like biases. Science (2017) 3. Garg et al. Word embeddings quantify 100 years of gender and ethnic stereotypes. PNAS. (2018) 4. Zhao, Jieyu, et al. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. arXiv (2017) 5. Zhao, Jieyu, et al. Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv (2018) 6. Zhang, et al. Mitigating unwanted biases with adversarial learning. AIES, 2018 7. Webster, Kellie, et al. Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns. TACL (2018) 8. Svetlana and Mohammad. Examining gender and race bias in two hundred sentiment analysis systems. arXiv (2018) 9. Díaz, et al. Addressing age-related bias in sentiment analysis. CHI Conference on Human Factors in Computing Systems. (2018) 10. Dixon, et al. Measuring and mitigating unintended bias in text classification. AIES. (2018) 11. Prates, et al. Assessing gender bias in machine translation: a case study with Google Translate. Neural Computing and Applications (2018) 12. Park, et al. Reducing gender bias in abusive language detection. arXiv (2018) 13. Zhao, Jieyu, et al. Learning gender-neutral word embeddings. arXiv (2018) 14. Anne Hendricks, et al. Women also snowboard: Overcoming bias in captioning models. ECCV. (2018) 15. Elazar and Goldberg. Adversarial removal of demographic attributes from text data. arXiv (2018) 16. Hu and Strout. Exploring Stereotypes and Biased Data with the Crowd. arXiv (2018) 17. Swinger, De-Arteaga, et al. What are the biases in my word embedding? AIES (2019) 18. De-Arteaga et al. Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting. FAT* (2019) 19. Gonen, et al. Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them. NAACL (2019). 20. Manzini et al. Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings. NAACL (2019). 21. …

Bias in NLP Models

2018 2019

slide-82
SLIDE 82

Tsvetkov – Socially Responsible NLP

Biases in NLP Representations

  • Bolukbasi et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word
  • Embeddings. NIPS (2016)
  • Caliskan, et al. Semantics derived automatically from language corpora contain human-like biases.

Science (2017)

  • Garg et al. Word embeddings quantify 100 years of gender and ethnic stereotypes. PNAS. (2018)
  • Swinger, De-Arteaga, et al. What are the biases in my word embedding? AIES (2019)
  • Manzini et al. Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in

Word Embeddings. NAACL (2019).

  • ...
slide-83
SLIDE 83

Tsvetkov – Socially Responsible NLP

Biases in NLP Classifiers/Taggers

  • Gender Bias in Coreference resolution

○ Zhao, Jieyu, et al. Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv (2018) ○ Webster, Kellie, et al. Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns. TACL (2018)

  • Gender, Race, and Age Bias in Sentiment Analysis

○ Svetlana and Mohammad. Examining gender and race bias in two hundred sentiment analysis systems. arXiv (2018) ○ Díaz, et al. Addressing age-related bias in sentiment analysis. CHI Conference on Human Factors in Comp. Systems. (2018)

  • LGBTQ identitiy terms bias in Toxicity classification

○ Dixon, et al. Measuring and mitigating unintended bias in text classification. AIES. (2018)

  • Gender Bias in Occupation Classification

○ De-Arteaga et al. Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting. FAT* (2019)

  • Gender bias in Machine Translation

○ Prates, et al. Assessing gender bias in machine translation: a case study with Google Translate. Neural Computing and Applications (2018)

slide-84
SLIDE 84

Tsvetkov – Socially Responsible NLP

Bias Amplification

  • Zhao et al. Men Also Like Shopping: Reducing Gender Bias

Amplification using Corpus-level Constraint. EMNLP (2017)

  • De-Arteaga et al. Bias in Bios: A Case Study of Semantic Representation

Bias in a High-Stakes Setting. FAT* (2019)

slide-85
SLIDE 85

Tsvetkov – Socially Responsible NLP

Why Is It Especially Relevant Now?

  • Data: the exponential growth of user-generated content
  • Tools: machine learning tools have become accessible to everyone
slide-86
SLIDE 86

Tsvetkov – Socially Responsible NLP

Discussion

  • Applications that are built from online data, generated by people, learn also

real-world stereotypes

  • Should our ML models represent the “real world”?
  • Or should we artificially skew data distribution?
  • If we modify our data, what are guiding principles on what our models should
  • r shouldn't learn?
slide-87
SLIDE 87

Tsvetkov – Socially Responsible NLP

Considerations for Debiasing Data and Models

  • Ethical considerations

○ Preventing discrimination in AI-based technologies ■ in consumer products and services ■ in diagnostics, in medical systems ■ in parole decisions ■ in mortgage lending, credit scores, and other financial decisions ■ in educational applications ■ in search → access to information and knowledge

  • Practical considerations

○ Improving performance particularly where our model’s accuracy is lower

slide-88
SLIDE 88

Tsvetkov – Socially Responsible NLP

http://demo.clab.cs.cmu.edu/ethical_nlp/

slide-89
SLIDE 89

Tsvetkov – Socially Responsible NLP

Thank You!