Paraphrase generation: adversarial examples / data augmentation CS - PowerPoint PPT Presentation

Paraphrase generation: adversarial examples / data augmentation CS 685, Fall 2020 Advanced Natural Language Processing Mohit Iyyer College of Information and Computer Sciences University of Massachusetts Amherst

stuff from last time… • HW1 released, start early! • Exam will be Nov 5-6 2

adversarial examples panda gibbon 57.7% confidence 99.3% confidence credit: openai

adversarial examples panda gibbon 57.7% confidence 99.3% confidence The movie was ??? very bad.

Textual Entailment is the task of predicting whether, for a pair of sentences, the facts in the first sentence necessarily imply the facts in the second.

demo from allennlp.org, example from Yoav Goldberg

adversarial examples for NLP the build-it-break-it workshop at EMNLP 2017 challenged humans to “break” existing systems by coming up with linguistically-adversarial examples “iid development data is unlikely to exhibit all the linguistic phenomena that we might be interested in testing” “NLP systems are quite brittle in the face of infrequent linguistic phenomena, a characteristic which stands in stark contrast to human language users.” Ettinger et al., 2017

lexical adversaries create by word replacement (e.g., Jia et al., ACL 2017) using thesaurus, WordNet, word embedding similarity Input sentence Model prediction Exactly the kind of unexpected delight one positive hopes for every time the lights go down Exactly the kind of thrill one hopes negative for every time the lights go down ex from Ettinger et al., 2017

syntactic adversaries Input sentence Model prediction American drama doesn’t get any more positive meaty and muscular than this. Doesn’t get any more meaty and muscular negative than this American drama. how do we automatically create such examples? can we use a paraphrase generation system? ex from Ettinger et al., 2017

an ideal syntactic paraphraser… • produces grammatically-correct paraphrases that retain the meaning of the original sentence • minimizes lexical differences between input sentence and paraphrase • generates many diverse syntactic paraphrases from the same input

syntactic paraphrase generation Usually you require inventory only when you plan to sell your assets . example paraphrases 1. Usually, you required the inventory only if you were planning to sell the assets. 2. When you plan to sell your assets, you usually require inventory. 3. You need inventory when you plan to sell your assets. 4. Do the inventory when you plan to sell your assets.

syntactic paraphrase generation Usually you require inventory only when you plan to sell your assets . example paraphrases } 1. Usually, you required the inventory only if you were planning to sell the assets. grammatical 2. When you plan to sell your assets, you preserve input semantics usually require inventory. minimal lexical substitution 3. You need inventory when you plan to sell high syntactic diversity your assets. 4. Do the inventory when you plan to sell your assets.

Long history of work on paraphrasing! • rule / template-based syntactic paraphrasing (e.g., McKeown, 1983; Carl et al., 2005) • high grammaticality, but very low diversity • translation-based uncontrolled paraphrasing that rely on parallel text to apply machine translation methods (e.g., Bannard & Callison-Burch, 2005; Quirk et al., 2004) • high diversity, but low grammaticality and no syntactic control • deep learning-based controlled language generation with conditional encoder/decoder architectures (e.g., Ficler & Goldberg, 2017; Shen et al., 2017) • grammatical, but low diversity and no paraphrase constraint

syntactically controlled paraphrase networks (SCPNs) 1. acquire millions of sentential paraphrase pairs through neural backtranslation 2. automatically label these pairs with descriptive syntactic transformations 3. train a supervised encoder/decoder model on this labeled data to produce a paraphrase given the original sentence and a target syntactic form

training data via backtranslation isn't that more a topic for your priest ? translate to Czech není to více téma pro tvého kn ě ze? translate back to English are you sure that's not a topic for you to discuss with your priest ?

training data via backtranslation isn't that more a topic for your priest ? translate to Czech backtranslate the CzEng parallel corpus (Bojar et al., 2016) není to více téma pro tvého kn ě ze? using a state-of-the-art NMT system, which yields ~50 million paraphrase pairs translate back to English are you sure that's not a topic for you to discuss with your priest ?

through neural backtranslation, we can generate uncontrolled paraphrases. how can we achieve syntactic control?

labeling paraphrase pairs with descriptive syntactic transformations • first experiment: rule-based labels She drives home. She is driven home. active > passive • • Easy to write these rules, but low syntactic variance between the paraphrase pairs

using linearized syntactic parses as labels s 1 isn't that more a topic for your priest ? ( ROOT ( S ( VP ( VBZ ) ( RB ) ( SBARQ ( IN ) ( NP ( NP ( JJR ) ( NP ( NP ( p 1 DT ) ( NN ) ) ( PP ( IN ) ( NP ( PRP$ ) ( NN ) ) ) ) ) ) ) ) ( . ) ) ) are you sure that's not a topic for you to s 2 discuss with your priest ? ( ROOT ( SBARQ ( SQ ( VBP ) ( NP ( PRP ) ) ( ADJP ( JJ ) ( SBAR ( S ( NP ( DT ) ) p 2 ( VP ( VBZ ) ( RB ) ( NP ( DT ) ( NN ) ) ( SBAR ( IN ) ( S ( NP ( PRP ) ) ( VP ( TO ) ( VP ( VB ) ( PRT ( RP ) ) ( PP ( IN ) ( NP ( PRP$ ) ( NN ) ) ) ) ) ) ) ) ) ) ) ) ( . ) ) )

input to our model s 1 isn't that more a topic for your priest ? ( ROOT ( SBARQ ( SQ ( VBP ) ( NP ( PRP ) ) ( ADJP ( JJ ) ( SBAR ( S ( NP ( DT ) ) p 2 ( VP ( VBZ ) ( RB ) ( NP ( DT ) ( NN ) ) ( SBAR ( IN ) ( S ( NP ( PRP ) ) ( VP ( TO ) ( VP ( VB ) ( PRT ( RP ) ) ( PP ( IN ) ( NP ( PRP$ ) ( NN ) ) ) ) ) ) ) ) ) ) ) ) ( . ) ) ) are you sure that's not a topic for you to s 2 discuss with your priest ?

SCPN architecture The man is standing in the water at the base of a waterfall The man, at the base of the waterfall, is standing in the water paraphrase generator + input sentence s 1 target sentence s 2 The man , at the base of … The man is standing in the water … The man , at the base … + ( ROOT ( S ( … target parse p 2 ( ROOT ( S ( NP (NP ( DT ) ( NN ) ) ( , ) ( PP ( IN ) ( NP ( NP ( DT ) ( NN ) ) ( PP ( IN ) …

SCPN architecture The man is standing in the water at the base of a waterfall The man, at the base of the waterfall, is standing in the water input sentence s 1 encoder (e.g., BERT) The man is standing in the water …

SCPN architecture The man is standing in the water at the base of a waterfall The man, at the base of the waterfall, is standing in the water input sentence s 1 The man is standing in the water … parse encoder (fine-tuned BERT?) target parse p 2 ( ROOT ( S ( NP (NP ( DT ) ( NN ) ) ( , ) ( PP ( IN ) ( NP ( NP ( DT ) ( NN ) ) ( PP ( IN ) …

SCPN architecture The man is standing in the water at the base of a waterfall The man, at the base of the waterfall, is standing in the water decoder paraphrase generator copy mechanism (e.g., on encoder + Transformer) input sentence s 1 target sentence s 2 The man , at the base of … The man is standing in the water … The man , at the base … attention on parse encoder + ( ROOT ( S ( … target parse p 2 ( ROOT ( S ( NP (NP ( DT ) ( NN ) ) ( , ) ( PP ( IN ) ( NP ( NP ( DT ) ( NN ) ) ( PP ( IN ) …

specifying a full target parse is unwieldy we use the top two levels of the linearized parse tree as a parse template She drove home. (S (NP (PRP)) (VP (VBD) (NP (NN))) (.)) template: S → NP VP .

paraphrase quality • crowdsourced task, workers rate a paraphrase pair on a three point scale (Kok and Brockett, 2010) 0 = no paraphrase 1 = ungrammatical paraphrase 2 = grammatical paraphrase

paraphrase quality • crowdsourced task, workers rate a paraphrase pair on a three point scale (Kok and Brockett, 2010) 0 = no paraphrase 1 = ungrammatical paraphrase 2 = grammatical paraphrase no significant quality } loss despite adding syntactic control

adversarial evaluations • how many held-out examples can we “break”? • a development example x is “broken” if the original prediction y x is correct, but the prediction y x* for at least one paraphrase x* is incorrect. • this is only a valid measure if the paraphrase that breaks x actually has the same label as x • we conduct a crowdsourced evaluation to determine if the adversarial examples actually preserve the original label

two tasks • sentiment analysis (Stanford Sentiment Treebank) • binary classification of sentences (0 = negative, 1 = positive) • many long sentences with high syntactic variance • textual entailment (SICK) • 3-way classification of sentence pairs (0 = contradiction, 1 = neutral, 2 = entailment) • almost exclusively short, simple sentences

Paraphrase generation: adversarial examples / data augmentation CS - PowerPoint PPT Presentation

Paraphrase generation: adversarial examples / data augmentation CS 685, Fall 2020 Advanced Natural Language Processing Mohit Iyyer College of Information and Computer Sciences University of Massachusetts Amherst stuff from last time HW1

Paraphrase Recognition Using Machine Learning to Combine Similarity Measures Prodromos

Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang Saarland

Summary-Paraphrase-Analysis 1 revised: 10.06.11 || English 1301: Composition & Rhetoric I ||

Paraphrase Generation from Latent-Variable PCFGs for Semantic Parsing Shashi Narayan, Siva Reddy,

Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations Vered Shwartz and Ido Dagan

You know you want it. An Advance Warning To paraphrase Aaron Bedra: Im going to sound like

A Study In Hebrew Paraphrase Identification Thesis Presentation Submitted by Gabriel Stanovsky

Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection Youxuan Lucy Jiang,

Personality Driven Di ff erences in Paraphrase Preference Daniel Preot iuc-Pietro Joint work

Week 3 Rhetorical Reading, Summary, and Paraphrase Some ads to consider, No. 1 No. 2 No. 3

Paraphrasing MS. STRAUSSS EPS CLASS What is paraphrasing? Taking An effective paraphrase

Topics in Computational Linguistics Learning to Paraphrase: An Unsupervised Approach Using

Social Media & Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE

Procedural Generation Lauri Kongas What is procedural generation? Procedural Generation It is

Procedural Generation Kaarel T onisson 2018-04-20 Kaarel T onisson Procedural Generation

Antonella Bogoni CNIT-TECIP Microwave Signal Generation High purity carrier generation

1 3 In Intro: God often uses His creatures to teach us lessons: Isa. 1:3; Psa. 32:9; Job

Hardware Observability Framework Hardware Observability Framework Hardware Observability

Sensitive Data Exposure Emmanuel Benoist Fall Term 2020/2021 Berner Fachhochschule | Haute

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

Mining Topics in Documents Standing on the Shoulders of Big Data Zhiyuan (Brett) Chen and Bing

13 When Joshua was near the town of Jericho, he looked up and saw a man standing in front of him

Welcome to Advanced Standing Course Planning Tim Colenback Assistant Dean for S tudent S

Designing future-proof smart contract systems Jorge Izquierdo Devcon3 Aragon Decentralized

Paraphrase generation: adversarial examples / data augmentation CS - PowerPoint PPT Presentation

Paraphrase generation: adversarial examples / data augmentation CS 685, Fall 2020 Advanced Natural Language Processing Mohit Iyyer College of Information and Computer Sciences University of Massachusetts Amherst stuff from last time HW1

Paraphrase Recognition Using Machine Learning to Combine Similarity Measures Prodromos

Using Discourse Information for Paraphrase Extraction Michaela Regneri &amp; Rui Wang Saarland

Summary-Paraphrase-Analysis 1 revised: 10.06.11 || English 1301: Composition &amp; Rhetoric I ||

Paraphrase Generation from Latent-Variable PCFGs for Semantic Parsing Shashi Narayan, Siva Reddy,

Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations Vered Shwartz and Ido Dagan

You know you want it. An Advance Warning To paraphrase Aaron Bedra: Im going to sound like

A Study In Hebrew Paraphrase Identification Thesis Presentation Submitted by Gabriel Stanovsky

Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection Youxuan Lucy Jiang,

Personality Driven Di ff erences in Paraphrase Preference Daniel Preot iuc-Pietro Joint work

Week 3 Rhetorical Reading, Summary, and Paraphrase Some ads to consider, No. 1 No. 2 No. 3

Paraphrasing MS. STRAUSSS EPS CLASS What is paraphrasing? Taking An effective paraphrase

Topics in Computational Linguistics Learning to Paraphrase: An Unsupervised Approach Using

Social Media &amp; Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE

Procedural Generation Lauri Kongas What is procedural generation? Procedural Generation It is

Procedural Generation Kaarel T onisson 2018-04-20 Kaarel T onisson Procedural Generation

Antonella Bogoni CNIT-TECIP Microwave Signal Generation High purity carrier generation

1 3 In Intro: God often uses His creatures to teach us lessons: Isa. 1:3; Psa. 32:9; Job

Hardware Observability Framework Hardware Observability Framework Hardware Observability

Sensitive Data Exposure Emmanuel Benoist Fall Term 2020/2021 Berner Fachhochschule | Haute

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

Mining Topics in Documents Standing on the Shoulders of Big Data Zhiyuan (Brett) Chen and Bing

13 When Joshua was near the town of Jericho, he looked up and saw a man standing in front of him

Welcome to Advanced Standing Course Planning Tim Colenback Assistant Dean for S tudent S

Designing future-proof smart contract systems Jorge Izquierdo Devcon3 Aragon Decentralized

Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang Saarland

Summary-Paraphrase-Analysis 1 revised: 10.06.11 || English 1301: Composition & Rhetoric I ||

Social Media & Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE