Social Media & Text Analysis lecture 7 - Paraphrase - PowerPoint PPT Presentation

Social Media & Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE 5539-0010 Ohio State University Instructor: Alan Ritter Website: socialmedia-class.org Alan Ritter ◦ socialmedia-class.org

(Recap) what is Paraphrase? “sentences or phrases that convey approximately the same meaning using different words” — (Bhagat & Hovy, 2012)

(Recap) what is Paraphrase? “sentences or phrases that convey approximately the same meaning using different words” — (Bhagat & Hovy, 2012) rich wealthy word

(Recap) what is Paraphrase? “sentences or phrases that convey approximately the same meaning using different words” — (Bhagat & Hovy, 2012) rich wealthy word His Majesty’s address the king’s speech phrase

(Recap) what is Paraphrase? “sentences or phrases that convey approximately the same meaning using different words” — (Bhagat & Hovy, 2012) rich wealthy word His Majesty’s address the king’s speech phrase … after Boeing Co. Chief … the forced resignation Executive Harry Stonecipher of the CEO of Boeing, sentence was ousted from … Harry Stonecipher, for …

The Ideal

(Recap) Paraphrase Research 80s '01 '04 '05 '13 '11 '12* ‘01 '13* '14* '15*'16* WordNet Novels News Bi-Text Video Style Web Twitter Simple Xu Xu Ritter Callison-Burch Dolan Napoles Grishman Cherry Xu Ritter Callison-Burch Dolan Ji

Distributional Similarity Lin and Panel (2001) operationalize the Distributional Hypothesis using dependency relationships to define similar environments. Duty and responsibility share a similar set of dependency contexts in large volumes of text: modified by adjectives objects of verbs additional, administrative, assert, assign, assume, assigned, assumed, attend to, avoid, become, collective, congressional, breach ... constitutional ... Decking Lin and Patrick Pantel. “DIRT - Discovery of Inference Rules from Text” In KDD (2001)

Bilingual Pivoting word alignment ... 5 farmers were in Ireland ... thrown into jail ... fünf Landwirte , weil ... festgenommen ... oder wurden , gefoltert ... festgenommen ... or have been imprisoned , tortured ... Source: Chris Callison-Burch

Key Limitations of PPDB?

Key Limitations of PPDB? word sense insect, beetle, microbe, virus, pest, mosquito, bacterium, fly germ, parasite microphone, tracker, mic, bug bother, annoy, wire, earpiece, pester cookie squealer, snitch, rat, mole glitch, error, malfunction, fault, failure Source: Chris Callison-Burch

Another Key Limitation 80s '01 '04 '05 '13 '11 '12* ‘01 '13* '14* '15*'16* WordNet Novels News Bi-Text Video Style Web Twitter Simple only paraphrases, no non-paraphrases

Paraphrase Identification obtain sentential paraphrases automatically Yes!% Mancini has been sacked by Manchester City Mancini gets the boot from Man City No!$ WORLD OF JENKS IS ON AT 11 World of Jenks is my favorite show on tv (meaningful) non-paraphrases are needed to train classifiers! Wei Xu , Alan Ritter, Chris Callison-Burch, Bill Dolan, Yangfeng Ji. “Extracting Lexically Divergent Paraphrases from Twitter” In TACL (2014)

Also Non-Paraphrases 80s '01 '04 '05 '13 '11 '12* ‘01 '13* ’14* 17* '15*'16* WordNet Novels News Bi-Text Video Style Web Twitter Simple (meaningful) non-paraphrases are needed to train classifiers!

News Paraphrase Corpus Microsoft Research Paraphrase Corpus also contains some non-paraphrases (Dolan, Quirk and Brockett, 2004; Dolan and Brockett, 2005; Brockett and Dolan, 2005)

Twitter Paraphrase Corpus also contains a lot of non-paraphrases Wei Xu , Alan Ritter, Chris Callison-Burch, Bill Dolan, Yangfeng Ji. “Extracting Lexically Divergent Paraphrases from Twitter” In TACL (2014)

Paraphrase Identification: A Binary Classification Problem • Input: - a sentence pair x - a fixed set of binary classes Y = {0, 1} • Output: - a predicted class y ∈ Y ( y = 0 or y = 1 ) Alan Ritter ◦ socialmedia-class.org

Paraphrase Identification: A Binary Classification Problem • Input: negative (non-paraphrases) - a sentence pair x - a fixed set of binary classes Y = {0, 1} • Output: - a predicted class y ∈ Y ( y = 0 or y = 1 ) Alan Ritter ◦ socialmedia-class.org

Paraphrase Identification: A Binary Classification Problem • Input: negative (non-paraphrases) - a sentence pair x - a fixed set of binary classes Y = {0, 1} positive (paraphrases) • Output: - a predicted class y ∈ Y ( y = 0 or y = 1 ) Alan Ritter ◦ socialmedia-class.org

Paraphrase Identification: A Binary Classification Problem • Input: - a sentence pair x - a fixed set of binary classes Y = {0, 1} • Output: - a predicted class y ∈ Y ( y = 0 or y = 1 ) Alan Ritter ◦ socialmedia-class.org

Classification Method: Supervised Machine Learning • Input: - a sentence pair x - a fixed set of binary classes Y = {0, 1} - a training set of m hand-labeled sentence pairs   (x (1) , y (1) ), … , (x (m) , y (m) ) • Output: - a learned classifier 𝜹 : x → y ∈ Y ( y = 0 or y = 1 ) Alan Ritter ◦ socialmedia-class.org

Classification Method: Supervised Machine Learning • Input: - a sentence pair x (represented by features) - a fixed set of binary classes Y = {0, 1} - a training set of m hand-labeled sentence pairs   (x (1) , y (1) ), … , (x (m) , y (m) ) • Output: - a learned classifier 𝜹 : x → y ∈ Y ( y = 0 or y = 1 ) Alan Ritter ◦ socialmedia-class.org

(Recap) Classification Method: Supervised Machine Learning • Naïve Bayes • Logistic Regression • Support Vector Machines (SVM) • … Alan Ritter ◦ socialmedia-class.org

(Recap) Naïve Bayes • Cons: features t i are assumed independent given the class y P ( t 1 , t 2 ,..., t n | y ) = P ( t 1 | y ) ⋅ P ( t 2 | y ) ⋅ ... ⋅ P ( t n | y ) • This will cause problems: - correlated features ➞ double-counted evidence - while parameters are estimated independently - hurt classier’s accuracy Alan Ritter ◦ socialmedia-class.org

Classification Method: Supervised Machine Learning • Naïve Bayes • Logistic Regression • Support Vector Machines (SVM) • … Alan Ritter ◦ socialmedia-class.org

Logistic Regression • One of the most useful supervised machine learning algorithm for classification! • Generally high performance for a lot of problems. • Much more robust than Naïve Bayes   (better performance on various datasets).

Before Logistic Regression Let’s start with something simpler!

Paraphrase Identification: Simplified Features • We use only one feature: - number of words that two sentences share in common Alan Ritter ◦ socialmedia-class.org

A very related problem of Paraphrase Identification: Semantic Textual Similarity • How similar (close in meaning) two sentences are? 5: completely equivalent in meaning 4: mostly equivalent, but some unimportant details differ 3: roughly equivalent, some important information differs/missing 2: not equivalent, but share some details 1: not equivalent, but are on the same topic 0: completely dissimilar Alan Ritter ◦ socialmedia-class.org

A Simpler Model: Linear Regression 5 Sentence Similarity (rated by Human) 4 3 2 1 0 0 5 10 15 20 #words in common (feature) Alan Ritter ◦ socialmedia-class.org

A Simpler Model: Linear Regression 5 Sentence Similarity 4 3 2 1 0 0 5 10 15 20 #words in common (feature) • also supervised learning (learn from annotated data) • but for Regression : predict real-valued output   (Classification: predict discrete-valued output) Alan Ritter ◦ socialmedia-class.org

A Simpler Model: Linear Regression 5 Sentence Similarity 4 3 threshold ➞ Classification 2 1 0 0 5 10 15 20 #words in common (feature) • also supervised learning (learn from labeled data) • but for Regression : predict real-valued output   (Classification: predict discrete-valued output) Alan Ritter ◦ socialmedia-class.org

Social Media & Text Analysis lecture 7 - Paraphrase - PowerPoint PPT Presentation

Social Media & Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE 5539-0010 Ohio State University Instructor: Alan Ritter Website: socialmedia-class.org Alan Ritter socialmedia-class.org (Recap) what is

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Presentation 1 What is social media? Get Media Smart social media 2 What is social media?

Social Media Legal Issues Brian C. England Deputy City Attorney Garland, Texas March 7, 2018

Social Media for Mason AGENDA What is Social Media Social Media Strategy Content

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Social Media donts What is social media Social media is nothing new Just an extension

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Social Media Analytics Ahmed Abbasi University of Virginia 1 Outline Social Media Overview

Quantitative Text Analysis. Applications to Social Media Research Pablo Barber a London

Getting Social What is social media? Why does social media matter? What social media

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

Quantitative Text Analysis. Applications to Social Media Research Pablo Barber a London

ABAX GROUP AS Q3 2020 Report CEO UPDATE Q3 2020 Q3 2020 saw a further improvement of the

http://cs224w.stanford.edu Probabilistic models of network contagion Probabilistic models

A simplicial approach to the non-Abelian Chern-Simons path integral Atle Hahn Group of

FutureGrid 100 and 101 (part one) Virtual School for Computational Science and Engineering July

Text Operations Text Operations Berlin Chen 2005 References: 1. Modern Information Retrieval,

Lecture 21: Consensus - 1 1 Consensus is agreement. A question is seen as settled. Sometimes a

The Platform is the Message James Grimmelmann The Governance and Regulation of Internet

Various applications of restricted Boltzmann machines for bad quality training data Maciej Ziba