A Study In Hebrew Paraphrase Identification Thesis Presentation - PowerPoint PPT Presentation

Definitions and Motivation Previous Work Contribution of this Work A Study In Hebrew Paraphrase Identification Thesis Presentation Submitted by Gabriel Stanovsky Advised by Prof. Michael Elhadad

Definitions and Motivation Previous Work Contribution of this Work Outline Definitions and Motivation 1 What is a Paraphrase? Linguistic Background Previous Work 2 Overview Deep Learning Method Recursive Auto Encoding State of the Art English Paraphrasing Identification Contribution of this Work 3 Algorithms Developed Generated Resources Results

Definitions and Motivation Previous Work Contribution of this Work Textual Entailment Text fragment A will Textually Entail B if a human being who trusts A , on all its parts - will consequently have to infer that B is also true. Example A - דחיהכזוסוקיינתאנפתצובקל�ייפלאתנשבעיגהשטק�ורוד הנשהתואבהפוריאתופילאבהמיע B - �ייפלאתנשבהפוריאתופילאבהתכזסוקיינתאנפתצובק

Definitions and Motivation Previous Work Contribution of this Work Paraphrase Text fragments ( A , B ) are said to be in Paraphrase Relationship if A entails B and vice-versa. Paraphrase Identification is the task of determining whether two given texts stand in a relation of paraphrasing. Simple Example A - רצואהרשתיבלומהנגפהבדאמלקהעצפנריפשויתס B - רצואהרשתיבדילהנגפהבדאמלקהעגפנריפשויתס

Definitions and Motivation Previous Work Contribution of this Work Paraphrase? The first example was a very simple one - what about the following pairs? Paraphrase? היסור�עטפנרחס�כסהלהעיגה�יס היסור�עטפנרחסמ�כסהלעהמתח�לועבתסלכואמההנידמה

Definitions and Motivation Previous Work Contribution of this Work Paraphrase? The first example was a very simple one - what about the following pairs? Paraphrase? הירוסבתורירגשההרגסבהרא הירוסמהרירגשתאהריזחהבהרא

Definitions and Motivation Previous Work Contribution of this Work Paraphrase? The first example was a very simple one - what about the following pairs? Paraphrase? הירוסב�ינורחאה�יעוראהתאהרקיסתשרה הירוסבתונורחאהתויושחרתההלעהחווידהריזגלאתשר

Definitions and Motivation Previous Work Contribution of this Work Paraphrase? The first example was a very simple one - what about the following pairs? Paraphrase? �ראלרזח�כמרחאלויסורהותימע�עשגפנוהינתנ�ימינב �ישדחה�ילועהתייגוסתאוינפב�ייצויסורהוליבקמ�עדעונוהינתנ�ימינב We are in need of rigorous definitions! These were produced for Hebrew during the course of this work, following similar English definitions.

Definitions and Motivation Previous Work Contribution of this Work Why Paraphrase? Automatic Summarization : while scanning through a 1 document, paraphrases found in text body could be detected, and then omitted, in order to provide a shorter version of the document

Definitions and Motivation Previous Work Contribution of this Work Why Paraphrase? Automatic Construction of Thesaurus : Identifying 2 paraphrases from freely occurring text in conjunction with exploiting knowledge of the sentence structure can be used to yield a bank of Hebrew words which are, with high probability, synonyms.

Definitions and Motivation Previous Work Contribution of this Work Why Paraphrase? Automatic Filter of News Stream : Identification of 3 paraphrases can be used upon a parallel news stream to detect the first occurrence of a news item (a task recently known as “first story detection”)

Definitions and Motivation Previous Work Contribution of this Work Why Paraphrase? Plus , it is a challenging task of automating a process 4 which is carried out naturally and with no apparent effort by human beings.

Definitions and Motivation Previous Work Contribution of this Work What’s Interesting in Hebrew Paraphrasing? Word Agglutination : function words (prepositions, 1 conjunctions and articles) in Hebrew can be agglutinated with other words - giving its speakers more articulations to express the same meaning. Example �יצעהלש�יענה�לצבבשיאוה השרוחבעיגרמהלצהתחתחנאוה

Definitions and Motivation Previous Work Contribution of this Work What’s Interesting in Hebrew Paraphrasing? Syntactic Variations Exploiting Free-Word Order in 2 Hebrew : Sentences in Hebrew may be expressed in different word orderings, as a tool to emphasize different notions within the same occurrence Example הגועהתאיתנכהינא יתנכההגועהתא

Definitions and Motivation Previous Work Contribution of this Work What’s Interesting in Hebrew Paraphrasing? Lexical Replacement : Replacing a Hebrew word with 3 another derived from another language with transliterations, changing its part of speech Example הנואתב�יטולחלהסרהנתינוכמה הנואתהתובקעבסוללאטוטהרבעתינוכמה

Definitions and Motivation Previous Work Contribution of this Work Outline Definitions and Motivation 1 What is a Paraphrase? Linguistic Background Previous Work 2 Overview Deep Learning Method Recursive Auto Encoding State of the Art English Paraphrasing Identification Contribution of this Work 3 Algorithms Developed Generated Resources Results

Definitions and Motivation Previous Work Contribution of this Work Parsing Parsing (also referred to as Syntax Analysis) is the process that maps an input sentence to the more abstract representation of a syntactic tree. This tree represents the relation among words within the sentence. The parse tree of a sentence is not naturally embedded in the text itself. Language specific information (such as the language grammar, and knowledge of specific relation between words) is often needed.

Definitions and Motivation Previous Work Contribution of this Work Parsing Conventions Parsing is commonly a basic feature of NLP systems, and will play a prominent role in the systems to be described henceforth. Several conventions exist for the construction of parse trees. The dominants are that of constituency parsing and dependency parsing We will use pre-trained systems for Hebrew parsing. for both of these conventions

Definitions and Motivation Previous Work Contribution of this Work Phrase Structure Grammar Phrase structure grammar (constituent grammar) was originally defined by Chomsky(1956) as part of the generative school. A Phrase structure grammar is formally defined as a 4-tuple G = ( N , T , S , P ) : N � T = ∅ , N - The non-terminal set , T - the terminal set 1 S ∈ N , S being the start symbol 2 P = { ( u , v ) : u , v ∈ ( N � T ) ∗ } , P is finite and called the 3 production rules

Definitions and Motivation Previous Work Contribution of this Work Phrase Structure Grammar According to these grammars, the leaves ( terminals ) of the parse tree are the words of the original sentence, appearing in the original sentence order. The rules by which an input sentence is parsed onto a parse tree are thus a modeling of a human language.

Definitions and Motivation Previous Work Contribution of this Work Phrase Structure Example FRAG yyDOT NP . NP NP CC ו NP NP MOD MOD NN אל NN אל רעי �יבוד

Definitions and Motivation Previous Work Contribution of this Work Dependency Grammar Dependency Grammar date to the work of Tesni‘re (1959) Dependency parsing views the syntax analysis of a sentence as consisting of binary asymmetrical relations between words. According to this linguistic theory, the speaker of a language analyzes syntax by perceiving connections between words, the dependency relation aims at modeling this connection

Definitions and Motivation Previous Work Contribution of this Work Dependency Rules Various definitions exist for determining when two words will appear in a dependency relation. Following are a few of these definitions (where: H marks the head and D marks the dependent): H is obligatory, D may be optional. The form of D depends on H . The linear position of D is specified with reference to H .

Definitions and Motivation Previous Work Contribution of this Work Dependency Grammar Example

Definitions and Motivation Previous Work Contribution of this Work Parsing as a Step Towards Detecting Paraphrases Parsing seems a necessary step to assess whether two sentences are syntactic variants. With parsing one can align paraphrase candidates so that each part of the sentence can be further analyzed in terms of lexical similarity. This is exemplified in the next slide

Definitions and Motivation Previous Work Contribution of this Work Dependency Transliteration Paraphrasing Example

A Study In Hebrew Paraphrase Identification Thesis Presentation - PowerPoint PPT Presentation

Definitions and Motivation Previous Work Contribution of this Work A Study In Hebrew Paraphrase Identification Thesis Presentation Submitted by Gabriel Stanovsky Advised by Prof. Michael Elhadad Definitions and Motivation Previous Work

The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew Bible (Old Testament) The

Devotionals from the Psalms 2 nd Edition The Hebrew Scriptures The Hebrew Scriptures The Hebrew

D EVOTIO N ALS DEVOTIONALS The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

The Hebrew-Christian Messiah; Or, the Presentation of the Messiah to the The Hebrew-Christian

Paraphrase Recognition Using Machine Learning to Combine Similarity Measures Prodromos

Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang Saarland

Summary-Paraphrase-Analysis 1 revised: 10.06.11 || English 1301: Composition & Rhetoric I ||

Social Media & Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE

Where s at 1 Modern Hebrew feminine suffixes Noam Faust Where 's at: the

Text Mining in Hebrew Impact of Morphology Analysis on Topic Analysis and on Search Quality

HEBREW CANON HEBREW CANON TORAH NEVIIM KETHUVIM Psalms Joshua Genesis Proverbs Judges

HEBREW DATING Nachum Dershowitz Edward M. Reingold CALENDARS Gregorian Julian

XPANDER: TOWARDS OPTIMAL-PERFORMANCE DATACENTERS Asaf Valadarsky (Hebrew University) Gal Shahaf

Identification of Transliterated Foreign Words in Hebrew Script Yoav Goldberg Michael Elhadad

RISK IDENTIFICATION Everything your competitor knows about Risk Identification on Software

Intermediate representations of functional programming languages for software quality control

A Text Alignment Corpus for Persian Plagiarism Detection Fatemeh Mashhadirajab, Mehrnoush

Model-Based Development To develop complex software systems Model Validate Refine

Identifying successful features in extended definitions from Chemistry: A corpus study

Aylin nald FOAI 11 Thank you FOAI for bringing together the assessment professionals for

Modelling and Analysis of Traffic Networks Based on Graph Transformation Juan de Lara E.T.S. de

Medical Education Online FEATURE ARTICLE The oral case presentation: toward a

Goulamina Lithium Project Gr Greg g Wa Walker Exe xecutiv ive Dir Director & Chie ief

A Study In Hebrew Paraphrase Identification Thesis Presentation - PowerPoint PPT Presentation

Definitions and Motivation Previous Work Contribution of this Work A Study In Hebrew Paraphrase Identification Thesis Presentation Submitted by Gabriel Stanovsky Advised by Prof. Michael Elhadad Definitions and Motivation Previous Work

The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew Bible (Old Testament) The

Devotionals from the Psalms 2 nd Edition The Hebrew Scriptures The Hebrew Scriptures The Hebrew

D EVOTIO N ALS DEVOTIONALS The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

The Hebrew-Christian Messiah; Or, the Presentation of the Messiah to the The Hebrew-Christian

Paraphrase Recognition Using Machine Learning to Combine Similarity Measures Prodromos

Using Discourse Information for Paraphrase Extraction Michaela Regneri &amp; Rui Wang Saarland

Summary-Paraphrase-Analysis 1 revised: 10.06.11 || English 1301: Composition &amp; Rhetoric I ||

Social Media &amp; Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE

Where s at 1 Modern Hebrew feminine suffixes Noam Faust Where 's at: the

Text Mining in Hebrew Impact of Morphology Analysis on Topic Analysis and on Search Quality

HEBREW CANON HEBREW CANON TORAH NEVIIM KETHUVIM Psalms Joshua Genesis Proverbs Judges

HEBREW DATING Nachum Dershowitz Edward M. Reingold CALENDARS Gregorian Julian

XPANDER: TOWARDS OPTIMAL-PERFORMANCE DATACENTERS Asaf Valadarsky (Hebrew University) Gal Shahaf

Identification of Transliterated Foreign Words in Hebrew Script Yoav Goldberg Michael Elhadad

RISK IDENTIFICATION Everything your competitor knows about Risk Identification on Software

Intermediate representations of functional programming languages for software quality control

A Text Alignment Corpus for Persian Plagiarism Detection Fatemeh Mashhadirajab, Mehrnoush

Model-Based Development To develop complex software systems Model Validate Refine

Identifying successful features in extended definitions from Chemistry: A corpus study

Aylin nald FOAI 11 Thank you FOAI for bringing together the assessment professionals for

Modelling and Analysis of Traffic Networks Based on Graph Transformation Juan de Lara E.T.S. de

Medical Education Online FEATURE ARTICLE The oral case presentation: toward a

Goulamina Lithium Project Gr Greg g Wa Walker Exe xecutiv ive Dir Director &amp; Chie ief

Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang Saarland

Summary-Paraphrase-Analysis 1 revised: 10.06.11 || English 1301: Composition & Rhetoric I ||

Social Media & Text Analysis lecture 7 - Paraphrase Identification and Linear Regression CSE

Goulamina Lithium Project Gr Greg g Wa Walker Exe xecutiv ive Dir Director & Chie ief