Language Processing with Perl and Prolog Chapter 8: Part-of-Speech - PowerPoint PPT Presentation

Language Technology Language Processing with Perl and Prolog Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://cs.lth.se/pierre_nugues/ Pierre Nugues Language Processing with Perl and Prolog 1 / 9

Language Technology Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques POS Annotation with Statistical Methods Modeling the problem: t 1 , t 2 , t 3 ,..., t n → noisy channel → w 1 , w 2 , w 3 ,..., w n . The optimal part of speech sequence is ˆ T = argmax P ( t 1 , t 2 , t 3 ,..., t n | w 1 , w 2 , w 3 ,..., w n ) , t 1 , t 2 , t 3 ,..., t n The Bayes’ rule on conditional probabilities: P ( A | B ) P ( B ) = P ( B | A ) P ( A ) . ˆ T = argmax P ( T ) P ( W | T ) . T P ( T ) and P ( W | T ) are simplified and estimated on hand-annotated corpora, the “gold standard”. Pierre Nugues Language Processing with Perl and Prolog 2 / 9

Language Technology Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques The First Term: N -Gram Approximation n ∏ P ( T ) = P ( t 1 , t 2 , t 3 ,..., t n ) ≈ P ( t 1 ) P ( t 2 | t 1 ) P ( t i | t i − 2 , t i − 1 ) . i = 3 If we use a start-of-sentence delimiter <s> , the two first terms of the product, P ( t 1 ) P ( t 2 | t 1 ) , are rewritten as P ( < s > ) P ( t 1 | < s > ) P ( t 2 | < s >, t 1 ) , where P ( < s > ) = 1. We estimate the probabilities with the maximum likelihood, P MLE : P MLE ( t i | t i − 2 , t i − 1 ) = C ( t i − 2 , t i − 1 , t i ) C ( t i − 2 , t i − 1 ) . Pierre Nugues Language Processing with Perl and Prolog 3 / 9

Language Technology Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques Sparse Data If N p is the number of the different parts-of-speech tags, there are N p × N p × N p values to estimate. If data is missing, we can back off to bigrams: n ∏ P ( T ) = P ( t 1 , t 2 , t 3 ,..., t n ) ≈ P ( t 1 ) P ( t i | t i − 1 ) . i = 2 Or to unigrams: n ∏ P ( T ) = P ( t 1 , t 2 , t 3 ,..., t n ) ≈ P ( t i ) . i = 1 And finally, we can combine linearly these approximations: P LinearInter ( t i | t i − 2 t i − 1 ) = λ 1 P ( t i | t i − 2 t i − 1 )+ λ 2 P ( t i | t i − 1 )+ λ 3 P ( t i ) , with λ 1 + λ 2 + λ 3 = 1, for example, λ 1 = 0 . 6, λ 2 = 0 . 3, λ 3 = 0 . 1. Pierre Nugues Language Processing with Perl and Prolog 4 / 9

Language Technology Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques The Second Term The complete word sequence knowing the part-of-speech sequence is usually approximated as: n ∏ P ( W | T ) = P ( w 1 , w 2 , w 3 ,..., w n | t 1 , t 2 , t 3 ,..., t n ) ≈ P ( w i | t i ) . i = 1 Like the previous probabilities, P ( w i | t i ) is estimated from hand-annotated corpora using the maximum likelihood: P MLE ( w i | t i ) = C ( w i , t i ) C ( t i ) . For N w different words, there are N p × N w values to obtain. But in this case, many of the estimates will be 0. Pierre Nugues Language Processing with Perl and Prolog 5 / 9

Language Technology Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques Viterbi (Informal) Je le donne demain dans la matinée ‘I give it tomorrow in the morning’ le /ART donne /VERB la /ART je /PRO demain /ADV dans /PREP matinée /NOUN le /PRO donne /NOUN la /PRO Pierre Nugues Language Processing with Perl and Prolog 7 / 9

Language Technology Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques Viterbi (Informal) The term brought by the word demain has still the memory of the ambiguity of donne : P ( adv | verb ) × P ( demain | adv ) and P ( adv | noun ) × P ( demain | adv ) . This is no longer the case with dans . According to the noisy channel model and the bigram assumption, the term brought by the word dans is P ( dans | prep ) × P ( prep | adv ) . It does not show the ambiguity of le and donne . The subsequent terms will ignore it as well. We can discard the corresponding paths. The optimal path does not contain nonoptimal subpaths. Pierre Nugues Language Processing with Perl and Prolog 8 / 9

Language Technology Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques Supervised Learning: A Summary Needs a manually annotated corpus called the Gold Standard The Gold Standard may contain errors ( errare humanum est ) that we ignore A classifier is trained on a part of the corpus, the training set , and evaluated on another part, the test set , where automatic annotation is compared with the “Gold Standard” N-fold cross validation is used avoid the influence of a particular division Some algorithms may require additional optimization on a development set Classifiers can use statistical or symbolic methods Pierre Nugues Language Processing with Perl and Prolog 9 / 9

Language Processing with Perl and Prolog Chapter 8: Part-of-Speech - PowerPoint PPT Presentation

Language Technology Language Processing with Perl and Prolog Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://cs.lth.se/pierre_nugues/ Pierre Nugues Language

Language Processing with Perl and Prolog A Short Introduction to Prolog Pierre Nugues Lund

Language Processing with Perl and Prolog Chapter 9: Phrase-Structure Grammars in Prolog Pierre

Introduction to Perl Pinkhas Nisanov Perl culture Perl - Practical Extraction and Report

Language Processing with Perl and Prolog Chapter 2: Corpus Processing Tools Pierre Nugues Lund

Intro to Perl Practical Extraction and Reporting Language CIS 218 Perl Syntax Perl is an

Language Processing with Perl and Prolog Chapter 17: Dialogue Pierre Nugues Lund University

Language Processing with Perl and Prolog Chapter 5: Counting Words Pierre Nugues Lund University

Language Processing with Perl and Prolog Chapter 5: Counting Words Pierre Nugues Lund University

Language Processing with Perl and Prolog Chapter 11: Syntactic Formalisms Pierre Nugues Lund

Language Processing with Perl and Prolog Chapter 15: Lexical Semantics Pierre Nugues Lund

Language Processing with Perl and Prolog Chapter 10: Partial Parsing Pierre Nugues Lund

The Perl 6 Express Jonathan Worthington Belgian Perl Workshop 2009 The Perl 6 Express About

Solved In Perl 6 Jonathan Worthington Seoul.pm Solved in Perl 6 About Me Solved in Perl 6

Implementing Perl 6 Jonathan Worthington Dutch Perl Workshop 2008 Implementing Perl 6 I

An Introduction to Prolog Programming 1 What is Prolog? Prolog ( pro gramming in log ic) is a

Prolog Prolog.1 Textbook Title u PROLOG programming for artificial intelligence l Author u

Evidence that Palmer Station Antarctica Seasonal O 2 and CO 2 Cycles Understate Regional Marine

Isometries of subRiemannian manifolds Alessandro Ottazzi (in collaboration with Enrico Le Donne)

Staff engagement Tips and tricks on getting buy-in Logistics Please use chat and for for

8/4/2014 All the articles of our religion, all the canons of our church, all the injunctions of

Rtro-ingnierie matrielle pour les reversers logiciels : cas dun DD externe chiffr

A new stable surface in the Heisenberg group Fellow: Sebastiano Nicolussi Golo Advisors:

Preuves Formelles dIn egalit es et Programmation Semi-D efinie Directeur: Benjamin

Strong Verb Conjugation & Anomalous Verbs M&R 110114, 128130 ENG240Y Old

Language Processing with Perl and Prolog Chapter 8: Part-of-Speech - PowerPoint PPT Presentation

Language Technology Language Processing with Perl and Prolog Chapter 8: Part-of-Speech Tagging Using Stochastic Techniques Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://cs.lth.se/pierre_nugues/ Pierre Nugues Language

Language Processing with Perl and Prolog A Short Introduction to Prolog Pierre Nugues Lund

Language Processing with Perl and Prolog Chapter 9: Phrase-Structure Grammars in Prolog Pierre

Introduction to Perl Pinkhas Nisanov Perl culture Perl - Practical Extraction and Report

Language Processing with Perl and Prolog Chapter 2: Corpus Processing Tools Pierre Nugues Lund

Intro to Perl Practical Extraction and Reporting Language CIS 218 Perl Syntax Perl is an

Language Processing with Perl and Prolog Chapter 17: Dialogue Pierre Nugues Lund University

Language Processing with Perl and Prolog Chapter 5: Counting Words Pierre Nugues Lund University

Language Processing with Perl and Prolog Chapter 5: Counting Words Pierre Nugues Lund University

Language Processing with Perl and Prolog Chapter 11: Syntactic Formalisms Pierre Nugues Lund

Language Processing with Perl and Prolog Chapter 15: Lexical Semantics Pierre Nugues Lund

Language Processing with Perl and Prolog Chapter 10: Partial Parsing Pierre Nugues Lund

The Perl 6 Express Jonathan Worthington Belgian Perl Workshop 2009 The Perl 6 Express About

Solved In Perl 6 Jonathan Worthington Seoul.pm Solved in Perl 6 About Me Solved in Perl 6

Implementing Perl 6 Jonathan Worthington Dutch Perl Workshop 2008 Implementing Perl 6 I

An Introduction to Prolog Programming 1 What is Prolog? Prolog ( pro gramming in log ic) is a

Prolog Prolog.1 Textbook Title u PROLOG programming for artificial intelligence l Author u

Evidence that Palmer Station Antarctica Seasonal O 2 and CO 2 Cycles Understate Regional Marine

Isometries of subRiemannian manifolds Alessandro Ottazzi (in collaboration with Enrico Le Donne)

Staff engagement Tips and tricks on getting buy-in Logistics Please use chat and for for

8/4/2014 All the articles of our religion, all the canons of our church, all the injunctions of

Rtro-ingnierie matrielle pour les reversers logiciels : cas dun DD externe chiffr

A new stable surface in the Heisenberg group Fellow: Sebastiano Nicolussi Golo Advisors:

Preuves Formelles dIn egalit es et Programmation Semi-D efinie Directeur: Benjamin

Strong Verb Conjugation &amp; Anomalous Verbs M&amp;R 110114, 128130 ENG240Y Old

Strong Verb Conjugation & Anomalous Verbs M&R 110114, 128130 ENG240Y Old