pos tagging
play

POS Tagging HMMs L645 / B659 Dept. of Linguistics, Indiana - PowerPoint PPT Presentation

POS Tagging Definition Tagsets Automatic POS Tagging Bigram tagging MLE POS Tagging HMMs L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 1 / 17 POS Tagging Def. Part of Speech Tagging Definition Tagsets Automatic POS


  1. POS Tagging Definition Tagsets Automatic POS Tagging Bigram tagging MLE POS Tagging HMMs L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 1 / 17

  2. POS Tagging Def. Part of Speech Tagging Definition Tagsets Automatic POS Tagging Bigram tagging MLE HMMs POS Tagging = Assigning word class information to words ex: the man bought a book determiner noun verb determiner noun 2 / 17

  3. POS Tagging Linguistic Questions Definition Tagsets Automatic POS Tagging Bigram tagging MLE HMMs ◮ How do we divide the text into individual word tokens ? ◮ How do we choose a tagset to represent all words? ◮ How do we select appropriate tags for individual words ? 3 / 17

  4. POS Tagging Tagsets Definition Tagsets Size of tagsets Automatic POS Tagging Bigram tagging ◮ English: MLE TOSCA 32 HMMs Penn treebank 36 BNC C5 61 Brown 77 LOB 132 London-Lund Corpus 197 TOSCA-ICE 270 ◮ Romanian: 614 ◮ Hungarian: ca. 2 100 4 / 17

  5. POS Tagging Penn Treebank Tagset CC Coord. conjunction RB Adverb Definition CD Cardinal number RBR Adverb, comparative Tagsets DT Determiner RBS Adverb, superlative Automatic POS EX Existential there RP Particle Tagging Bigram tagging FW Foreign word SYM Symbol MLE IN Prep. / subord. conj. TO to HMMs JJ Adjective UH Interjection JJR Adjective, comparative VB Verb, base form JJS Adjective, superlative VBD Verb, past tense LS List item marker VBG Verb, gerund / present part. MD VBN Modal Verb, past part. NN VBP Noun, singular or mass Verb, non-3rd p., sing. pres. NNS VBZ Noun, plural Verb, 3rd p. sing. pres. NP WDT Proper noun, singular Wh-determiner NPS WP Proper noun, plural Wh-pronoun PDT Predeterminer WP Possessive wh-pronoun POS Possessive ending WRB Wh-adverb PRP Personal pronoun , Comma PRP$ Possessive pronoun . Sentence-final punctuation 5 / 17

  6. POS Tagging Annotating POS Tags Definition Tagsets Automatic POS Tagging Bigram tagging Two fundamentally different approaches: MLE ◮ Start from scratch, find characteristics in words or HMMs context ( = rules) which give indication of word class ◮ e.g., if word ends in ‘‘ion’’, tag it as noun ◮ Accumulate lexicon, disambiguate words with more than one tag ◮ e.g., possible categories for ‘‘about’’: preposition, adverb, particle 6 / 17

  7. POS Tagging Automatic POS Tagging Definition Tagsets Automatic POS Tagging Bigram tagging Assumption: local context is sufficient MLE HMMs Examples: ◮ for the man : noun or verb? ◮ we will man : noun or verb? ◮ I can put : verb base form or past? ◮ re-cap real quick : adjective or adverb? 7 / 17

  8. POS Tagging Bigram Tagging Definition Tagsets Automatic POS Tagging Bigram tagging ◮ Basic assumption: POS tag only depends on word itself MLE and on the POS tag of the previous word HMMs ◮ Use lexicon to retrieve ambiguity class for words ◮ e.g., word: beginning , ambiguity class: [JJ, NN, VBG] ◮ For unknown words: use heuristics, e.g. all open class POS tags ◮ Disambiguation: look for most likely path through possibilities 8 / 17

  9. POS Tagging Bigram Tagging – Example Definition Tagsets Automatic POS time like an arrow flies Tagging Bigram tagging MLE HMMs S NN VBZ IN DT NN E NNS VB VB JJ RB 9 / 17

  10. POS Tagging Bigram Tagging – Probabilities Definition Tagsets Automatic POS Tagging Bigram tagging MLE P ( t 1 . . . t 5 ) = P ( t 1 | S ) P ( w 1 | t 1 ) P ( t 2 | t 1 ) P ( w 2 | t 2 ) . . . HMMs (Note: this is actually P ( t 1 . . . t 5 | w 1 . . . w 5 ) ) green = transition probabilities blue = lexical probabilities 10 / 17

  11. POS Tagging Bigram Tagging – Probability Table Definition Tagsets Automatic POS Tagging Probabilities (in %s from 0 to 100): Bigram tagging MLE P(time | NN) = 7.0727 P(NN | S) = 0.6823 P(IN | NNS) = 21.8302 HMMs P(time | VB) = 0.0005 P(VB | S) = 0.5294 P(VB | VBZ) = 0.7002 P(time | JJ) = 0 P(JJ | S) = 0.8033 P (VB | NNS) = 11.1406 P(flies | VBZ) = 0.4754 P(VBZ | NN) = 3.9005 P(RB | VBZ) = 15.0350 P(flies | NNS) = 0.1610 P(VBZ | VB) = 0.0566 P(RB | NNS) = 6.4721 P(like | IN) = 2.6512 P(VBZ | JJ) = 2.0934 P(DT | IN) = 31.4263 P(like | VB) = 2.8413 P(NNS | NN) = 1.6076 P(DT | VB) = 15.2649 P(like | RB) = 0.5086 P(NNS | VB) = 0.6566 P(DT | RB) = 5.3113 P(an | DT) = 1.4192 P(NNS | JJ) = 2.4383 P(NN | DT) = 38.0170 P(arrow | NN) = 0.0215 P(IN | VBZ) = 8.5862 P(E | NN) = 0.2069 11 / 17

  12. POS Tagging Bigram Tagging – Counter-Examples Definition Tagsets ◮ start before Automatic POS Tagging ◮ start before the course or start before he is Bigram tagging done MLE ◮ real quick HMMs ◮ re-cap real quick or a real quick lunch ◮ barely changed ◮ he was barely changed or he barely changed his contents ◮ that beginning ◮ that beginning part or that beginning frightened the students or with that beginning early, he was forced ... 12 / 17

  13. POS Tagging Maximum Likelihood Estimation Definition Tagsets Automatic POS Tagging Bigram tagging Simplest way to calculate such probabilities from a corpus: MLE HMMs P MLE ( t n | t n − 1 ) = C ( t n − 1 t n ) C ( t n − 1 ) P MLE ( w n | t n ) = C ( w n t n ) C ( t n ) ◮ Uses relative frequency ◮ Maximizes the probabilities of the corpus 13 / 17

  14. POS Tagging Maximum Likelihood Estimation (2) Definition Tagsets Automatic POS Tagging Bigram tagging MLE ◮ Not a great estimator: zero probabilities for unseen HMMs events makes them impossible ◮ Need smoothing or discounting method to give minimal probabilities to unseen events ◮ Simplest possibility: learn from hapax legomena (words that appear only once) 14 / 17

  15. POS Tagging Motivating Hidden Markov Models Definition Tagsets Automatic POS Tagging Thinking back to Markov models: we are now given a Bigram tagging sequence of words and want to find the POS tags MLE HMMs ◮ The underlying event of POS tags can be thought of as generating the words in the sentence ◮ Each state in the Markov model can be a POS tag ◮ We don’t know the correct state sequence ( Hidden Markov Model (HMM) ) This requires an additional emission matrix , linking words with POS tags (cf. P ( arrow | NN ) ) 15 / 17

  16. POS Tagging Example HMM Assume DET, N, and VB as hidden states, with this Definition Tagsets transition matrix (A): Automatic POS Tagging DET N VB Bigram tagging MLE DET 0.01 0.89 0.10 HMMs N 0.30 0.20 0.50 VB 0.67 0.23 0.10 ... emission matrix (B): dogs bit the chased a these cats ... DET 0.0 0.0 0.33 0.0 0.33 0.33 0.0 ... N 0.2 0.1 0.0 0.0 0.0 0.0 0.15 ... VB 0.1 0.6 0.0 0.3 0.0 0.0 0.0 ... DET 0.7 ... and initial probability matrix ( π ): N 0.2 VB 0.1 16 / 17

  17. POS Tagging Using Example HMM Definition Tagsets In order to generate words, we: Automatic POS Tagging 1. Choose tag/state from π Bigram tagging MLE 2. Choose emitted word from relevant row of B HMMs 3. Choose transition from relevant row of A 4. Repeat #2 & #3, until we hit a stopping point ◮ keeping track of probabilities as we go along We could generate all possibilities this way and find the most probable sequence ◮ Want a more efficient way of finding most probable sequence 17 / 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend