HMM Can Find Pretty Good POS Taggers (When Given a Good Start) Yoav - PowerPoint PPT Presentation

Introduction Initial Conditions For POS Tagging Experiments HMM Can Find Pretty Good POS Taggers (When Given a Good Start) Yoav Goldberg Meni Adler Michael Elhadad university-logo ACL 2008, Columbus, Ohio Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction The Task Initial Conditions For POS Tagging Previous Work Experiments Our Approach Unsupervised POS Tagging (If you don’t know what POS Tagging is, please leave the room) a: DET Input an: DT arrow: NN banana: NN Lots of (unannotated) Text flies: NNS VB fruit flies like a banana fruit: NN ADJ A Lexicon time flies like an arrow like: VB IN RB JJ . . . . . . . . . . . . . . . . . . time: VB NN Maps words to their . . . . . . possible POS tags Some words may be missing Analyses for a word are not ordered Output A POS Tagger university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction The Task Initial Conditions For POS Tagging Previous Work Experiments Our Approach Previous Work – 10-15 years ago Early Unsupervised POS Tagging HMM Early works on HMM models trained with EM Pretty decent results (Merialdo 1994, Elworthy 1994,. . . ) Transformation Based Learning Unsupervised Transformation Based Learning (Brill, 1995) This also seemed to work well Alas, it turns out they were “cheating” HMM – use “pruned” dictionaries: only probable POS tags are suggested Brill – assume knowledge of most-probable-tag per word university-logo This kind of information is based on corpus Counts! Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction The Task Initial Conditions For POS Tagging Previous Work Experiments Our Approach Previous Work – 10-15 years ago Initial Conditions Elworthy shows that good initialization of parameters prior to EM boost results (Elworthy 1994) . . . but doesn’t tell how it can be done automatically Context Free Approximation from Raw Data Moshe Levinger proposes a way to estimate p ( tag | word ) from raw data. He applies it to Hebrew. (Levinger et al. , CL, 1995) university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction The Task Initial Conditions For POS Tagging Previous Work Experiments Our Approach Previous Work – Right About Now EM/HMMs are Out “Why doesn’t EM find Good HMM-POS taggers?” (Mark Johnson, EMNLP-2007) New and Complicated Methods are in “Contrastive estimation: training log-linear models on unlabeled data” (Smith and Eisner, ACL-2005) “A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging” (Goldwater and Griffiths, ACL-2007) “A Bayesian LDA-based model for semi-supervised part-of-speech tagging” (Toutanova and Johnson, NIPS-2007) university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction The Task Initial Conditions For POS Tagging Previous Work Experiments Our Approach Objective: Build a Hebrew POS-Tagger Hebrew Rich Morphology Huge Tagset ( 3k tags) Building a Hebrew Tagger No large annotated corpora A fairly comprehensive Lexicon An unsupervised approach is called for . . . but current works on English are un-realistic for us university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction The Task Initial Conditions For POS Tagging Previous Work Experiments Our Approach Our Take at Unsupervised POS Tagging Grandma knows best! . . . back to EM trained HMMs We just need to find the right initial parameters! Finding initial parameters Improved version of the Levinger algorithm a novel iterative context-based estimation method Much simpler (computationally) than recent methods university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Raw Text Lexicon For Hebrew: Earlier Today Unknown Words Initial Possible Tags Guesser Parameters Estimation P init ( t | w ) This Work EM Trained 2nd order HMM P ( w | t ) , P ( t i | t i − 1 , t i − 2 )

Introduction The Task Initial Conditions For POS Tagging Previous Work Experiments Our Approach Outline We can build a good tagger using EM-HMM if we supply good initial conditions It works in Hebrew and in English Finding initial conditions: Morphology Based Context Based Experiments Hebrew English university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction p ( t | w ) from morphology Initial Conditions For POS Tagging p ( t | w ) from context Experiments Morphology based p ( t | w ) Levinger’s “Similar Words” Algorithm Example : The Hebrew �הדלי Language specific algorithm for is ambiguous between context-free estimation of p ( t | w ) a Noun (girl) and a Verb (gave birth). Main intuitions: Morphological variations of Estimate p ( Noun | �הדלי ) words have similar distribution by counting: While a form may be �הדליה (the girl) ambiguous, some of its �תודליה (the girls). inflections aren’t Estimate p ( Verb | �הדלי ) ⇒ Estimate based on inflected forms by counting: �דלת (she will give birth) �ודלי (they gave birth) university-logo (Would probably not work that well for English) Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction p ( t | w ) from morphology Initial Conditions For POS Tagging p ( t | w ) from context Experiments Context Based p ( t | w ) The Intuition: Distributional Similarity Words in similar contexts have similar POS distributions (cf. Harris’ distributional hypothesis, Schutze’s POS induction, etc. ) Previous work: what are the possible tags for a given word? This work: Possible tags are known. Let’s rank them. In other words: We have a guess at p ( t | w ) . Use context to improve it. university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction Unsupervised Hebrew POS Tagging Initial Conditions For POS Tagging Unsupervised English POS Tagging Experiments Evaluation Evaluating the Learned p ( t | w ) How does the p ( t | w ) perform as a Context Free tagger? ContextFreeTagger: tag ( w ) = arg max t p ( t | w ) The REAL Evaluation How does an EM-HMM tagger initialized with the learned p ( t | w ) perform? university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

Introduction Unsupervised Hebrew POS Tagging Initial Conditions For POS Tagging Unsupervised English POS Tagging Experiments Hebrew Experiments How good are the learned p ( t | w ) ? P Unif ( t | w ) [following the Lexicon] Context Free Tagger FullMorph POS+Seg Levinger’s p ( t | c ) Algorithm p ( t | w ) Context Free Tagger tag ( w ) = arg max t p ( t | w ) university-logo Yoav Goldberg, Meni Adler, Michael Elhadad HMM Can Find Pretty Good POS Taggers

HMM Can Find Pretty Good POS Taggers (When Given a Good Start) Yoav - PowerPoint PPT Presentation

Introduction Initial Conditions For POS Tagging Experiments HMM Can Find Pretty Good POS Taggers (When Given a Good Start) Yoav Goldberg Meni Adler Michael Elhadad university-logo ACL 2008, Columbus, Ohio Yoav Goldberg, Meni Adler, Michael

Why doesn't EM find good HMM POS-taggers? Mark Johnson Microsoft Research Brown University 1

Making maps pretty Andrea Aime Jim Groffen Making Maps Pretty Making Maps Pretty 1 1 Making

POS Tagging HMMs L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 1 / 17 POS

Introduction to Hmm Introduction to Hmm Joe Wu Nov 4 th 2011 Agenda The applications of HMM.

Cell implementation HMM (HMM hidden Markov model) Authors: Jakub Hork Ji Hona

Using HMM to Blur the Lines between CPU and GPU Programming John Hubbard, May 10, 2017

If all you have is a bit of the Bible: Learning POS taggers for truly low-resource languages

Character Eyes: Seeing Language through Character-Level Taggers Yuval Pinter Marc Marone Jacob

Hidden Markov Models (HMM) Many slides from Michael Collins and HMMs Overview I The Tagging

Pretty Good Democracy Peter Y A Ryan University of Luxembourg Vanessa Teague University of

Algorithms for NLP IITP, Spring 2020 HMMs, POS tagging, NER Yulia Tsvetkov 1 Plan POS

Arabic POS Tagging Results Error Analysis Conclusion Emad Mohamed, Sandra K ubler Indiana

A Talk on Protein Homology Detection by HMM-HMM comparisons[1] Sding, J Qing Ye Department of

The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 Lecture Outline Lecture Outline

Fast TwoLevel Fast TwoLevel HMM Decodi HMM Decoding ng Algor gorithm for thm for Large

Global Robot Ego-Localization C Combining Image Retrieval and HMM- bi i I R i l d HMM

Information Extraction and Question-Answering Systems Foundations and methods Dr. Gnter

Morphology and Corpora: Introduction Marco Baroni University of Bologna Granada Morphology

Evaluating classifiers CS440 The 2-by-2 contingency table correct not correct positive tp fp

Open Cavity Resonators The Orpheus Experiment Gray Rybka, University of Washington Workshop on

higher genus partition functions from three dimensional gravity Henry Maxfield 1601.00980 with

Function Fields, Curves Introduction Function Fields vs. Curves and Global sections Function

New Age Transac9onal Systems Not Your Grandpa s OLTP

Matthew Series Lesson #070 March 15, 2015 Dean Bible Ministries www.deanbibleministries.org