Intro NLP Tools Sporleder & Rehbein WS 09/10 Sporleder & - PowerPoint PPT Presentation

Intro NLP Tools Sporleder & Rehbein WS 09/10 Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 1 / 15

Approaches to POS tagging rule-based ◮ look up words in the lexicon to get a list of potential POS tags ◮ apply hand-written rules to select the best candidate tag probabilistic models ◮ for a string of words W = w 1 , w 2 , w 3 , ..., w n find the string of POS tags T = t 1 , t 2 , t 3 , ..., t n which maximises P ( T | W ) ( ⇒ the probability of tag T given that the word is W) ◮ mostly based on (first or second order) Markov Models : estimate transition probabilities ⇒ How probable is it to see POS tag Z after having seen tag Y on position x − 1 and tag X on position x − 2 ? Basic idea of ngram tagger: current state only depends on previous n states: p ( t n | t n − 2 t n − 1 ) Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 2 / 15

How to compute transition probabilities? How do we get p ( t n | t n − 2 t n − 1 ) ? many ways to do it... e.g. Maximum Likelihood Estimation (MLE) ◮ p ( t n | t n − 2 t n − 1 ) = F ( t n − 2 t n − 1 t n ) F ( t n − 2 t n − 1 ) F ( the / DET white / ADJ house / N ) ◮ F ( the / DET white / ADJ ) Problems: ◮ zero probabilities (might be ingrammatical or just rare) ◮ unreliable counts for rare events Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 3 / 15

Treetagger probabilistic uses decision trees to estimate transition probabilities ⇒ avoid sparse data problems How does it work? ◮ decision tree automatically determines the context size used for estimating transition probabilities ◮ context: unigrams, bigrams, trigrams as well as negations of them (e.g. t n − 1 =ADJ and t n − 2 � = ADJ and t n − 3 = DET) ◮ probability of an n-gram is determined by following the corresponding path through the tree until a leaf is reached ◮ improves on sparse data, avoids zero frequencies Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 4 / 15

Treetagger Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 5 / 15

Stanford log-linear POS tagger ML-based approach based on maximum entropy models Idea: improving the tagger by extending the knowledge sources, with a focus on unknown words Include linguistically motivated, non-local features: ◮ more extensive treatment of capitalization for unknown words ◮ features for disambiguation of tense form of verbs ◮ features for disambiguating particles from prepositions and adverbs Advantage of Maxent: does not assume independence between predictors Choose the probability distribution p that has the highest entropy out of those distributions that satisfy a certain set of constraints Constraints ⇒ statistics from the training data (not restricted to n − gram sequences) Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 6 / 15

C&C Taggers Based on maximum entropy models highly efficient! State-of-the-art results: ◮ deleting the correction feature for GIS (Generalised Iterative Scaling) ◮ smoothing of parameters of the ME model: replacing simple frequency cutoff by Gaussian prior (form of maximum a posteriori estimation rather than a maximum likelihood estimation) ⋆ penalises models that have very large positive or negative weights ⋆ allows to use low frequency features without overfitting Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 7 / 15

The Stanford Parser Factored model : compute semantic (lexical dependency) and syntactic (PCFG) structures using separate models combine results in a new, generative model P ( T , D ) = P ( T ) P ( D ) (1) Advantages: ◮ conceptual simplicity ◮ each model can be improved seperately ◮ effective A* parsing algorithm (enables efficient, exact inference) Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 8 / 15

The Stanford Parser Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 9 / 15

The Stanford Parser P(T): use more accurate PCFGs annotate tree nodes with contextual markers (weaken PCFG independence assumptions) ◮ PCFG-PA : Parent encoding (S (NP (N Man) ) (VP (V bites) (NP (N dog) ) ) ) (S (NPˆS (N Man) ) (VPˆS (V bites) (NPˆVP (N dog) ) ) ) ◮ PCFG-LING : selective parent splitting, order-2 rule markovisation, and linguistically-derived feature splits Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 10 / 15

Intro NLP Tools Sporleder & Rehbein WS 09/10 Sporleder & - PowerPoint PPT Presentation

Intro NLP Tools Sporleder & Rehbein WS 09/10 Sporleder & Rehbein (WS 09/10) PS Domain Adaptation October 2009 1 / 15 Approaches to POS tagging rule-based look up words in the lexicon to get a list of potential POS tags apply

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

Interchange Intro Presentation Plus: Intro (Mixed media Interchange Intro Presentation Plus: Intro

IXA pipes: Efficient and Ready to Use Multilingual NLP tools Rodrigo Agerri IXA NLP Group,

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

Tools for investigating THDM models Henning Bahl 14.11.2019, Hamburg Intro Tools Conclusions

Facing NLP German Rigau i Claramunt http://adimen.si.ehu.es/~rigau IXA group Departamento de

Prominent Research Directions in NLP Alexander Panchenko Assistant Professor for NLP About

Natural Language Processing (NLP) In 11-711 Algorithms for NLP we take an

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and

Forwarding, Splitting, and Block Ordering to Optimize BDD-based Bisimulation Computation Ralf

Composition and Splitting Methods Book Sections II.4 and II.5 Claude Gittelson Seminar on

Order Splitting and Interacting with a Counterparty Vincent van Kervel Amy Kwan Joakim

CS70: Lecture 18. 1. Review. 2. Stars/Bars. 3. Balls in Bins. 4. Addition Rules. 5.

Splitting in SourceTerminal Network Reliability Estimation H ector Cancela Leslie Murray

Accelerated Douglas-Rachford splitting and ADMM for structured nonconvex optimization Panos

Social Security: With You Through Lifes Journey Pre-Retirement Seminar Federal Executive

THE MEANING OF MARRIAGE The Mission of Marriage (Part 2) THE MEANING OF MARRIAGE Gods