parameter estimation and lexicalization for
play

Parameter Estimation and Lexicalization for Problem 1: Assuming - PowerPoint PPT Presentation

Standard PCFGs Standard PCFGs Lexicalized PCFGs Lexicalized PCFGs 1 Standard PCFGs Parameter Estimation Parameter Estimation and Lexicalization for Problem 1: Assuming Independence PCFGs Problem 2: Ignoring Lexical Information Informatics


  1. Standard PCFGs Standard PCFGs Lexicalized PCFGs Lexicalized PCFGs 1 Standard PCFGs Parameter Estimation Parameter Estimation and Lexicalization for Problem 1: Assuming Independence PCFGs Problem 2: Ignoring Lexical Information Informatics 2A: Lecture 20 2 Lexicalized PCFGs Lexicalization Mirella Lapata Head Lexicalization The Collins Parser School of Informatics University of Edinburgh Reading: J&M 2 nd edition, ch. 14.2–14.6.1, NLTK Book, Chapter 04 November 2011 8, final section on Weighted Grammar 1 / 28 2 / 28 Parameter Estimation Parameter Estimation Standard PCFGs Standard PCFGs Problem 1: Assuming Independence Problem 1: Assuming Independence Lexicalized PCFGs Lexicalized PCFGs Problem 2: Ignoring Lexical Information Problem 2: Ignoring Lexical Information Parameter Estimation Parameter Estimation In a PCFG every rule is associated with a probability. But where do these rule probabilities come from? In a PCFG every rule is associated with a probability. Use a large parsed corpus such as the Penn Treebank. But where do these rule probabilities come from? Use a large parsed corpus such as the Penn Treebank. ( (S (NP-SBJ (DT That) (JJ cold) (, ,) obtain grammar rules by reading them off the trees; (JJ empty) (NN sky) ) Number of times LHS → RHS occurs in corpus over number S → NP - SBJ VP (VP (VBD was) of times LHS occurs VP → VBD ADJP - PRD (ADJP-PRD (JJ full) PP → IN NP (PP (IN of) Count( α → β ) γ Count( α → γ ) = Count( α → β ) NP → NN CC NN P ( α → β | α ) = (NP (NN fire) � Count( α ) (CC and) (NN light) )))) (. .) )) 3 / 28 4 / 28

  2. Parameter Estimation Parameter Estimation Standard PCFGs Standard PCFGs Problem 1: Assuming Independence Problem 1: Assuming Independence Lexicalized PCFGs Lexicalized PCFGs Problem 2: Ignoring Lexical Information Problem 2: Ignoring Lexical Information Parameter Estimation Parameter Estimation With these parameters (rule probabilities), we can now compute the probabilities of the four sentences S1–S4: Corpus of parsed sentences: Compute PCFG probabilities: P ( S 1) = P ( r 1 | S ) P ( r 3 | NP ) P ( r 5 | VP ) ’ S1: [S [NP grass] [VP grows]]’ r Rule α P ( r | α ) = 2 / 4 · 3 / 4 · 3 / 4 = 0 . 28125 ’ S2: [S [NP grass] [VP grows] [AP slowly]]’ r 1 S → NP VP S 2/4 ’ S3: [S [NP grass] [VP grows] [AP fast]]’ r 2 S → NP VP AP S 2/4 ’ S4: [S [NP bananas] [VP grow]]’ P ( S 2) = P ( r 2 | S ) P ( r 3 | NP ) P ( r 5 | VP ) P ( r 7 | AP ) r 3 NP → grass NP 3/4 = 2 / 4 · 3 / 4 · 3 / 4 · 1 / 2 = 0 . 140625 r 4 NP → bananas NP 1/4 r 5 VP → grows VP 3/4 P ( S 3) = P ( r 2 | S ) P ( r 3 | NP ) P ( r 5 | VP ) P ( r 7 | AP ) r 6 VP → grow VP 1/4 = 2 / 4 · 3 / 4 · 3 / 4 · 1 / 2 = 0 . 140625 r 7 AP → fast AP 1/2 r 8 AP → slowly AP 1/2 P ( S 4) = P ( r 1 | S ) P ( r 4 | NP ) P ( r 6 | VP ) = 2 / 4 · 1 / 4 · 1 / 4 = 0 . 03125 5 / 28 6 / 28 Parameter Estimation Parameter Estimation Standard PCFGs Standard PCFGs Problem 1: Assuming Independence Problem 1: Assuming Independence Lexicalized PCFGs Lexicalized PCFGs Problem 2: Ignoring Lexical Information Problem 2: Ignoring Lexical Information Parameter Estimation Problems with Standard PCFGs What if we don’t have a treebank but we do have a While standard PCFGs are useful for a number of applications, (non-probabilistic) parser? they can produce a wrong result when used to choose the correct parse for an ambiguous sentence. 1 Take a CFG and set all rules to have equal probability 2 Parse the corpus with the CFG How can that be? 3 Adjust the probabilities 1 The independence of the rules in a PCFG. 4 Repeat steps two and three until probabilities converge 2 They ignore lexical information until the very end of the analysis, when word classes are rewritten to word tokens. This is the Inside-Outside algorithm (Baker, 1979), a type of Expectation Maximisation algorithm. It can also be used to induce How can this lead to the wrong choice among possible parses? a grammar, but only with limited success. 7 / 28 8 / 28

  3. Parameter Estimation Parameter Estimation Standard PCFGs Standard PCFGs Problem 1: Assuming Independence Problem 1: Assuming Independence Lexicalized PCFGs Lexicalized PCFGs Problem 2: Ignoring Lexical Information Problem 2: Ignoring Lexical Information Problem 1: Assuming Independence Problem 1: Assuming Independence S → NP VP NP → PRO By definition, a CFG assumes that the expansion of non-terminals VP → VBD NP NP → DT NOM is completely independent: It doesn’t matter: The above rules assign the same probability to both these trees, where a non-terminal is in the analysis; because they use the same re-write rules, and probability what else is (or isn’t) in the analysis. calculations do not depend on where rules are used. The same assumption holds for standard PCFGs: The probability of S S a rule is the same, no matter NP VP NP VP where it is applied in the analysis; what else is (or isn’t) in the analysis. VBD NP PRO VBD NP But this assumption is too simple! wrote PRO They wrote them 9 / 28 10 / 28 Parameter Estimation Parameter Estimation Standard PCFGs Standard PCFGs Problem 1: Assuming Independence Problem 1: Assuming Independence Lexicalized PCFGs Lexicalized PCFGs Problem 2: Ignoring Lexical Information Problem 2: Ignoring Lexical Information Problem 1: Assuming Independence Problem 2: Ignoring Lexical Information S → NP VP N → queen | bin But in speech corpus, 91% of 31021 subject NPs are pronouns: NP → NNS | NN NNS → workers | sacks | cars VP → VBD NP | VBD NP PP V → dumped | repaired (1) a. She’s able to take her baby to work with her. PP → P NP DT → a | the b. My wife worked until we had a family. NP → DT NN P → into | of while only 34% of 7489 object NPs are pronouns: Consider the sentences: (2) a. Some laws absolutely prohibit it. (3) a. Workers dumped sacks into a bin. b. It wasn’t clear how NL and Mr. Simmons would b. Workers repaired cars of the queen. respond if Georgia Gulf spurns them again. Because rules for rewriting non-terminals ignore word tokens until So the probability of NP → PRO should depend on where in the the very end, let’s consider these simply as strings of POS tags: analysis it applies (e.g., subject or object position). (4) a. PRO V DT N PREP DT N b. PRO V DT N PREP DT N 11 / 28 12 / 28

  4. Parameter Estimation Lexicalization Standard PCFGs Standard PCFGs Problem 1: Assuming Independence Head Lexicalization Lexicalized PCFGs Lexicalized PCFGs Problem 2: Ignoring Lexical Information The Collins Parser Problem 1: Ignoring Lexical Information Lexicalized PCFGs S S A PCFG can be lexicalised by associating a word and part-of-speech tag with every non-terminal in the grammar. NP VP NP VP It is head-lexicalised if the word is the head of the constituent described by the non-terminal. NNS NNS VBD NP VBD NNS PP Each non-terminal has a head that determines syntactic properties NP PP of phrase (e.g., which other phrases it can combine with). P NP NNS P NP DT N Example DT NN Noun Phrase (NP): Noun Adjective Phrase (AP): Adjective Which do we want for “Workers dumped sacks into a bin” ? Which Verb Phrase (VP): Verb for “Workers repaired cars of the queen” ? Prepositional Phrase (PP): Preposition Most appropriate analysis depends, in part, on the actual words. 13 / 28 14 / 28 Lexicalization Lexicalization Standard PCFGs Standard PCFGs Head Lexicalization Head Lexicalization Lexicalized PCFGs Lexicalized PCFGs The Collins Parser The Collins Parser Lexicalization Lexicalization Example We can lexicalize a PCFG by annotating each non-terminal with its TOP head word, starting with the terminals – replacing S VP → VBD NP PP VP → VBD NP NP → DT NN NP VP NP → NNS PP → P NP NNS VBD NP PP with rules of the form workers dumped NNS P NP VP(dumped) → V(dumped) NP(sacks) PP(into) VP(repaired) → V(repaired) NP(cars) PP(of) DT NN sacks into VP(dumped) → V(dumped) NP(sacks) VP(repaired) → V(repaired) NP(cars) a bin NP(queen) → DT(the) NN(queen) PP(into) → P(into) NP(bins) 15 / 28 16 / 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend