statistical parsing
play

Statistical Parsing Recap (dashed ellipse) are adequate for - PDF document

Statistical Parsing Recap (dashed ellipse) are adequate for representing natural languages cross-cut this hierarchy (shaded region) . ltekin, SfS / University of Tbingen November 8, 2016 5 / 27 Parsing basics Context Sensitive CKY


  1. Statistical Parsing Recap (dashed ellipse) are adequate for representing natural languages cross-cut this hierarchy (shaded region) Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 5 / 27 Parsing basics Context Sensitive CKY Earley Closing Context free grammars phenomena in natural language syntax parsing CF languages (possibly empty) sequence of terminal or non-terminal Recursively Enumerable Context Free grammars for the rest of this lecture SfS / University of Tübingen Phrase structure grammars A phrase structure grammar is specifjed by, N is a set of non-terminal symbols R is a set of rules of the form with the rewrite rules R Ç. Çöltekin, November 8, 2016 Regular 4 / 27 Recap Parsing basics Parsing context-free languages Earley Closing Chomsky hierarchy and natural languages symbols Ç. Çöltekin, Earley saw S NP Prn she VP V NP V Det a N duck Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 N S SfS / University of Tübingen An example grammar November 8, 2016 6 / 27 Recap Parsing basics CKY Earley Closing S Derivation of sentence ‘she saw a duck’ S N N N V V V Closing CKY CKY we do not study or focus on any specifjc theory infjnite) language grammars Constitunecy (or phrase structure) grammars, Dependency grammars often use ideas/notions from both constituencies and Parsing basics Ç. Çöltekin, Closing SfS / University of Tübingen November 8, 2016 2 / 27 Recap Parsing basics CKY Earley Grammars Earley Dependency vs. constituency Earley Çağrı Çöltekin University of Tübingen Seminar für Sprachwissenschaft November 8, 2016 Recap Parsing basics CKY Closing CKY Ingredients of a (natural language) parser Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 1 / 27 Recap Parsing basics Closing dependencies on units formed by a group of lexical saw John VP V saw NP Marry John Marry S subject object root Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 3 / 27 Recap NP 7 / 27 become popular in CL binary head–dependent relations items (constituents or phrases) grammars between words developed with constituency • A grammar • An algorithm for parsing • A method for ambiguity resolution • A formal grammar is a fjnite specifjcation of a (possibly • Constituency grammars are based • In this course, we are interested in two broad classes of • Dependency grammars model • Various theories of ‘grammar’ (e.g., HPSG, LFG, CCG) • Most of the theory of parsing is • We will study these grammars in their relation to parsing, • Dependency grammars has recently Σ is a set of terminal symbols S ∈ N is a distinguished start symbol αAβ → γ for A ∈ N α, β, γ ∈ Σ ∪ N • The grammar accepts a sentence if it can be derived from S • Chomsky hierarchy of languages form a set-inclusion hierarchy • It is often claimed that mildly context sensitive grammars • Note, however, that the possible natural languages probably → NP VP → Aux NP VP • Context free grammars are suffjcient for expressing most ⇒ NP VP NP → Det N NP ⇒ Prn NP → Prn Prn ⇒ she • Most of the parsing theory (and practice) is build on NP → NP PP VP ⇒ V NP VP → V NP ⇒ saw VP → V • The context-free rules have the form NP ⇒ Det N VP → VP PP A → α PP → Prp NP Det ⇒ a → duck ⇒ duck where A is a single non-terminal symbol and α is a → park → parks → duck → ducks • We will mainly focus with parsing with context-free → saw Prn → she | her Prp → in | with Det → a | the

  2. Recap V VP S S S N N N V V Ç. Çöltekin, NP SfS / University of Tübingen November 8, 2016 10 / 27 Recap Parsing basics CKY Earley Closing Problems with search procedures the input, and cannot handle left recursion NP N Some of these problems can be solved using dynamic Recap N N V V V Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 9 / 27 Parsing basics Det CKY Earley Closing Parsing as search: bottom up she saw a duck Parsing basics V the sentence programming techniques. S Converting to CNF: example Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 13 / 27 Recap Parsing basics CKY Earley Closing S grammar: it generates/accepts the same language, but the S N N N V V V Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 derivations are difgerent following forms Ç. Çöltekin, algorithm is a dynamic programming algorithm (Kasami SfS / University of Tübingen November 8, 2016 11 / 27 Recap Parsing basics CKY Earley Closing CKY algorithm 1965; Younger 1967; Cocke and Schwartz 1970) Chomsky normal form (CNF) results on a chart Ç. Çöltekin, SfS / University of Tübingen November 8, 2016 12 / 27 Recap Parsing basics CKY Earley Closing N Prn 14 / 27 S VP NP she Prn Det NP Parsing as search: top down saw Closing Ç. Çöltekin, Earley SfS / University of Tübingen November 8, 2016 9 / 27 Recap V NP Earley Backtrack! N N N S V V S duck Det a saw she V duck N a Parsing basics CKY Closing directions Parsing as search: top down saw SfS / University of Tübingen a Ç. Çöltekin, duck Backtrack! 8 / 27 the sentence S and the input Parsing as search Closing Earley CKY she November 8, 2016 Recap VP S NP Det Prn Parsing basics NP she V saw NP Det a N duck CKY → NP VP → Aux NP VP NP → Det N NP → Prn • Parsing can be seen as search constrained by the grammar NP → NP PP VP → V NP • Top down: start from S , fjnd the derivations that lead to VP → V VP → VP PP PP → Prp NP • Bottom up: start from the sentence, fjnd series of → duck derivations (in reverse) that leads to S → park • One can search depth fjrst or breadth fjrst in both → parks → duck → ducks → saw Prn → she | her Prp → in | with Det → a | the → NP VP → NP VP → Aux NP VP → Aux NP VP NP → Det N NP → Det N NP → Prn NP → Prn NP → NP PP NP → NP PP VP → V NP VP → V NP VP → V VP → V VP → VP PP VP → VP PP PP → Prp NP PP → Prp NP → duck → duck → park → park → parks → parks → duck → duck → ducks → ducks → saw → saw Prn → she | her Prn → she | her Prp → in | with Prp → in | with Det → a | the Det → a | the • Top-down search considers productions incompatible with • The CKY (Cocke–Younger–Kasami), or CYK, parsing • Bottom-up search considers non-terminals that would never lead to S • It processes the input bottom up , and saves the intermediate • Repeated work because of backtracking → The result is exponential time complexity in the length of • Time complexity for recognition is O ( n 3 ) (with a space complexity of O ( n 2 ) ) • It requires the CFG to be in Chomsky normal form (CNF) → NP VP → Aux NP VP NP → Det N NP → Prn • A CFG is in CNF, if the rewrite rules are in one of the NP → NP PP VP → V NP – A → B C VP → V • For rules with > 2 RHS symbols – A → a VP → VP PP S → Aux NP VP ⇒ S → Aux X where A , B , C are non-terminals and a is a terminal PP → Prp NP X → NP VP → duck • Any CFG can be converted to CNF • For rules with < 2 RHS symbols → park • Resulting grammar is weakly equivalent to the original NP → Det ⇒ NP → a | the → parks → duck → ducks → saw Prn → she | her Prp → in | with Det → a | the

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend