taaltheorie en taalverwerking
play

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel - PowerPoint PPT Presentation

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic, Language, and Computation Winter 2012, lecture 3a Raquel Fernndez TtTv 2012 - lecture 3a 1 / 19 Plan for Today Theoretical session:


  1. Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernández Institute for Logic, Language, and Computation Winter 2012, lecture 3a Raquel Fernández TtTv 2012 - lecture 3a 1 / 19

  2. Plan for Today Theoretical session: • Parsing (part of ch. 13 of J&M) • Guest lecture on Machine Translation Practical session: • Questions/problems regarding HW#2 (I’ll be at there 17-18h). • Time to finish HW#3. • Review project groups. Raquel Fernández TtTv 2012 - lecture 3a 2 / 19

  3. Parsing Syntactic parsing is the task of computing a parse tree for a sentence given a grammar. • When we use grammars as recognizers, the recognizer also parses the sentence (goes through a derivation), but we are not interested in the resulting structure. • When we use grammars as parsers, we are interested in the tree structure assigned to a particular sentence. Parsing can be viewed as a search problem: • the parser searches through the space of possible parse trees allowed by a grammar to find the right tree for a given sentence. ◦ note : recognition/parsing of regular languages can also be viewed as a search problem, but since any non-deterministic FSA is equivalent to a deterministic FSA the search ‘problem’ is not a problem in theory. Raquel Fernández TtTv 2012 - lecture 3a 3 / 19

  4. Parsing as a Search Problem A grammar defines a search space of possible trees – each state in this space corresponds to a tree. The space includes: • all the complete trees a grammar can generate (trees whose leaves correspond to words and cannot be further expanded), and • all the partial trees (where some node can still be expanded by a rule), which can be seen as intermediate steps towards the generation of complete trees. ⇒ the search space of natural language grammars can be huge! Raquel Fernández TtTv 2012 - lecture 3a 4 / 19

  5. Parsing as a Search Problem How does a parser assign a parse tree to a sentence? Given a sentence and a grammar, the parser navigates the search space following two constraints: • the complete parse tree of a given sentence must have leaves that correspond to the words in it. • the root of the complete parse tree must be the start symbol S of the grammar. These two constraints give rise to the two search strategies underlying most parsers: bottom-up and top-down. Raquel Fernández TtTv 2012 - lecture 3a 5 / 19

  6. Bottom-Up Parsing • The starting point of a bottom-up parser are the words of the input sentence. • The parser proceeds by building up structure from the bottom to the top of the tree. • It does so by looking at the grammar rules right-to-left . • At each stage, it considers as many (partial) trees as can be built by matching the right-hand side of a rule with the current input. • The parser succeeds if it is able to build a tree that covers all teh input and whose root is the start symbol of the grammar. Raquel Fernández TtTv 2012 - lecture 3a 6 / 19

  7. Bottom-Up Parsing Raquel Fernández TtTv 2012 - lecture 3a 7 / 19

  8. Top-Down Parsing • The starting point of a top-down parser is the start symbol of the grammar. • The parser starts by assuming that the input is indeed a well-formed sentence and it tries to prove this by building up structure from the top of the tree down to the leaves. • It does so by looking at the grammar rules left-to-right . • At each stage, it considers as many (partial) trees as can be built by matching the left-hand side of a rule with the currently available non-terminal nodes. • The parser succeeds if the leaves of at least one of the trees it has constructed matches the words of the input sentence Raquel Fernández TtTv 2012 - lecture 3a 8 / 19

  9. Top-Down Parsing Raquel Fernández TtTv 2012 - lecture 3a 9 / 19

  10. Bottom-Up vs. Top-Down These two basic strategies have advantages and disadvantages: • Top-Down parsers never explore illegal parse trees that cannot form an S – but waste time on trees that can never match the input words. • Bottom-Up parsers never explore trees that are inconsistent with input sentence – but waste time exploring illegal parse trees that will never lead to an S root. Actual parsing algorithms may combine these two strategies. Raquel Fernández TtTv 2012 - lecture 3a 10 / 19

  11. A Grammar’s Search Space How can we define the search space of a given grammar? For simplicity, let us focus on the top-down approach (the same considerations apply to the bottom-up approach). Let’s assume that the states in the search space are created by: • applying the grammar rules in the order in which they appear in the grammar, and • expanding the nodes at a given level in a tree from left to right. We can define the search space of a given grammar following one of two strategies: depth-first or breadth-first. • depth-first: we work vertically – priority is given to nodes that are lower or deeper in the tree • breadth-first: we work horizontally – priority is given to nodes that are higher up in the tree Raquel Fernández TtTv 2012 - lecture 3a 11 / 19

  12. Search Space: Depth-first 1. S → NP VP 5. D → the 8. V → runs 2. NP → Det N 6. N → dog 9. A → fast 3. VP → V 7. N → cat 4. VP → V A For simplicity, sequences of states where there is no branching are collapsed into one single state. S NP VP D N the S S NP VP NP VP D N D N the dog the cat S S S S NP VP NP VP NP VP NP VP D N D N V V D N D N V A V A the cat the dog runs runs the cat the dog runs fast runs fast Raquel Fernández TtTv 2012 - lecture 3a 12 / 19

  13. Search Space: Breadth-first 1. S → NP VP 5. D → the 8. V → runs 2. NP → Det N 6. N → dog 9. A → fast 3. VP → V 7. N → cat 4. VP → V A For simplicity, sequences of states where there is no branching are collapsed into one single state. S NP VP D N S S NP VP NP VP D N V D N V A S S S S NP VP NP VP NP VP NP VP D N D N V V D N D N V A V A the cat the dog runs runs the cat the dog runs fast runs fast Raquel Fernández TtTv 2012 - lecture 3a 13 / 19

  14. Realistic Search • Since the search space of a realistic grammar can be huge, parsing algorithms do not actually build the full space of parse trees that a grammar allows and then search for the tree that corresponds to a given sentence. • Instead, they expand the search space incrementally by systematically exploring one state at a time. • When parsing a given a sentence, parsers explore paths in a theoretical search space. Raquel Fernández TtTv 2012 - lecture 3a 14 / 19

  15. Exploring Paths breadth-first 1 2 3 4 5 6 7 depth-first 1 2 5 3 4 6 7 Raquel Fernández TtTv 2012 - lecture 3a 15 / 19

  16. Top-Down depth-first with bottom-up filtering We can combine top-down and bottom-up parsing by adding the following constraint: the parser should not consider any grammar rule that leads to words which are not part of the input sentence. S NP VP D N the S S NP VP NP VP D N D N the cat the dog The cat runs fast Raquel Fernández TtTv 2012 - lecture 3a 16 / 19

  17. Structural Ambiguity There are several types of structural or syntactic ambiguity: • attachment ambiguity: one constituent can appear in more than one location in the parse tree (we have already seen this kind of ambiguity). The tourist saw the astronomer with the telescope I shot an elephant in my pajamas We saw the Eiffel Tower flying to Paris • coordination ambiguity: different sets of phrases can be conjoined together (a variant of attachment ambiguity) old men and women → old [men & women] / [ old men ] & women → nationwide [t & r] / [nationwide t] & r nationwide television and radio the light red chair → the [light [blue chair ] ] / the [ [light blue] chair ] • local ambiguity: a part of a sentence is ambiguous (has more than one parse tree) even thought the whole sentence may not be so. book that flight → POS ambiguity of ‘book’ (V or N) the robber knew Vincent shot Marsellus → the grammar may be able to assign a sentential structure to the sub-string ‘the robber knew Vincent’. Raquel Fernández TtTv 2012 - lecture 3a 17 / 19

  18. Structural Ambiguity • We have been referring to ambiguous sentences. • We say that a grammar is ambiguous if it can generate more than one parse tree for a given sentence. ∗ note that local ambiguity is possible with grammars that are not ambiguous. For instance, this grammar is not ambiguous even though it gives rise to local ambiguity: S → NP VP S → VP NP → Det N VP → V NP Det → the | that N → book | flight V → book Raquel Fernández TtTv 2012 - lecture 3a 18 / 19

  19. Syntactic Disambiguation Ambiguity is perhaps the worst enemy of parsers. • Syntactic disambiguation is the task of choosing one parse tree among the possible parses of an ambiguous sentence. • This task is critical because structure guides how we assign meaning to a given sentence. • Parsing by itself does not offer tools for syntactic disambiguation – a parser can at most return all possible parse trees. • On Friday we’ll look into basic probabilistic techniques for syntactic disambiguation (PCFGs). Raquel Fernández TtTv 2012 - lecture 3a 19 / 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend