probabilistic models of language processing and
play

Probabilistic Models of Language Processing and Acquisition Fuchs - PowerPoint PPT Presentation

TU Graz - Signal Processing and Speech Communication Laboratory Probabilistic Models of Language Processing and Acquisition Fuchs Anna & L aer Andreas Signal Processing and Speech Communication Laboratory Advanced Signal Processing 2


  1. TU Graz - Signal Processing and Speech Communication Laboratory Probabilistic Models of Language Processing and Acquisition Fuchs Anna & L¨ aßer Andreas Signal Processing and Speech Communication Laboratory Advanced Signal Processing 2 Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 1/40

  2. TU Graz - Signal Processing and Speech Communication Laboratory Outline Introduction Syntactic Parsing Formal Grammar Context-free Grammar Parsing as Search – Two Strategies Ambiguity Dynamic Programming Parsing Method - CKY Algorihtm Statistical Parsing Probabilistic Context-Free Grammar (PCFG) Where do the probabilities come from? – Tree Banks Probabilistic CKY PCFG – Solve Ambiguity Problems with PCFG Conclusion Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 2/40

  3. TU Graz - Signal Processing and Speech Communication Laboratory Introduction General Comments ◮ Language can be represented by a probabilistic model ◮ Language processing involves generating or interpreting this model ◮ Language acquisition involves learning probabilistic models ◮ Main focus on Parsing and Learning grammar ◮ Chomskayan linguistics – language is internally represented as a Grammar ◮ Grammar – a system of rules that specifies all and only allowable sentences Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 3/40

  4. TU Graz - Signal Processing and Speech Communication Laboratory Probability in Language ◮ Cognitive science of language can be described WITH and WITHOUT probability ◮ Structural linguistics want to find regularities in language corpora and focused on finding the abstract rules ◮ Development of sophisticated probabilistic models – specified in terms of symbolic rules and representations ◮ Grammatical rules are associated with probabilities of what is linguistically likely not just what is linguistically possible Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 4/40

  5. TU Graz - Signal Processing and Speech Communication Laboratory Syntactic Parsing Formal Grammar ◮ Grammar is a powerful tool for describing and analyzing languages ◮ Grammar is a structured set of production rules by which valid sentences in a language are constructed ◮ Most commonly used for syntactic description, but also useful for (semantics, phonology,...) ◮ Defines syntactically legal sentences Sandra ate an apple. (syntactically legal) � Sandra ate apple. (not syntactically legal) x Sandra ate a building. (syntactically legal) � ◮ Sentences may be grammatically OK but not acceptable Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 5/40

  6. TU Graz - Signal Processing and Speech Communication Laboratory Definition I N a set of non-terminal symbols (or variables) Σ a set of terminal symbols (disjoint from N ); an actual word in a language a set of Rules or Productions, each of the form A → β , R A is a non-terminal ; β is any strings of terminals and non-terminals S is a designated start symbol Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 6/40

  7. TU Graz - Signal Processing and Speech Communication Laboratory Definition II ◮ Production – A can be replace by β ◮ Strings containing nothing that can be expanded further will consist of only terminals ◮ Such a string is called a sentence ◮ In the context of programming languages: a sentence is a syntactically correct and complete program ◮ Derivation – a sequence of applications of the rules of a grammar that produces a finished string of terminals ◮ Also called a parse Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 7/40

  8. TU Graz - Signal Processing and Speech Communication Laboratory Chomsky Hierarchy ◮ Type 0: unrestricted grammar, no other constraints ◮ Type 1: Context-sensitive grammars ◮ Type 2: Context-Free Grammar (CFGs) ◮ Type 3: Regular grammar Context-Free Grammar – CFG ◮ Declarative CFG – not specified how parse trees will be constructed ◮ Non-terminal on the left-hand side of a rule is all by itself ◮ Context-free – each node is expanded independently ◮ e.g. A → B C means that A is replaced by B followed by a C regardless of the context in which A is found Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 8/40

  9. TU Graz - Signal Processing and Speech Communication Laboratory Parsing as Search – Two Strategies ◮ Best possible way to make an analysis of a sentence ◮ Process of taking a string and a grammar and returning a (many?) parse tree(s) for that string ◮ Assigning correct trees to input strings ◮ Correct means a tree that covers all and only the elements of the input and has an S at the top ◮ It does not mean that the system can select the correct tree from among the possible trees ◮ Parsing – search which involves the making of choices Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 9/40

  10. TU Graz - Signal Processing and Speech Communication Laboratory Derivation as Trees ◮ Syntactic parsing - searching through the space of possible parse trees to find the correct parse tree for a given sentence ◮ E.g. Book that flight. S VP Verb NP Book Det Nominal that Noun flight Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 10/40

  11. TU Graz - Signal Processing and Speech Communication Laboratory Example Rules Sentence → < Subject >< Verb-Phrase >< Object > Subject → This | Computers | I Verb-Phrase → < Adverb >< Verb > | < Verb > Adverb → never Verb → is | run | am | tell Object → the < Noun > | a < Noun > | < Noun > Noun → university | world | cheese | lies Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 11/40

  12. TU Graz - Signal Processing and Speech Communication Laboratory Example cont. ◮ Derive simple sentences with and without sense This is a university. Computers run the world. I never tell lies. I am the cheese. Computers run cheese. ◮ Do not make semantic sense, but syntactically correct ◮ Formal grammars are a tool for SYNTAX not SEMANTICS Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 12/40

  13. TU Graz - Signal Processing and Speech Communication Laboratory Two Strategies ◮ Find all trees, whose root is start symbol S and cover the input words ◮ Two constraints (two search strategies): 1. Grammar – goal-directed search ( Top-Down ) 2. Data – data-directed search ( Bottom-Up ) Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 13/40

  14. TU Graz - Signal Processing and Speech Communication Laboratory Top-Down Parsing ◮ Find trees rooted with an S start with the rules that give us an S ◮ Work the way down from there to the words Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 14/40

  15. TU Graz - Signal Processing and Speech Communication Laboratory S S S S NP VP Aux NP VP VP S S S S S S NP VP NP VP Aux NP VP Aux NP VP VP VP PropN PropN Det Nom Det Nom V NP V Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 15/40

  16. TU Graz - Signal Processing and Speech Communication Laboratory Bottom-Up Parsing ◮ Trees that cover the input words start with trees that link up with the words in the right way ◮ Work the way up from there Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 16/40

  17. TU Graz - Signal Processing and Speech Communication Laboratory Let’s do an example! Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 17/40

  18. TU Graz - Signal Processing and Speech Communication Laboratory Grammar Lexicon S NP VP Det that | this → → S Aux NP VP the | a → S VP Noun book | flight → → NP Pronoun meal | money → NP Proper-Noun Verb book | include → → NP Det Nominal prefer → Nominal Noun Pronoun I | she | me → → Nominal Nominal Noun Proper-Noun Houston | NWA → → Nominal Nominal PP Aux does → → VP Verb Preposition from | to | on → → VP Verb NP near | through → VP Verb NP PP → VP Verb PP → VP VP PP → PP Preposition NP → Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 18/40

  19. TU Graz - Signal Processing and Speech Communication Laboratory Top-Down vs Bottom-Up Search Top-Down Bottom-Up ◮ Never considers derivations ◮ Generates many subtrees that do not end up at root S that will never lead to an S ◮ Wastes a lot of time with ◮ Only considers trees that trees that are inconsistent cover some part of the input with the input ◮ Combine TD and BU: Top-Down expectations with Bottom-Up data to get more efficient searches ◮ One kind as the control and the other as a filter ◮ For both: How to explore the search space? Pursuing all parses in parallel? Which rule to apply next? Which node to expand next? Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 19/40

  20. TU Graz - Signal Processing and Speech Communication Laboratory Ambiguity I ◮ At least one string which has multiple parse trees ◮ E.g.1: ...old men and women... ◮ E.g.2: I shot an elephant in my pajamas. ◮ Choose the correct parse from multitude of possible parses through syntactic disambiguation ◮ Such algorithms require statistical, semantic and pragmatic knowledge Fuchs Anna & L¨ aßer Andreas Advanced Signal Processing 2 page 20/40

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend