natural language parsing techonlogy
play

Natural Language Parsing Techonlogy Foundations of Language Science - PowerPoint PPT Presentation

Natural Language Parsing Techonlogy Foundations of Language Science and Technology (WS 2014/2015) Bernd Kiefer Language Technology Lab, DFKI GmbH Department of Computational Linguistics Saarland University November 2014 1 Natural Language


  1. Natural Language Parsing Techonlogy Foundations of Language Science and Technology (WS 2014/2015) Bernd Kiefer Language Technology Lab, DFKI GmbH Department of Computational Linguistics Saarland University November 2014 1 Natural Language Parsing Technology

  2. Outline Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology 2 Natural Language Parsing Technology

  3. Outline Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology 3 Natural Language Parsing Technology

  4. Language & Grammar q Language q Structural q Productive q Ambiguous, yet efficient in human-human communication q Grammar q Generalization of regularities in language structures q Morphology & syntax, often complemented by phonetics, phonology, semantics, and pragmatics 4 Natural Language Parsing Technology

  5. Ambiguity q Human languages are ambiguous on almost every layer q Grammar frameworks are designed to represent necessary ambiguities, and eliminate unnecessary ones q Parsing models are responsible for retrieving valid analyses according to the grammar 5 Natural Language Parsing Technology

  6. Syntactic Parser as NLP Component PoS Tagging Chunking Morph. Analysis NER Syntactic Parsing Semantic Analysis . . . 6 Natural Language Parsing Technology

  7. Trees (or not) S D E   PHON | ORTH " GAVE " NP VP          2 3  Sue V NP NP  2 3   HEAD VERB         6 2 3 7  D E  6 7  gave Paul Det N  6 NP 1 7  SUBJ  6 7   CAT  6 7  6 7   VAL 6 7  6 7 6 7   an A N D E  6 7  4 5  4 NP 2 , NP 3 5  COMPS  6 7     6 7    old penny  SYNSEM | LOC  6 7    6 7    8 9  6 2 3 7    ARG 1 1  6 7   > >   > >  6 7 > >    < 6 7 =  6 7 ARG 2 2  CONT | RELS   6 6 7 7    DOBJ  6 4 5 7    > ARG 3 3 >  4 5  > >    > >  give_rel : ;   DET   SBJ IOBJ ADJ gave penny Sue Paul an old 7 Natural Language Parsing Technology

  8. Chomsky Hierarchy q Type 0 (unrestricted rewriting system) ↵ ! � ↵ , � 2 ( V N [ V T ) ∗ q Type 1 (context sensitive grammars) � A ! ! ��! A 2 V N , � , � , ! 2 ( V N [ V T ) ∗ q Type 2 (context free grammars) A ! � A 2 V N , � 2 ( V N [ V T ) ∗ q Type 3 (regular grammars) A ! xB _ A ! x A , B 2 V N , x 2 V T 8 Natural Language Parsing Technology

  9. Context-Free Grammar A CFG is a quadruple: h V T , V N , P , S i q V T : terminal symbols q V N : non-terminal symbols q P : context-free productions A 2 V N , � 2 ( V N [ V T ) ∗ A ! � q S : start symbol 9 Natural Language Parsing Technology

  10. Context-Free Phrase Structure Grammar q S ! NP VP q N ! dog | cat q NP ! Det N q Det ! the | a q N ! Adj N q V ! chases | sleeps q VP ! V q Adj ! gray | lazy q VP ! V NP q Adv ! fiercely q VP ! Adv VP 10 Natural Language Parsing Technology

  11. CFG Derivation q If � = � A � , ! = �↵� and A ! ↵ 2 P then ! follows � , � ) ! q If a sequence of strings � 1 , � 2 , . . . , � m where for all i (1  i  m � 1), � i ) � i + 1 then � 1 , � 2 , . . . , � m is a derivation from � 1 to � m q “Derivable” relation: transitive, reflexive ∗ ) � m � 1 11 Natural Language Parsing Technology

  12. Outline Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology 12 Natural Language Parsing Technology

  13. Parsing Strategies q Top-down: start from the start symbol, and expand the tree with grammar rules (e.g. replace LHS symbol with RHS sequences of CFG productions) q Bottom-up: start from the input sequence, and apply grammar rules to build trees upwards (e.g. reducing RHS sequence into LHS symbols) 13 Natural Language Parsing Technology

  14. Top-Down Parsing q Goal-directed search 1. S ! NP VP q Waste time on trees that do 2. NP ! NP PP not match input sentence 3. . . . q Pure top-down (left-first) S approach cannot parse NP VP (left-)recursion grammars NP PP NP PP NP PP . . . 14 Natural Language Parsing Technology

  15. Bottom-Up Parsing q Use the input to guide the 1. A ! B | a search (data-driven) 2. B ! A q Waste time on trees that don’t 3. . . . result in S . . . q Recursive unary rules still B create an infinite parse forest A for a finite length sentence B A a 15 Natural Language Parsing Technology

  16. Problems q Left-recursion NP ! NP PP q Ambiguity q Repeated parsing of subtrees 16 Natural Language Parsing Technology

  17. Dynamic Programming (DP) q Divisibility: the optimal solution of a sub problem is part of the optimal solution of the whole problem q Memoization: solve small problems only once and remember the answers Example Calculating Fibonacci numbers: F n = F n − 1 + F n − 2 ( F 0 = 0 , F 1 = 1 ) Pascal Triangle (Binomial Coefficients): ✓ n + 1 ◆ ✓ n ◆ ✓ n ◆ = + k + 1 k k + 1 17 Natural Language Parsing Technology

  18. CYK Algorithm q Cocke-Younger-Kasami, also known as CKY algorithm q Essentially a bottom-up chart parsing algorithm using dynamic programming q CFG is in Chomsky Normal Form (CNF) q A ! BC q A ! a q S ! ✏ q A , B , C 2 V N , a 2 V T , B , C 6 = S q Fill in a two-dimension array: C [ i ][ j ] contains all the possible syntactic interpretations of the substring w i + 1 . . . w j q Complexity O ( n 3 ) 18 Natural Language Parsing Technology

  19. CYK Algorithm 0  i < j  n do 1: for all i , j C [ i ][ j ] ( ; 2: 3: end for 4: for all A ! w i 2 P do C [ i � 1 ][ i ] ( { A } [ C [ i � 1 ][ i ] 5: 6: end for 7: for s = h 2 . . . n i do 8: for all A ! B C 2 P , i , k : 0  i < k < i + s do 9: if B 2 C [ i ][ k ] ^ C 2 C [ k ][ i + s ] then 10: C [ i ][ i + s ] ( { A } [ C [ i ][ i + s ] 11: end if 12: end for 13: end for 19 Natural Language Parsing Technology

  20. CYK Chart Example S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP NP → D N | NP PP | N PP PP → P NP | P N N → john, girl, car V → saw, walks P → in D → the, a john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  21. CYK Chart Example N V S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP → D N | NP PP | N PP N PP → P NP | P N N → john, girl, car P V → saw, walks D P → in D → the, a N N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  22. CYK Chart Example N S V S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP → D N | NP PP | N PP N PP → P NP | P N N → john, girl, car P V → saw, walks D NP P → in D → the, a N S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  23. CYK Chart Example N S V VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP → D N | NP PP | N PP N PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  24. CYK Chart Example N S S V VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  25. CYK Chart Example N S S V VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N NP S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  26. CYK Chart Example N S S V VP VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N VP NP S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

  27. CYK Chart Example N S S S V VP VP S → NP VP | N VP | N V | NP V VP → V NP | V N | VP PP D NP NP NP → D N | NP PP | N PP N NP PP → P NP | P N N → john, girl, car P PP V → saw, walks D NP P → in D → the, a N S VP NP S NP VP PP S NP NP N V D N P D N john saw the girl in a car 0 1 2 3 4 5 6 7 20 Natural Language Parsing Technology

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend