natural language processing
play

Natural Language Processing Lecture 132/26/2015 Martha Palmer - PowerPoint PPT Presentation

Natural Language Processing Lecture 132/26/2015 Martha Palmer Today Start on Parsing Top-down vs. Bottom-up Speech and Language Processing - Jurafsky and Martin 2/26/15 2 Summary Context-free grammars can be used to model


  1. Natural Language Processing Lecture 13—2/26/2015 Martha Palmer

  2. Today � Start on Parsing � Top-down vs. Bottom-up Speech and Language Processing - Jurafsky and Martin 2/26/15 2

  3. Summary � Context-free grammars can be used to model various facts about the syntax of a language. � When paired with parsers, such grammars consititute a critical component in many applications. � Constituency is a key phenomena easily captured with CFG rules. � But agreement and subcategorization do pose significant problems � Treebanks pair sentences in a corpus with their corresponding trees. Speech and Language Processing - Jurafsky and Martin 2/26/15 3

  4. Parsing � Parsing with CFGs refers to the task of assigning proper trees to input strings � Proper here means a tree that covers all and only the elements of the input and has an S at the top � It doesn ’ t actually mean that the system can select the correct tree from among all the possible trees Speech and Language Processing - Jurafsky and Martin 2/26/15 4

  5. Automatic Syntactic Parse

  6. For Now � Assume… � You have all the words already in some buffer � The input is not POS tagged prior to parsing � We won ’ t worry about morphological analysis � All the words are known � These are all problematic in various ways, and would have to be addressed in real applications. Speech and Language Processing - Jurafsky and Martin 2/26/15 6

  7. Top-Down Search � Since we ’ re trying to find trees rooted with an S (Sentences), why not start with the rules that give us an S . � Then we can work our way down from there to the words. Speech and Language Processing - Jurafsky and Martin 2/26/15 7

  8. Top Down Space Speech and Language Processing - Jurafsky and Martin 2/26/15 8

  9. Bottom-Up Parsing � Of course, we also want trees that cover the input words. So we might also start with trees that link up with the words in the right way. � Then work your way up from there to larger and larger trees. Speech and Language Processing - Jurafsky and Martin 2/26/15 9

  10. Bottom-Up Search Speech and Language Processing - Jurafsky and Martin 2/26/15 10

  11. Bottom-Up Search Speech and Language Processing - Jurafsky and Martin 2/26/15 11

  12. Bottom-Up Search Speech and Language Processing - Jurafsky and Martin 2/26/15 12

  13. Bottom-Up Search Speech and Language Processing - Jurafsky and Martin 2/26/15 13

  14. Bottom-Up Search Speech and Language Processing - Jurafsky and Martin 2/26/15 14

  15. Control � Of course, in both cases we left out how to keep track of the search space and how to make choices � Which node to try to expand next � Which grammar rule to use to expand a node � One approach is called backtracking. � Make a choice, if it works out then fine � If not then back up and make a different choice � Same as with ND-Recognize Speech and Language Processing - Jurafsky and Martin 2/26/15 15

  16. Problems � Even with the best filtering, backtracking methods are doomed because of two inter-related problems � Ambiguity and search control (choice) � Shared subproblems Speech and Language Processing - Jurafsky and Martin 2/26/15 16

  17. Ambiguity Speech and Language Processing - Jurafsky and Martin 2/26/15 17

  18. Structural Ambiguities � Its very important to separate PP ’ s that are part of the verb subcategorization frame from PP ’ s that modify the entire event. � The man saw the woman on the hill with the telescope. Woman has telescope � The man saw the woman on the hill with the telescope. Man has telescope 18

  19. Shared Sub-Problems � No matter what kind of search (top-down or bottom-up or mixed) that we choose... � We can’t afford to redo work we’ve already done. � Without some help naïve backtracking will lead to such duplicated work. Speech and Language Processing - Jurafsky and Martin 2/26/15 19

  20. Sample L1 Grammar Speech and Language Processing - Jurafsky and Martin 2/26/15 20

  21. State space representations: Recursive transition nets NP VP S S1 S2 S3 pp det noun NP adj S4 S5 S6 noun pronoun NP � s :- np,vp. � np:- pronoun; noun; det,adj, noun; np,pp. CSE391 – NLP 21 2005

  22. State space representations: Recursive transition nets, cont. VP PP VP NP S13 S14 V S9 NP PP VP aux S7 S8 S12 NP S11 NP V S10 V � VP:- VP, PP. � VP:- V; V,NP; V,NP,NP; V,NP,PP. CSE391 – NLP 22 2005

  23. S1 Parses The cat sat on the mat S S1: S → NP, VP NP VP 23

  24. S1 Parses S2 The cat sat on the mat S1: S → NP, VP S S2: NP → Det, N NP VP Det N the cat 24

  25. S1 Parses S2 The cat sat on the mat S3 S1: S → NP, VP S S2: NP → Det, N S3: VP → V NP VP Det N V the cat sat 25

  26. S1 Parses S2 The cat sat on the mat S3 S1: S → NP, VP S4 S S2: NP → Det, N S3: VP → V NP S4: VP → VP, PP VP Det PP N V the cat sat NP Prep N on Det mat the 26

  27. Multiple parses for a single sentence Time flies like an arrow. S NP VP N time V PP flies Prep NP like Det N arrow an NLP 27

  28. Multiple Parses for a single sentence Time flies like an arrow. S NP VP N time V NP N like flies N Det arrow an NLP 28

  29. Lexicon noun(cat). noun(flies). noun(mat). noun(time). det(the). noun(arrow). det(a). det(an). verb(sat). verb(flies). prep(on). verb(time). prep(like). CSE391 – NLP 29 2005

  30. Lexicon with Roots noun(flies,fly). noun(cat,cat). noun(time,time). noun(mat,mat). noun(arrow,arrow). det(the,the) det(an,an). det(a,a). verb(flies,fly). verb(sat,sit). verb(time,time). prep(on,on). prep(like,like). CSE391 – NLP 30 2005

  31. Parses The old can can hold the water. S NP VP det NP aux the V can adj N hold can old det N the water CSE391 – NLP 31 2005

  32. Structural ambiguities � That factory can can tuna. � That factory cans cans of tuna and salmon. CSE391 – NLP 32 2005

  33. Lexicon The old can can hold the water. Noun(can,can) Verb(hold,hold) Noun(cans,can) Verb(holds,hold) Noun(water,water) Verb(can, can) Noun(hold,hold) Aux(can,can) Noun(holds,hold) Adj(old,old) Det(the,the) Noun(old, old) CSE391 – NLP 33 2005

  34. Simple Context Free Grammar in BNF S → NP VP NP → Pronoun | Noun | Det Adj Noun |NP PP PP → Prep NP V → Verb | Aux Verb VP → V | V NP | V NP NP | V NP PP | VP PP NLP 34

  35. Top-down parse in progress [The, old, can, can, hold, the, water] S → NP VP NP → NP? NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the ADJ? old Noun? Can Succeed. Succeed. VP? CSE391 – NLP 35 2005

  36. Top-down parse in progress [can, hold, the, water] VP → VP? V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? hold succeed succeed fail [the, water] CSE391 – NLP 36 2005

  37. Top-down parse in progress [can, hold, the, water] VP → V NP? V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? h old NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the Noun? w ater SUCCEED SUCCEED CSE391 – NLP 37 2005

  38. Top-down approach � Start with goal of sentence S → NP VP S → Wh-word Aux NP VP � Will try to find an NP 4 different ways before trying a parse where the verb comes first. � What would be better? CSE391 – NLP 38 2005

  39. Bottom-up approach � Start with words in sentence. � What structures do they correspond to? � Once a structure is built, kept on a CHART. CSE391 – NLP 39 2005

  40. Bottom-up parse in progress det adj noun aux verb det noun. The old can can hold the water. det noun aux/verb noun/verb noun det noun . 40

  41. Bottom-up parse in progress S VP NP NP V det adj noun aux verb det noun. The old can can hold the water. det noun aux/verb noun/verb noun det noun . VP NP NP NP V S VP

  42. Bottom-up parse in progress – What is wrong w/ bottom parse? det adj noun aux verb det noun. The old can can hold the water. det noun aux/verb noun/verb noun det noun/verb . 42 NLP

  43. Bottom-up parse, corrected The old can can hold the water. det noun verb noun noun det noun/verb . NP NP V NP VP S 43 NLP

  44. Headlines � Police Begin Campaign To Run Down Jaywalkers � Iraqi Head Seeks Arms � Teacher Strikes Idle Kids � Miners Refuse To Work After Death � Juvenile Court To Try Shooting Defendant 44 NLP

  45. Headlines � Drunk Gets Nine Months in Violin Case � Enraged Cow Injures Farmer with Ax � Hospitals are Sued by 7 Foot Doctors � Milk Drinkers Turn to Powder � Lung Cancer in Women Mushrooms 45 NLP

  46. Top-down vs. Bottom-up � Helps with POS ambiguities – only � Has to consider every consider relevant POS POS � Rebuilds the same � Builds each structure structure repeatedly once � Spends a lot of time � Spends a lot of time on on impossible parses useless structures ( trees that make no sense ( trees that are not globally ) consistent with any of the words) What would be better? 46 NLP

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend