computational linguistics ii parsing
play

Computational Linguistics II: Parsing Ungers Parsing Method Frank - PowerPoint PPT Presentation

Computational Linguistics II: Parsing Ungers Parsing Method Frank Richter & Jan-Philipp S ohn fr@sfs.uni-tuebingen.de, jp.soehn@uni-tuebingen.de November 29th, 2006 Richter/S ohn (WS 2006/07) Computational Linguistics II: Parsing


  1. Computational Linguistics II: Parsing Unger’s Parsing Method Frank Richter & Jan-Philipp S¨ ohn fr@sfs.uni-tuebingen.de, jp.soehn@uni-tuebingen.de November 29th, 2006 Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 1 / 11

  2. Unger’s Parser top-down processing guesses how to split the input string into partitions that can be derived from a particular daughter all possible splits are tried assume: ǫ -free grammar example: rule: S → PP NP VP | NP VP | VP sentence: In the Olympic Games, Greeks ran races, jumped, hurled the biscuits, and threw the java. Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 2 / 11

  3. Unger’s Parser – Example S → VP: easy ⇒ VP → In the Olympic Games, Greeks ran races, jumped, hurled the biscuits, and threw the java. S → NP VP: NP VP In the Olympic Games, Greeks... In the Olympic Games, Greeks ran... In the Olympic Games, Greeks ran races... In the Olympic Games, Greeks ran races, jumped... . . . In the Olympic... java. Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 3 / 11

  4. Unger’s Parser – Example II S → PP NP VP: PP NP VP In the Olympic Games,... In the Olympic Games, Greeks... . . . In the Olympic Games, Greeks ran... In the Olympic Games, Greeks ran... . . . In the Olympic... the java. then try all rules and all partitions for PP, NP, VP each symbol needs to cover at least one word ⇒ the strings will always become shorter Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 4 / 11

  5. Unger’s Parser – Details can be executed depth-first or breadth-first immense number of comparisons: exponential time complexity possible optimization: discard splits for which terminals do not match: rule: NPK → NP and NP impossible split: { NP many poems and }{ and verse }{ NP and also literature } more optimizations: e.g. compute minimum number of terminals that derive from a non-terminal i.e. non-terminal: VP , minimal length for VP = 3, then discard all partitions of less than 3 words Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 5 / 11

  6. Unger Algorithm – parallel 1 if Z ∈ T and Z = w k , finish 2 select rule Z → X 1 . . . X n 3 split up sentence in n parts w 1 . . . w n in all different ways 4 for all k = 1 to n : if X k ∈ T and X k � = w k , discard split otherwise store split 5 select one split, for all parts Z repeat steps 1 – 4 Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 6 / 11

  7. Towards a Real Algorithm What knowledge needs to be preserved during the parse? What data structures do we need? What happens if a possibility turns out to be wrong? Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 7 / 11

  8. Unger’s Parser with ǫ Rules allow empty string as partition: rule: S → NP VP: NP VP In the Olympic Games,... In the Olympic Games, Greeks... In the Olympic Games, Greeks ran... In the Olympic Games, Greeks ran races... In the Olympic Games, Greeks ran races, jumped... . . . . . . In the Olympic... java. In the Olympic... Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 8 / 11

  9. Unger’s Parser with ǫ Rules II problem: loops rules: S → NP VP, and VP → V S sentence: The Magna Carta provided that no free man should be hanged twice for the same offense. problematic partition: NP VP The Magna Carta provided that... V S The Magna Carta provided... Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 9 / 11

  10. Unger’s Parser with ǫ Rules III Solution: check in decision history whether the same situation has occurred before S ⇒ The Magna ... same offense. NP ⇒ ǫ ; VP ⇒ The Magna ... same offense. V ⇒ ǫ ; S ⇒ The Magna ... same offense. cut off! . . . NP ⇒ The ; VP ⇒ Magna ... same offense Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 10 / 11

  11. Example Sentence: shit happens on the other side of the wormhole (Trekkism, DS9) Grammar: → S NP VP NP → N | DET N | DET ADJ N | NP PP → VP V PP PP → P NP ADJ → other DET → the N → shit | side | wormhole → on | of P V → happens Richter/S¨ ohn (WS 2006/07) Computational Linguistics II: Parsing November 29th, 2006 11 / 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend