weaknesses of probabilistic context free grammars
play

Weaknesses of Probabilistic Context-Free Grammars Michael Collins, - PowerPoint PPT Presentation

Weaknesses of Probabilistic Context-Free Grammars Michael Collins, Columbia University Weaknesses of PCFGs Lack of sensitivity to lexical information Lack of sensitivity to structural frequencies S NP VP NNP Vt NP IBM bought NNP


  1. Weaknesses of Probabilistic Context-Free Grammars Michael Collins, Columbia University

  2. Weaknesses of PCFGs ◮ Lack of sensitivity to lexical information ◮ Lack of sensitivity to structural frequencies

  3. S NP VP NNP Vt NP IBM bought NNP Lotus p(t) = q ( S → NP VP ) × q ( NNP → IBM ) × q ( VP → V NP ) × q ( Vt → bought ) × q ( NP → NNP ) × q ( NNP → Lotus ) × q ( NP → NNP )

  4. Another Case of PP Attachment Ambiguity (a) S NP VP NNS VP PP workers VBD NP IN NP dumped NNS into DT NN sacks a bin (b) S NP VP NNS VBD NP workers dumped NP PP NNS IN NP sacks into DT NN a bin

  5. Rules Rules S → NP VP S → NP VP NP → NNS NP → NNS VP → VP PP NP → NP PP VP → VBD NP VP → VBD NP NP → NNS NP → NNS PP → IN NP PP → IN NP (a) (b) NP → DT NN NP → DT NN NNS → workers NNS → workers VBD → dumped VBD → dumped NNS → sacks NNS → sacks IN → into IN → into DT → a DT → a NN → bin NN → bin If q ( NP → NP PP ) > q ( VP → VP PP ) then (b) is more probable, else (a) is more probable. Attachment decision is completely independent of the words

  6. A Case of Coordination Ambiguity (a) NP NP CC NP and NNS NP PP cats NNS IN NP dogs in NNS houses (b) NP NP PP NNS IN NP dogs in NP CC NP NNS and NNS houses cats

  7. Rules Rules NP → NP CC NP NP → NP CC NP NP → NP PP NP → NP PP NP → NNS NP → NNS PP → IN NP PP → IN NP NP → NNS NP → NNS (a) (b) NP → NNS NP → NNS NNS → dogs NNS → dogs IN → in IN → in NNS → houses NNS → houses CC → and CC → and NNS → cats NNS → cats Here the two parses have identical rules, and therefore have identical probability under any assignment of PCFG rule probabilities

  8. Structural Preferences: Close Attachment (a) NP (b) NP NP PP NP PP NN IN NP IN NP NP PP NN NP PP NN IN NP IN NP NN NN NN ◮ Example: president of a company in Africa ◮ Both parses have the same rules, therefore receive same probability under a PCFG ◮ “Close attachment” (structure (a)) is twice as likely in Wall Street Journal text.

  9. Structural Preferences: Close Attachment Previous example: John was believed to have been shot by Bill Here the low attachment analysis (Bill does the shooting ) contains same rules as the high attachment analysis (Bill does the believing ), so the two analyses receive same probability.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend