Empirical Methods in Natural Language Processing Lecture 10 Parsing - PowerPoint PPT Presentation

Empirical Methods in Natural Language Processing Lecture 10 Parsing (II): Probabilistic parsing models Philipp Koehn 7 February 2008 Philipp Koehn EMNLP Lecture 10 7 February 2008

1 Parsing • Task: build the syntactic tree for a sentence • Grammar formalism – phrase structure grammar – context-free grammar • Parsing algorithm: CYK (chart) parsing • Open problems – where do we get the grammar from? – how do we resolve ambiguities Philipp Koehn EMNLP Lecture 10 7 February 2008

2 Penn treebank • Penn treebank: English sentences annotated with syntax trees – built at the University of Pennsylvania – 40,000 sentences, about a million words – real text from the Wall Street Journal • Similar treebanks exist for other languages – German – French – Spanish – Arabic – Chinese Philipp Koehn EMNLP Lecture 10 7 February 2008

3 Sample syntax tree S PPPP . NP-SBJ VP ✦ ✦ ❝ ❡ ✦ ❝ ❡ Mr Vinken is NP-PRD PPPP NP PP ❅ ❅ chairman of NP ✭ ✭ ✥ ❵❵❵❵❵❵ ✭ ✥ ✭ ✭ ✥ ✭ ✭ ✭ ✥ ✭ ✭ ✥ ✭ ✭ ✥ , NP NP ✘ ✭ ❤❤❤❤❤❤❤❤❤ ✭ ❛ ✭ ✭ ✘ ✧ ❛ ✭ ✭ ✘ ◗ ✭ ✧ ❛ ✭ ✘ ✭ ✧ ❛ ✭ ✘ ◗ Elsevier N.V. the Dutch publishing group Philipp Koehn EMNLP Lecture 10 7 February 2008

4 Sample tree with part-of-speech S PPPP NP-SBJ VP . ✦ ✦ ❝ ❡ ✦ ❝ ❡ . NNP NNP VBZ NP-PRD PPPP Mr Vinken is NP PP ❅ ❅ NN IN NP ✭ ✭ ✥ ❵❵❵❵❵❵ ✭ ✭ ✥ ✭ ✥ ✭ ✭ ✭ ✥ ✭ ✭ ✥ ✭ ✭ ✥ chairman of NP , NP ✭ ❤❤❤❤❤❤❤❤❤ ✭ ✘ ❛ ✭ ✭ ✘ ✧ ✭ ❛ ✭ ✘ ◗ ✭ ✧ ❛ ✭ ✘ ✭ ✧ ❛ ✭ ✘ ◗ , NNP NNP DT NNP VBG NN Elsevier N.V. the Dutch publishing group Philipp Koehn EMNLP Lecture 10 7 February 2008

5 Learning a grammar from the treebank • Context-free grammar: we have rules in the form S → NP-SBJ VP • We can collect these rules from the treebank • We can even estimate probabilities for rules p ( S → NP-SBJ VP | S ) = count ( S → NP-SBJ VP ) count ( S ) ⇒ Probabilistic context-free grammar (PCFG) Philipp Koehn EMNLP Lecture 10 7 February 2008

6 Rules applications to build tree S → NP-SBJ VP NP-SBJ → NNP NNP S ✭ ✭ PPPPPPP ✭ ✭ ✭ ✭ ✭ ✭ ✭ NNP → Mr ✭ ✭ ✭ ✭ NP-SBJ VP NNP → Vinken ✭ ✭ ✟ ✭ ✭ ❧ ✭ ✟ ✭ ❡ ✭ ✭ ✟ ❧ ✭ ✭ ✭ ✟ ❧ ✭ ❡ VP → VBZ NP-PRD NNP NNP VBZ NP-PRD VBZ → is ✏ ❛❛❛❛❛ ✏ ✏ ✏ ✏ ✏ ✏ NP-PRD → NP PP is Mr Vinken NP PP ✟ NP → NN ✟ ❅ ✟ ✟ ✟ ❅ NN → chairman NN IN NP PP → IN NP chairman of NNP IN → of NP → NNP Elsevier NNP → Elsevier Philipp Koehn EMNLP Lecture 10 7 February 2008

7 Compute probability of tree • Probability of a tree is the product of the probabilities of the rule applications: � p ( tree ) = p ( rule i ) i • We assume that all rule applications are independent of each other p ( tree ) = p ( S → NP-SBJ VP | S ) × p ( NP-SBJ → NNP NNP | NP-SBJ ) × ... × p ( NNP → Elsevier | NNP ) Philipp Koehn EMNLP Lecture 10 7 February 2008

8 Prepositional phrase attachment ambiguity S ✥ ✥ ❛ ✥ ✥ ❛ ✥ ✥ ❛ ✥ ❛ ✥ ✥ ❛ ✥ S ✥ ❛ ✥ ✥ ❛ ✥ ✥ ❛ ✥ ❛ ✥ ✥ ❛ ✥ ✥ ❛ NP-SBJ VP ✥ ✥ ❛ ✥ ✥ ✧ ✥ ✥ ❅ ✥ ❙ ✧ ✥ ✥ ✧ ✥ ✥ NP-SBJ VP ✧ ❅ ✥ ❙ ✥ PPPPPP ✥ ✟ ✧ ✥ ✥ ✧ ❅ ✥ ✟ ✥ ✥ ✧ ✟ ✥ NNP NNP VBZ NP-PRD ✥ ✧ ❅ ✥ ✟ ✦ ❍ ✦ ❍ ✦ ❍ ✦ ❍ ✦ NNP NNP VBZ NP-PRD PP ✦ ❍ ✧ Mr Vinken is ✧ ❭ ✧ NP PP ✧ ❭ ✧ Mr Vinken is ❭ ✧ ✧ NP IN NP ✧ ❭ NN IN NP of NN NNP chairman of NNP chairman Elsevier Elsevier PP attached to NP-PRD PP attached to VP Philipp Koehn EMNLP Lecture 10 7 February 2008

9 PP attachment ambiguity: rule applications S → NP-SBJ VP S → NP-SBJ VP NP-SBJ → NNP NNP NP-SBJ → NNP NNP NNP → Mr NNP → Mr NNP → Vinken NNP → Vinken VP → VBZ NP-PRD VP → VBZ NP-PRD PP VBZ → is VBZ → is NP-PRD → NP PP NP-PRD → NP NP → NN NP → NN NN → chairman NN → chairman PP → IN NP PP → IN NP IN → of IN → of NP → NNP NP → NNP NNP → Elsevier NNP → Elsevier PP attached to NP-PRD PP attached to VP Philipp Koehn EMNLP Lecture 10 7 February 2008

10 PP attachment ambiguity: difference in probability • PP attachment to NP-PRD is preferred if p ( VP → VBZ NP-PRD | VP ) × p ( NP-PRD → NP PP | NP-PRD ) is larger than p ( VP → VBZ NP-PRD PP | VP ) × p ( NP-PRD → NP | NP-PRD ) • Is this too general? Philipp Koehn EMNLP Lecture 10 7 February 2008

11 Scope ambiguity NP NP ❤❤❤❤❤❤❤❤❤❤❤❤ ✥ ❳❳❳❳❳❳❳❳ ✥ ✟ ✥ ✥ ❝ ✟ ✥ ✥ ✟ ✥ ❝ ✥ ✟ ✥ ✥ ✟ ✥ ❝ NP CC NP NP PP ✘ ✏ ✘ ✏ ✘ ❝ ❝ ✏ ✘ ✘ ✏ ❝ ✘ ❝ ✏ ✘ ✏ ✘ ✏ ❝ ✘ ❝ and NP PP NNP NNP IN NP PPPPPPP ✟ ✟ ❝ ✟ ✟ ❅ ✟ ✟ ❝ ✟ ✟ ✟ ❝ ✟ ❅ Jim John from NNP IN NP NP CC NP John from and NN NN NNP Hoboken Hoboken Jim correct: false: and connects John and Jim and connects Hoboken and Jim However: the same rules are applied Philipp Koehn EMNLP Lecture 10 7 February 2008

12 Weakness of PCFG • Independence assumption too strong • Non-terminal rule applications do not use lexical information • Not sufficiently sensitive to structural differences beyond parent/child node relationships Philipp Koehn EMNLP Lecture 10 7 February 2008

13 Head words • Recall dependency structure : is PPPPPP ✦ ✦ ✦ ✦ ✦ ✦ Vinken chairman Mr Elsevier of • Direct relationships between words, some are the head of others (see also Head-Driven Phrase Structure Grammar ) Philipp Koehn EMNLP Lecture 10 7 February 2008

14 Adding head words to trees S(is) ❤❤❤❤❤❤❤❤❤❤❤❤ NP-SBJ(Vinken) VP(is) ✭ ❳❳❳❳❳ ❳ ✭ ✭ ❳ ✭ ❳ ✭ ✭ ❳ ✭ ❳ NNP(Mr) NNP(Vinken) VBZ(is) NP-PRD(chairman) ✭ ✭ ❤❤❤❤❤❤❤❤ ✭ ✭ ✭ ✭ ✭ ✭ ✭ ✭ ✭ ✭ Mr Vinken is NP(chairman) PP(Elsevier) ✭ ❳ ✭ ✭ ❳ ✭ ❳ ✭ ✭ ❳ ✭ ❳ NN(chairman) IN(of) NP(Elsevier) chairman of NNP(Elsevier) Elsevier Philipp Koehn EMNLP Lecture 10 7 February 2008

15 Head words in rules • Each context-free rule has one head child that is the head of the rule – S → NP VP – VP → VBZ NP – NP → DT NN NN • Parent receives head word from head child • Head childs are not marked in the Penn treebank, but they are easy to recover using simple rules Philipp Koehn EMNLP Lecture 10 7 February 2008

16 Recovering heads • Rule for recovering heads for NPs – if rule contains NN , NNS or NNP , choose rightmost NN , NNS or NNP – else if rule contains a NP , choose leftmost NP – else if rule contains a JJ , choose rightmost JJ – else if rule contains a CD , choose rightmost CD – else choose rightmost child • Examples – NP → DT NNP NN – NP → NP CC NP – NP → NP PP – NP → DT JJ – NP → DT Philipp Koehn EMNLP Lecture 10 7 February 2008

17 Using head nodes • PP attachment to NP-PRD is preferred if p ( VP(is) → VBZ(is) NP-PRD(chairman) | VP(is) ) × p ( NP-PRD(chairman) → NP(chairman) PP(Elsevier) | NP-PRD(chairman) ) is larger than p ( VP(is) → VBZ(is) NP-PRD(chairman) PP(Elsevier) | VP(is) ) × p ( NP-PRD(chairman) → NP(chairman) | NP-PRD(chairman) ) • Scope ambiguity: combining Hoboken and Jim should have low probability p ( NP(Hoboken) → NP(Hoboken) CC(and) NP(John) | VP(Hoboken) ) Philipp Koehn EMNLP Lecture 10 7 February 2008

18 Sparse data concerns • How often will we encounter NP(Hoboken) → NP(Hoboken) CC(and) NP(John) • ... or even NP(Jim) → NP(Jim) CC(and) NP(John) • If not seen in training, probability will be zero Philipp Koehn EMNLP Lecture 10 7 February 2008

19 Sparse data: Dependency relations • Instead of using a complex rule NP(Jim) → NP(Jim) CC(and) NP(John) • ... we collect statistics over dependency relations head word head tag child node child tag direction NP CC left Jim and NP NP left Jim John – first generate child tag : p (CC | NP, Jim ,left) – then generate child word : p ( and | NP, Jim ,left,CC) Philipp Koehn EMNLP Lecture 10 7 February 2008

20 Sparse data: Interpolation • Use of interpolation with back-off statistics (recall: language modeling) • Generate child tag count ( CC , NP , Jim , left ) count ( CC , NP , left ) p ( CC | NP , Jim , left ) = λ 1 + λ 2 count ( NP , Jim , left ) count ( NP , left ) • With 0 ≤ λ 1 ≤ 1 , 0 ≤ λ 2 ≤ 1 , λ 1 + λ 2 = 1 Philipp Koehn EMNLP Lecture 10 7 February 2008

21 Sparse data: Interpolation (2) • Generate child word count ( and , CC , NP , Jim , left ) p ( and | CC , NP , Jim , left ) = λ 1 count ( CC , NP , Jim , left ) count ( and , CC , NP , left ) + λ 2 count ( CC , NP , left ) count ( and , CC , left ) + λ 3 count ( CC , left ) • With 0 ≤ λ 1 ≤ 1 , 0 ≤ λ 2 ≤ 1 , 0 ≤ λ 3 ≤ 1 , λ 1 + λ 2 + λ 3 = 1 Philipp Koehn EMNLP Lecture 10 7 February 2008

Empirical Methods in Natural Language Processing Lecture 10 Parsing - PowerPoint PPT Presentation

Empirical Methods in Natural Language Processing Lecture 10 Parsing (II): Probabilistic parsing models Philipp Koehn 7 February 2008 Philipp Koehn EMNLP Lecture 10 7 February 2008 1 Parsing Task: build the syntactic tree for a sentence

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Empirical Methods in Natural Language Processing Lecture 4 Language Modeling (II): Smoothing and

Empirical Methods in Natural Language Processing Lecture 4 Language Modeling (II): Smoothing and

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Outline of todays lecture Natural Language Processing Lecture 1: Introduction Overview of the

Outline of todays lecture Overview of Natural Language Generation Components of Natural

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

JPEG Anti-forensics Using Non-parametric DCT Quantization Noise Estimation and Natural Image

Workers and Employment Relations in RECESSION and Recovery 1. UK in Global Perspective 2.

R I C K M U N C R I E F , C H A I R M A N & C E O A U G U S T 6 , 2 0 1 9 WPX

Demetres Briassoulis AUA-DNRAE Virtual Final STAR-ProBio Workshop, 28/4/2020 Agricultural

Classical Copying versus Qantum Entanglement in Natural Language: the Case of VP-ellipsis Gijs

1 SELECTING THE RIGHT NEBULIZER SELECTING THE RIGHT NEBULIZER Front-loaded versus Bottom-loaded

COVID-19 and LTC April 9, 2020 Questions and Answer Session Use the QA box in the webinar

1 2 3 4 Predicting medical gas consumption during surge conditions is a complicated topic This

Empirical Methods in Natural Language Processing Lecture 10 Parsing - PowerPoint PPT Presentation

Empirical Methods in Natural Language Processing Lecture 10 Parsing (II): Probabilistic parsing models Philipp Koehn 7 February 2008 Philipp Koehn EMNLP Lecture 10 7 February 2008 1 Parsing Task: build the syntactic tree for a sentence

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Empirical Methods in Natural Language Processing Lecture 4 Language Modeling (II): Smoothing and

Empirical Methods in Natural Language Processing Lecture 4 Language Modeling (II): Smoothing and

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Outline of todays lecture Natural Language Processing Lecture 1: Introduction Overview of the

Outline of todays lecture Overview of Natural Language Generation Components of Natural

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

JPEG Anti-forensics Using Non-parametric DCT Quantization Noise Estimation and Natural Image

Workers and Employment Relations in RECESSION and Recovery 1. UK in Global Perspective 2.

R I C K M U N C R I E F , C H A I R M A N &amp; C E O A U G U S T 6 , 2 0 1 9 WPX

Demetres Briassoulis AUA-DNRAE Virtual Final STAR-ProBio Workshop, 28/4/2020 Agricultural

Classical Copying versus Qantum Entanglement in Natural Language: the Case of VP-ellipsis Gijs

1 SELECTING THE RIGHT NEBULIZER SELECTING THE RIGHT NEBULIZER Front-loaded versus Bottom-loaded

COVID-19 and LTC April 9, 2020 Questions and Answer Session Use the QA box in the webinar

1 2 3 4 Predicting medical gas consumption during surge conditions is a complicated topic This

R I C K M U N C R I E F , C H A I R M A N & C E O A U G U S T 6 , 2 0 1 9 WPX