Parsing with PCFGs Joakim Nivre Uppsala University Department of - PowerPoint PPT Presentation

Parsing with PCFGs Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Parsing with PCFGs 1(15)

Probabilistic Context-Free Grammar (PCFG) 1. Grammar Formalism 2. Parsing Model 3. Parsing Algorithms 4. Learning with a Treebank 5. Learning without a Treebank Parsing with PCFGs 2(15)

Grammar Formalism G = ( N , Σ , R , S , Q ) ◮ N is a finite (non-terminal) alphabet ◮ Σ is a finite (terminal) alphabet ◮ R is a finite set of rules A → α ( A ∈ N , α ∈ (Σ ∪ N ) ∗ ) ◮ S ∈ N is the start symbol ◮ Q is function from R to the real numbers in the interval [ 0 , 1 ] Parsing with PCFGs 3(15)

Grammar Formalism S ✑ ◗ ✑✑✑✑✑✑✑✑✑✑✑ ◗ ◗ S → NP VP PU 1.00 ◗ VP ◗ ❍ VP → VP PP 0.33 � ❍ ◗ ◗ � NP ◗ ❍ VP → VBD NP 0.67 ✧✧✧✧✧ � ❍ ◗ ◗ � PP NP → NP PP 0.14 ◗ ❍ � � ❍ ◗ NP → JJ NN 0.57 � � NP NP NP PU ✟✟ ❍ ✟✟ ❍ ✟✟ ❍ ❍ � ❍ � ❍ NP → JJ NNS 0.29 JJ NN VBD JJ NN IN JJ NNS PP → IN NP 1.00 Economic news on . had little effect financial markets PU → . 1.00 S ✑ ◗ JJ → Economic 0.33 ✑✑✑✑✑✑✑✑✑✑✑ ◗ ◗ ◗ JJ → little 0.33 ◗ VP ◗ ✦✦✦✦✦ JJ → financial 0.33 ❍ ◗ ❍ ◗ ❍ ❍ ◗ NN → news 0.50 ◗ VP PP ◗ ❍ ❍ NN → effect 0.50 ✡ ✡ ❍ � ❍ ◗ NP ✡ NP � NP PU ✟✟ ❍ ✟✟ ❍ ✟✟ ❍ NNS → markets 1.00 ❍ ✡ ❍ � ❍ JJ NN VBD JJ NN IN JJ NNS VBD → had 1.00 IN → on 1.00 Economic news had little effect on financial markets . Parsing with PCFGs 4(15)

Grammar Formalism L ( G ) = { x ∈ Σ ∗ | S ⇒ ∗ x } T ( G ) = set of parse trees for x ∈ L ( G ) For parse tree y ∈ T ( G ) : ◮ yield ( y ) = terminal string associated with y ◮ count ( i , y ) = number of times the r i ∈ R is used to derive y ◮ lhs ( i ) = nonterminal symbol in the left-hand side of r i ◮ Q ( i ) = q i = probability of r i Parsing with PCFGs 5(15)

Grammar Formalism Probability P ( y ) of a parse tree y ∈ T ( G ) : | R | q count ( i , y ) � P ( y ) = i i = 1 Probability P ( x , y ) of a string x and parse tree y : � P ( y ) if yield ( y ) = x P ( x , y ) = 0 otherwise The probability P ( x ) of a string x ∈ L ( G ) : � P ( x ) = P ( y ) y ∈ T ( G ): yield ( y )= x Parsing with PCFGs 6(15)

Grammar Formalism A PCFG is proper iff for every nonterminal A ∈ N � q i = 1 r i ∈ R : lhs ( i )= A A PCFG is consistent iff � P ( y ) = 1 y ∈ T ( G ) Parsing with PCFGs 7(15)

Parsing Model 1. X = Σ ∗ 2. Y = R ∗ [parse trees = leftmost derivations] 3. GEN ( x ) = { y ∈ T ( G ) | yield ( y ) = x } 4. EVAL ( y ) = P ( y ) = � | R | i = 1 q count ( i , y ) i NB: Joint probability is proportional to conditional probability: P ( x , y ) P ( y | x ) = y ′ ∈ GEN(x) P ( y ′ ) � Parsing with PCFGs 8(15)

Parsing Model S → NP VP PU 1.00 S ✑ ◗ ✑✑✑✑✑✑✑✑✑✑✑ ◗ ◗ VP → VP PP 0.33 ◗ VP ◗ ❍ � ❍ ◗ VP → VBD NP 0.67 ◗ � NP ◗ ❍ ✧✧✧✧✧ NP → NP PP 0.14 � ❍ ◗ ◗ � PP ◗ ❍ NP → JJ NN 0.57 � � ❍ ◗ NP � NP � NP PU ✟✟ ❍ ✟✟ ❍ ✟✟ ❍ NP → JJ NNS 0.29 ❍ � ❍ � ❍ JJ NN VBD JJ NN IN JJ NNS PP → IN NP 1.00 . Economic news had little effect on financial markets 0.0000794 PU → . 1.00 S JJ → Economic 0.33 ✑ ◗ ✑✑✑✑✑✑✑✑✑✑✑ ◗ ◗ ◗ JJ → little 0.33 ◗ VP ◗ ✦✦✦✦✦ ❍ ◗ JJ → financial 0.33 ❍ ◗ ❍ ❍ ◗ NN → news 0.50 ◗ VP PP ◗ ❍ ❍ ✡ ✡ ❍ � ❍ ◗ NN → effect 0.50 NP ✡ NP � NP PU ✟✟ ❍ ✟✟ ❍ ✟✟ ❍ ❍ ✡ ❍ � ❍ NNS → markets 1.00 JJ NN VBD JJ NN IN JJ NNS VBD → had 1.00 news on . Economic had little effect financial markets 0.0001871 IN → on 1.00 Parsing with PCFGs 9(15)

Parsing Algorithms Parsing (decoding) problem for PCFG G and input x : ◮ Compute GEN ( x ) ◮ Compute EVAL ( y ) for y ∈ GEN ( x ) Standard algorithms for CFG can be adapted to PCFG: ◮ CKY ◮ Earley Viterbi parsing: argmax y ∈ GEN ( x ) EVAL ( y ) Parsing with PCFGs 10(15)

Parsing Algorithms Fencepost positions Parsing with PCFGs 11(15)

Parsing Algorithms PARSE(G, x) for j from 1 to n do for all A : A → a ∈ R and a = j − 1 w j C [ j − 1 , j , A ] := Q ( A → a ) for j from 2 to n do for i from j − 2 downto 0 do for k from i + 1 to j − 1 do for all A : A → BC ∈ R and C [ i , k , B ] > 0 and C [ k , j , C ] > 0 if ( C [ i , j , A ] < Q ( A → BC ) · C [ i , k , B ] · C [ k , j , C ] ) then C [ i , j , A ] := Q ( A → BC ) · C [ i , k , B ] · C [ k , j , C ] B [ i , j , A ] := { k , B , C } return BUILD-TREE( B [ 0 , n , S ] ), C [ 0 , n , S ] Parsing with PCFGs 12(15)

Learning with a Treebank Training set: ◮ Treebank Y = { y 1 , . . . , y m } Extract grammar G = ( N , Σ , R , S ) : ◮ N = the set of all nonterminals occurring in some y i ∈ Y ◮ Σ = the set of all terminals occurring in some y i ∈ Y ◮ R = the set of all rules needed to derive some y i ∈ Y ◮ S = the nonterminal at the root of every y i ∈ Y Estimate Q using relative frequencies (MLE): � m j = 1 count ( i , y j ) q i = � m � r k ∈ R : lhs ( r k )= lhs ( r i ) count ( k , y j ) j = 1 Parsing with PCFGs 13(15)

Learning with a Treebank S ✑ ◗ ✑✑✑✑✑✑✑✑✑✑✑ ◗ ◗ S → NP VP PU 1.00 ◗ VP ◗ ❍ VP → VP PP 0.33 � ❍ ◗ ◗ � NP ◗ ❍ VP → VBD NP 0.67 ✧✧✧✧✧ � ❍ ◗ ◗ � PP NP → NP PP 0.14 ◗ ❍ � � ❍ ◗ NP → JJ NN 0.57 � � NP NP NP PU ✟✟ ❍ ✟✟ ❍ ✟✟ ❍ ❍ � ❍ � ❍ NP → JJ NNS 0.29 JJ NN VBD JJ NN IN JJ NNS PP → IN NP 1.00 Economic news on . had little effect financial markets PU → . 1.00 S ✑ ◗ JJ → Economic 0.33 ✑✑✑✑✑✑✑✑✑✑✑ ◗ ◗ ◗ JJ → little 0.33 ◗ VP ◗ ✦✦✦✦✦ JJ → financial 0.33 ❍ ◗ ❍ ◗ ❍ ❍ ◗ NN → news 0.50 ◗ VP PP ◗ ❍ ❍ NN → effect 0.50 ✡ ✡ ❍ � ❍ ◗ NP ✡ NP � NP PU ✟✟ ❍ ✟✟ ❍ ✟✟ ❍ NNS → markets 1.00 ❍ ✡ ❍ � ❍ JJ NN VBD JJ NN IN JJ NNS VBD → had 1.00 IN → on 1.00 Economic news had little effect on financial markets . Parsing with PCFGs 14(15)

Learning without a Treebank Training set: ◮ Corpus X = { x 1 , . . . , x m } ◮ Grammar G = ( N , Σ , R , S ) Estimate Q using expectation-maximization (EM): 1. Guess a probability q i for each rule r i ∈ R 2. Repeat until convergence: 2.1 E-step: Compute the expected count f ( r i ) of each rule r i ∈ R : m � � P ( y | x j , Q ) · count ( i , y ) f ( r i ) = j = 1 y ∈ GEN ( x j ) 2.2 M-step: Reestimate the probability q i of each rule r i to maximize the marginal likelihood given expected counts: f ( r i ) q i = � r j ∈ R : lhs ( r j )= lhs ( r i ) f ( r j ) Parsing with PCFGs 15(15)

Parsing with PCFGs Joakim Nivre Uppsala University Department of - PowerPoint PPT Presentation

Parsing with PCFGs Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Parsing with PCFGs 1(15) Probabilistic Context-Free Grammar (PCFG) 1. Grammar Formalism 2. Parsing Model 3. Parsing

Parameter Estimation and Lexicalization for Problem 1: Assuming Independence PCFGs Problem 2:

Parameter Estimation and Lexicalization for PCFGs Informatics 2A: Lecture 21 John Longley 4

Natural Language Processing Learning PCFGs Parsing II Dan Klein UC Berkeley Treebank PCFGs

Natural Language Processing Parsing II Dan Klein UC Berkeley 1 Learning PCFGs 2 Treebank

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Algorithms for NLP Parsing III Maria Ryskina CMU Slides adapted from: Dan Klein UC

PCFGs: Parsing & Evaluation Deep Processing Techniques for NLP Ling 571 January 23, 2017

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

SI485i : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

SI425 : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

Probabilistic Context-Free Probabilistic Context-Free Grammars (PCFGs) Grammars (PCFGs) Berlin

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Paraphrase Generation from Latent-Variable PCFGs for Semantic Parsing Shashi Narayan, Siva Reddy,

{Probabilistic | Stochastic} Context-Free Grammars (PCFGs) 116 The velocity of the seismic

Outline: 1. Introduction 2. Formalism 3. Results 4. Summary Standard Model quark mixing CKM

Comparing the Effectiveness of Reasoning Famelis, Gorzny, Robinson, Formalisms for Partial

Quantum chaos and the thermodynamical formalism Stphane Nonnenmacher (Orsay) Fractal Geometry,

Strategies to determine the X(3872) energy from QCD lattice simulations R. Molina 1 , E. J. Garzon

Specification Formalisms for LTSs Xinxin Liu Institute of Software Chinese Academy of Sciences

Lecture 18: Expressive Grammars Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center

Mo dels of the System Ov erview Standard F ormalisms soft w are engineering notations

... . . . 3. Uniform Probability Space: Pr [ ] = 1 / | | for all . 1 2 In

Parsing with PCFGs Joakim Nivre Uppsala University Department of - PowerPoint PPT Presentation

Parsing with PCFGs Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Parsing with PCFGs 1(15) Probabilistic Context-Free Grammar (PCFG) 1. Grammar Formalism 2. Parsing Model 3. Parsing

Parameter Estimation and Lexicalization for Problem 1: Assuming Independence PCFGs Problem 2:

Parameter Estimation and Lexicalization for PCFGs Informatics 2A: Lecture 21 John Longley 4

Natural Language Processing Learning PCFGs Parsing II Dan Klein UC Berkeley Treebank PCFGs

Natural Language Processing Parsing II Dan Klein UC Berkeley 1 Learning PCFGs 2 Treebank

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Algorithms for NLP Parsing III Maria Ryskina CMU Slides adapted from: Dan Klein UC

PCFGs: Parsing &amp; Evaluation Deep Processing Techniques for NLP Ling 571 January 23, 2017

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

SI485i : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

SI425 : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

Probabilistic Context-Free Probabilistic Context-Free Grammars (PCFGs) Grammars (PCFGs) Berlin

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Paraphrase Generation from Latent-Variable PCFGs for Semantic Parsing Shashi Narayan, Siva Reddy,

{Probabilistic | Stochastic} Context-Free Grammars (PCFGs) 116 The velocity of the seismic

Outline: 1. Introduction 2. Formalism 3. Results 4. Summary Standard Model quark mixing CKM

Comparing the Effectiveness of Reasoning Famelis, Gorzny, Robinson, Formalisms for Partial

Quantum chaos and the thermodynamical formalism Stphane Nonnenmacher (Orsay) Fractal Geometry,

Strategies to determine the X(3872) energy from QCD lattice simulations R. Molina 1 , E. J. Garzon

Specification Formalisms for LTSs Xinxin Liu Institute of Software Chinese Academy of Sciences

Lecture 18: Expressive Grammars Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center

Mo dels of the System Ov erview Standard F ormalisms soft w are engineering notations

... . . . 3. Uniform Probability Space: Pr [ ] = 1 / | | for all . 1 2 In

PCFGs: Parsing & Evaluation Deep Processing Techniques for NLP Ling 571 January 23, 2017