pcfg p robabilistic c ontext f ree g rammars
play

PCFG : P robabilistic C ontext F ree G rammars Presenter: Ba Dat - PowerPoint PPT Presentation

PCFG : P robabilistic C ontext F ree G rammars Presenter: Ba Dat Nguyen Advisor: Dr. Martin Theobald Max-Planck-Institut fr Informatik Saarbrcken, Germany Probabilistic Context Free Grammars 2 / 25 Outline Introduction P


  1. PCFG : P robabilistic C ontext F ree G rammars Presenter: Ba Dat Nguyen Advisor: Dr. Martin Theobald Max-Planck-Institut für Informatik Saarbrücken, Germany

  2. Probabilistic Context Free Grammars 2 / 25 Outline • Introduction • P robabilistic C ontext F ree G rammars  Parsing  C ontext F ree G rammars  P robabilistic C ontext F ree G rammars  Inside-Outside Algorithm • Extension  Distance  Complement/ adjunct distinction  Traces and Wh-movement

  3. Probabilistic Context Free Grammars 3 / 25 The World is a big ambiguity

  4. Probabilistic Context Free Grammars Solution PCFG is a good way to solve ambiguity problems in syntactic structure field .

  5. Probabilistic Context Free Grammars 5 / 25 Outline • Introduction • Probabilistic Context Free Grammars  Parsing  C ontext F ree G rammars  P robabilistic C ontext F ree G rammars  Inside-Outside Algorithm • Extension  Distance  Complement/ adjunct distinction  Traces and Wh-movement

  6. Probabilistic Context Free Grammars 6 / 25 Language and Grammar • Language  Structural  Ambiguous • Grammar  Generalization of regularities in language structures  Morphology and syntax

  7. Probabilistic Context Free Grammars 7 / 25 Parsing • Process working out the grammatical structure of sentences. • Basic Parsing Algorithms  Parsing Strategies  CYK Algorithm  Earley Algorithm

  8. Probabilistic Context Free Grammars 8 / 25 Example of parsing • “She is a nice girl” S NP VP PRP VBZ NP She is DT JJ NN a nice girl

  9. Probabilistic Context Free Grammars 9 / 25 Outline • Introduction • Probabilistic Context Free Grammars  Parsing  C ontext F ree G rammars  P robabilistic C ontext F ree G rammars  Inside-Outside Algorithm • Extension  Distance  Complement/ adjunct distinction  Traces and Wh-movement

  10. Probabilistic Context Free Grammars Chomsky hierarchy        A   A   A a or A aB Where : A, B are nontermina ls a is a terminal α, β, γ are strings of terminals and nontermina ls

  11. Probabilistic Context Free Grammars C ontext F ree G rammars (CFG) • A C ontext F ree G rammars consists of k w  A set of terminals { }, k = 1,... V i  A set of nonterminals { }, i = 1,... n N 1  A designated start symbol N   i j  A set of rules { } N  j where is a sequence of terminals and nonterminals

  12. Probabilistic Context Free Grammars Example of CFG S -> NP VP NP -> NP PP PP -> P NP NP -> astronomers VP -> V NP NP -> ears VP -> VP PP NP -> saw P -> with NP -> stars V -> saw NP -> telescopes

  13. Probabilistic Context Free Grammars Ambiguous sentences S NP VP astronomers V NP saw NP PP Which one stars P NP is better? with ears S NP VP astronomers VP PP V NP P NP saw stars with ears

  14. Probabilistic Context Free Grammars 14 / 25 Outline • Introduction • Probabilistic Context Free Grammars  Parsing  Context Free Grammars  P robabilistic C ontext F ree G rammars  Inside-Outside Algorithm • Extension  Distance  Complement/ adjunct distinction  Traces and Wh-movement

  15. Probabilistic Context Free Grammars P robabilistic CFG • A P robabilistic C ontext F ree G rammars (PCFG) consists of  A CFG  A corresponding set of probabilities on rules such that:      i j P ( N ) 1 i j

  16. Probabilistic Context Free Grammars Example of PCFG S -> NP VP 1.0 NP -> NP PP 0.4 PP -> P NP 1.0 NP -> astronomers 0.1 VP -> V NP 0.7 NP -> ears 0.18 VP -> VP PP 0.3 NP -> saw 0.04 P -> with 1.0 NP -> stars 0.18 V -> saw 1.0 NP -> telescopes 0.1

  17. Probabilistic Context Free Grammars Probability of a tree NP NP PP stars P NP with ears      P(NP NP PP, NP stars, PP P NP, P with, NP ears)      P(S NP PP) P(NP stars| NP NP PP)     P(PP P NP | NP NP PP, NP stars)      P(P with | PP P NP, NP NP PP, NP stars)       P(NP ears | P with , PP P NP, NP NP PP, NP stars)

  18. Probabilistic Context Free Grammars Assumptions • Place invariance j N   w ... w k P(N j ξ) is the same  k k c  k(k c) • Context-free      j j P ( N | anything o utside k t hrough l) P ( N ) kl kl • Ancestor-free      j j j P ( N | any ancestor nodes outs ide N ) P ( N ) kl kl kl

  19. Probabilistic Context Free Grammars Probability of a tree NP NP PP stars P NP with ears      P(NP NP PP, NP stars, PP P NP, P with, NP ears)       P(S NP PP) P(NP stars) P(PP P NP )     P(P with) P(NP ears)

  20. Probabilistic Context Free Grammars Ambiguity S 1.0 NP 0.1 VP 0.7 astronomers V 1.0 NP 0.4 saw NP 0.18 PP 1.0 stars P 1.0 NP 0.18 1.0x0.1x0.7x1.0x0.4x with ears 0.18x1.0x1.0x0.18 = 0.0009072 1.0x0.1x0.3x0.7x1.0x S 1.0 0.18x1.0x1.0x0.18 = 0.0006804 NP 0.1 VP 0.3 astronomers VP 0.7 PP 1.0 V 1.0 NP 0.18 P 1.0 NP 0.18 saw stars with ears

  21. Probabilistic Context Free Grammars 21 / 25 Outline • Introduction • Probabilistic Context Free Grammars  Parsing  C ontext F ree G rammars  P robabilistic C ontext F ree G rammars  Inside-Outside Algorithm • Extension  Distance  Complement/ adjunct distinction  Traces and Wh-movement

  22. Probabilistic Context Free Grammars Probability of a rule • Given a training set of annotated sentences  j ξ) C(N   j ξ) P(N   γ) j C(N γ C(.) - number of times that a particular rule is used.

  23. Probabilistic Context Free Grammars Probability of a rule How to calculate if there is no annotated data!

  24. Probabilistic Context Free Grammars Maximum Likelihood Estimation • Maximum Likelihood Estimation  arg max P ( O | ) training    parameters of current grammar set • No known analytic method to choose µ to maximize P(O | µ) • Locally maximize P(O | µ) by an iterative hill-climbing – special case of E xpectation M aximization method. • Inside-Outside algorithm is a form of EM using the inside-outside probabilities estimated from training set.

  25. Probabilistic Context Free Grammars Training a PCFG • We are given  A set of training sentences  A set of terminals  A set of nonterminals • Initial probabilities are estimated by rules (perhaps by randonly chosen) • Using inside-outside algorithm to train

  26. Probabilistic Context Free Grammars Inside-Outside probabilities • Outside probability Inside probability 1 N    ( p , q ) ( p , q ) j j j N  w ... w w ... w w ... w   1 p 1 p q q 1 m

  27. Probabilistic Context Free Grammars Inside probabilities  ( p , q ) • Inside probability is the probability of sequence j w ... p w j being generated with a tree rooted by node N q   j ( p , q ) P ( w | N ) j pq pq • Calculation can be carried out bottom-up    j ( k , k ) P ( N w ) j k  q 1        j r s ( p , q ) P ( N N N ) ( p , d ) ( d 1 , q ) j r s  r , s d q

  28. Probabilistic Context Free Grammars Outside probabilities  • Outside probability is the total probability of ( p , q ) j beginning with the start symbol and generating j all the words outside N pq   j ( p , q ) P ( w , N , w )   j 1 , p 1 pq q 1 , n    j j * N N w ... w pq p q    α and α ( 1 ,m) 1 ( 1 ,m) 0 for j 1 1 j m         f j g ( p , q ) ( p , e ) P ( N N N ) ( q 1 , e ) j f g   f , g e q 1  p 1       f g j ( e , q ) P ( N N N ) ( e , p 1 ) f g  f , g e 1

  29. Probabilistic Context Free Grammars Inside-Outside Algorithm We have:        1 * j * ( p , q ) ( p , q ) P ( N w , N w ) j j 1 , m p , q    1 * Call P ( N w ) 1 , m   ( p , q ) ( p , q )      j j j * 1 * P ( N w | N w )  p , q 1 , m

  30. Probabilistic Context Free Grammars Inside-Outside Algorithm m m    ( p , q ) ( p , q ) j j    p 1 q p j E ( N is used )   j r s j E ( N N N , N used )   q 1 m 1 m         j r s ( p , q ) P ( N N N ) ( p , d ) ( d 1 , q ) j r s      p 1 q p 1 d p 

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend