Presenter: Ba Dat Nguyen Advisor: Dr. Martin Theobald
Max-Planck-Institut für Informatik Saarbrücken, Germany
PCFG: Probabilistic Context Free Grammars
PCFG : P robabilistic C ontext F ree G rammars Presenter: Ba Dat - - PowerPoint PPT Presentation
PCFG : P robabilistic C ontext F ree G rammars Presenter: Ba Dat Nguyen Advisor: Dr. Martin Theobald Max-Planck-Institut fr Informatik Saarbrcken, Germany Probabilistic Context Free Grammars 2 / 25 Outline Introduction P
Presenter: Ba Dat Nguyen Advisor: Dr. Martin Theobald
Max-Planck-Institut für Informatik Saarbrücken, Germany
PCFG: Probabilistic Context Free Grammars
Outline
2 / 25
Probabilistic Context Free Grammars
The World is a big ambiguity
3 / 25
Probabilistic Context Free Grammars
PCFG is a good way to solve ambiguity problems in syntactic structure field.
Probabilistic Context Free Grammars
Solution
Outline
5 / 25
Probabilistic Context Free Grammars
Language and Grammar
6 / 25
Probabilistic Context Free Grammars
sentences.
Parsing
7 / 25
Probabilistic Context Free Grammars
S NP VP PRP VBZ NP She is DT JJ NN a nice girl
Example of parsing
8 / 25
Probabilistic Context Free Grammars
Outline
9 / 25
Probabilistic Context Free Grammars
Probabilistic Context Free Grammars
Chomsky hierarchy
A A
aB A a A
ls nontermina and terminals
strings are γ β, α, terminal a is a ls nontermina are B A, : Where
where is a sequence of terminals and nonterminals
Probabilistic Context Free Grammars
Context Free Grammars (CFG)
k
w
i
N
1
N
j i
N
j
S -> NP VP NP -> NP PP PP -> P NP NP -> astronomers VP -> V NP NP -> ears VP -> VP PP NP -> saw P -> with NP -> stars V -> saw NP -> telescopes
Probabilistic Context Free Grammars
Example of CFG
S NP VP astronomers V NP saw NP PP Which one stars P NP is better? with ears
Probabilistic Context Free Grammars
S NP VP astronomers VP PP V NP P NP saw stars with ears
Ambiguous sentences
Outline
14 / 25
Probabilistic Context Free Grammars
consists of
Probabilistic Context Free Grammars
Probabilistic CFG
j j i
N P 1 ) (
i
S -> NP VP 1.0 NP -> NP PP 0.4 PP -> P NP 1.0 NP -> astronomers 0.1 VP -> V NP 0.7 NP -> ears 0.18 VP -> VP PP 0.3 NP -> saw 0.04 P -> with 1.0 NP -> stars 0.18 V -> saw 1.0 NP -> telescopes 0.1
Probabilistic Context Free Grammars
Example of PCFG
NP NP PP stars P NP with ears
Probabilistic Context Free Grammars
Probability of a tree
stars) NP PP, NP NP P PP with P ears P(NP stars) NP PP, NP NP P PP with P(P stars) NP PP, NP NP P P(PP NP PP) stars| NP P(NP NP PP) P(S ears) with, NP P P stars, PP NP PP, NP P(NP NP, , | NP, | | NP NP,
Probabilistic Context Free Grammars
Assumptions
same ξ) is the k P(N j
c) k(k
) ( | (
j kl j kl
N P hrough l) utside k t anything o N P ) ( | (
j kl j kl j kl
N P ) ide N nodes outs ancestor any N P
c k k j
w w N
...
NP NP PP stars P NP with ears
Probabilistic Context Free Grammars
Probability of a tree
ears) P(NP with) P(P ) P P(PP stars) P(NP NP PP) P(S ears) with, NP P P stars, PP NP PP, NP P(NP NP NP,
S1.0 NP0.1 VP0.7 astronomers V1.0 NP0.4 saw NP0.18 PP1.0 stars P1.0 NP0.18 with ears
Probabilistic Context Free Grammars
S1.0 NP0.1 VP0.3 astronomers VP0.7 PP1.0 V1.0 NP0.18 P1.0 NP0.18 saw stars with ears
1.0x0.1x0.7x1.0x0.4x 0.18x1.0x1.0x0.18 = 0.0009072 1.0x0.1x0.3x0.7x1.0x 0.18x1.0x1.0x0.18 = 0.0006804
Ambiguity
Outline
21 / 25
Probabilistic Context Free Grammars
C(.) - number of times that a particular rule is used.
Probabilistic Context Free Grammars
Probability of a rule
γ j j j
γ) C(N ξ) C(N ξ) P(N
Probabilistic Context Free Grammars
Probability of a rule
How to calculate if there is no annotated data!
P(O | µ)
special case of Expectation Maximization method.
inside-outside probabilities estimated from training set.
Probabilistic Context Free Grammars
Maximum Likelihood Estimation
set grammar current
parameters ) | ( max arg
training
O P
by randonly chosen)
Probabilistic Context Free Grammars
Training a PCFG
Inside probability
Probabilistic Context Free Grammars
Inside-Outside probabilities
1
N
j
N
m 1 q q p 1 p 1
w ... w w ... w w ... w
) , ( q p
j
) , ( q p
j
being generated with a tree rooted by node
Probabilistic Context Free Grammars
Inside probabilities
) , ( q p
j
q p w
w ...
j
N
) ( ) , (
k j jw N P k k ) , 1 ( ) , ( ) ( ) , (
, 1q d d p N N N P q p
s r s r s r q q d j j
) | ( ) , (
j pq pq j
N w P q p
beginning with the start symbol and generating all the words outside
Probabilistic Context Free Grammars
Outside probabilities
) , ( q p
j
j pq
N
) 1 , ( ) ( ) , ( ) , 1 ( ) ( ) , ( ) , ( 1 1 1 1 ... ) , , ( ) , (
, 1 1 , 1 1 * , 1 1 , 1
p e N N N P q e e q N N N P e p q p for j ,m) ( and α ,m) ( α w w N N w N w P q p
g j g f g f p e f g g j f g f m q e f j j q p j j pq n q j pq p j
We have:
Probabilistic Context Free Grammars
Inside-Outside Algorithm
) , ( ) , ( ) | ( ) ( ) , ( ) , ( ) , (
, 1 * 1 , * , 1 * 1 , * , 1 * 1
q p q p w N w N P w N P Call w N w N P q p q p
j j m q p j m q p j m j j
Probabilistic Context Free Grammars
Inside-Outside Algorithm
1 1 1 1 1
) , 1 ( ) , ( ) ( ) , ( ) , ( ) , ( ) , ( ) (
m p m p q q p d s r s r j j j s r j m p m p q j j j
q d d p N N N P q p used N N N N E q p q p used is N E
Probabilistic Context Free Grammars
Inside-Outside Algorithm
m p m q j j m h j k h j k j m p m q j j m p m p q q p d s r s r j j s r j
q p q p h h w w P h h w N P q p q p q d d p N N N P q p N N N P Therefore
1 1 1 1 1 1 1 1 1
) , ( ) , ( ) , ( ) ( ) , ( ) ( ) , ( ) , ( ) , 1 ( ) , ( ) ( ) , ( ) ( :
Probabilistic Context Free Grammars
Inside-Outside Algorithm
) ( ) , ( ) , ( ) , , ( ) ( ) , ( ) ( ) , ( ) , , ( ) ( ) , 1 ( ) , ( ) ( ) , ( ) , , , , ( corpus in the W sentence each For
* 1 * 1 * 1 1 i i j j i i j k h j i i q p d s r s r j j i
W N P q p q p j q p h W N P h h w w P h h k j h g W N P q d d p N N N P q p s r j q p f
Probabilistic Context Free Grammars
Inside-Outside Algorithm
l i m p m p q i l i m h i k j l i m p m p q i l i m p m p q i s r j
i i i i i i ij q p h k j h g w N P j q p h s r j q p f N N N P
1 1 1 1 1 1 1 1 1 1
) , , ( ) , , ( ) ( ) , , ( ) , , , , ( ) ( have We
for each sentence
the parameters.
English than an n-gram model (n>1).
Probabilistic Context Free Grammars
Discussion
) (
3 3n
m O
Outline
35 / 25
Probabilistic Context Free Grammars
TOP -> S(bought) S(bought) -> NP(week) NP(Marks) VP(bought) NP(week) -> JJ(Last) NN(week) NP(Marks) -> NNP(Marks) VP(bought) -> VB(bought) NP(Brooks) NP(Brooks) -> NNP(Brooks)
non-terminals.
head-word from its parent.
Probabilistic Context Free Grammars
More features
TOP S(bought) NP(week) NP(Marks) VP(bought) JJ NN NNP VB NP(Brooks) Last week Marks bought NNP Brooks
Probabilistic Context Free Grammars
Example
– Head constituent label of the phrase – Modifiers to the right of the head – Modifiers to the left of the head
Probabilistic Context Free Grammars
Using Distance
) , | ( h P H P
H
1 .. 1 1 1 1 1
)) ( )... ( , , , | ) ( (
m i i i i i R
r R r R H h P r R P
1 .. 1 1 1 1 1
)) ( )... ( , , , | ) ( (
n i i i i i L
l L l L H h P l L P
Probabilistic Context Free Grammars
Using Distance
) , , | ) ( ( ) , , | ) ( ( ) , | ( )) ( ) ( ) ( ) ( ( distance With ) , , , , | ) ( ( ) , , | ) ( ( ) , | ( )) ( ) ( ) ( ) ( ( : bought VP S week NP P bought VP S Marks NP P bought S VP P bought VP Marks NP week NP bought S P Marks NP bought VP S week NP P bought VP S Marks NP P bought S VP P bought VP Marks NP week NP bought S P example For
l l h l l h
Outline
40 / 25
Probabilistic Context Free Grammars
TOP S(bought) NP(week) NP-C(Marks) VP(bought) JJ NN NNP VB NP-C(Brooks) Last week Marks bought NNP Brooks It would be useful to identify “Marks” as a subject and “Last week” as an adjunct!
Probabilistic Context Free Grammars
Adding the complement / adjunct distinction
satisfy:
an NP, SBAR, S, or VP whose parent is a VP; or S whose parent is an SBAR
ADV, VOC, BNF, DIR, EXT, LOC, MNR, TMP, CLR or PRP.
Probabilistic Context Free Grammars
Adding the complement / adjunct distinction
Probabilistic Context Free Grammars
Adding the complement / adjunct distinction
) , | ( h P H P
H
) ), 1 ( distance , , , | , ( RC i h P H r R P
i i R
) , , | ( ) , , | ( h H P RC P and h H P LC P
rc lc) ), 1 ( distance , , , | , ( LC i h P H l L P
i i L
Probabilistic Context Free Grammars
Adding the complement / adjunct distinction
) , , | ) ( ( }) { , , , | )} ( ({ ) , , | } ({ ) , | ( )) ( ) ( ) ( ) ( ( bought VP S week NP P C NP bought VP S Marks C NP P bought VP S C NP P bought S VP P bought VP Marks C NP week NP bought S P
l l lc h
Outline
45 / 25
Probabilistic Context Free Grammars
propagating gaps through the tree until the are finally discharged as a trace complement.
(1) NP
NP SBAR(+gap) (2) SBAR(+gap)
WHNP S-C(+gap) (3) S(+gap)
NP-C VP(+gap) (4) VP(+gap)
VB TRACE NP
Probabilistic Context Free Grammars
Traces and Wh-movement
NP(store) NP(store) SBAR(that)(+gap) The store WHNP(that) S(bought)(+gap) WDT NP-C(Marks) VP(bought)(+gap) that Marks VBD TRACE NP(week) bought last week
Probabilistic Context Free Grammars
Traces and Wh-movement
) , , | ) ( ( }) , { , , , | ( ) , , | } ({ ) , , | ( ) , | ( )) ( ) ( ) )( ( ( VB bought VP week NP P gap C NP VB bought VP TRACE P VB bought VP C NP P VB bought VP Right P bought VP VB P week NP TRACE bought VB gap bought VP P
R R RC G h l
Probabilistic Context Free Grammars
Experiment
parse treebank in the t constituen a with boundaries t constituen ate viol which ts constituen
number Brackets Crossing parse k in treeban ts constituen
number parse proposed in ts constituen correct
number recall Labelled parse proposed in ts constituen
number parse proposed in ts constituen correct
number Precision Labelled
23. section
tested and Treebank Penn the
portion Journal Street Wall the
21
sections
trained are Models
Processing
http://nlp.stanford.edu:8080/parser/
Statistical Parsing, ACL 97
Probabilistic Context Free Grammars
References
Probabilistic Context Free Grammars