StatisticalNLP Spring2010 Lecture14:PCFGs DanKlein UCBerkeley - - PDF document

statistical nlp
SMART_READER_LITE
LIVE PREVIEW

StatisticalNLP Spring2010 Lecture14:PCFGs DanKlein UCBerkeley - - PDF document

StatisticalNLP Spring2010 Lecture14:PCFGs DanKlein UCBerkeley TreebankPCFGs [Charniak96] UsePCFGsforbroadcoverageparsing


slide-1
SLIDE 1

1

StatisticalNLP

Spring2010

Lecture14:PCFGs

DanKlein– UCBerkeley

TreebankPCFGs

  • UsePCFGsforbroadcoverageparsing
  • Cantakeagrammarrightoffthetrees(doesn’tworkwell):

  • Baseline

72.0 [Charniak96]

slide-2
SLIDE 2

2

ConditionalIndependence?

NoteveryNPexpansioncanfilleveryNPslot

Agrammarwithsymbolslike“NP”won’tbecontext9free Statistically,conditionalindependencetoostrong

Non9Independence

Independenceassumptionsareoftentoostrong. Example:theexpansionofanNPishighlydependent

  • ntheparentoftheNP(i.e.,subjectsvs.objects).

Also:thesubjectandobjectexpansionsarecorrelated!

slide-3
SLIDE 3

3

GrammarRefinement

Example:PPattachment

GrammarRefinement

  • StructureAnnotation[Johnson’98,Klein&Manning ’03]
  • Lexicalization[Collins’99,Charniak ’00]
  • LatentVariables[Matsuzaki etal.05,Petrov etal.’06]
slide-4
SLIDE 4

4

TheGameofDesigningaGrammar

Annotationrefinesbasetreebanksymbolsto improvestatisticalfitofthegrammar

Structuralannotation

TypicalExperimentalSetup

Corpus:PennTreebank,WSJ Accuracy– F1:harmonicmeanofper9nodelabeled precisionandrecall. Here:alsosize– numberofsymbolsingrammar.

Passive/completesymbols:NP,NP^S Active/incompletesymbols:NP→ NPCC• Training: sections 02921 Development: section 22(here,first20files) Test: section 23

slide-5
SLIDE 5

5

VerticalMarkovization

VerticalMarkov

  • rder:rewrites

dependonpast ancestornodes. (cf.parent annotation)

  • 72%

73% 74% 75% 76% 77% 78% 79% 1 2v 2 3v 3 VerticalMarkovOrder 5000 10000 15000 20000 25000 1 2v 2 3v 3 VerticalMarkovOrder

  • HorizontalMarkovization

70% 71% 72% 73% 74% 1 2v 2 inf HorizontalMarkovOrder 3000 6000 9000 12000 1 2v 2 inf HorizontalMarkovOrder

∞ ∞ ∞

slide-6
SLIDE 6

6

UnarySplits

Problem:unary rewritesusedto transmute categoriessoa high9probability rulecanbe used.

Annotation F1 Size Base 77.8 7.5K UNARY 78.3 8.0K

  • TagSplits

Problem:Treebank tagsaretoocoarse. Example:Sentential, PP,andother prepositionsareall markedIN. PartialSolution:

SubdividetheINtag.

Annotation F1 Size Previous 78.3 8.0K SPLIT9IN 80.3 8.1K

slide-7
SLIDE 7

7

OtherTagSplits

UNARY9DT:markdemonstrativesasDT^U (“theX”vs.“those”) UNARY9RB:markphrasaladverbsasRB^U (“quickly”vs.“very”) TAG9PA:marktagswithnon9canonical parents(“not”isanRB^VP) SPLIT9AUX:markauxiliaryverbswith–AUX [cf.Charniak97] SPLIT9CC:separate“but”and“&”fromother conjunctions SPLIT9%:“%”getsitsowntag. F1 Size 80.4 8.1K 80.5 8.1K 81.2 8.5K 81.6 9.0K 81.7 9.1K 81.8 9.3K

AFullyAnnotated(Unlex)Tree

slide-8
SLIDE 8

8

SomeTestSetResults

  • Beats“firstgeneration”lexicalizedparsers.
  • Lotsofroomtoimprove– morecomplexmodelsnext.

Parser LP LR

  • CB

0CB Magerman95 84.9 84.6

  • 1.26

56.6 Collins96 86.3 85.8

  • 1.14

59.9 Unlexicalized 86.9 85.7

  • 1.10

60.3 Charniak97 87.4 87.5

  • 1.00

62.1 Collins99 88.7 88.6

  • 0.90

67.1

Annotationrefinesbasetreebanksymbolsto improvestatisticalfitofthegrammar

Structuralannotation[Johnson’98,Kleinand Manning03] Headlexicalization [Collins’99,Charniak’00]

TheGameofDesigningaGrammar

slide-9
SLIDE 9

9

ProblemswithPCFGs

What’sdifferentbetweenbasicPCFGscoreshere? What(lexical)correlationsneedtobescored?

LexicalizedTrees

Add“headwords”to eachphrasalnode

Syntacticvs.semantic heads Headshipnotin(most) treebanks Usually, e.g.:

NP:

TakeleftmostNP TakerightmostN* TakerightmostJJ Takerightchild

VP:

TakeleftmostVB* TakeleftmostVP Takeleftchild

slide-10
SLIDE 10

10

LexicalizedPCFGs?

Problem:wenowhavetoestimateprobabilitieslike Nevergoingtogettheseatomicallyoffofatreebank Solution:breakupderivationintosmallersteps

LexicalDerivationSteps

Aderivationofalocaltree[Collins99]

Chooseaheadtagandword Chooseacomplementbag Generatechildren(incl.adjuncts) Recursivelyderivechildren

slide-11
SLIDE 11

11

LexicalizedCKY

  • !"#

$# !$" "!# $"# !$ Y[h] Z[h’] X[h] ihkh’j

  • %&%'(•
  • )

*& %&%'(+++*&•

  • )
  • PruningwithBeams

TheCollinsparserpruneswith per9cellbeams[Collins99]

Essentially,runtheO(n5) CKY Rememberonlyafewhypothesesfor eachspan<i,j>. IfwekeepKhypothesesateach span,thenwedoatmostO(nK2) workperspan(why?) Keepsthingsmoreorlesscubic

Also:certainspansareforbidden entirelyonthebasisof punctuation(crucialforspeed)

Y[h] Z[h’] X[h] ihkh’j

slide-12
SLIDE 12

12

PruningwithaPCFG

TheCharniakparserprunesusingatwo9pass approach[Charniak97+]

First,parsewiththebasegrammar ForeachX:[i,j]calculateP(X|i,j,s)

Thisisn’ttrivial,andtherearecleverspeedups

Second,dothefullO(n5) CKY

SkipanyX:[i,j]whichhadlow(say,<0.0001)posterior

Avoidsalmostallworkinthesecondphase!

Charniaketal06:canusemorepasses Petrovetal07:canusemanymorepasses

PruningwithA*

Youcanalsospeedup thesearchwithout sacrificingoptimality Foragenda9based parsers:

Canselectwhichitemsto processfirst Candowithany“figureof merit”[Charniak98] Ifyourfigure9of9meritisa validA*heuristic,noloss

  • foptimiality[Kleinand

Manning03]

slide-13
SLIDE 13

13

Projection9BasedA*

  • π
  • π

A*Speedup

TotaltimedominatedbycalculationofA*tablesineach projection…O(n3)

10 20 30 40 50 60 5 10 15 20 25 30 35 40 Length Time(sec) CombinedPhase DependencyPhase PCFGPhase

slide-14
SLIDE 14

14

Results

Someresults

Collins99– 88.6F1(generativelexical) Charniak andJohnson05– 89.7/91.3F1 (generativelexical/reranked) Petrov etal06– 90.7F1(generativeunlexical) McClosky etal06– 92.1F1(gen+rerank +self9train)

However

Bilexical countsrarelymakeadifference(why?) Gildea 01– Removingbilexical countscosts<0.5F1

Annotationrefinesbasetreebanksymbolsto improvestatisticalfitofthegrammar

Structuralannotation Headlexicalization Automaticclustering?

TheGameofDesigningaGrammar

slide-15
SLIDE 15

15

LatentVariableGrammars

  • LearningLatentAnnotations

EMalgorithm:

  • Bracketsareknown

Basecategoriesareknown Onlyinducesubcategories !"#$%&" '())

%&"

slide-16
SLIDE 16

16

RefinementoftheDTtag

DT DT91 DT92 DT93 DT94

Hierarchicalrefinement

slide-17
SLIDE 17

17

HierarchicalEstimationResults

74 76 78 80 82 84 86 88 90 100 300 500 700 900 1100 1300 1500 1700 TotalNumberofgrammarsymbols Parsingaccuracy(F1)

  • FlatTraining

87.3 HierarchicalTraining 88.4

Refinementofthe,tag

Splittingallcategoriesequallyiswasteful:

slide-18
SLIDE 18

18

AdaptiveSplitting

Wanttosplitcomplexcategoriesmore Idea:spliteverything,rollbacksplitswhich wereleastuseful

AdaptiveSplittingResults

  • Previous

88.4 With50%Merging 89.5

slide-19
SLIDE 19

19

5 10 15 20 25 30 35 40 NP VP PP ADVP S ADJP SBAR QP WHNP PRN NX SINV PRT WHPP SQ CONJP FRAG NAC UCP WHADVP INTJ SBARQ RRC WHADJP X ROOT LST

  • NumberofPhrasalSubcategories

NumberofLexicalSubcategories

10 20 30 40 50 60 70 NNP JJ NNS NN VBN RB VBG VB VBD CD IN VBZ VBP DT NNPS CC JJR JJS : PRP PRP$ MD RBR WP POS PDT WRB 9LRB9 . EX WP$ WDT 9RRB9 '' FW RBS TO $ UH , `` SYM RP LS #

slide-20
SLIDE 20

20

LearnedSplits

ProperNouns(NNP): Personalpronouns(PRP):

NNP914 Oct. Nov. Sept. NNP912 John Robert James NNP92 J. E. L. NNP91 Bush Noriega Peters NNP915 New San Wall NNP93 York Francisco Street PRP90 It He I PRP91 it he they PRP92 it them him

Relativeadverbs(RBR): CardinalNumbers(CD):

RBR90 further lower higher RBR91 more less More RBR92 earlier Earlier later CD97

  • ne

two Three CD94 1989 1990 1988 CD911 million billion trillion CD90 1 50 100 CD93 1 30 31 CD99 78 58 34

LearnedSplits

slide-21
SLIDE 21

21

Coarse9to9FineInference

Example:PPattachment

?????????

Prune?

Foreachchartitem,computeposteriorprobability:

… QP NP VP …

coarse: refined:

E.g.considerthespan5to12:

slide-22
SLIDE 22

22

BracketPosteriors HierarchicalPruning

… QP NP VP …

coarse: splitintwo:

… QP1 QP2 NP1 NP2 VP1 VP2 … … QP1 QP1 QP3 QP4 NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4 …

splitinfour: splitineight: …

… … … … … … … … … … … … … … … …

slide-23
SLIDE 23

23

FinalResults(Accuracy)

≤40words F1 all F1 *+ Charniak&Johnson‘05(generative) 90.1 89.6 ,! -)#.#

  • +*

Dubey‘05 76.3 9 ,! -)#.#

  • /(

Chiangetal.‘02 80.0 76.6 ,! -)#.#

  • Stillhighernumbersfromreranking/self9trainingmethods