Assignment 2: Parsing PCFG and CKY with C2FP Chan Young Park - - PowerPoint PPT Presentation

assignment 2 parsing
SMART_READER_LITE
LIVE PREVIEW

Assignment 2: Parsing PCFG and CKY with C2FP Chan Young Park - - PowerPoint PPT Presentation

Assignment 2: Parsing PCFG and CKY with C2FP Chan Young Park Background: PCFG Recap 2 Background: PCFG Recap S NP VP S NP VP N I NP N NP N N students NP NP PP NP NP PP N telescope NP DT NN NP DT NN


slide-1
SLIDE 1

Assignment 2: Parsing

PCFG and CKY with C2FP

Chan Young Park

slide-2
SLIDE 2

Background: PCFG Recap

2

slide-3
SLIDE 3

Background: PCFG Recap

S →NP VP NP →N NP →NP PP NP →DT NN VP →ADV V NP PP →P NP N →I N →students N →telescope ADV →recently V →saw P →with DT →a

3

S →NP VP NP →N NP →NP PP NP →DT NN VP →ADV V NP PP →P NP

slide-4
SLIDE 4

Background: PCFG Recap

S →NP VP NP →N NP →NP PP NP →DT NN VP →ADV V NP PP →P NP N →I N →students N →telescope ADV →recently V →saw P →with DT →a

1.0 0.5 0.33 0.33 0.33 0.33 1.0 1.0 1.0 1.0 1.0 0.5 0.25 0.25 1.0 1.0 1.0 1.0 1.0 0.25 0.5 0.33 1.0 0.25 1.0 0.33 4

slide-5
SLIDE 5

Main Implementation

  • 1. Main entry point: PCFGParserTester (+ baseline: BaselineParser)
  • 2. Two classes you need to implement
  • GeneratvieParserFactory
  • CoarseToFineParserFactory (optional)
  • 3. Methods you need to implement
  • getParser(List<Tree<String>> trainTrees)
  • getBestParse(List<String> sentence)

5

slide-6
SLIDE 6
  • 1. getParser(List<Tree<String>> trainTrees)

Two methods to implement

annotateTrees(trainTrees)

Use given classes Grammar, Simple Lexicon, UnaryClosure

6

slide-7
SLIDE 7

Two methods to implement

Use given classes to build a parser Grammar, Simple Lexicon, UnaryClosure

7

slide-8
SLIDE 8
  • 2. getBestParse(List<String> sentence)

Two methods to implement

CKY Algorithm(buildChart) Back Tracking (getBestTree)

8

slide-9
SLIDE 9

The Assignment: Parsing

In Summary, What do you need to implement?

  • [Pre-processing] Tree Annotator
  • [Main stuff] CKY algorithm that can handle unaries
  • [Post-processing] Extracting the best tree from the backpointers
  • [Extra credit] Implementing coarse-to-fine pruning

9

slide-10
SLIDE 10
  • 1. Tree Annotation

Binarization + Markovization + Parent Annotation

10

slide-11
SLIDE 11

Binarization + Markovization + Parent Annotation

  • Tree binarization:

11

slide-12
SLIDE 12

Binarization + Markovization + Parent Annotation

  • Tree binarization:

12

slide-13
SLIDE 13

Binarization + Markovization + Parent Annotation

  • Horizontal Markovization:

13

slide-14
SLIDE 14

Binarization + Markovization + Parent Annotation

  • Parent annotation:

14

slide-15
SLIDE 15

TreeAnnotations.class

  • edu.berkeley.nlp.assignments.parsing.TreeAnnotations
  • Choose which order you want to use first

15

slide-16
SLIDE 16
  • 2. CKY algorithm

Filling in Unary & Binary Charts

Slide credit: Lecture Slides for Stanford Coursera course -- Probabilistic Parsing

16

slide-17
SLIDE 17

CKY Main Algorithm

  • Standard CKY works with binary rules only
  • One possible way to handle unary rules:

○ Use two charts: binary chart and unary chart ○ Binary chart: store the scores of non-terminals after applying binary rules ○ Unary chart: store the scores of non-terminals after applying unary rules ○ Alternate filling the unary and binary charts

17

slide-18
SLIDE 18

Possible Ways to Fill the Charts

[Main Stuff] CKY Algorithm

max=1 max=2 max=3 min=0 min=1 min=2 max=1 max=2 max=3 min=0 min=1 min=2

Method 1 Based on length and then min Method 2 Based on max and then min

18

slide-19
SLIDE 19

<fill in possible pre-terminals and unary closures of those> for each max from 2 to n for each min from max-2 to 0 for each non-terminal C for each binary rule C -> C1 C2 for each mid from min+1 to max-1 if unary_chart[min][mid][C1] and unary_chart[mid][max][C2] then binary_chart[min][max][C] = score(min, mid, max, C, C1, C2) <fill in unary_chart based on binary_chart, but with unary rules>

  • We can improve the speed by avoiding testing unnecessary rules

CKY Main Algorithm

19 max=1 max=2 max=3 min=0 min=1 min=2 max=4 min=3

slide-20
SLIDE 20

CKY Main Algorithm

<fill in possible pre-terminals and unary closures of those> for each max from 2 to n for each min from max-2 to 0 for each mid from min+1 to max-1 for each non-terminal C1 present at [min][mid] for each binary rule C -> C1 C2 if unary_chart[min][mid][C1] and unary_chart[mid][max][C2] then binary_chart[min][max][C] = score(min, mid, max, C, C1, C2) <fill in unary_chart based on binary_chart, but with unary rules>

  • You can experiment with other ways to prune the rules to be tested

max=1 max=2 max=3 min=0 min=1 min=2 20

slide-21
SLIDE 21

The grammar: Binary

21

slide-22
SLIDE 22

22

slide-23
SLIDE 23

23

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

26

slide-27
SLIDE 27

27

slide-28
SLIDE 28

28

slide-29
SLIDE 29

29

slide-30
SLIDE 30

30

slide-31
SLIDE 31

31

slide-32
SLIDE 32

CKY Implementation Details

  • Utilize these functions well:

○ Grammar.binaryRulesBy{LeftChild,RightChild,Parent} ○

  • UnaryClosures. closedUnaryRulesBy{Child,Parent}

○ Lexicon.scoreTagging

  • How to store the possible non-terminals?

○ A fixed list of all non-terminals? How many of them? Is it small enough? ○ A dynamic-size array for non-terminals? Is it fast enough?

32

slide-33
SLIDE 33
  • 3. Extracting the Best Tree

33

slide-34
SLIDE 34

Extracting Tree: Following Backpointers

  • What information do you need to store?

○ The information should be able to identify uniquely the previous step during the bottom-up process of CKY

  • What happen if we could not find any valid parse?

○ For the purpose of this assignment, you still need to return a Tree object

  • Recursive or iterative method?
  • Don’t forget to debinarize the resulting tree TreeAnnotations.unAnnotateTree()

34

slide-35
SLIDE 35

buildTree()

35

slide-36
SLIDE 36

buildTree() buildTree()

36

slide-37
SLIDE 37
  • 4. Coarse-to-fine Pruning

37

slide-38
SLIDE 38
  • Coarse-to-fine pruning: more advanced pruning method
  • Idea: prune non-terminals that are not plausible as part of the full tree
  • Define non-plausible as “having small enough posterior probability”
  • This is where inside-outside algorithm comes into play

Extra Credit: Coarse-to-fine Pruning

PER ORG LOC O PER ORG LOC O PER ORG LOC O PER ORG LOC O PER ORG LOC O

. . . . . .

START STOP

38

slide-39
SLIDE 39
  • Calculate inside:
  • Calculate outside:
  • Calculate posterior probability: (then prune if this is below certain threshold)
  • Detailed explanation on these equations is available in the additional note we provide

Extra Credit: Coarse-to-fine Pruning

39

slide-40
SLIDE 40

Some Tips

  • Print the rules after your Markovization (binary rules, unary rules, expanded

unary chains, label set)

  • As always, have a small set of sentences and trees for which you can

process manually, and test on them

  • Getting a very high* F1 or a very fast* decoding time might give extra points

*Threshold will be decided later

40

slide-41
SLIDE 41

Some Tips

  • UnaryClosure.getClosedUnaryRulesBy

instead of Grammar.getUnaryRulesBy

(+ UnaryClosure.getPath to unroll the closed unary rules)

  • Loop order? {max, min, grammar}?
  • How can we reduce the number of grammar access?
  • How can we reduce the number of grammar? (w/o reducing order of

markovization)

  • Chart of Objects? Primitives?

41

slide-42
SLIDE 42

Some Questions to be Explored

  • Right-branching vs Left-branching (vs other?) during binarization?
  • What is the use of parent annotation? What is the use of horizontal

Markovization? Any examples showing the benefits/drawbacks?

  • What grammar to be used as coarse grammar in coarse-to-fine pruning?

42

slide-43
SLIDE 43

Read the References!

  • References:

https://nlp.stanford.edu/~manning/papers/unlexicalized-parsing.pdf Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing

43