Collins Parsing Victor, Ydng Zhu Outline Introduction Basic Model - - PowerPoint PPT Presentation

collins parsing
SMART_READER_LITE
LIVE PREVIEW

Collins Parsing Victor, Ydng Zhu Outline Introduction Basic Model - - PowerPoint PPT Presentation

Collins Parsing Victor, Ydng Zhu Outline Introduction Basic Model Representation Calculation Three generative models Models Practice issues Evaluation 2 Introduction Michael Collins PhD Thesis, 1999


slide-1
SLIDE 1

Collins Parsing

Victor, Yùdōng Zhōu

slide-2
SLIDE 2

Outline

  • Introduction
  • Basic Model
  • Representation
  • Calculation
  • Three generative models
  • Models
  • Practice issues
  • Evaluation

2

slide-3
SLIDE 3

Introduction

  • Michael Collins PhD Thesis, 1999
  • Head-Driven (Lexical info)
  • Statistical, supervised
  • Input: Tagged sentence
  • Output: phrase-structure tree

3

slide-4
SLIDE 4

Basic Model

  • Task: Given a sentence, candidate trees

and probabilities, find best parsing tree

  • In this model: T=(B,D), where
  • B=set of baseNPs
  • D=set of dependencies

4

slide-5
SLIDE 5

Basic Model

An Example

5

slide-6
SLIDE 6

Basic Model

  • Dependency Set
  • Step one: Find head child
  • Eg. S  <NP VP>
  • Step two: Extract head modifier
  • Eg. NP modify VP, with rule S<NP VP>

6

slide-7
SLIDE 7

Basic Model

Notation: AF(j) = (hj, Rj)

  • Eg. AF(1) = (5, <NP,S,VP>).

where w1=Smith, w5=announced D = {(AF(1),AF(2)...AF(m)} P(T|S)=P(B,D|S)= P(B|S)* P(D|S,B)

7

slide-8
SLIDE 8

Basic Model

  • Calculation
  • Dependency Probability: Training data

8

slide-9
SLIDE 9

Basic Model

  • Calculation
  • Dependency Probability: Training data
  • Distance Measure

9

  • Eg. Second “of” will reduce the probability:

Shaw, based in Dalton, Ga., has annual sales of about $1.18 billion, and has economies of scale and lower raw-material costs that are expected to boost the profitability of Armstrong's brands, sold under the Armstrong and Evans-Black names .

slide-10
SLIDE 10

Basic Model

  • Calculation
  • Dependency Probability: Training data
  • Distance Measure
  • Sparse Data
  • Solved by smoothing

10

slide-11
SLIDE 11

Three Generative Models

11

Generative Model Discrimitive Model joint probability distribution

  •  P(T,S)

conditional distribution

  • P(T|S)
slide-12
SLIDE 12

Modal 1

Representation:

12

slide-13
SLIDE 13

Modal 1

  • Calculation: PCFG based
  • Generate head, PH(H|P,h)
  • Generate right modifier, PR(Ri(ri)|P,h,H)
  • Until STOP symbol, Rm+1(rm+1)
  • Generate left modifier, Pl(Li(li)|P,h,H)
  • Example:
  • S(bought)NP(week) NP(Marks) VP(bought)
  • Ph(VP|S,bought)

*Pr(STOP|S,VP,bought)

13

*Pl(NP(Marks)|S,VP,bought) *Pl(NP(week)|S,VP,bought) *Pl(STOP|S,VP,bought)

slide-14
SLIDE 14

Modal 1

  • Calculation
  • Distance Measure
  • PR(Ri(ri)|P,h,H, R1(r1),…Ri-1(ri-1) )

=PR(Ri(ri)|P,h,H) In Previous Formula =PR(Ri(ri)|P,h,H, distancer(i-1) )

14

slide-15
SLIDE 15

Modal 2

  • Complement/Adjunct distinction
  • Reasons doing this while parsing:
  • Lexical info/additional knowledge needed
  • Help parsing accuracy

15

slide-16
SLIDE 16

Modal 2

  • Identifying Complement in Penn Treebank
  • Rule based
  • One incorrect Example:
  • How to get the correct one?

16

slide-17
SLIDE 17

Modal 2

  • Subcategorisation Frames
  • Generate head, PH(H|P,h)
  • Generate left and right subcat frames, LC and RC,

Plc(LC|P,H,h) and Prc(RC|P,H,h)

  • Generate right modifier, (and then left modifier)

PR(Ri(ri)|P,h,H, distancer(i-1), RC ) ……

17

slide-18
SLIDE 18

Modal 2

  • Subcat Frames: Example
  • S(bought)NP(week) NP-C(Marks) VP(bought)
  • Ph(VP|S,bought) *

Plc({NP-C}|S,VP,bought) * Prc({ }|S,VP,bought) * Pl(NP-C(Marks)|S,VP,bought, {NP-C} )* Pl(NP(week)|S,VP,bought, { } ) * Pl(STOP|S,VP,bought, { }) * Pr(STOP|S,VP,bought, { }) Plc({NP-C,NP-C}|S,VP,bought) will be quite small Thus achieve the correct parse

18

slide-19
SLIDE 19

Modal 3

  • Traces and Wh-movement
  • Example 1 The store (SBAR which TRACE bought

Brooks Brothers)

  • Example 2 The store (SBAR which Marks bought TRACE)
  • Example 3 The store (SBAR which Marks bought

Brooks Brothers from TRACE)

19

slide-20
SLIDE 20

Modal 3

  • +gap feature added
  • Introduce parameter PG(G|P,h,H)

where G is Head, Left or Right

20

slide-21
SLIDE 21

Practice Issues

  • Smoothing
  • Eg. PH estimation e2=PH(H|P,t)
  • Final estimation:
  • Unknown words

21

slide-22
SLIDE 22

Evaluation

  • Training data:
  • Section 02-21, Wall Street Journal portion
  • (Approximately 40,000 sentences)
  • Testing data:
  • Section 23 (2,416 sentences)

22

slide-23
SLIDE 23

Evaluation

  • PARSEVAL measures
  • Label Precision =

number of correct constituents in proposed parse number of constituents in proposed parse

  • Label Recall =

number of correct constituents in proposed parse number of constituents in treebank parse

  • Crossing Brackets = number of constituents which

violate constituent boundaries with a constituent in the treebank parse.

23

slide-24
SLIDE 24

Evaluation

  • Collins 96 vs. Model 1
  • Model 1 better on unary rules and distance measures

24

  • Model 2 vs. Model 3
  • For 436 trace cases in testing data, Model 3 has

precision/recall 93.3%/90.1%

slide-25
SLIDE 25

Q&A

25