Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in - - PowerPoint PPT Presentation

basic parsing algorithms chart parsing
SMART_READER_LITE
LIVE PREVIEW

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in - - PowerPoint PPT Presentation

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS 2011/2012 Anna Schmidt Talk Outline Chart Parsing Basics Chart Parsing Algorithms Earley Algorithm CKY Algorithm Basics


slide-1
SLIDE 1

Basic Parsing Algorithms – Chart Parsing

Seminar Recent Advances in Parsing Technology WS 2011/2012

Anna Schmidt

slide-2
SLIDE 2

Talk Outline

 Chart Parsing – Basics  Chart Parsing – Algorithms

– Earley Algorithm – CKY Algorithm → Basics → BitPar: Efficient Implementation of CKY

slide-3
SLIDE 3

Chart Parsing – Basics

slide-4
SLIDE 4

Chart Parsing – Basics

 First proposed by Martin Kay  Dynamic programming approach

– Partial results of the computation are stored and (re)used later if needed → Same problem is not solved more than once

 Operates on a CFG  Functionality: Recogniser / Parser

… in this talk focus on recogniser functionality

slide-5
SLIDE 5

Main Components

 Chart  Edges  Agenda

slide-6
SLIDE 6

Component: Chart

 Is a well-formed substring table (WFST)

– Stores partial and complete analyses of substrings – Information stored in one triangular half

  • f a two-dimensional array of

(n+1)*(n+1) | n*n

 Can also be understood as a (directed) graph

– Vertices: positions between input words

0 Mary 1 feeds 2 the 3 otter 4

– Edges connecting vertices

 Allows no duplicate entries

slide-7
SLIDE 7

Component: Edge

 Data structure storing information about a

particular step in the parsing process

 Inhabit cells of the chart  Contain

– Start and end position in input string – A dotted rule – Can also contain edge probability

slide-8
SLIDE 8

Component: Edge

 A dotted rule consists of

– Left hand side (LHS) = non-terminal symbol – Right hand side (RHS) = non-terminal or terminal symbol – A dot between RHS symbols indicating which constituents have already been found

 Edges can be

– Active / incomplete: dot not the last element of RHS – Inactive / complete: dot is last element of RHS

 Example: S → NP • VP (0,1)

slide-9
SLIDE 9

Component: Agenda

 Organises the order in which tasks are

executed

 Here all tasks (edges) are collected before

being put on the chart

 Ordering of agenda determines what is

processed first → Therefore also which parse is found first

– Queue, stack, ordering with respect to probabilities, …

slide-10
SLIDE 10

Parsing Strategies

 Kay differentiates parsing strategies along two dimensions:

– Bottom-up versus top-down – Directed versus undirected

 Directed bottom-up

– Only build edges for phrases that can actually be incorporated into a higher level structure → Left-Corner Parser

 Directed top-down

– Only build a new (active) edge if the next word of the input can be used to extend such an edge → Earley

 Undirected varieties: No such restrictions

→ Undirected Bottom-Up: CKY

slide-11
SLIDE 11

Parsing Strategies

Ways of achieving directedness:

 Reachability Table:

– Contains for each non-terminal N the set of all symbols that can be the first element of a string dominated by N – For example: NP can start with DET, N, ADJ, but not with V

 Rule selection table:

– M*N table where M = non-terminals excluding pre-terminals N = all non-terminals – Contains all grammar rules applicable in a situation where M is the 'upper' and N is the 'lower' symbol

slide-12
SLIDE 12

Chart Parsing: Advantages

 No repeated computation of same subproblem  Deals well with left-recursive grammars  Deals well with ambiguity  No backtracking necessary

slide-13
SLIDE 13

Earley Algorithm

slide-14
SLIDE 14

Earley Algorithm

 Proposed by Jay Earley  Top down search  Can handle all CFGs  Efficient:

– O(n3) in the general case – Faster for particular types of grammar

slide-15
SLIDE 15

Terminology

 In his paper, Earley does not use the notion of a

'chart'

 He represents the parsing process as

sets of states

– Index of each state set = end position of all states in the set – A state largely corresponds to an edge

  • Contains dotted rule
  • Pointer to start position
  • End position can be derived from state set
slide-16
SLIDE 16

Terminology

 Formalisms are very similar  Examples easier to follow when represented in

charts

 So we will stick with 'chart' representations

slide-17
SLIDE 17

Algorithm – Components

 Initialization  Predictor  Scanner  Completer  Algorithm operates on one half of an array of

size (n+1)*(n+1)

slide-18
SLIDE 18

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos 1 2 3 4 5

Initialise

slide-19
SLIDE 19

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the

1 2 3 4 5

Predict

slide-20
SLIDE 20

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary •

1 2 3 4 5

Scan

slide-21
SLIDE 21

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1 2 3 4 5

Complete

slide-22
SLIDE 22

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds

2 3 4 5

Predict

slide-23
SLIDE 23

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds •

2 3 4 5

Scan

slide-24
SLIDE 24

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP

2 3 4 5

Complete

slide-25
SLIDE 25

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

3 4 5

Predict

slide-26
SLIDE 26

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the • 3 4 5

Scan

slide-27
SLIDE 27

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N

3 4 5

Complete

slide-28
SLIDE 28

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N

3

N → • Mary N → • otter

4 5

Predict

slide-29
SLIDE 29

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N

3

N → • Mary N → • otter N → otter •

4 5

Scan

slide-30
SLIDE 30

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N NP → DET N •

3

N → • Mary N → • otter N → otter •

4 5

Complete

slide-31
SLIDE 31

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

1

VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N NP → DET N •

3

N → • Mary N → • otter N → otter •

4 5

Complete

slide-32
SLIDE 32

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

S → NP VP • 1

VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N NP → DET N •

3

N → • Mary N → • otter N → otter •

4 5

Complete

slide-33
SLIDE 33

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

S → NP VP • X → S • eos 1

VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N NP → DET N •

3

N → • Mary N → • otter N → otter •

4 5

Complete

slide-34
SLIDE 34

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

S → NP VP • X → S • eos 1

VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N NP → DET N •

3

N → • Mary N → • otter N → otter •

4 eos → • eos 5

Predict

slide-35
SLIDE 35

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

S → NP VP • X → S • eos 1

VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N NP → DET N •

3

N → • Mary N → • otter N → otter •

4 eos → • eos eos → eos • 5

Scan

slide-36
SLIDE 36

0 Mary 1 feeds 2 the 3 otter 4 eos 5

1 2 3 4 5 0 X → • S eos

S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP

S → NP VP • X → S • eos X →S eos • 1

VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •

2

NP → • N NP → • DET N N → • Mary N → • otter DET → • the

DET → the •

NP → DET • N NP → DET N •

3

N → • Mary N → • otter N → otter •

4 eos → • eos eos → eos • 5

Complete

slide-37
SLIDE 37

Lookahead Component

 In original paper, Earley proposes the use of a

lookahead string for each state which represents the allowed successor for LHS

 Prevents completer from processing a state if

lookahead string and next word of input do not match → Remember Kay's directed top-down strategy?

slide-38
SLIDE 38

CKY: Basics

slide-39
SLIDE 39

CKY Basics

 Proposed by John Cocke, Daniel H. Younger, and

Tadao Kasami (independently)

 Bottom-up search  Incremental  Grammar must be in Chomsky normal form (CNF)  Complexity O(n3)  Chart: (upper triangle of) array of size n*n

slide-40
SLIDE 40

CKY Algorithm: Idea

 Initialise upper triangle of a chart of size n*n  From upper left to lower right corner of chart:

Go to the next cell in the diagonal

– Fill in POS tag of next word in input string – Each time a POS tag has been filled in, go up cell by cell and build larger constituents that end at the current end position

slide-41
SLIDE 41

1 2 3 4 1 2 3 4

slide-42
SLIDE 42

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 2 3 4

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-43
SLIDE 43

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 3 4

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-44
SLIDE 44

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V 3 4

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-45
SLIDE 45

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V 3 4

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-46
SLIDE 46

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V 3 DET 4

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-47
SLIDE 47

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V 3 DET 4

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-48
SLIDE 48

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V 3 DET 4

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-49
SLIDE 49

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V 3 DET 4 N NP

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-50
SLIDE 50

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V 3 DET NP 4 N NP

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-51
SLIDE 51

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP 2 V VP 3 DET NP 4 N NP

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-52
SLIDE 52

0 Mary 1 feeds 2 the 3 otter 4

1 2 3 4 1 N NP S 2 V VP 3 DET NP 4 N NP

S → NP VP NP → N NP → DET N VP → V NP N → Mary |

  • tter

V → feeds DET → the

slide-53
SLIDE 53

CKY: BitPar

slide-54
SLIDE 54

BitPar: Basics

 Proposed by Helmut Schmid  Bit-vector-based parser  Efficiently implements a CKY-style algorithm  Uses bit vector operations to parallelise parsing

  • perations

 Idea:

Don't try to decrease number of edges that are built, instead minimise cost of building edges

 Especially useful if all analyses are needed

slide-55
SLIDE 55

BitPar: Requirements

 Restrictions on Context Free Grammar

– Must be in CNF – Must be ε-free – Chain rules allowed

 Precomputed for each non-terminal N:

– Set of non-terminals that are derivable from N via chain rules – Set is stored in the bit vector chainvec[N] – Set includes N itself

slide-56
SLIDE 56

Background: Bitwise AND and OR

  • AND

0101 & 0011

= 0001

Both corresponding bits must equal 1

  • OR

0101

| 0011 = 0111 At least one of corresponding bits must equal 1

slide-57
SLIDE 57

BitPar: Chart

 Chart = three-dimensional bit array

chart [start position b] [end position e] = [011000...]

 [b] [e] contains a bit vector with one bit for each

non-terminal

– Bit is set to 1 if non-terminal was inserted – 0 otherwise

 Chart initialised with all bits = 0

slide-58
SLIDE 58

Filling the Chart: POS Tags

Inserting POS tags into a cell of the diagonal:

 For each non-terminal N that can be rewritten

as the word at the current position Do a bitwise OR of

– Bits inhabiting the chart cell – chainvec[N]

→ N and all its chain derivations are inserted in just one operation

slide-59
SLIDE 59

Mary feeds the otter

1 2 3 4 1 011000

000000 000000 000000

2 000000

000010 000000 000000

3 000000

000000 000000 000000

4 000000

000000 000000 000000

S, NP, N, VP, V, DET

slide-60
SLIDE 60

Mary feeds the otter

1 2 3 4 1 011000

000000 ? 000000 000000

2 000000

000010 000000 000000

3 000000

000000 000000 000000

4 000000

000000 000000 000000

S, NP, N, VP, V, DET

slide-61
SLIDE 61

Filling the Chart: Larger Constituents

Conceptually:

 Determine if several cells can be combined to

form a higher level constituent labeled N

 For this:

Loop over grammar rules with LHS = N, extract RHS (consisting of RHS1, RHS2)

 Loop over all possible combinations of cells that

together could contain the substructure of N and determine whether they contain RHS1 and RHS2 respectively

slide-62
SLIDE 62

Filling the Chart: Larger Constituents

 This has to be done

– For each super-diagonal cell – For each non-terminal – For all corresponding grammar rules – For all possible cell combinations that could constitute a substructure of N

 This is a time-consuming process  BUT: The same functionality can be achieved

by a single AND operation on two bit vectors

slide-63
SLIDE 63

Filling the Chart: Larger Constituents

Internally:

 Can a given non-terminal LHS be inserted into a

given chart cell [b] [e]?

 Get RHS1, RHS2 from grammar  Vector 1

Contains bits stored in

chart [ b ] [ b ..b+1..e-1 ] [ RHS1 ]

 Vector 2

Contains bits stored in

chart [ b+1..b+2..e ] [ e ] [ RHS2 ]

slide-64
SLIDE 64

Filling the Chart: Larger Constituents

 If a bitwise AND operation on the two new

vectors produces one bit = 1

– A valid substructure for LHS has been found – LHS can be inserted into the chart cell

 Let's look at an example

slide-65
SLIDE 65

Mary feeds the otter

1 2 3 4 1 011000 2

000010

3

000001

? 4

011000

Example: Lets determine if NP should go into cell [3] [4].

S, NP, N, VP, V, DET

1 2 3 4 1 011000

000000 000000 000000

2 000000

000010 000000 000000

3 000000

000000 000001 000000 ?

4 000000

000000 000000 011000

slide-66
SLIDE 66

Should NP go into [3] [4]?

 First, we consult the grammar

We find a rule NP → DET N, so allowed right-hand sides for NP are RHS1 = DET RHS2 = N

 Reminder: Rules

v1 = chart [ b ] [ b .. b+1 .. e-1] [ RHS1 ] v2 = chart [ b+1.. b+2 .. e ] [ e ] [ RHS2 ]

 Vector1 = 1

chart [3] [3] = RHS1 = DET? → yes, so insert 1

 Vector2 = 1

chart [4] [4] = RHS2 = N? → yes, so insert 1

 Vector1 AND Vector2 = 1, so insert NP

slide-67
SLIDE 67

Mary feeds the otter

1 2 3 4 1 011000 2

000010

3

000001

? 4

011000

Example: Lets determine if NP should go into cell [3] [4]. → Yes!

S, NP, N, VP, V, DET

1 2 3 4 1 011000

000000 000000 000000

2 000000

000010 000000 000000

3 000000

000000 000001 010000

4 000000

000000 000000 011000

slide-68
SLIDE 68

Thank you for your attention!

slide-69
SLIDE 69

References

 Earley, Jay: An efficient context-free parsing algorithm. Communications of the ACM, 13(2):94–

102, 1970.

 Jurafsky, Daniel and Martin, James H.: 2009. Speech and Language Processing: An Introduction to

Natural Language Processing, Speech Recognition, and Computational Linguistics. 2nd edition. Prentice-Hall

 Kay, Martin: Algorithm schemata and data structures in syntactic processing. In Readings in

natural language processing, pages 35–70. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1986.

 Kay, Martin: Lecture Slides of the Course 'Basic Algorithms for Computational Linguistics'

http://www.coli.uni-saarland.de/courses/algorithms-11/

 Schmid, Helmut: Efficient Parsing of Highly Ambiguous Context-Free Grammars with Bit

  • Vectors. In Proceedings of Coling 2004, pages 162–168, Geneva, Switzerland, 2004.

 Wirén, Mats: A Comparison of Rule-Invocation Strategies in Context-Free Chart Parsing

slide-70
SLIDE 70

Initialization

 introduces a new non-terminal start symbol X

and a new end symbol EOS

 adds EOS to the end of the input string  for each root symbol R of the grammar:

add to the chart[0,0] an edge of the form:

X → . R EOS

slide-71
SLIDE 71

Predictor

 for all non-terminals N directly following a dot

(in the current state set):

 and for each grammar rule with N as LHS:

add a new edge with

– LHS = N – RHS according to grammar, but – dot first element of RHS – start and end = end of original state

slide-72
SLIDE 72

Scanner

for all terminal symbols immediately following a dot:

 compare terminal symbol with input string

starting at end position of current edge

 if they match: add new edge to the chart with

– dot moved over the terminal symbol – end position incremented by 1

slide-73
SLIDE 73

Completer

If the dot is last element of a production with LHS of type T

 find edges that

– are still waiting for a constituent of the type T – end where the complete edge is starting

 Add to the chart an edge with

– dot moved over T – end position = end position of completed edge