Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in - - PowerPoint PPT Presentation
Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in - - PowerPoint PPT Presentation
Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS 2011/2012 Anna Schmidt Talk Outline Chart Parsing Basics Chart Parsing Algorithms Earley Algorithm CKY Algorithm Basics
Talk Outline
Chart Parsing – Basics Chart Parsing – Algorithms
– Earley Algorithm – CKY Algorithm → Basics → BitPar: Efficient Implementation of CKY
Chart Parsing – Basics
Chart Parsing – Basics
First proposed by Martin Kay Dynamic programming approach
– Partial results of the computation are stored and (re)used later if needed → Same problem is not solved more than once
Operates on a CFG Functionality: Recogniser / Parser
… in this talk focus on recogniser functionality
Main Components
Chart Edges Agenda
Component: Chart
Is a well-formed substring table (WFST)
– Stores partial and complete analyses of substrings – Information stored in one triangular half
- f a two-dimensional array of
(n+1)*(n+1) | n*n
Can also be understood as a (directed) graph
– Vertices: positions between input words
0 Mary 1 feeds 2 the 3 otter 4
– Edges connecting vertices
Allows no duplicate entries
Component: Edge
Data structure storing information about a
particular step in the parsing process
Inhabit cells of the chart Contain
– Start and end position in input string – A dotted rule – Can also contain edge probability
Component: Edge
A dotted rule consists of
– Left hand side (LHS) = non-terminal symbol – Right hand side (RHS) = non-terminal or terminal symbol – A dot between RHS symbols indicating which constituents have already been found
Edges can be
– Active / incomplete: dot not the last element of RHS – Inactive / complete: dot is last element of RHS
Example: S → NP • VP (0,1)
Component: Agenda
Organises the order in which tasks are
executed
Here all tasks (edges) are collected before
being put on the chart
Ordering of agenda determines what is
processed first → Therefore also which parse is found first
– Queue, stack, ordering with respect to probabilities, …
Parsing Strategies
Kay differentiates parsing strategies along two dimensions:
– Bottom-up versus top-down – Directed versus undirected
Directed bottom-up
– Only build edges for phrases that can actually be incorporated into a higher level structure → Left-Corner Parser
Directed top-down
– Only build a new (active) edge if the next word of the input can be used to extend such an edge → Earley
Undirected varieties: No such restrictions
→ Undirected Bottom-Up: CKY
Parsing Strategies
Ways of achieving directedness:
Reachability Table:
– Contains for each non-terminal N the set of all symbols that can be the first element of a string dominated by N – For example: NP can start with DET, N, ADJ, but not with V
Rule selection table:
– M*N table where M = non-terminals excluding pre-terminals N = all non-terminals – Contains all grammar rules applicable in a situation where M is the 'upper' and N is the 'lower' symbol
Chart Parsing: Advantages
No repeated computation of same subproblem Deals well with left-recursive grammars Deals well with ambiguity No backtracking necessary
Earley Algorithm
Earley Algorithm
Proposed by Jay Earley Top down search Can handle all CFGs Efficient:
– O(n3) in the general case – Faster for particular types of grammar
Terminology
In his paper, Earley does not use the notion of a
'chart'
He represents the parsing process as
sets of states
– Index of each state set = end position of all states in the set – A state largely corresponds to an edge
- Contains dotted rule
- Pointer to start position
- End position can be derived from state set
Terminology
Formalisms are very similar Examples easier to follow when represented in
charts
So we will stick with 'chart' representations
Algorithm – Components
Initialization Predictor Scanner Completer Algorithm operates on one half of an array of
size (n+1)*(n+1)
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos 1 2 3 4 5
Initialise
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the
1 2 3 4 5
Predict
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary •
1 2 3 4 5
Scan
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1 2 3 4 5
Complete
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds
2 3 4 5
Predict
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds •
2 3 4 5
Scan
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP
2 3 4 5
Complete
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
3 4 5
Predict
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the • 3 4 5
Scan
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N
3 4 5
Complete
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N
3
N → • Mary N → • otter
4 5
Predict
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N
3
N → • Mary N → • otter N → otter •
4 5
Scan
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N NP → DET N •
3
N → • Mary N → • otter N → otter •
4 5
Complete
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
1
VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N NP → DET N •
3
N → • Mary N → • otter N → otter •
4 5
Complete
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
S → NP VP • 1
VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N NP → DET N •
3
N → • Mary N → • otter N → otter •
4 5
Complete
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
S → NP VP • X → S • eos 1
VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N NP → DET N •
3
N → • Mary N → • otter N → otter •
4 5
Complete
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
S → NP VP • X → S • eos 1
VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N NP → DET N •
3
N → • Mary N → • otter N → otter •
4 eos → • eos 5
Predict
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
S → NP VP • X → S • eos 1
VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N NP → DET N •
3
N → • Mary N → • otter N → otter •
4 eos → • eos eos → eos • 5
Scan
0 Mary 1 feeds 2 the 3 otter 4 eos 5
1 2 3 4 5 0 X → • S eos
S → • NP VP NP → • N NP → • DET N N → • Mary N → • otter DET → • the N → Mary • NP → N • S → NP • VP
S → NP VP • X → S • eos X →S eos • 1
VP → • V NP V → • feeds V → feeds • VP → V • NP VP → V NP •
2
NP → • N NP → • DET N N → • Mary N → • otter DET → • the
DET → the •
NP → DET • N NP → DET N •
3
N → • Mary N → • otter N → otter •
4 eos → • eos eos → eos • 5
Complete
Lookahead Component
In original paper, Earley proposes the use of a
lookahead string for each state which represents the allowed successor for LHS
Prevents completer from processing a state if
lookahead string and next word of input do not match → Remember Kay's directed top-down strategy?
CKY: Basics
CKY Basics
Proposed by John Cocke, Daniel H. Younger, and
Tadao Kasami (independently)
Bottom-up search Incremental Grammar must be in Chomsky normal form (CNF) Complexity O(n3) Chart: (upper triangle of) array of size n*n
CKY Algorithm: Idea
Initialise upper triangle of a chart of size n*n From upper left to lower right corner of chart:
Go to the next cell in the diagonal
– Fill in POS tag of next word in input string – Each time a POS tag has been filled in, go up cell by cell and build larger constituents that end at the current end position
1 2 3 4 1 2 3 4
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 2 3 4
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 3 4
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V 3 4
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V 3 4
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V 3 DET 4
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V 3 DET 4
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V 3 DET 4
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V 3 DET 4 N NP
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V 3 DET NP 4 N NP
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP 2 V VP 3 DET NP 4 N NP
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
0 Mary 1 feeds 2 the 3 otter 4
1 2 3 4 1 N NP S 2 V VP 3 DET NP 4 N NP
S → NP VP NP → N NP → DET N VP → V NP N → Mary |
- tter
V → feeds DET → the
CKY: BitPar
BitPar: Basics
Proposed by Helmut Schmid Bit-vector-based parser Efficiently implements a CKY-style algorithm Uses bit vector operations to parallelise parsing
- perations
Idea:
Don't try to decrease number of edges that are built, instead minimise cost of building edges
Especially useful if all analyses are needed
BitPar: Requirements
Restrictions on Context Free Grammar
– Must be in CNF – Must be ε-free – Chain rules allowed
Precomputed for each non-terminal N:
– Set of non-terminals that are derivable from N via chain rules – Set is stored in the bit vector chainvec[N] – Set includes N itself
Background: Bitwise AND and OR
- AND
0101 & 0011
= 0001
Both corresponding bits must equal 1
- OR
0101
| 0011 = 0111 At least one of corresponding bits must equal 1
BitPar: Chart
Chart = three-dimensional bit array
chart [start position b] [end position e] = [011000...]
[b] [e] contains a bit vector with one bit for each
non-terminal
– Bit is set to 1 if non-terminal was inserted – 0 otherwise
Chart initialised with all bits = 0
Filling the Chart: POS Tags
Inserting POS tags into a cell of the diagonal:
For each non-terminal N that can be rewritten
as the word at the current position Do a bitwise OR of
– Bits inhabiting the chart cell – chainvec[N]
→ N and all its chain derivations are inserted in just one operation
Mary feeds the otter
1 2 3 4 1 011000
000000 000000 000000
2 000000
000010 000000 000000
3 000000
000000 000000 000000
4 000000
000000 000000 000000
S, NP, N, VP, V, DET
Mary feeds the otter
1 2 3 4 1 011000
000000 ? 000000 000000
2 000000
000010 000000 000000
3 000000
000000 000000 000000
4 000000
000000 000000 000000
S, NP, N, VP, V, DET
Filling the Chart: Larger Constituents
Conceptually:
Determine if several cells can be combined to
form a higher level constituent labeled N
For this:
Loop over grammar rules with LHS = N, extract RHS (consisting of RHS1, RHS2)
Loop over all possible combinations of cells that
together could contain the substructure of N and determine whether they contain RHS1 and RHS2 respectively
Filling the Chart: Larger Constituents
This has to be done
– For each super-diagonal cell – For each non-terminal – For all corresponding grammar rules – For all possible cell combinations that could constitute a substructure of N
This is a time-consuming process BUT: The same functionality can be achieved
by a single AND operation on two bit vectors
Filling the Chart: Larger Constituents
Internally:
Can a given non-terminal LHS be inserted into a
given chart cell [b] [e]?
Get RHS1, RHS2 from grammar Vector 1
Contains bits stored in
chart [ b ] [ b ..b+1..e-1 ] [ RHS1 ]
Vector 2
Contains bits stored in
chart [ b+1..b+2..e ] [ e ] [ RHS2 ]
Filling the Chart: Larger Constituents
If a bitwise AND operation on the two new
vectors produces one bit = 1
– A valid substructure for LHS has been found – LHS can be inserted into the chart cell
Let's look at an example
Mary feeds the otter
1 2 3 4 1 011000 2
000010
3
000001
? 4
011000
Example: Lets determine if NP should go into cell [3] [4].
S, NP, N, VP, V, DET
1 2 3 4 1 011000
000000 000000 000000
2 000000
000010 000000 000000
3 000000
000000 000001 000000 ?
4 000000
000000 000000 011000
Should NP go into [3] [4]?
First, we consult the grammar
We find a rule NP → DET N, so allowed right-hand sides for NP are RHS1 = DET RHS2 = N
Reminder: Rules
v1 = chart [ b ] [ b .. b+1 .. e-1] [ RHS1 ] v2 = chart [ b+1.. b+2 .. e ] [ e ] [ RHS2 ]
Vector1 = 1
chart [3] [3] = RHS1 = DET? → yes, so insert 1
Vector2 = 1
chart [4] [4] = RHS2 = N? → yes, so insert 1
Vector1 AND Vector2 = 1, so insert NP
Mary feeds the otter
1 2 3 4 1 011000 2
000010
3
000001
? 4
011000
Example: Lets determine if NP should go into cell [3] [4]. → Yes!
S, NP, N, VP, V, DET
1 2 3 4 1 011000
000000 000000 000000
2 000000
000010 000000 000000
3 000000
000000 000001 010000
4 000000
000000 000000 011000
Thank you for your attention!
References
Earley, Jay: An efficient context-free parsing algorithm. Communications of the ACM, 13(2):94–
102, 1970.
Jurafsky, Daniel and Martin, James H.: 2009. Speech and Language Processing: An Introduction to
Natural Language Processing, Speech Recognition, and Computational Linguistics. 2nd edition. Prentice-Hall
Kay, Martin: Algorithm schemata and data structures in syntactic processing. In Readings in
natural language processing, pages 35–70. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1986.
Kay, Martin: Lecture Slides of the Course 'Basic Algorithms for Computational Linguistics'
http://www.coli.uni-saarland.de/courses/algorithms-11/
Schmid, Helmut: Efficient Parsing of Highly Ambiguous Context-Free Grammars with Bit
- Vectors. In Proceedings of Coling 2004, pages 162–168, Geneva, Switzerland, 2004.
Wirén, Mats: A Comparison of Rule-Invocation Strategies in Context-Free Chart Parsing
Initialization
introduces a new non-terminal start symbol X
and a new end symbol EOS
adds EOS to the end of the input string for each root symbol R of the grammar:
add to the chart[0,0] an edge of the form:
X → . R EOS
Predictor
for all non-terminals N directly following a dot
(in the current state set):
and for each grammar rule with N as LHS:
add a new edge with
– LHS = N – RHS according to grammar, but – dot first element of RHS – start and end = end of original state
Scanner
for all terminal symbols immediately following a dot:
compare terminal symbol with input string
starting at end position of current edge
if they match: add new edge to the chart with
– dot moved over the terminal symbol – end position incremented by 1
Completer
If the dot is last element of a production with LHS of type T
find edges that
– are still waiting for a constituent of the type T – end where the complete edge is starting
Add to the chart an edge with