Introduction to Natural Language Processing PARSING: Earley, - - PowerPoint PPT Presentation

introduction to natural language processing parsing
SMART_READER_LITE
LIVE PREVIEW

Introduction to Natural Language Processing PARSING: Earley, - - PowerPoint PPT Presentation

Introduction to Natural Language Processing PARSING: Earley, Bottom-Up Chart Parsing Jean-C edric Chappelier Jean-Cedric.Chappelier@epfl.ch Artificial Intelligence Laboratory M. Rajman LIA Introduction to Natural


slide-1
SLIDE 1

✬ ✫ ✩ ✪ Introduction to Natural Language Processing

PARSING: Earley, Bottom-Up Chart Parsing

Jean-C´ edric Chappelier

Jean-Cedric.Chappelier@epfl.ch

Artificial Intelligence Laboratory

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 1/20

slide-2
SLIDE 2

✬ ✫ ✩ ✪ Objectives of this lecture ➥ After CYK algorithm, present two other algorithms used for syntactic parsing

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 2/20

slide-3
SLIDE 3

✬ ✫ ✩ ✪

Earley Parsing Top-down algorithm(predictive) Bottom-up = inference Top-down = search 3 advantages:

✌ best known worst-case complexity (as CYK) ✌ adaptive complexity for least complex languages (e.g. regular languages) ✌ No need for a special form of the CF grammar

2 drawbacks :

➷ No way to correct/reconstruct non-parsable sentences (”early error detection”) ➷ not very intuitive

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 3/20

slide-4
SLIDE 4

✬ ✫ ✩ ✪

Earley Parsing (2) Idea: on-line (i.e during parsing) binarization of the grammar

☞ doted rules and ”Earley items”

doted rules: X → X1...Xk • Xk+1...Xm with X → X1...Xm a rule of the grammar Earley item: one doted rule with one integer i (0 ≤ i ≤ n, n: size of the input string)

☞ the part before the dot (•) represents the subpart of the rule that derives a substring of

the input string starting at position i + 1 Example: (VP → V • NP, 2) is an Earley item for input string the cat ate a mouse 1 2 3 4 5

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 4/20

slide-5
SLIDE 5

✬ ✫ ✩ ✪

Earley Parsing (3) Principle: Starting from all possible (S → • X1 ... Xm, 0), parallel construction of all the dotted rules deriving (larger and larger) substrings of the input string, up to a point where the whole input sentence is derived

☞ construction of sets of items (Ej) such that: (X → α • β, i) ∈ Ej ⇐ ⇒ ∃γ, δ : S ⇒∗ γ X δ

and

γ ⇒∗ w1 ... wi

and

α ⇒∗ wi+1 ... wj

Example: in the former example (VP → V • NP, 2) ∈ E3 The input string (length n) is syntactically correct (accepted) iff at least one

(S → X1 ... Xm •, 0) is in En

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 5/20

slide-6
SLIDE 6

✬ ✫ ✩ ✪

Earley Parsing (4)

➊ Initialization: construction of E0

  • 1. For each rule S → X1 ... Xm in the grammar: add (S → • X1 ... Xm, 0)

to E0

  • 2. For each (X → •Y β, 0) in E0 and every rule Y → γ, add (Y → • γ, 0)

to E0

  • 3. Iterate (2) until convergence of E0

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 6/20

slide-7
SLIDE 7

✬ ✫ ✩ ✪

Earley Parsing: Interpretation

➋ Iterations: building of derivations of w1...wj (Ej)

  • 1. Linking with words: Introduce word wj whenever a derivation of w1...wj−1 can

”eat” wj (i.e. ”there is a • before wj”)

  • 2. Stepping in the derivation: Whenever non-terminal X can derive a subsequence

starting at wi+1 and if there exists one subderivation ending in wi which can ”eat” X, do it!

  • 3. Prediction (of useful items): If at some place Y could be ”eaten” by some rule, then

introduce all the rules that might (later on) produce Y

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 7/20

slide-8
SLIDE 8

✬ ✫ ✩ ✪

Earley Parsing (next)

➋ Iterations: construction of the Ej sets (1 ≤ j ≤ n)

  • 1. for all (X → α • wj β, i) in Ej−1, add (X → α wj • β, i) to Ej
  • 2. For all (X → γ •, i) of Ej, for all (Y → α • Xβ, k) of Ei,

add (Y → α X • β, k) to Ej

  • 3. For all (Y → α • Xβ, i) in Ej and for each rule X → γ,

add (X → • γ, j) to Ej

  • 4. Repeat to (2) while Ej keeps changing

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 8/20

slide-9
SLIDE 9

✬ ✫ ✩ ✪

Earley Parsing: Full Example Example for ”I think”, and the grammar:

S → NP VP Pron → I NP → Pron V → think NP → Det N VP → V VP → V S VP → V NP

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 9/20

slide-10
SLIDE 10

✬ ✫ ✩ ✪ E0: (S → •NP VP, 0) (NP → • Pron, 0) (NP → • Det N, 0) (Pron → • I, 0) E1: (Pron → I •, 0) (NP → Pron •, 0) (S → NP • VP, 0) (VP → • V, 1) (VP → • V P, 1) (VP → • V NP, 1) (V → • think, 1) E2: (V → think •, 1) (VP → V •, 1) (VP → V • S, 1) (VP → V • NP, 1) (S → NP VP •, 0) (S → • NP VP, 2) (NP → • Pron, 2) (NP → • Det N, 2) (Pron → • I, 2)

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 10/20

slide-11
SLIDE 11

✬ ✫ ✩ ✪

Link between CYK and Earley

(X → α • β, i) ∈ Ej ⇐ ⇒ (X → α • β) ∈ cellj−i, i+1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

(S → NP VP •, 0) (S → NP • VP, 0) (VP → V • S, 1) (VP → V • NP, 1)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

(NP → Pron •, 0) (Pron → I •, 0) (VP → V •, 1) (V → think •, 1) (S → • NP VP, 0) (NP → • Pron, 0) (NP → • Det N, 0) (Pron → • I, 0) (VP → • V, 1) (VP → • V S, 1) (VP → • V NP, 1) (V → • think, 1) (S → • NP VP, 2) (NP → • N, 2) (NP → • Det N, 2) (Pron → • I, 2)

I think

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 11/20

slide-12
SLIDE 12

✬ ✫ ✩ ✪

Bottom-up Chart Parsing Idea: keep the best of both CYK and Earley

☞ on-line binarization ”`

a la” Earley (and even better) within a bottom-up CYK-like algorithm Mainly:

  • no need for indices in items: cell position is enough
  • factorize (with respect to α) all the X → α • β

☞ α • ...

This is possible when processing bottom-up

  • replace all the X → α • simply by X
  • supression of X → • α

This is possible when processing bottom-up (and without lookahead)

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 12/20

slide-13
SLIDE 13

✬ ✫ ✩ ✪

Bottom-up Chart Parsing: Example N The the cat crocodile ate

  • V

N Det S S ... Det ... V ... ... Det ... NP NP VP VP NP NP Det

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 13/20

slide-14
SLIDE 14

✬ ✫ ✩ ✪

Bottom-up Chart Parsing (3) More formally, a CYK algorithm in which: If cell contents are denoted by [α • ..., i, j] and [X, i, j] respectively Then initialization is wij ⇒ [X, i, j] for X → wij ∈ R and the completion phase becomes: (association of two cells)

[α • ..., i, j] ⊕ [X, k, j + i] ⇒    [α X • ..., i + k, j]

if Y → α Xβ ∈ R

[Y, i + k, j]

if Y → α X ∈ R (”self-filling”)

[X, i, j] ⇒    [X • ..., i, j]

if Y → Xβ ∈ R

[Y, i, j]

if Y → X ∈ R

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 14/20

slide-15
SLIDE 15

✬ ✫ ✩ ✪

Bottom-up Chart Parsing: Example Initialization:

N N ... ... Det Det The hate dog the ... cat the hate The cat dog V Det Det N V Det N V Det VP

Completion:

  • k

k

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 15/20

slide-16
SLIDE 16

✬ ✫ ✩ ✪

N N

  • V

...

  • Det

...

  • Det

... cat the The crocodile ate

  • V

Det Det ... ... S S NP NP NP NP NP VP

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 16/20

slide-17
SLIDE 17

✬ ✫ ✩ ✪

Dealing with compounds Example on how to deal with compouds during initialization phase: N N credit card N V

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 17/20

slide-18
SLIDE 18

✬ ✫ ✩ ✪

Complexity still in O(n3) What coefficient for n3? (with respect to grammar parameters)

m(R′) · |NT | · n3

where m(R′) the number of internal nodes of the trie of the right-hand sides of the non-lexical grammar rules

NT : the set of non-terminals R′: the set of non-lexical grammar rules

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 18/20

slide-19
SLIDE 19

✬ ✫ ✩ ✪

Keypoints

➟ The way algorithms work (Earley items, linking, stepping, prediction, link CYK-Earley) ➟ worst-case complexity O(n3) ➟ Advantages and drawbacks of algorithms

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 19/20

slide-20
SLIDE 20

✬ ✫ ✩ ✪

References [1] D. Jurafsky & J. H. Martin, Speech and Language Processing, pp. 377-385, Prentice Hall, 2000. [2] R. Dale, H. Moisi, H. Somers, Handbook of Natural Language Processing, pp. 69-73, Dekker, 2000.

LIA I&C Introduction to Natural Language Processing (CS-431)

  • M. Rajman

J.-C. Chappelier 20/20