Overview Last Time Grammatical Structure Context-Free Grammar - - PowerPoint PPT Presentation

overview
SMART_READER_LITE
LIVE PREVIEW

Overview Last Time Grammatical Structure Context-Free Grammar - - PowerPoint PPT Presentation

University of Oslo : Department of Informatics INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Parsing and Parser Evaluation Stephan Oepen & Murhaf Fares Language Technology Group (LTG) November 10, 2016


slide-1
SLIDE 1

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Parsing and Parser Evaluation

Stephan Oepen & Murhaf Fares

Language Technology Group (LTG)

November 10, 2016 University of Oslo : Department of Informatics

slide-2
SLIDE 2

Last Time

◮ Grammatical Structure ◮ Context-Free Grammar ◮ Treebanks ◮ Probabilistic CFGs

Overview

slide-3
SLIDE 3

Last Time

◮ Grammatical Structure ◮ Context-Free Grammar ◮ Treebanks ◮ Probabilistic CFGs

Today

◮ Parser Evaluation ◮ Syntactic Parsing

◮ Na¨

ıve: Recursive-Descent

◮ Dynamic Programming: CKY

◮ Generalized Chart Parsing

Overview

slide-4
SLIDE 4

Formally, a CFG is a quadruple: G = C, Σ, P, S

Recall: CFGs (Formally, this Time)

slide-5
SLIDE 5

Formally, a CFG is a quadruple: G = C, Σ, P, S

◮ C is the set of categories (aka non-terminals),

◮ {S, NP, VP, V}

Recall: CFGs (Formally, this Time)

slide-6
SLIDE 6

Formally, a CFG is a quadruple: G = C, Σ, P, S

◮ C is the set of categories (aka non-terminals),

◮ {S, NP, VP, V}

◮ Σ is the vocabulary (aka terminals),

◮ {Kim, snow, adores, in}

Recall: CFGs (Formally, this Time)

slide-7
SLIDE 7

Formally, a CFG is a quadruple: G = C, Σ, P, S

◮ C is the set of categories (aka non-terminals),

◮ {S, NP, VP, V}

◮ Σ is the vocabulary (aka terminals),

◮ {Kim, snow, adores, in}

◮ P is a set of category rewrite rules (aka productions)

S → NP VP NP → Kim VP → V NP NP → snow V → adores

Recall: CFGs (Formally, this Time)

slide-8
SLIDE 8

Formally, a CFG is a quadruple: G = C, Σ, P, S

◮ C is the set of categories (aka non-terminals),

◮ {S, NP, VP, V}

◮ Σ is the vocabulary (aka terminals),

◮ {Kim, snow, adores, in}

◮ P is a set of category rewrite rules (aka productions)

S → NP VP NP → Kim VP → V NP NP → snow V → adores

◮ S ∈ C is the start symbol, a filter on complete results;

Recall: CFGs (Formally, this Time)

slide-9
SLIDE 9

Formally, a CFG is a quadruple: G = C, Σ, P, S

◮ C is the set of categories (aka non-terminals),

◮ {S, NP, VP, V}

◮ Σ is the vocabulary (aka terminals),

◮ {Kim, snow, adores, in}

◮ P is a set of category rewrite rules (aka productions)

S → NP VP NP → Kim VP → V NP NP → snow V → adores

◮ S ∈ C is the start symbol, a filter on complete results; ◮ for each rule α → β1, β2, ..., βn ∈ P: α ∈ C and βi ∈ C ∪ Σ

Recall: CFGs (Formally, this Time)

slide-10
SLIDE 10

◮ The ParsEval metric (Black, et al., 1991) measures

constituent overlap.

◮ The original formulation only considered the shape of the

(unlabeled) bracketing.

◮ The modern ‘standard’ uses a tool called evalb, which

reports precision, recall and F1 score for labeled brackets, as well as the number of crossing brackets.

ParsEval

slide-11
SLIDE 11

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-12
SLIDE 12

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-13
SLIDE 13

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 0,1 dt

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-14
SLIDE 14

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 0,1 dt 1,3 advp

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-15
SLIDE 15

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 0,1 dt 1,3 advp

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-16
SLIDE 16

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 0,1 dt 2,3 jj 1,3 advp

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-17
SLIDE 17

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 0,1 dt 2,3 jj 1,3 advp 3,6 nom

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-18
SLIDE 18

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 1,3 advp 3,6 nom

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-19
SLIDE 19

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-20
SLIDE 20

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom 5,6 nn

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

ParsEval

slide-21
SLIDE 21

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom 5,6 nn

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

0,6 np 2,6 nom 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,2 jj 3,6 nom 5,6 nn

ParsEval

slide-22
SLIDE 22

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom 5,6 nn

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

0,6 np 2,6 nom 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,2 jj 3,6 nom 5,6 nn

Correct: 7

ParsEval

slide-23
SLIDE 23

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom 5,6 nn

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

0,6 np 2,6 nom 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,2 jj 3,6 nom 5,6 nn

Recall: Correct Gold = 7 9 Precision: Correct System = 7 9

ParsEval

slide-24
SLIDE 24

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom 5,6 nn

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

0,6 np 2,6 nom 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,2 jj 3,6 nom 5,6 nn

Recall: Correct Gold = 7 9 Precision: Correct System = 7 9 F1 score: 7 9

ParsEval

slide-25
SLIDE 25

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom 5,6 nn

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

0,6 np 2,6 nom 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,2 jj 3,6 nom 5,6 nn

Recall: Correct Gold = 2 3 Precision: Correct System = 2 3 F1 score: 2 3

ParsEval

slide-26
SLIDE 26

Gold Standard

(NP (DT a) (ADVP (RB pretty) (JJ big)) (NOM (NN dog) (POS ’s) (NN house)) )

0,6 np 1,2 rb 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,3 advp 3,6 nom 5,6 nn

System Output

(NP (DT a) (JJ pretty) (NOM (JJ big) (NOM (NN dog) (POS ’s) (NN house))))

0,6 np 2,6 nom 3,4 nn 0,1 dt 2,3 jj 4,5 pos 1,2 jj 3,6 nom 5,6 nn

Recall: Correct Gold = 2 3 Precision: Correct System = 2 3 F1 score: 2 3 Crossing Brackets: 1

ParsEval

slide-27
SLIDE 27

Parsing with CFGs: Moving to a Procedural View

✬ ✫ ✩ ✪

S → NP VP VP → V | V NP | VP PP NP → NP PP PP → P NP NP → Kim | snow | Oslo V → adores P → in All Complete Derivations

  • are rooted in the start symbol S;
  • label internal nodes with cate-

gories ∈ C, leafs with words ∈ Σ;

  • instantiate a grammar rule ∈ P at

each local subtree of depth one.

S NP Kim VP VP V adores NP snow PP P in NP Oslo S NP Kim VP V adores NP NP snow PP P in NP

  • slo

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (3)

slide-28
SLIDE 28

Parsing with CFGs: Moving to a Procedural View

✬ ✫ ✩ ✪

S → NP VP VP → V | V NP | VP PP NP → NP PP PP → P NP NP → Kim | snow | Oslo V → adores P → in All Complete Derivations

  • are rooted in the start symbol S;
  • label internal nodes with cate-

gories ∈ C, leafs with words ∈ Σ;

  • instantiate a grammar rule ∈ P at

each local subtree of depth one.

S NP Kim VP VP V adores NP snow PP P in NP Oslo S NP Kim VP V adores NP NP snow PP P in NP

  • slo

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (3)

slide-29
SLIDE 29

Recursive Descend: A Na¨ ıve Parsing Algorithm

Control Structure

  • top-down: given a parsing goal α, use all grammar rules that rewrite α;
  • successively instantiate (extend) the right-hand sides of each rule;
  • for each βi in the RHS of each rule, recursively attempt to parse βi;
  • termination: when α is a prefix of the input string, recursion succeeds.

(Intermediate) Results

  • Each result records a (partial) tree and remaining input to be parsed;
  • complete results consume the full input string and are rooted in S;
  • whenever a RHS is fully instantiated, a new tree is built and returned;
  • all results at each level are combined and successively accumulated.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (4)

slide-30
SLIDE 30

Recursive Descend: A Na¨ ıve Parsing Algorithm

Control Structure

  • top-down: given a parsing goal α, use all grammar rules that rewrite α;
  • successively instantiate (extend) the right-hand sides of each rule;
  • for each βi in the RHS of each rule, recursively attempt to parse βi;
  • termination: when α is a prefix of the input string, parsing succeeds.

(Intermediate) Results

  • Each result records a (partial) tree and remaining input to be parsed;
  • complete results consume the full input string and are rooted in S;
  • whenever a RHS is fully instantiated, a new tree is built and returned;
  • all results at each level are combined and successively accumulated.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (4)

slide-31
SLIDE 31

The Recursive Descent Parser

✬ ✫ ✩ ✪

(defun parse (input goal) (if (equal (first input) goal) (let ((edge (make-edge :category (first input)))) (list (make-parse :edge edge :input (rest input)))) (loop for rule in (rules-deriving goal) append (extend-parse (rule-lhs rule) nil (rule-rhs rule) input))))

✬ ✫ ✩ ✪

(defun extend-parse (goal analyzed unanalyzed input) (if (null unanalyzed) (let ((tree (cons goal analyzed))) (list (make-parse :tree tree :input input))) (loop for parse in (parse input (first unanalyzed)) append (extend-parse goal (append analyzed (list (parse-tree parse))) (rest unanalyzed) (parse-input parse)))))

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (5)

slide-32
SLIDE 32

Quantifying the Complexity of the Parsing Task

1 2 3 4 5 6 7 8

Number of Prepositional Phrases (n)

250000 500000 750000 1000000 1250000 1500000

Recursive Function Calls

  • Kim adores snow (in Oslo)n

n trees calls 1 46 1 2 170 2 5 593 3 14 2,093 4 42 7,539 5 132 27,627 6 429 102,570 7 1430 384,566 8 4862 1,452,776 . . . . . . . . .

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (6)

slide-33
SLIDE 33

Top-Down vs. Bottom-Up Parsing

Top-Down (Goal-Oriented)

  • Left recursion (e.g. a rule like ‘VP → VP PP’) causes infinite recursion;
  • search is uninformed by the (observable) input: can hypothesize many

unmotivated sub-trees, assuming terminals (words) that are not present; → assume bottom-up as basic search strategy for remainder of the course. Bottom-Up (Data-Oriented)

  • unary (left-recursive) rules (e.g. ‘NP → NP’) would still be problematic;
  • lack of parsing goal: compute all possible derivations for, say, the input

adores snow; however, it is ultimately rejected since it is not sentential;

  • availability of partial analyses desirable for, at least, some applications.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (7)

slide-34
SLIDE 34

Top-Down vs. Bottom-Up Parsing

Top-Down (Goal-Oriented)

  • Left recursion (e.g. a rule like ‘VP → VP PP’) causes infinite recursion;
  • search is uninformed by the (observable) input: can hypothesize many

unmotivated sub-trees, assuming terminals (words) that are not present; → assume bottom-up as basic search strategy for remainder of the course. Bottom-Up (Data-Oriented)

  • unary (left-recursive) rules (e.g. ‘NP → NP’) would still be problematic;
  • lack of parsing goal: compute all possible derivations for, say, the input

adores snow; however, it is ultimately rejected since it is not sentential;

  • availability of partial analyses desirable for, at least, some applications.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (7)

slide-35
SLIDE 35

A Key Insight: Local Ambiguity

  • For many substrings, more than one way of deriving the same category;
  • NPs:

1 | 2 | 3 | 6 | 7 | 9 ; PPs: 4 | 5 | 8 ; 9 ≡ 1 + 8 | 6 + 5 ;

  • parse forest — a single item represents multiple trees [Billot & Lang, 89].

✬ ✫ ✩ ✪

2 3 4 5 6 7 boys with hats from France

1 2 3 4 5 6 7 8 9

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (8)

slide-36
SLIDE 36

The CKY (Cocke, Kasami, & Younger) Algorithm

for (0 ≤ i < |input|) do chart[i,i+1] ← {α | α → inputi ∈ P}; for (1 ≤ l < |input|) do for (0 ≤ i < |input| − l) do for (1 ≤ j ≤ l) do if (α → β1 β2 ∈ P ∧ β1 ∈ chart[i,i+j] ∧ β2 ∈ chart[i+j,i+l+1]) then chart[i,i+l+1] ← chart[i,i+l+1] ∪ {α};

✎ ✍ ☞ ✌

Kim adored snow in Oslo

1 2 3 4 5 0 NP S S 1 V VP VP 2 NP NP 3 P PP 4 NP

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (9)

slide-37
SLIDE 37

Limitations of the CKY Algorithm

Built-In Assumptions

  • Chomsky Normal Form grammars: α → β1β2 or α → γ (βi ∈ C, γ ∈ Σ);
  • breadth-first (aka exhaustive): always compute all values for each cell;
  • rigid control structure: bottom-up, left-to-right (one diagonal at a time).

Generalized Chart Parsing

  • Liberate order of computation: no assumptions about earlier results;
  • active edges encode partial rule instantiations, ‘waiting’ for additional

(adjacent and passive) constituents to complete: [1, 2, VP → V • NP];

  • parser can fill in chart cells in any order and guarantee completeness.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (10)

slide-38
SLIDE 38

Limitations of the CKY Algorithm

Built-In Assumptions

  • Chomsky Normal Form grammars: α → β1β2 or α → γ (βi ∈ C, γ ∈ Σ);
  • breadth-first (aka exhaustive): always compute all values for each cell;
  • rigid control structure: bottom-up, left-to-right (one diagonal at a time).

Generalized Chart Parsing

  • Liberate order of computation: no assumptions about earlier results;
  • active edges encode partial rule instantiations, ‘waiting’ for additional

(adjacent and passive) constituents to complete: [1, 2, VP → V • NP];

  • parser can fill in chart cells in any order and guarantee completeness.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (10)

slide-39
SLIDE 39

Chart Parsing — Specialized Dynamic Programming

Basic Notions

  • Use chart to record partial analyses, indexing them by string positions;
  • count inter-word vertices; CKY: chart row is start, column end vertex;
  • treat multiple ways of deriving the same category for some substring as

equivalent; pursue only once when combining with other constituents. Key Benefits

  • Dynamic programming (memoization): avoid recomputation of results;
  • efficient indexing of constituents: no search by start or end positions;
  • compute parse forest with exponential ‘extension’ in polynomial time.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (11)

slide-40
SLIDE 40

Chart Parsing — Specialized Dynamic Programming

Basic Notions

  • Use chart to record partial analyses, indexing them by string positions;
  • count inter-word vertices; CKY: chart row is start, column end vertex;
  • treat multiple ways of deriving the same category for some substring as

equivalent; pursue only once when combining with other constituents. Key Benefits

  • Dynamic programming (memoization): avoid recomputation of results;
  • efficient indexing of constituents: no search by start or end positions;
  • compute parse forest with exponential ‘extension’ in polynomial time.

inf4820 — -nov- (oe@ifi.uio.no)

Chart Parsing for Context-Free Grammars (11)