Algorithms for Natural Language Processing Lecture 12: - - PowerPoint PPT Presentation

algorithms for natural language processing
SMART_READER_LITE
LIVE PREVIEW

Algorithms for Natural Language Processing Lecture 12: - - PowerPoint PPT Presentation

Algorithms for Natural Language Processing Lecture 12: Context-Free Recognition Levels of Linguistic Representation discourse pragmatics semantics syntax generation analysis most of this class lexemes morphology phonology orthography


slide-1
SLIDE 1

Algorithms for Natural Language Processing

Lecture 12: Context-Free Recognition

slide-2
SLIDE 2

Levels of Linguistic Representation

discourse semantics pragmatics lexemes morphology

  • rthography

phonology phonetics speech text

analysis generation most of this class

syntax

slide-3
SLIDE 3

Context-Free Grammars

  • Using grammars

Recognition Parsing

  • Parsing algorithms

Top down Bottom up

  • CNF
  • CKY Algorithm
  • Cocke-Younger-Kasami
slide-4
SLIDE 4

Parsing vs Word Matching

  • Consider
  • The student who was taught by David won the prize
  • Who won the prize?
  • String matching

”David won the prize.”

  • Parsing based
  • ((The student (who was taught by David))

won the prize)

  • “The student won the prize”
slide-5
SLIDE 5

Context-Free Grammars

  • Vocabulary of terminal symbols, Σ
  • Set of nonterminal symbols (a.k.a. variables), N
  • Special start symbol S ∈ N
  • Production rules of the form X → α

where X ∈ N α ∈ (N ∪ Σ)*

slide-6
SLIDE 6

Two Related Problems

  • Input: sentence w = (w1, ..., wn) and CFG G
  • Output (recognition): true iff w ∈ Language(G)
  • Output (parsing): one or more derivations for

w, under G

slide-7
SLIDE 7

Parsing as Search

S w1 ... ... wn top-down bottom-up

slide-8
SLIDE 8

Implementing Recognizers as Search

Agenda = { state0 } while(Agenda not empty) s = pop a state from Agenda if s is a success-state return s // valid parse tree else if s is not a failure-state: generate new states from s push new states onto Agenda return nil // no parse!

slide-9
SLIDE 9

Example Grammar and Lexicon

slide-10
SLIDE 10

Recursive Descent (A Top-Down Parser)

Start state: (S, 0) Scan: From (wj+1 β, j), you can get to (β, j + 1). Predict: If Z → γ, then from (Z β, j), you can get to (γβ, j). Final state: (ε, n)

slide-11
SLIDE 11

Example Grammar and Lexicon

slide-12
SLIDE 12

Shift-Reduce (A Bottom-Up Parser)

  • Start state: (ε, 0)
  • Shift: From (α, j), you can get to (α wj+1, j + 1).
  • Reduce: If Z → γ, then from (αγ, j) you can get

to (α Z, j).

  • Final state: (S, n)
slide-13
SLIDE 13

Simple Grammar

  • S -> NP VP
  • VP -> V NP
  • NP -> John
  • NP -> Delta
  • V -> flies
slide-14
SLIDE 14

Context-Free Grammars in Chomsky Normal Form

  • Vocabulary of terminal symbols, Σ
  • Set of nonterminal symbols (a.k.a. variables), N
  • Special start symbol S ∈ N
  • Production rules of the form X → α

where X ∈ N α ∈ N,N ∪ Σ

slide-15
SLIDE 15

Convert CFGs to CNF

  • For each rule

X → A B C

  • Rewrite as

X → A X2 X2 → B C

  • Introducing a new non-terminal
slide-16
SLIDE 16
slide-17
SLIDE 17

CKY Algorithm

for i = 1 ... n C[i-1, i] = { V | V → wi } for ℓ = 2 ... n // width for i = 0 ... n - ℓ // left boundary k = i + ℓ // right boundary for j = i + 1 ... k – 1 // midpoint C[i, k] = C[i, k] ∪ { V | V → YZ, Y ∈ C[i, j], Z ∈ C[j, k] } return true if S ∈ C[0, n]

slide-18
SLIDE 18

CKY Algorithm: Chart

book this flight through

Houston

slide-19
SLIDE 19

CKY Algorithm: Chart

Noun

book this flight through

Houston

slide-20
SLIDE 20

CKY Algorithm: Chart

Noun, Verb

book this flight through

Houston

slide-21
SLIDE 21

CKY Algorithm: Chart

Noun, Verb

book

Det

this

Noun

flight

Prep

through

PNoun

Houston

slide-22
SLIDE 22

CKY Algorithm: Chart

Noun, Verb

book

Det

this

Noun

flight

Prep

through

PNoun, NP

Houston

slide-23
SLIDE 23

CKY Algorithm: Chart

Noun, Verb

  • book

Det

this

Noun

flight

Prep

through

PNoun NP

Houston

slide-24
SLIDE 24

CKY Algorithm: Chart

Noun, Verb

  • book

Det NP

this

Noun

flight

Prep

through

PNoun, NP

Houston

slide-25
SLIDE 25

CKY Algorithm: Chart

Noun, Verb

  • book

Det NP

this

Noun

flight

Prep

through

PNoun, NP

Houston

slide-26
SLIDE 26

CKY Algorithm: Chart

Noun, Verb

  • book

Det NP

this

Noun

  • flight

Prep

through

PNoun, NP

Houston

slide-27
SLIDE 27

CKY Algorithm: Chart

Noun, Verb

  • book

Det NP

  • this

Noun

  • flight

Prep

through

PNoun, NP

Houston

slide-28
SLIDE 28

CKY Algorithm: Chart

Noun, Verb

  • book

Det NP

  • this

Noun

  • flight

Prep PP

through

PNoun, NP

Houston

slide-29
SLIDE 29

CKY Algorithm: Chart

Noun, Verb

  • book

Det NP

  • this

Noun

  • flight

Prep PP

through

PNoun, NP

Houston

slide-30
SLIDE 30

CKY Algorithm: Chart

Noun, Verb

  • book

Det NP

  • NP

this

Noun

  • flight

Prep PP

through

PNoun, NP

Houston

slide-31
SLIDE 31

CKY Algorithm: Chart

Noun, Verb

  • VP

book

Det NP

  • NP

this

Noun

  • flight

Prep PP

through

PNoun, NP

Houston

slide-32
SLIDE 32

CKY Algorithm: Chart

Noun, Verb

  • VP,S

book

Det NP

  • NP

this

Noun

  • flight

Prep PP

through

PNoun, NP

Houston

slide-33
SLIDE 33

CKY Algorithm: Chart

Noun, Verb

  • VP,S
  • book

Det NP

  • NP

this

Noun

  • flight

Prep PP

through

PNoun, NP

Houston

slide-34
SLIDE 34

CKY Algorithm: Chart

Noun, Verb

  • VP,S
  • S

book

Det NP

  • NP

this

Noun

  • flight

Prep PP

through

PNoun, NP

Houston

slide-35
SLIDE 35

CKY Algorithm

for i = 1 ... n C[i-1, i] = { V | V → wi } for ℓ = 2 ... n // width for i = 0 ... n - ℓ // left boundary k = i + ℓ // right boundary for j = i + 1 ... k – 1 // midpoint C[i, k] = C[i, k] ∪ { V | V → YZ, Y ∈ C[i, j], Z ∈ C[j, k] } return true if S ∈ C[0, n]

slide-36
SLIDE 36

CKY Equations

C[i − 1, i, wi] = true C[i − 1, i, V ] = ( true if V → wi false

  • therwise

C[i, j, V ] = 8 > > > > > > > > < > > > > > > > > : true if ∃j, Y, Z such that V → Y Z and C[i, k, Y ] and C[k, j, Z] and i < k < j false

  • therwise

goal = C[0, n, S]

slide-37
SLIDE 37

CKY Complexity

  • CKY worst case is O(n^3 . G)
  • Best is worst case
  • (Others better in average case)
slide-38
SLIDE 38

CFG Grammars

  • Parsing and Recognition
  • Bottom up and Top down
  • CKY (for CNF)