Natural Language Parsing Techonlogy Foundations of Language Science - - PowerPoint PPT Presentation

natural language parsing techonlogy
SMART_READER_LITE
LIVE PREVIEW

Natural Language Parsing Techonlogy Foundations of Language Science - - PowerPoint PPT Presentation

Natural Language Parsing Techonlogy Foundations of Language Science and Technology (WS 2014/2015) Bernd Kiefer Language Technology Lab, DFKI GmbH Department of Computational Linguistics Saarland University November 2014 1 Natural Language


slide-1
SLIDE 1

Natural Language Parsing Techonlogy

Foundations of Language Science and Technology (WS 2014/2015) Bernd Kiefer

Language Technology Lab, DFKI GmbH Department of Computational Linguistics Saarland University

November 2014

Natural Language Parsing Technology

1

slide-2
SLIDE 2

Outline

Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology

Natural Language Parsing Technology

2

slide-3
SLIDE 3

Outline

Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology

Natural Language Parsing Technology

3

slide-4
SLIDE 4

Language & Grammar

q Language

q Structural q Productive q Ambiguous, yet efficient in human-human communication

q Grammar

q Generalization of regularities in language structures q Morphology & syntax, often complemented by phonetics, phonology, semantics, and pragmatics

Natural Language Parsing Technology

4

slide-5
SLIDE 5

Ambiguity

q Human languages are ambiguous on almost every layer q Grammar frameworks are designed to represent necessary ambiguities, and eliminate unnecessary ones q Parsing models are responsible for retrieving valid analyses according to the grammar

Natural Language Parsing Technology

5

slide-6
SLIDE 6

Syntactic Parser as NLP Component

  • Morph. Analysis

PoS Tagging NER Chunking Syntactic Parsing Semantic Analysis . . .

Natural Language Parsing Technology

6

slide-7
SLIDE 7

Trees (or not)

S VP NP N N penny A

  • ld

Det an NP Paul V gave NP Sue Sue gave Paul an

  • ld

penny

SBJ IOBJ DOBJ ADJ DET

                                             

PHON|ORTH

D "GAVE" E

SYNSEM|LOC

2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4

CAT

2 6 6 6 6 4

HEAD VERB VAL

2 6 4

SUBJ

D NP 1 E

COMPS

D NP 2 , NP 3 E 3 7 5 3 7 7 7 7 5

CONT|RELS

8 > > > < > > > : give_rel 2 6 6 4

ARG1 1 ARG2 2 ARG3 3

3 7 7 5 9 > > > = > > > ; 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5

                                             

Natural Language Parsing Technology

7

slide-8
SLIDE 8

Chomsky Hierarchy

q Type 0 (unrestricted rewriting system) ↵ ! ↵, 2 (VN [ VT)∗ q Type 1 (context sensitive grammars) A! ! ! A 2 VN, , , ! 2 (VN [ VT)∗ q Type 2 (context free grammars) A ! A 2 VN, 2 (VN [ VT)∗ q Type 3 (regular grammars) A ! xB _ A ! x A, B 2 VN, x 2 VT

Natural Language Parsing Technology

8

slide-9
SLIDE 9

Context-Free Grammar

A CFG is a quadruple: hVT, VN, P, Si

q VT: terminal symbols q VN: non-terminal symbols q P: context-free productions A ! A 2 VN, 2 (VN [ VT)∗ q S: start symbol

Natural Language Parsing Technology

9

slide-10
SLIDE 10

Context-Free Phrase Structure Grammar

q S ! NP VP q NP ! Det N q N ! Adj N q VP ! V q VP ! V NP q VP ! Adv VP q N ! dog|cat q Det ! the|a q V ! chases|sleeps q Adj ! gray|lazy q Adv ! fiercely

Natural Language Parsing Technology

10

slide-11
SLIDE 11

CFG Derivation

q If = A, ! = ↵ and A ! ↵ 2 P then ! follows , ) ! q If a sequence of strings 1, 2, . . . , m where for all i (1  i  m 1), i ) i+1 then 1, 2, . . . , m is a derivation from 1 to m q “Derivable” relation: transitive, reflexive 1

) m

Natural Language Parsing Technology

11

slide-12
SLIDE 12

Outline

Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology

Natural Language Parsing Technology

12

slide-13
SLIDE 13

Parsing Strategies

q Top-down: start from the start symbol, and expand the tree with grammar rules (e.g. replace LHS symbol with RHS sequences of CFG productions) q Bottom-up: start from the input sequence, and apply grammar rules to build trees upwards (e.g. reducing RHS sequence into LHS symbols)

Natural Language Parsing Technology

13

slide-14
SLIDE 14

Top-Down Parsing

q Goal-directed search q Waste time on trees that do not match input sentence q Pure top-down (left-first) approach cannot parse (left-)recursion grammars

  • 1. S ! NP VP
  • 2. NP ! NP PP
  • 3. . . .

S VP NP PP NP PP NP PP NP . . .

Natural Language Parsing Technology

14

slide-15
SLIDE 15

Bottom-Up Parsing

q Use the input to guide the search (data-driven) q Waste time on trees that don’t result in S q Recursive unary rules still create an infinite parse forest for a finite length sentence

  • 1. A ! B|a
  • 2. B ! A
  • 3. . . .

. . . B A B A a

Natural Language Parsing Technology

15

slide-16
SLIDE 16

Problems

q Left-recursion NP ! NP PP q Ambiguity q Repeated parsing of subtrees

Natural Language Parsing Technology

16

slide-17
SLIDE 17

Dynamic Programming (DP)

q Divisibility: the optimal solution of a sub problem is part of the

  • ptimal solution of the whole problem

q Memoization: solve small problems only once and remember the answers

Example

Calculating Fibonacci numbers: Fn = Fn−1 + Fn−2 (F0 = 0, F1 = 1) Pascal Triangle (Binomial Coefficients): ✓n + 1 k + 1 ◆ = ✓n k ◆ + ✓ n k + 1 ◆

Natural Language Parsing Technology

17

slide-18
SLIDE 18

CYK Algorithm

q Cocke-Younger-Kasami, also known as CKY algorithm q Essentially a bottom-up chart parsing algorithm using dynamic programming q CFG is in Chomsky Normal Form (CNF)

q A ! BC q A ! a q S ! ✏ q A, B, C 2 VN, a 2 VT, B, C 6= S

q Fill in a two-dimension array: C[i][j] contains all the possible syntactic interpretations of the substring wi+1 . . . wj q Complexity O(n3)

Natural Language Parsing Technology

18

slide-19
SLIDE 19

CYK Algorithm

1: for all i, j 0  i < j  n do 2: C[i][j] ( ; 3: end for 4: for all A ! wi 2 P do 5: C[i 1][i] ( {A} [ C[i 1][i] 6: end for 7: for s = h2 . . . ni do 8: for all A ! B C 2 P, i, k : 0  i < k < i + s do 9: if B 2 C[i][k] ^ C 2 C[k][i + s] then 10: C[i][i + s] ( {A} [ C[i][i + s] 11: end if 12: end for 13: end for

Natural Language Parsing Technology

19

slide-20
SLIDE 20

CYK Chart Example

S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car

Natural Language Parsing Technology

20

slide-21
SLIDE 21

CYK Chart Example

N V D N P D N S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N

Natural Language Parsing Technology

20

slide-22
SLIDE 22

CYK Chart Example

N V D N P D N S NP NP S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N S NP NP

Natural Language Parsing Technology

20

slide-23
SLIDE 23

CYK Chart Example

N V D N P D N S NP NP VP PP S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N S NP NP VP PP

Natural Language Parsing Technology

20

slide-24
SLIDE 24

CYK Chart Example

N V D N P D N S NP NP VP PP S NP S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N S NP NP VP PP S NP

Natural Language Parsing Technology

20

slide-25
SLIDE 25

CYK Chart Example

N V D N P D N S NP NP VP PP S NP NP S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N S NP NP VP PP S NP NP

Natural Language Parsing Technology

20

slide-26
SLIDE 26

CYK Chart Example

N V D N P D N S NP NP VP PP S NP NP VP S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N S NP NP VP PP S NP NP VP

Natural Language Parsing Technology

20

slide-27
SLIDE 27

CYK Chart Example

N V D N P D N S NP NP VP PP S NP NP VP S S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N S NP NP VP PP S NP NP VP S

Natural Language Parsing Technology

20

slide-28
SLIDE 28

CYK Chart Example

N V D N P D N S NP NP VP PP S NP NP VP S VP S →NP VP|N VP|N V|NP V VP→V NP|V N|VP PP NP→D N|NP PP|N PP PP→P NP|P N N →john, girl, car V →saw, walks P →in D →the, a

1 2 3 4 5 6 7

john saw the girl in a car N V D N P D N S NP NP VP PP S NP NP VP S VP

Natural Language Parsing Technology

20

slide-29
SLIDE 29

Earley’s Algorithm

q Use dynamic programming to do top-down search q Chart: a set of items hh, i, A ! ↵ · i

q h, i: positions in the input 0  h  i  n q A ! ↵ · : dotted rule (A ! ↵ 2 P) q ↵: RHS prefix that has already been applied to input from h to i q : RHS suffix yet to be found

Natural Language Parsing Technology

21

slide-30
SLIDE 30

Earley’s Algorithm

q Initialize

foreach S ! ↵ 2 P C ( h0, 0, S ! ·↵i

q Scan(i)

if wi = a ^ hh, i 1, A ! ↵ · ai 2 C C ( hh, i, A ! ↵a · i

q Complete(i)

foreach hh, i, A ! ↵·i 2 C foreach hk, h, B ! · Ai 2 C C ( hk, i, B ! A · i

q Predict(i)

foreach hh, i, A ! ↵ · Bi 2 C foreach B ! 2 P C ( hi, i, B ! ·i

q Parse

Initialize for i = h1 . . . ni Predict(i 1) Scan(i) Complete(i) if 9 h0, n, S ! ↵·i 2 C return success else return failed

Natural Language Parsing Technology

22

slide-31
SLIDE 31

Earley Chart: Example

0 the/det 1 dog/n 2 chases/v 3 a/det 4 cat/n 5

1 2 3 4 5 S → ·NP VP NP → ·det n 1 NP → det · n 2 NP → det n· S → NP · VP VP → ·v VP → ·v NP 3 S → NP VP· VP → v· VP → v · NP NP → ·det n 4 NP → det · n 5 S → NP VP· VP → v NP· NP → det n·

  • 1. S ! NP VP
  • 2. VP ! v NP
  • 3. VP ! v
  • 4. NP ! det n

Natural Language Parsing Technology

23

slide-32
SLIDE 32

Outline

Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology

Natural Language Parsing Technology

24

slide-33
SLIDE 33

Probabilistic Context-Free Grammar

An PCFG is a quintuple: hVT, VN, P, S, Pri

q Pr : P ! [0, 1] s.t. 8A 2 VN, X

A→α∈P

Pr(A ! ↵) = 1 q Pr(A ! ↵) can be understood as the conditional probability of

  • bserving A ! ↵ in the derivation given A: P(A ! ↵|A)

Natural Language Parsing Technology

25

slide-34
SLIDE 34

Various Probabilities

Joint Probability: P(x, y)

q Input sequence: x q A Parse: y with corresponding derivation sequence: S )

r1 1 ) r2 2 ) r3 . . . ) rk x

where ri is the production rule used in th ith derivation step q P(x, y) = Qk

i=1 Pr(ri)

q P

y∈T (G),x=yield(y) P(x, y) = 1

q More generally, P(x, y|A) = Qk

i=1 Pr(ri) is the probability of a

sub-parse y rooted by A and generate input x by derivation sequence A )

r1 1 ) r2 2 ) r3 . . . ) rk x

Natural Language Parsing Technology

26

slide-35
SLIDE 35

Various Probabilities

Structural Language Model: P(x)

q P(x) = P

y∈T (x) P(x, y)

q T (x) is the set of parse trees for input sequence x

Natural Language Parsing Technology

27

slide-36
SLIDE 36

PCFG Example

  • 1. S → NP VP

1.0

  • 2. NP → Det N

0.8

  • 3. NP → NP PP

0.2

  • 4. VP → V NP

0.7

  • 5. VP → VP PP

0.3

  • 6. PP → P NP

1.0

  • 7. V → saw

1.0

  • 8. N → man

0.3

  • 9. N → girl

0.4

  • 10. N → telescope

0.3

  • 11. Det → a

0.4

  • 12. Det → the

0.6

  • 13. P → with

1.0

S VP PP NP N telescope Det a P with VP NP N girl Det a V saw NP N man Det the Natural Language Parsing Technology

28

slide-37
SLIDE 37

PCFG Example

  • 1. S → NP VP

1.0

  • 2. NP → Det N

0.8

  • 3. NP → NP PP

0.2

  • 4. VP → V NP

0.7

  • 5. VP → VP PP

0.3

  • 6. PP → P NP

1.0

  • 7. V → saw

1.0

  • 8. N → man

0.3

  • 9. N → girl

0.4

  • 10. N → telescope

0.3

  • 11. Det → a

0.4

  • 12. Det → the

0.6

  • 13. P → with

1.0

S0.000247726 VP0.00172032 NP0.0024576 PP0.096 NP0.096 N0.3 telescope Det0.4 a P1.0 with NP0.128 N0.4 girl Det0.4 a V1.0 saw NP0.144 N0.3 man Det0.6 the Natural Language Parsing Technology

28

slide-38
SLIDE 38

PCFG Example

  • 1. S → NP VP

1.0

  • 2. NP → Det N

0.8

  • 3. NP → NP PP

0.2

  • 4. VP → V NP

0.7

  • 5. VP → VP PP

0.3

  • 6. PP → P NP

1.0

  • 7. V → saw

1.0

  • 8. N → man

0.3

  • 9. N → girl

0.4

  • 10. N → telescope

0.3

  • 11. Det → a

0.4

  • 12. Det → the

0.6

  • 13. P → with

1.0

S0.000371589 VP0.00258048 PP0.096 NP0.096 N0.3 telescope Det0.4 a P1.0 with VP0.0896 NP0.128 N0.4 girl Det0.4 a V1.0 saw NP0.144 N0.3 man Det0.6 the Natural Language Parsing Technology

28

slide-39
SLIDE 39

Parsing with PCFG

q Earley and CYK algorithms can be adapted to carry probabilities q Best parse tree y∗ for a sentence x y∗ = argmax

y∈T (x)

P(x, y) q Nbest parse can be recovered with Viterbi-like algorithm

Natural Language Parsing Technology

29

slide-40
SLIDE 40

Learning PCFG Probabilities

q Given a treebank, with Maximum-Likelihood Estimation (MLE): Pr(A ! ) = #(A ! ) #(A) q When the grammar is large (e.g. by lexicalization), smoothing is necessary to overcome data sparseness

Natural Language Parsing Technology

30

slide-41
SLIDE 41

Inside-Outside Algorithm

q When there is no labeled data (treebank), probabilities of a PCFG can be updated to maximize the likelihood over a set of unlabeled sentences Pr∗ = argmax

Pr

Y

x

P(x) = argmax

Pr

Y

x

X

y∈T (x)

P(x, y) q An Expectation-Maximization procedure can be used to iteratively find Pr∗

Natural Language Parsing Technology

31

slide-42
SLIDE 42

Inside Probability

Definition

Inside probability j(p, q) is the probability of sequence wp+1 . . . wq being generated with a tree rooted by node Nj j(p, q) = P(wp+1 . . . wq|Nj

pq)

q 1(0, n) = P(w1w2 . . . wn) N1 = S q Calculation can be carried out bottom-up j(k 1, k) = Pr(Nj ! wk) Nj 2 VN (1) j(p, q) = X

r,s q−1

X

d=p+1

Pr(Nj ! Nr Ns) · r(p, d) · s(d, q) (2)

Natural Language Parsing Technology

32

slide-43
SLIDE 43

Outside Probability

Definition

Outside probability ↵j(p, q) is the total probability of beginning with the start symbol and generating Nj

pq and all the words outside

↵j(p, q) = P(w1 . . . wp, Nj

pq, wq+1 . . . wn)

q Nj

pq means

Nj ∗ ) wp+1 . . . wq q P(w1w2 . . . wn, Nj

pq) = ↵j(p, q) · j(p, q)

q P(w1w2 . . . wn) = P

j ↵j(k 1, k)Pr(Nj ! wk) for any k

Natural Language Parsing Technology

33

slide-44
SLIDE 44

Outside Probability (cont.)

q Calculation is top-down ↵j(0, n) = ⇢ 1 Nj = S

  • therwise

(3) ↵j(p, q) = X

f,g

X

q<e<n

↵f(p, e) · Pr(Nf ! Nj Ng) · g(q, e) + X

f,g

X

0<e<p

↵f(e, q) · Pr(Nf ! Ng Nj) · g(e, p) (4)

Natural Language Parsing Technology

34

slide-45
SLIDE 45

Calculating Expected Counts

The expected times Nj is used in the derivation for sentence w1 . . . wn E[Nj|w1 . . . wn] =

n−1

X

p=0 n

X

q=p+1

P(Nj

pq|w1 . . . wn)

(5) =

n−1

X

p=0 n

X

q=p+1

P(Nj

pq, w1 . . . wn)

P(w1 . . . wn) =

n−1

X

p=0 n

X

q=p+1

↵j(p, q) · j(p, q) P(w1 . . . wn)

Natural Language Parsing Technology

35

slide-46
SLIDE 46

Calculating Expected Counts (cont.)

The expected times Nj ! NrNs and Nj is used in the derivation for sentence w1 . . . wn

E[Nj ! NrNs|w1 . . . wn] =

n−1

X

p=0 n

X

q=p+1

P(Nj

pq, Nj ! NrNs|w1 . . . wn)

(6) = Pn−1

p=0

Pn

q=p+1

Pq−1

d=p+1 ↵j(p, q) · Pr(Nj ! NrNs) · r(p, d) · s(d, q)

P(w1 . . . wn)

Natural Language Parsing Technology

36

slide-47
SLIDE 47

Update Formula

For a single sentence, rule probabilities can be reestimated

ˆ Pr(Nj ! NrNs) = E[Nj ! NrNs, Nj|w1 . . . wn] E[Nj|w1 . . . wn] (7) = Pn−1

p=0

Pn

q=p+1

Pq−1

d=p+1 ↵j(p, q) · Pr(Nj ! NrNs) · r(p, d) · s(d, q)

Pn−1

p=0

Pn

q=p+1 ↵j(p, q) · j(p, q)

Similarly, for unary rules,

ˆ Pr(Nj ! wk) = Pn

h=1 ↵j(h 1, h) · P(wh = wk) · j(h 1, h)

Pn−1

p=0

Pn

q=p+1 ↵j(p, q) · j(p, q)

(8)

Natural Language Parsing Technology

37

slide-48
SLIDE 48

Multiple Training Sentences

For each sentence ~ wi in the training corpus

fi(p, q, j, r, s) = Pq−1

d=p+1 ↵j(p, q) · Pr(Nj ! NrNs) · r(p, d) · s(d, q)

P(w1 . . . wn) (9) gi(h, j, k) = ↵j(h 1, h) · P(wh = wk) · j(h 1, h) P(w1 . . . wn) (10) hi(p, q, j) = ↵j(p, q) · j(p, q) P(w1 . . . wn) (11)

then

ˆ Pr(Nj ! NrNs) = Pm

i=1

Pni −1

p=0

Pni

q=p+1 fi(p, q, j, r, s)

Pm

i=1

Pni −1

p=0

Pni

q=p+1 hi(p, q, j)

(12) ˆ Pr(Nj ! wk) = Pm

i=1

Pni

h=1 gi(h, j, k)

Pm

i=1

Pni −1

p=0

Pni

q=p+1 hi(p, q, j)

(13)

Natural Language Parsing Technology

38

slide-49
SLIDE 49

Inside-Outside Algorithm

Initialize an arbitrary set of rule probabilities Pr0 repeat F = G = H ( 0 for ~ wk = wk

1 . . . wk n in the corpus do

Calculate inside probabilities j(p, q) Calculate outside probabilities ↵j(p, q) Accumulate counts F G and H end for Update rule probabilities Pr i+1(Nj ! NrNs) and Pr i+1(Nj ! wh) until |PPri+1(W) PPri (W)|  ✏

Natural Language Parsing Technology

39

slide-50
SLIDE 50

Outline

Overview Basic Parsing Algorithms Parsing Strategies CYK Algorithm Earley’s Algorithm Parsing with Probabilistic Context-Free Grammar PCFG Inside-Outside Algorithm Recent Advances in Parsing Technology

Natural Language Parsing Technology

40

slide-51
SLIDE 51

Statistical Constituent Parsing

q Collins parser [Collins, 1997] q Reranking model [Charniak and Johnson, 2005] q Self-training [McClosky et al., 2006] q Latent-Variable PCFG [Petrov et al., 2006]

Natural Language Parsing Technology

41

slide-52
SLIDE 52

Statistical Dependency Parsing

q Graph-based approach [Eisner, 1996, McDonald et al., 2005]

q Edge-factorized scoring model q Efficient algorithms to find maximal spanning tree q Allows non-projective dependency structures

q Transition-based approach [Nivre et al., 2007, Sagae and Tsujii, 2008]

q (Near) deterministic parsing q Projective/pseudo-projective

Natural Language Parsing Technology

42

slide-53
SLIDE 53

Parsing with Richer Formalisms

q TAG q CCG q LFG q HPSG

Natural Language Parsing Technology

43

slide-54
SLIDE 54

Parser Evaluation

q Evaluation against “gold-standard”

q E.g. PARSEVAL

q Application-based evaluation

Natural Language Parsing Technology

44

slide-55
SLIDE 55

Natural Language Parsing Technology

45

slide-56
SLIDE 56

Domain Adaptability and Multilinguality

q Statistical parsing models usually performs well in in-domain tests and suffer significant accuracy drop when tested on out-of-domain data q Differences between languages require different parsing models (morphology, word order, etc.)

Natural Language Parsing Technology

46

slide-57
SLIDE 57

Open Questions

q How relevant is linguistic study to the development of parsers? q How do we evaluate a parser? q How to make trade-offs between adequacy, accuracy and efficiency?

Natural Language Parsing Technology

47

slide-58
SLIDE 58

References I

Charniak, E. and Johnson, M. (2005). Coarse-to-fine n-best parsing and maxent discriminative reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL ’05), pages 173–180, Ann Arbor, Michigan. Collins, M. (1997). Three Generative, Lexicalised Models for Statistical Parsing. In Proceedings of the 35th annual meeting of the association for computational linguistics, pages 16–23, Madrid, Spain. Eisner, J. (1996). Three new probabilistic models for dependency parsing: An exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), pages 340–345, Copenhagen, Denmark.

Natural Language Parsing Technology

48

slide-59
SLIDE 59

References II

McClosky, D., Charniak, E., and Johnson, M. (2006). Effective self-training for parsing. In Proceedings of HLT-NAACL-2006, pages 152–159, New York, USA. McDonald, R., Pereira, F., Ribarov, K., and Hajic, J. (2005). Non-Projective Dependency Parsing using Spanning Tree Algorithms. In Proceedings of HLT-EMNLP 2005, pages 523–530, Vancouver, Canada. Nivre, J., Nilsson, J., Hall, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., and Marsi, E. (2007). Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(1):1–41.

Natural Language Parsing Technology

49

slide-60
SLIDE 60

References III

Petrov, S., Barrett, L., Thibaux, R., and Klein, D. (2006). Learning accurate, compact, and interpretable tree annotation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 433–440, Sydney, Australia. Sagae, K. and Tsujii, J. (2008). Shift-reduce dependency dag parsing. In COLING ’08: Proceedings of the 22nd International Conference on Computational Linguistics, pages 753–760, Manchester, UK.

Natural Language Parsing Technology

50