Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning - - PowerPoint PPT Presentation

weakly supervised grammar informed bayesian ccg parser
SMART_READER_LITE
LIVE PREVIEW

Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning - - PowerPoint PPT Presentation

Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning Dan Garrette UT-Austin Chris Dyer CMU Jason Baldridge UT-Austin Noah A. Smith CMU Motivation Annotating parse trees by hand is extremely difficult. Motivation Can we learn


slide-1
SLIDE 1

Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning

Dan Garrette Chris Dyer Jason Baldridge Noah A. Smith UT-Austin CMU UT-Austin CMU

slide-2
SLIDE 2

Annotating parse trees by hand is extremely difficult.

Motivation

slide-3
SLIDE 3

Motivation

Can we learn new parsers cheaply? (cheaper = less supervision)

slide-4
SLIDE 4

Motivation

When supervision is scarce, we have to be smarter about data.

slide-5
SLIDE 5

Type-Level Supervision

slide-6
SLIDE 6

Type-Level Supervision

  • Unannotated text
  • Incomplete tag dictionary: word {tags}
slide-7
SLIDE 7

Type-Level Supervision

Used for part-of-speech tagging for 20+ years

[Kupiec, 1992] [Merialdo, 1994]

slide-8
SLIDE 8

Type-Level Supervision

Good tagger performance even with low supervision

[Ravi & Knight, 2009] [Das & Petrov, 2011] [Garrette & Baldridge, 2013] [Garrette et al., 2013]

slide-9
SLIDE 9

Combinatory Categorial Grammar (CCG)

slide-10
SLIDE 10

CCG

Every word token is associated with a category Categories combine to form categories of larger constituents

[Steedman, 2000] [Steedman and Baldridge, 2011]

slide-11
SLIDE 11

the dog n np n / np

CCG

slide-12
SLIDE 12

s dogs np sleep \ s

CCG

np

slide-13
SLIDE 13

Type-Supervised CCG

the lazy dogs np/n n np n/n np wander (s\np)/np n n/n np/n s\np …

slide-14
SLIDE 14

n np np n wander the lazy dogs / n n / \ s

CCG Parsing

slide-15
SLIDE 15

n np np n wander the lazy dogs / n n / \ s

CCG Parsing

slide-16
SLIDE 16

n np np n wander the lazy dogs / n n / \ s

CCG Parsing

slide-17
SLIDE 17

n n np np n wander the lazy dogs / n n / \ s

CCG Parsing

slide-18
SLIDE 18

n n np np n wander the lazy dogs / n n / \ s

CCG Parsing

slide-19
SLIDE 19

n n np np n wander the lazy dogs / n n / \ s

CCG Parsing

slide-20
SLIDE 20

n n np np n wander the lazy dogs np / n n / \ s

CCG Parsing

slide-21
SLIDE 21

n n np np n wander the lazy dogs np / n n / \ s

CCG Parsing

slide-22
SLIDE 22

n n np np n wander the lazy dogs np / n n / \ s

CCG Parsing

slide-23
SLIDE 23

n n np np n wander the lazy dogs np / n n / \ s s

CCG Parsing

slide-24
SLIDE 24

Why CCG?

Machine Translation [Weese, Callison-Burch, and Lopez, 2012] Semantic Parsing [Zettlemoyer and Collins, 2005]

slide-25
SLIDE 25

Type-supervised learning for CCG is highly ambiguous Penn Treebank parts-of-speech CCGBank Categories 48 tags 1,300+ categories

Type-Supervised CCG

slide-26
SLIDE 26

The grammar formalism itself can be used to guide learning

Our Strategy

slide-27
SLIDE 27

Our Strategy

Incorporate universal knowledge about grammar into learning

slide-28
SLIDE 28

Universal Knowledge

slide-29
SLIDE 29

the lazy dog np/n (np\(np/n))/n n np\(np/n) np the lazy dog np/n n/n n n np

Prefer Simpler Categories

slide-30
SLIDE 30

the lazy dog np/n (np\(np/n))/n n np\(np/n) np the lazy dog np/n n/n n n np

Prefer Simpler Categories

slide-31
SLIDE 31

buy := (((sb\np)/pp)/pp)/np appears 342 times in CCGbank buy := (sb\np)/np appears once

e.g. “Opponents don't buy such arguments.” “Tele-Communications agreed to buy half of Showtime Networks from Viacom for $ 225 million.” pp pp

Prefer Simpler Categories

slide-32
SLIDE 32

transitive verb: (he) hides (the money) (sb\np)/np

Prefer Modifier Categories

((sb\np)/np)/((sb\np)/np) adverb: (he) quickly (hides) (the money)

slide-33
SLIDE 33

a {s, np, n,…} A B / B A B \ B patom(a) A A B \ C B / C × pterm pterm pterm pterm pterm × pfwd × pfwd × pfwd × pfwd × pmod × pmod × pmod × pmod

Weighted Category Grammar

slide-34
SLIDE 34

a {s, np, n,…} A B / B A B \ B patom(a) A A B \ C B / C × pterm pterm pterm pterm pterm × pfwd × pfwd × pfwd × pfwd × pmod × pmod × pmod × pmod

Weighted Category Grammar

+ +

slide-35
SLIDE 35

n np np np n wander the lazy dogs / n n / \ s s n np s

Prefer Likely Categories

slide-36
SLIDE 36

n np np np n wander the lazy dogs / n n / \ s s n np s

Prefer Likely Categories

slide-37
SLIDE 37

Type-Supervised Learning

unlabeled corpus tag dictionary universal properties of the CCG formalism

same as POS tagging

slide-38
SLIDE 38

Posterior Inference

[Johnson, Griffiths, and Goldwater, 2007]

slide-39
SLIDE 39

Posterior Inference

the lazy dogs

np/n n np n/n np

wander

(s\np)/np n n/n np/n s\np …

Priors (simple is good) PCFG

Inside

slide-40
SLIDE 40

Posterior Inference

the lazy dogs

np/n n np n/n np

wander

(s\np)/np n n/n np/n s\np …

Priors (simple is good) PCFG

Inside

slide-41
SLIDE 41

Posterior Inference

the lazy dogs

np/n n np n/n np

wander

(s\np)/np n n/n np/n s\np …

Priors (simple is good) PCFG

Sample

slide-42
SLIDE 42

Posterior Inference

the lazy dogs

np/n n np n/n np

wander

(s\np)/np n n/n np/n s\np …

Priors (simple is good) PCFG

Sample

slide-43
SLIDE 43

Posterior Inference

the lazy dogs

np/n n np n/n np

wander

(s\np)/np n n/n np/n s\np …

Priors (simple is good) PCFG

slide-44
SLIDE 44

Posterior Inference

the lazy dogs

np/n n np n/n np

wander

(s\np)/np n n/n np/n s\np …

Priors (simple is good) PCFG

slide-45
SLIDE 45

Posterior Inference

the lazy dogs

np/n n np n/n np

wander

(s\np)/np n n/n np/n s\np …

Priors (simple is good) PCFG

slide-46
SLIDE 46

Results

slide-47
SLIDE 47

25 50 75 English Chinese Italian

Uniform With Prior

25 50 75 English Chinese Italian

Uniform With Prior

CCG Parsing Results

55.7 42.0 60.0 parsing accuracy 53.4 35.9 58.2

slide-48
SLIDE 48

Conclusion

Using universal grammatical knowledge can make better use of weak supervision