An Extended GHKM Algorithm for Inducing -SCFG Peng Li, Yang Liu and - - PowerPoint PPT Presentation

an extended ghkm algorithm for inducing scfg
SMART_READER_LITE
LIVE PREVIEW

An Extended GHKM Algorithm for Inducing -SCFG Peng Li, Yang Liu and - - PowerPoint PPT Presentation

An Extended GHKM Algorithm for Inducing -SCFG Peng Li, Yang Liu and Maosong Sun THUNLP&CSS Tsinghua University, China Outline l Background l Rule extraction algorithm l Modeling l Experiments l Conclusion 2 Outline l


slide-1
SLIDE 1

An Extended GHKM Algorithm for Inducing λ-SCFG

Peng Li, Yang Liu and Maosong Sun

THUNLP&CSS Tsinghua University, China

slide-2
SLIDE 2

Outline

l Background l Rule extraction algorithm l Modeling l Experiments

l Conclusion

2

slide-3
SLIDE 3

Outline

l Background l Rule extraction algorithm l Modeling l Experiments

l Conclusion

3

slide-4
SLIDE 4

Semantic Parsing

l Semantic parsing: mapping a natural

language sentence into its computer executable meaning representation

4

NL : Every boy likes a star MR: ∀x.(boy (x) → ∃y (human(y) ⋀ pop(y) ⋀ like(x, y)))

slide-5
SLIDE 5

Related Work

l Hand-build systems (e.g., Woods et al., 1972; Warren &

Pereira, 1982)

l Learning for semantic parsing

l Supervised methods (e.g., Wong & Mooney, 2007; Lu et

al., 2008)

l Semi-supervised methods (e.g., Kate & Mooney, 2007) l Unsupervised methods (e.g., Poon & Domingos, 2009 &

2010; Goldwasser et al., 2011)

5

slide-6
SLIDE 6

Related Work

l Hand-build systems (e.g., Woods et al., 1972; Warren &

Pereira, 1982)

l Learning for semantic parsing

l Supervised methods (e.g., Wong & Mooney, 2007; Lu et

al., 2008)

l Semi-supervised methods (e.g., Kate & Mooney, 2007) l Unsupervised methods (e.g., Poon & Domingos, 2009 &

2010; Goldwasser et al., 2011)

6

slide-7
SLIDE 7

Supervised Methods

l Inductive logic programming based methods (e.g., Zelle & Mooney, 1996; Tang & Mooney, 2001) l String kernel based methods (e.g., Kate & Mooney, 2006) l Grammar based methods

l PCFG (e.g., Ge & Mooney, 2005) l SCFG (e.g., Wong & Mooney, 2006 & 2007) l CCG (e.g., Zettlemoyer & Collins, 2005 & 2007; Kwiatkowski et

al., 2010 & 2011)

l Hybrid tree (e.g., Lu et al., 2008) l Tree transducer (Jones et al., 2012)

7

slide-8
SLIDE 8

Supervised Methods

l Inductive logic programming based methods (e.g., Zelle & Mooney, 1996; Tang & Mooney, 2001) l String kernel based methods (e.g., Kate & Mooney, 2006) l Grammar based methods

l PCFG (e.g., Ge & Mooney, 2005) l SCFG (e.g., Wong & Mooney, 2006 & 2007) l CCG (e.g., Zettlemoyer & Collins, 2005 & 2007; Kwiatkowski et

al., 2010 & 2011)

l Hybrid tree (e.g., Lu et al., 2008) l Tree transducer (Jones et al., 2012)

8

slide-9
SLIDE 9

Context Free Grammar (CFG)

l A formal grammar in which every production

rule is of the following form

9

X à Every X

Nonterminal Terminal Left hand side Right hand side

slide-10
SLIDE 10

Context Free Grammar (CFG)

l Derivation example

10

S à X X à Every X X à Every X1 X2 X à Every boy X2 X à Every boy X2 a star X à Every boy likes a star r1: S à X r2: X à Every X r3: X à X1 X2 r4: X à boy r5: X à X a star r6: X à likes

CFG Rules Derivation

slide-11
SLIDE 11

Synchronous Context Free Grammar (SCFG)

11

X à <Every X1, 每个 X1 >

One nonterminal Left hand side Right hand side 1 Right hand side 2

Rewritten synchronously

slide-12
SLIDE 12

Synchronous Context Free Grammar (SCFG)

l Two strings can be generated synchronously

12

S à < X, X > X à < Every X, 每个 X > X à < Every X1 X2, 每个 X1 X2>

..........

X à < Every boy likes a star, 每个 男孩 都 喜欢 一个 明星> How to use SCFG to handle logical forms?

slide-13
SLIDE 13

λ-calculus

l A formal system in mathematical logic for

expressing computation by way of variable binding and substitution

l λ-expression: λx.λy.borders(y, x) l β-conversion: bound variable substitution

λx.λy.borders(y, x)(texas) = λy.borders(y, texas)

l α-conversion: bound variable renaming

λx.λy.borders(y, x) = λz.λy.borders(y, z)

13

slide-14
SLIDE 14

λ-SCFG: SCFG+λ-calculus

l Reducing semantic parsing problem to SCFG

parsing problem

l Using λ-calculus to handle semantic

specific phenomenon

l Rule example

l X à < Every X1 , λ f.∀x ( f (x))⊲X1 >

14

(Wong & Mooney, 2007)

slide-15
SLIDE 15

λ-SCFG: SCFG+λ-calculus

15

NL : Every boy likes a star

S à < X1, X1 > X à < Every X1, λ f.∀x ( f (x)) ⊲X1 > X à < X1 X2, λ f. λg.λx. f (x) → g(x) ⊲X1⊲X2 > X à < boy, λx.boy (x) > X à < X1, λ f. λx. ∃y (f (x, y)) ⊲X1 > X à < X1 a star, λ f. λx.λy. human(y) ⋀ pop(y) ⋀ f (x,y)⊲X1 > X à < like, λx.λy.like(x,y) > r1: r2: r3: r4: r5: r6: r7: < S1, S1 > à < X2, X2 > à < Every X3, λ f.∀x. (f (x)) ⊲X3 > à < Every X4 X5, λ f. λg.∀x.(f (x) → g(x))⊲X4⊲X5 > à < Every boy X5, λg.∀x.(boy (x) → g(x))⊲X5 > à < Every boy X6, λ f.∀x.(boy (x) → ∃y(f (x, y)))⊲X6 > à < Every boy X7 a star, λ f.∀x.(boy (x) → ∃y (human(y) ⋀ pop(y) ⋀ f (x, y)))⊲X7 > à < Every boy likes a star, ∀x.(boy (x) → ∃y (human(y) ⋀ pop(y) ⋀ like(x, y))) > (r1) (r2) (r3) (r4) (r5) (r6) (r7)

slide-16
SLIDE 16

GHKM

l The GHKM algorithm extracts STSG rules

from aligned tree-string pairs

16

(Galley et al., 2004)

slide-17
SLIDE 17

Our work

GHKM

l The GHKM algorithm extracts STSG rules

from aligned tree-string pairs

17

slide-18
SLIDE 18

Outline

l Background l Rule extraction algorithm l Modeling l Experiments

l Conclusion

18

slide-19
SLIDE 19

Overview

19

GHKM Rule Extractor

NL : Every boy likes a star MR: ∀x.(boy (x) → ∃y (human(y) ⋀ pop(y) ⋀ like(x, y)))

Semantic Parser

Parameter estimation X à < Every X1, λ f.∀x ( f (x)) ⊲X1 > X à < boy, λx.boy (x) >

slide-20
SLIDE 20

Rule Extraction Algorithm

l Outline

  • 1. Building training examples

1.

Transforming logical forms to trees

2.

Aligning trees with sentences

  • 2. Identifying frontier nodes
  • 3. Extracting minimal rules
  • 4. Extracting composed rules

20

slide-21
SLIDE 21

Building Training Examples

21

NL : Every boy likes a star MR: ∀x.(boy (x) → ∃y (human(y) ⋀ pop(y) ⋀ like(x, y)))

slide-22
SLIDE 22

Building Training Examples

22

slide-23
SLIDE 23

Building Training Examples

23

slide-24
SLIDE 24

Building Training Examples

24

∀x.(boy (x) → ∃y (human(y) ⋀ pop(y) ⋀ like(x, y)))

slide-25
SLIDE 25

Building Training Examples

25

slide-26
SLIDE 26

Identifying Frontier Nodes

26

slide-27
SLIDE 27

Identifying Frontier Nodes

27

slide-28
SLIDE 28

Extracting minimal rules

28

X à < Every X1, λ f.∀x ( f (x)) ⊲X1 > X à < X1 X2, λ f. λg.λx. f (x) → g(x) ⊲X1⊲X2 > X à < boy, λx.boy (x) > X à < X1, λ f. λx. ∃y (f (x, y)) ⊲X1 > X à < X1 a star, λ f. λx.λy. human(y) ⋀ pop(y) ⋀ f (x,y)⊲X1 > X à < like, λx.λy.like(x,y) > ∀x: →: boy: ∃y: ⋀: like:

slide-29
SLIDE 29

Composed Rule Extraction

29

X à < X1 X2, λ f. λg.λx. f (x) → g(x) ⊲X1⊲X2 > X à < boy, λx.boy (x) > + X à < X1, λ f. λx. ∃y (f (x, y)) ⊲X1 > = X à < boy X1, λ f. λx.boy(x) → ∃y (f (x, y))⊲X1 >

slide-30
SLIDE 30

Outline

l Background l Rule extraction algorithm l Modeling l Experiments

l Conclusion

30

slide-31
SLIDE 31

l Log-linear model + MERT training l Target

Modeling

31

ˆ e = e argmax

D s.t. s(D)≡sw D

( )

" # $ % & '

w D

( ) =

hi r

( )

r∈D

λi ×h4 D

( )

λ4 ×h5 D

( )

λ5 i=1 3

h

1 X → s,e

( ) = p e | s

( )

h2 X → s,e

( ) = plex s | e

( )

h3 X → s,e

( ) = plex e | s

( )

h4 X → s,e

( ) = ps e D

( )

( )

h5 X → s,e

( ) = exp D ( )

slide-32
SLIDE 32

Outline

l Background l Rule extraction algorithm l Modeling l Experiments

l Conclusion

32

slide-33
SLIDE 33

Experiments

l Dataset: GEOQUERY

l 880 English questions with corresponding Prolog

logical forms

33

answer(traverse(next_to(stateid(‘texas’))))

Semantic Parsing

Which rivers run through the states bordering Texas?

Query

Arkansas, Canadian, Cimarron, Gila, Mississippi, Rio Grande …

Answer

(Kate & Wong, ACL 2010 Tutotial)

slide-34
SLIDE 34

Experiments

l Dataset: GEOQUERY

l 880 English questions with corresponding Prolog

logical forms

l Evaluation metrics

34

precision = |C | |G |,recall = |C | |T |, F − measure = 2⋅ precision⋅recall precision +recall

slide-35
SLIDE 35

Experiments

System P R F Independent Test Set Z&C 2005 96.3 79.3 87.0 Z&C 2007 95.5 83.2 88.9 Kwiatkowksi, et al. (2010) 94.1 85.0 89.3 Cross Validation Results Kate et al. (2005) 89.0 54.1 67.3 Wong and Mooney (2006) 87.2 74.8 80.5 Kate and Mooney (2006) 93.3 71.7 81.1 Lu et al. (2008) 89.3 81.5 85.2 Ge and Mooney (2005) 95.5 77.2 85.4 Wong and Mooney (2007) 92.0 86.6 89.2 this work 93.0 87.6 90.2

35

slide-36
SLIDE 36

Experiments

  • F-measure for different languages

36

System en ge el th Wong and Mooney (2006) 77.7 74.9 78.6 75.0 Lu et al. (2008) 81.0 68.5 74.6 76.7 Kwiatkowksi, et al. (2010) 82.1 75.0 73.7 66.4 Jones et al. (2005) 79.3 74.6 75.4 78.2 this work 84.2 74.6 79.4 76.7

* en - English, ge - German, el - Greek, th - Thai

slide-37
SLIDE 37

Experiments

37

slide-38
SLIDE 38

Advantages

l Feasible to extract rules with varying

granularities in a principled way

l The widely used dataset only has 880 training

examples

l Alleviating the data sparseness problem

l Treating atomic logical form tokens as tree nodes

instead of context free grammar (CFG) production

l Robust to the nonisomorphism between NL

sentences and logical forms

38

slide-39
SLIDE 39

Outline

l Background l Rule extraction algorithm l Modeling l Experiments

l Conclusion

39

slide-40
SLIDE 40

Conclusion

l We have presented an extended GHKM

algorithm for inducing λ-SCFG and achieved state-of-the-art performance

l Future work

l Better alignment model l Investigate tree binarization to further improve

rule coverage

l Use EM or Monte Carlo methods to better

estimate λ-SCFG rule probabilities

40