Language Generation via DAG Transduction Yajie Ye, Weiwei Sun and - - PowerPoint PPT Presentation

language generation via dag transduction
SMART_READER_LITE
LIVE PREVIEW

Language Generation via DAG Transduction Yajie Ye, Weiwei Sun and - - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . . Language Generation via DAG Transduction Yajie Ye, Weiwei Sun and Xiaojun Wan {yeyajie,ws,wanxiaojun}@pku.edu.cn Institute of Computer Science and Technology Peking University . . . . . . .


slide-1
SLIDE 1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Language Generation via DAG Transduction

Yajie Ye, Weiwei Sun and Xiaojun Wan {yeyajie,ws,wanxiaojun}@pku.edu.cn

Institute of Computer Science and Technology Peking University

July 17, 2018

slide-2
SLIDE 2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Overview

1 Background 2 Formal Models 3 Our DAG Transducer 4 Evaluation

slide-3
SLIDE 3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Outline

1 Background 2 Formal Models 3 Our DAG Transducer 4 Evaluation

slide-4
SLIDE 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A NLG system Architecture

Communicative Goal Document Planning Document Plans Microplanning Sentence Plans Linguistic Realisation Surface Text

Reference

Ehud Reiter and Robert Dale, Building Natural Language Generation Systems, Cambridge University Press, 2000.

slide-5
SLIDE 5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A NLG system Architecture

Communicative Goal Document Planning Document Plans Microplanning Sentence Plans Linguistic Realisation Surface Text

In this paper, we study surface realization, i.e. mapping meaning representations to natural language sentences.

slide-6
SLIDE 6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Meaning Representation

  • Logic form, e.g. lambda calculus

Feature structures This paper: Graphs!

slide-7
SLIDE 7

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Meaning Representation

  • Logic form, e.g. lambda calculus
  • Feature structures

This paper: Graphs!

slide-8
SLIDE 8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Meaning Representation

  • Logic form, e.g. lambda calculus
  • Feature structures
  • This paper: Graphs!
slide-9
SLIDE 9

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graph-Structured Meaning Representation

Difgerent kinds of graph-structured semantic representations:

  • Semantic Dependency Graphs (SDP)
  • Abstract Meaning Representations (AMR)
  • Dependency-based Minimal Recursion Semantics (DMRS)
  • Elementary Dependency Structures (EDS)
slide-10
SLIDE 10

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graph-Structured Meaning Representation

Difgerent kinds of graph-structured semantic representations:

  • Semantic Dependency Graphs (SDP)
  • Abstract Meaning Representations (AMR)
  • Dependency-based Minimal Recursion Semantics (DMRS)
  • Elementary Dependency Structures (EDS)
slide-11
SLIDE 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graph-Structured Meaning Representation

Difgerent kinds of graph-structured semantic representations:

  • Semantic Dependency Graphs (SDP)
  • Abstract Meaning Representations (AMR)
  • Dependency-based Minimal Recursion Semantics (DMRS)
  • Elementary Dependency Structures (EDS)
slide-12
SLIDE 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graph-Structured Meaning Representation

Difgerent kinds of graph-structured semantic representations:

  • Semantic Dependency Graphs (SDP)
  • Abstract Meaning Representations (AMR)
  • Dependency-based Minimal Recursion Semantics (DMRS)
  • Elementary Dependency Structures (EDS)

_want_v_1 _the_q _boy_n_1 _the_q _believe_v_1 pronoun_q _girl_n_1 pron

BV BV ARG1 ARG2 ARG1 ARG2 BV

slide-13
SLIDE 13

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Type-Logical Semantic Graph

EDS graphs are grounded under type-logical semantics. They are usually very fmat and multi-rooted graphs.

_want_v_1 _the_q _boy_n_1 _the_q _believe_v_1 pronoun_q _girl_n_1 pron

BV BV ARG1 ARG2 ARG1 ARG2 BV

The boy wants the girl to believe him.

slide-14
SLIDE 14

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Previous Work

1 Seqence-to-seqence Models. (AMR-to-text) 2 Synchronous Node Replacement Grammar. (AMR-to-text) 3 Other Unifjcation grammar-based methods

Reference

Ioannis Konstas, Srinivasan Iyer, Mark Yatskar, Yejin Choi, and Luke Zettlemoyer. 2017. Neural AMR: Sequence-to-sequence models for parsing and generation.

slide-15
SLIDE 15

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Previous Work

1 Seqence-to-seqence Models. (AMR-to-text) 2 Synchronous Node Replacement Grammar. (AMR-to-text) 3 Other Unifjcation grammar-based methods

Reference

Linfeng Song, Xiaochang Peng, Yue Zhang, Zhiguo Wang, and Daniel Gildea. 2017. AMR-to-text generation with synchronous node replacement grammar.

slide-16
SLIDE 16

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Previous Work

1 Seqence-to-seqence Models. (AMR-to-text) 2 Synchronous Node Replacement Grammar. (AMR-to-text) 3 Other Unifjcation grammar-based methods

Reference

Carroll, John and Oepen, Stephan 2005. High effjciency realization for a wide-coverage unifjcation grammar

slide-17
SLIDE 17

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Outline

1 Background 2 Formal Models 3 Our DAG Transducer 4 Evaluation

slide-18
SLIDE 18

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Formalisms for Strings, Trees and Graphs

Chomsky hierarchy Grammar Abstract machines Type-0

  • Turing machine

Type-1 Context-sensitive Linear-bounded

  • Tree-adjoining

Embedded pushdown Type-2 Context-free Nondeterministic pushdown Type-3 Regular Finite Manipulating Graphs: Graph Grammar and DAG Automata.

slide-19
SLIDE 19

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Existing System

David Chiang, Frank Drewes, Daniel Gildea, Adam Lopez and Giorgio Satta. Weighted DAG Automata for Semantic Graphs. the longest NLP paper that I’ve ever read

slide-20
SLIDE 20

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata

A weighted DAG automaton is a tuple M = ⟨Σ, Q, δ, K⟩

σ

  • q1
  • q2
  • q3

. . .

  • qm
  • . . .
  • σ
  • . . .
  • r1
  • r2

. . .

  • rn

ω

{q1, · · · , qm}

σ/ω

− − → {r1, · · · , rn}

slide-21
SLIDE 21

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata

σ

  • q1
  • q2
  • q3

. . .

  • qm
  • . . .
  • σ
  • . . .
  • r1
  • r2

. . .

  • rn

ω

  • A run of M on DAG D = ⟨V, E, ℓ⟩ is an edge labeling function

ρ : E → Q.

  • The weight of ρ is the product of all weight of local

transitions: δ(ρ) = ⊗

v∈V

δ [ ρ(in(v))

ℓ(v)

− − → ρ(out(v)) ]

slide-22
SLIDE 22

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-23
SLIDE 23

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-24
SLIDE 24

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-25
SLIDE 25

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-26
SLIDE 26

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Failed ! Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-27
SLIDE 27

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-28
SLIDE 28

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-29
SLIDE 29

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Automata: Toy Example

States: John wants to go.

_want_v_1 _go_v_1 proper_q named(John)

Accept ! Recognition Rules: {}

_want_v_1

− − − − − − − → { , } {}

proper_q

− − − − − − → { } { }

_go_v_1

− − − − − → { } { }

_go_v_1

− − − − − → { } { , , }

named(John)

− − − − − − − → {}

slide-30
SLIDE 30

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Existing System

Daniel Quernheim and Kevin Knight. 2012. Towards probabilistic acceptors and transducers for feature structures

slide-31
SLIDE 31

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG-to-Tree Transducer

q

WANT BELIEVE BOY GIRL

slide-32
SLIDE 32

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG-to-Tree Transducer

q ⇒ S qnomb wants qinfb

WANT BELIEVE BOY GIRL BELIEVE BOY GIRL

slide-33
SLIDE 33

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG-to-Tree Transducer

q ⇒ S qnomb wants qinfb S qnomb wants INF qaccg to believe qaccb ⇒

WANT BELIEVE BOY GIRL BELIEVE BOY GIRL BOY GIRL

slide-34
SLIDE 34

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG-to-Tree Transducer

q ⇒ S qnomb wants qinfb S qnomb wants INF qaccg to believe qaccb ⇒ S INF NP NP NP the boy wants the girl to believe him ⇒

WANT BELIEVE BOY GIRL BELIEVE BOY GIRL BOY GIRL

slide-35
SLIDE 35

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG-to-Tree Transducer

q WANT BELIEVE BOY GIRL ⇒ S qnomb wants qinfb BELIEVE BOY GIRL ⇒ S qnomb wants INF qaccg to believe qaccb BOY GIRL ⇒ S INF NP NP NP the boy wants the girl to believe him

Challenges for DAG-to-tree transduction on EDS graphs:

  • Cannot easily reverse the directions of edges
  • Cannot easily handle multiple roots
slide-36
SLIDE 36

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Outline

1 Background 2 Formal Models 3 Our DAG Transducer 4 Evaluation

slide-37
SLIDE 37

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Our DAG-to-program transducer

The basic idea:

  • Rewritting: directly generating a new data structure piece

by piece, during recognizing an input DAG.

  • Obtaining target structures based on side efgects of the

DAG recognition. States:

_want_v_1 _go_v_1 proper_q named(John)

John wants to go. The output of our transducer is a program:

slide-38
SLIDE 38

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Our DAG-to-program transducer

The basic idea:

  • Rewritting: directly generating a new data structure piece

by piece, during recognizing an input DAG.

  • Obtaining target structures based on side efgects of the

DAG recognition. States:

_want_v_1 _go_v_1 proper_q named(John)

John wants to go. The output of our transducer is a program:

slide-39
SLIDE 39

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Our DAG-to-program transducer

The basic idea:

  • Rewritting: directly generating a new data structure piece

by piece, during recognizing an input DAG.

  • Obtaining target structures based on side efgects of the

DAG recognition. States:

_want_v_1 _go_v_1 proper_q named(John)

John wants to go. The output of our transducer is a program: S = x21 + want + x11

slide-40
SLIDE 40

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Our DAG-to-program transducer

The basic idea:

  • Rewritting: directly generating a new data structure piece

by piece, during recognizing an input DAG.

  • Obtaining target structures based on side efgects of the

DAG recognition. States:

_want_v_1 _go_v_1 proper_q named(John)

John wants to go. The output of our transducer is a program: S = x21 + want + x11 x11 = to + go

slide-41
SLIDE 41

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Our DAG-to-program transducer

The basic idea:

  • Rewritting: directly generating a new data structure piece

by piece, during recognizing an input DAG.

  • Obtaining target structures based on side efgects of the

DAG recognition. States:

_want_v_1 _go_v_1 proper_q named(John)

John wants to go. The output of our transducer is a program: S = x21 + want + x11 x11 = to + go x41 = ϵ

slide-42
SLIDE 42

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Our DAG-to-program transducer

The basic idea:

  • Rewritting: directly generating a new data structure piece

by piece, during recognizing an input DAG.

  • Obtaining target structures based on side efgects of the

DAG recognition. States:

_want_v_1 _go_v_1 proper_q named(John)

John wants to go. The output of our transducer is a program: S = x21 + want + x11 x11 = to + go x41 = ϵ x21 = x41 + John

slide-43
SLIDE 43

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Our DAG-to-program transducer

The basic idea:

  • Rewritting: directly generating a new data structure piece

by piece, during recognizing an input DAG.

  • Obtaining target structures based on side efgects of the

DAG recognition. States:

_want_v_1 _go_v_1 proper_q named(John)

John wants to go. The output of our transducer is a program: S = x21 + want + x11 x11 = to + go x41 = ϵ x21 = x41 + John = ⇒ S = John want to go

slide-44
SLIDE 44

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Transducation Rules

Recognition Part Generation Part A valid DAG Automata transition Statement template(s) {} _want_v_1 − − − − − − → { , } S = v + L + v

slide-45
SLIDE 45

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Transducation Rules

Recognition Part Generation Part A valid DAG Automata transition Statement template(s) {} _want_v_1 − − − − − − → { , } S = v + L + v We use parameterized states: label(number,direction) The range of direction: unchanged, empty, reversed.

slide-46
SLIDE 46

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Transducation Rules

Recognition Part Generation Part A valid DAG Automata transition Statement template(s) {} _want_v_1 − − − − − − → { , } S = v + L + v We use parameterized states: label(number,direction) The range of direction: unchanged, empty, reversed. {} _want_v_1 − − − − − − → {VP(1,u), NP(1,u)} S = vNP(1,u) + L + vVP(1,u)

slide-47
SLIDE 47

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Toy Example

Q = {DET(1,r), Empty(0,e), VP(1,u), NP(1,u)} Rule For Recognition For Generation 1 {}

proper_q

− − − − − → {DET(1,r)} vDET(1,r) = ϵ 2 {} _want_v_1 − − − − − − → {VP(1,u), NP(1,u)} S = vNP(1,u) + L + vVP(1,u) 3 {VP(1,u)}

_go_v_1

− − − − − → {Empty(0,e)} vVP(1,u) = to + L 4 {NP(1,u), DET(1,r)} named − − − → {} vNP(1,u) = vDET(1,r) + L

_want_v_1 _go_v_1 proper_q named(John) e3 e1 e2 e4

Recognition: To fjnd an edge labeling function ρ. The red dashed edges make up an intermediate graph T(ρ).

slide-48
SLIDE 48

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Toy Example

Q = {DET(1,r), Empty(0,e), VP(1,u), NP(1,u)} Rule For Recognition For Generation 1 {}

proper_q

− − − − − → {DET(1,r)} vDET(1,r) = ϵ 2 {} _want_v_1 − − − − − − → {VP(1,u), NP(1,u)} S = vNP(1,u) + L + vVP(1,u) 3 {VP(1,u)}

_go_v_1

− − − − − → {Empty(0,e)} vVP(1,u) = to + L 4 {NP(1,u), DET(1,r)} named − − − → {} vNP(1,u) = vDET(1,r) + L

_want_v_1 _go_v_1 proper_q named(John) e3 VP(1,u) e1 e2 NP(1,u) e4

Recognition: To fjnd an edge labeling function ρ. The red dashed edges make up an intermediate graph T(ρ).

slide-49
SLIDE 49

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Toy Example

Q = {DET(1,r), Empty(0,e), VP(1,u), NP(1,u)} Rule For Recognition For Generation 1 {}

proper_q

− − − − − → {DET(1,r)} vDET(1,r) = ϵ 2 {} _want_v_1 − − − − − − → {VP(1,u), NP(1,u)} S = vNP(1,u) + L + vVP(1,u) 3 {VP(1,u)}

_go_v_1

− − − − − → {Empty(0,e)} vVP(1,u) = to + L 4 {NP(1,u), DET(1,r)} named − − − → {} vNP(1,u) = vDET(1,r) + L

_want_v_1 _go_v_1 proper_q named(John) Empty(0,e) e3 VP(1,u) e1 e2 NP(1,u) e4

Recognition: To fjnd an edge labeling function ρ. The red dashed edges make up an intermediate graph T(ρ).

slide-50
SLIDE 50

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Toy Example

Q = {DET(1,r), Empty(0,e), VP(1,u), NP(1,u)} Rule For Recognition For Generation 1 {}

proper_q

− − − − − → {DET(1,r)} vDET(1,r) = ϵ 2 {} _want_v_1 − − − − − − → {VP(1,u), NP(1,u)} S = vNP(1,u) + L + vVP(1,u) 3 {VP(1,u)}

_go_v_1

− − − − − → {Empty(0,e)} vVP(1,u) = to + L 4 {NP(1,u), DET(1,r)} named − − − → {} vNP(1,u) = vDET(1,r) + L

_want_v_1 _go_v_1 proper_q named(John) Empty(0,e) e3 VP(1,u) e1 e2 NP(1,u) e4 DET(1,r)

Recognition: To fjnd an edge labeling function ρ. The red dashed edges make up an intermediate graph T(ρ).

slide-51
SLIDE 51

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Toy Example

Q = {DET(1,r), Empty(0,e), VP(1,u), NP(1,u)} Rule For Recognition For Generation 1 {}

proper_q

− − − − − → {DET(1,r)} vDET(1,r) = ϵ 2 {} _want_v_1 − − − − − − → {VP(1,u), NP(1,u)} S = vNP(1,u) + L + vVP(1,u) 3 {VP(1,u)}

_go_v_1

− − − − − → {Empty(0,e)} vVP(1,u) = to + L 4 {NP(1,u), DET(1,r)} named − − − → {} vNP(1,u) = vDET(1,r) + L

_want_v_1 _go_v_1 proper_q named(John) Empty(0,e) e3 VP(1,u) e1 e2 NP(1,u) e4 DET(1,r)

Recognition: To fjnd an edge labeling function ρ. The red dashed edges make up an intermediate graph T(ρ). Accept !

slide-52
SLIDE 52

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Toy Example

Q = {DET(1,r), Empty(0,e), VP(1,u), NP(1,u)} Rule For Recognition For Generation 1 {}

proper_q

− − − − − → {DET(1,r)} vDET(1,r) = ϵ 2 {} _want_v_1 − − − − − − → {VP(1,u), NP(1,u)} S = vNP(1,u) + L + vVP(1,u) 3 {VP(1,u)}

_go_v_1

− − − − − → {Empty(0,e)} vVP(1,u) = to + L 4 {NP(1,u), DET(1,r)} named − − − → {} vNP(1,u) = vDET(1,r) + L

S = vNP(1,u) + L + vVP(1,u) ⇓ S = x21 + want + x11 Instantiation: replace vl(j,d)

  • f edge ei with variable xij

and L with the output string in the statement templates.

slide-53
SLIDE 53

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DAG Transduction based-NLG

A general framework for DAG transduction based-NLG:

Semantic Graph DAG Transducer Sequential Lemmas Seq2seq Model Surface string

slide-54
SLIDE 54

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Outline

1 Background 2 Formal Models 3 Our DAG Transducer 4 Evaluation

slide-55
SLIDE 55

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inducing Transduction Rules

_steep_a_1<21:28> _decline_n_1<5:12> focus_d mofy<37:48> comp pronoun_q pron<49:51> _say_v_to<52:57> proper_q _the_q<0:4> _even_x_deg<16:20> _in_p_temp<34:36> e4 e3 e2 e5 e12 e1 e10 e9 e7 e9 e6 e11

“the decline is even steeper than in September”, he said.

Finding intermediate tree Assigning spans Assigning labels Generating statement templates

slide-56
SLIDE 56

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inducing Transduction Rules

_steep_a_1<21:28> _decline_n_1<5:12> focus_d mofy<37:48> comp pronoun_q pron<49:51> _say_v_to<52:57> proper_q _the_q<0:4> _even_x_deg<16:20> _in_p_temp<34:36> e4 e3 e2 e5 e12 e1 e10 e9 e7 e9 e6 e11

“the decline is even steeper than in September”, he said.

Finding intermediate tree Assigning spans Assigning labels Generating statement templates

slide-57
SLIDE 57

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inducing Transduction Rules

_steep_a_1<21:28> _decline_n_1<5:12> focus_d mofy<37:48> comp pronoun_q pron<49:51> _say_v_to<52:57> proper_q _the_q<0:4> _even_x_deg<16:20> _in_p_temp<34:36> e4 {<34:48>} e3 {<0:48>} e2 {<0:48>} e5 {<16:20>,<29:48>} e12 e1 {<16:20>} e10 {<0:4>} e9 {} e7 {} e9 {<37:48>} e6 {<49:51>} e11 {<0:12>}

“the decline is even steeper than in September”, he said.

Finding intermediate tree Assigning spans Assigning labels Generating statement templates

slide-58
SLIDE 58

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inducing Transduction Rules

_steep_a_1<21:28> _decline_n_1<5:12> focus_d mofy<37:48> comp pronoun_q pron<49:51> _say_v_to<52:57> proper_q _the_q<0:4> _even_x_deg<16:20> _in_p_temp<34:36> e4 {PP<34:48>} e3 {S<0:48>} e2 {S<0:48>} e5 {ADV<16:20>,PP<29:48>} e12 e1 {ADV<16:20>} e10 {DET<0:4>} e9 {} e7 {} e9 {NP<37:48>} e6 {NP<49:51>} e11 {NP<0:12>}

“the decline is even steeper than in September”, he said.

Finding intermediate tree Assigning spans Assigning labels Generating statement templates

slide-59
SLIDE 59

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inducing Transduction Rules

_steep_a_1<21:28> _decline_n_1<5:12> focus_d mofy<37:48> comp pronoun_q pron<49:51> _say_v_to<52:57> proper_q _the_q<0:4> _even_x_deg<16:20> _in_p_temp<34:36> e4 {PP<34:48>} e3 {S<0:48>} e2 {S<0:48>} e5 {ADV<16:20>,PP<29:48>} e12 e1 {ADV<16:20>} e10 {DET<0:4>} e9 {} e7 {} e9 {NP<37:48>} e6 {NP<49:51>} e11 {NP<0:12>}

“the decline is even steeper than in September”, he said.

Finding intermediate tree Assigning spans Assigning labels Generating statement templates

slide-60
SLIDE 60

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inducing Transduction Rules

_steep_a_1<21:28> _decline_n_1<5:12> focus_d mofy<37:48> comp pronoun_q pron<49:51> _say_v_to<52:57> proper_q _the_q<0:4> _even_x_deg<16:20> _in_p_temp<34:36> e4 {PP<34:48>} e3 {S<0:48>} e2 {S<0:48>} e5 {ADV<16:20>,PP<29:48>} e12 e1 {ADV<16:20>} e10 {DET<0:4>} e9 {} e7 {} e9 {NP<37:48>} e6 {NP<49:51>} e11 {NP<0:12>}

“the decline is even steeper than in September”, he said. {ADV(1,r)}

comp

− − − → {PP(1,u), ADV_PP(2,r)} vADV_PP(1,r) = vADV(1,r) vADV_PP(2,r) = than + vPP(1,u)

slide-61
SLIDE 61

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inducing Transduction Rules

_steep_a_1<21:28> _decline_n_1<5:12> focus_d mofy<37:48> comp pronoun_q pron<49:51> _say_v_to<52:57> proper_q _the_q<0:4> _even_x_deg<16:20> _in_p_temp<34:36> e4 {PP<34:48>} e3 {S<0:48>} e2 {S<0:48>} e5 {ADV<16:20>,PP<29:48>} e12 e1 {ADV<16:20>} e10 {DET<0:4>} e9 {} e7 {} e9 {NP<37:48>} e6 {NP<49:51>} e11 {NP<0:12>}

“the decline is even steeper than in September”, he said. {PP(1,u)}

_in_p_temp

− − − − − − − → {NP(1,u)} vPP(1,u) = in + vNP(1,u)

slide-62
SLIDE 62

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

NLG via DAG transduction

Experimental set-up

  • Data: DeepBank + Wikiwoods
  • Decoder: Beam search (beam size = 128)
  • About 37,000 induced rules are directly obtained from

DeepBank training dataset by a group of heuristic rules.

  • Disambiguation: global linear model

Transducer Lemmas Sentences Coverage induced rules 89.44 74.94 67%

slide-63
SLIDE 63

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fine-to-coarse Transduction

To deal with data sparseness problem, we use some heuristic rules to generate extened rules by slightly changing an induced rule. Given a induced rule: {NP, ADJ} X − → {} vNP = vADJ + L New rule generated by deleting: {NP} X − → {} vNP = L

slide-64
SLIDE 64

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fine-to-coarse Transduction

To deal with data sparseness problem, we use some heuristic rules to generate extened rules by slightly changing an induced rule. Given a induced rule: {NP, ADJ} X − → {} vNP = vADJ + L New rule generated by copying: {NP, ADJ1, ADJ2} X − → {} vNP = vADJ1 + vADJ2 + L

slide-65
SLIDE 65

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

NLG via DAG transduction

Experimental set-up

  • Data: DeepBank + Wikiwoods
  • Decoder: Beam search (beam size = 128)
  • About 37,000 induced rules and 440,000 exteneded rules
  • Disambiguation: global linear model

Transducer Lemmas Sentences Coverage induced rules 89.44 74.94 67% induced and exteneded rules 88.41 74.03 77%

slide-66
SLIDE 66

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fine-to-coarse transduction

σ

  • q1
  • q2
  • q3

. . .

  • qm
  • . . .
  • σ
  • . . .
  • r1
  • r2

. . .

  • rn

ω

During decoding, when neither induced nor extended rule is applicable, we use markov model to create a dynamic rule

  • n-the-fmy:

P({r1, · · · , rn}|C) = P(r1|C)

n

i=2

P(ri|C)P(ri|ri−1, C)

  • C = ⟨{q1, · · · , qm}, D⟩ represents the context.
  • r1, · · · , rn denotes the outgoing states.
slide-67
SLIDE 67

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

NLG via DAG transduction

Experimental set-up

  • Data: DeepBank + Wikiwoods
  • Decoder: Beam search (beam size = 128)
  • Other tool: OpenNMT

Transducer Lemmas Sentences Coverage induced rules 89.44 74.94 67% induced and exteneded rules 88.41 74.03 77% induced, exteneded and dynamic rules 82.04 68.07 100% DFS-NN 50.45 100% AMR-NN 33.8 100% AMR-NRG 25.62 100%

slide-68
SLIDE 68

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Conclusion and Future Work

English Resouce Semantics is fantastic! Conclusion

  • Formalism works for graph-to-string mapping, not surprisingly
  • r surprisingly

Future work

  • Is the decoder perfect? No, not even close
  • Is the disambiguation model a neural one? No, graph

embedding is non-trivial.

slide-69
SLIDE 69

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

QUESTIONS? COMMENTS?