Unsupervised Concept-to-text Generation with Hypergraphs Ioannis - - PowerPoint PPT Presentation

unsupervised concept to text generation with hypergraphs
SMART_READER_LITE
LIVE PREVIEW

Unsupervised Concept-to-text Generation with Hypergraphs Ioannis - - PowerPoint PPT Presentation

Unsupervised Concept-to-text Generation with Hypergraphs Ioannis Konstas, Mirella Lapata Institute for Language, Cognition and Computation University of Edinburgh NAACL 2012, Montral Konstas, Lapata (ILCC) Unsupervised Concept-to-text


slide-1
SLIDE 1

Unsupervised Concept-to-text Generation with Hypergraphs

Ioannis Konstas, Mirella Lapata

Institute for Language, Cognition and Computation University of Edinburgh

NAACL 2012, Montréal

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 1 / 25

slide-2
SLIDE 2

Introduction

Introduction

Concept-to-text generation refers to the task of automatically producing textual output from nonlinguistic input (Reiter and Dale, 2000)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 2 / 25

slide-3
SLIDE 3

Introduction

Introduction

Concept-to-text generation refers to the task of automatically producing textual output from nonlinguistic input (Reiter and Dale, 2000)

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 2 / 25

slide-4
SLIDE 4

Introduction

Introduction

Concept-to-text generation refers to the task of automatically producing textual output from nonlinguistic input (Reiter and Dale, 2000)

Flight direction from to

  • neway

edinburgh montreal Day day dep/ar/ret saturday departure Search

  • f

type what fare argmin flight Show me the cheapest one way flights from Edinburgh to Montreal leaving on Saturday

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 2 / 25

slide-5
SLIDE 5

Introduction

Traditional NLG Pipeline

Content Selection Surface Realisation Text Input Data Communicative Goal

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 3 / 25

slide-6
SLIDE 6

Introduction

Traditional NLG Pipeline

Content Selection Surface Realisation Text Input Data Communicative Goal

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 4 / 25

slide-7
SLIDE 7

Introduction

Our Approach

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 5 / 25

slide-8
SLIDE 8

Introduction

Our Approach

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 5 / 25

slide-9
SLIDE 9

Introduction

Our Approach

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 5 / 25

slide-10
SLIDE 10

Introduction

Our Approach

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 5 / 25

slide-11
SLIDE 11

Introduction

Our Approach

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 5 / 25

slide-12
SLIDE 12

Introduction

Our Approach

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 5 / 25

slide-13
SLIDE 13

Introduction

Our Approach

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Key idea: recast generation as a parsing problem

1 Describe the structure of the input with a PCFG 2 Convert PCFG to a hypergraph 3 Goal: Find the most fluent and grammatical derivation Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 5 / 25

slide-14
SLIDE 14

Introduction

Related Work

Angeli et al., 2010 Unified content selection and surface realisation Obtain alignments from Liang et al. (2009) Sequence of discriminative (log-linear) local decisions (records - fields - templates)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 6 / 25

slide-15
SLIDE 15

Introduction

Related Work

Angeli et al., 2010 Unified content selection and surface realisation Obtain alignments from Liang et al. (2009) Sequence of discriminative (log-linear) local decisions (records - fields - templates) Our Approach Unsupervised generative model Joint content selection and surface realisation, breaks the traditional NLG pipeline Domain independent, trainable end-to-end system

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 6 / 25

slide-16
SLIDE 16

Introduction

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 7 / 25

slide-17
SLIDE 17

Introduction

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 7 / 25

slide-18
SLIDE 18

Introduction

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 7 / 25

slide-19
SLIDE 19

Introduction

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 7 / 25

slide-20
SLIDE 20

Introduction

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 7 / 25

slide-21
SLIDE 21

Introduction

Grammar Definition

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) 5 FS(r, r.fi)→F(r, r.fj) 6 F(r, r.f )→W(r, r.f )F(r, r.f ) 7 F(r, r.f )→W(r, r.f ) 8 W(r, r.f )→α 9 W(r, r.f )→g(f .v) Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 8 / 25

slide-22
SLIDE 22

Introduction

Grammar Definition

R(skyCover1.t) → FS(temperature1, start)R(temperature1.t)

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) 5 FS(r, r.fi)→F(r, r.fj) 6 F(r, r.f )→W(r, r.f )F(r, r.f ) 7 F(r, r.f )→W(r, r.f ) 8 W(r, r.f )→α 9 W(r, r.f )→g(f .v) Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 8 / 25

slide-23
SLIDE 23

Introduction

Grammar Definition

FS(wSpeed1, min) → F(wSpeed1, max)FS(wSpeed1, max)

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) 5 FS(r, r.fi)→F(r, r.fj) 6 F(r, r.f )→W(r, r.f )F(r, r.f ) 7 F(r, r.f )→W(r, r.f ) 8 W(r, r.f )→α 9 W(r, r.f )→g(f .v) Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 8 / 25

slide-24
SLIDE 24

Introduction

Grammar Definition

F(gust1, mean) → W(gust1, mean)F(gust1, mean)

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) 5 FS(r, r.fi)→F(r, r.fj) 6 F(r, r.f )→W(r, r.f )F(r, r.f ) 7 F(r, r.f )→W(r, r.f ) 8 W(r, r.f )→α 9 W(r, r.f )→g(f .v) Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 8 / 25

slide-25
SLIDE 25

Introduction

Grammar Definition

W(windDir1, mode) → southeast

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) 5 FS(r, r.fi)→F(r, r.fj) 6 F(r, r.f )→W(r, r.f )F(r, r.f ) 7 F(r, r.f )→W(r, r.f ) 8 W(r, r.f )→α 9 W(r, r.f )→g(f .v) Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 8 / 25

slide-26
SLIDE 26

Introduction

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 9 / 25

slide-27
SLIDE 27

Introduction

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 9 / 25

slide-28
SLIDE 28

Introduction

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 9 / 25

slide-29
SLIDE 29

Introduction

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 9 / 25

slide-30
SLIDE 30

Introduction

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph R(skyCover1.t) FS(temp1.t,start) R(temp1.t) f (e) =f (FS1,2(temp1.t, start)) ⊗ f (R2,3(temp1.t))⊗ w(R(skyCover1.t) → FS(temp1, start) R(temp1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 10 / 25

slide-31
SLIDE 31

Introduction

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph R(skyCover1.t) FS(temp1.t,start) R(temp1.t) f (e) =f (FS1,2(temp1.t, start)) ⊗ f (R2,3(temp1.t))⊗ w(R(skyCover1.t) → FS(temp1, start) R(temp1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 10 / 25

slide-32
SLIDE 32

Introduction

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph R(skyCover1.t) FS(temp1.t,start) R(temp1.t) f (e) =f (FS1,2(temp1.t, start)) ⊗ f (R2,3(temp1.t))⊗ w(R(skyCover1.t) → FS(temp1, start) R(temp1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 10 / 25

slide-33
SLIDE 33

Introduction

Hypergraph Example

S0,7 R0,7(start) FS0,1(skyCover1,start) R1,7(skyCover1.t) FS0,2(skyCover1,start) R1,7(temp1.t) FS0,1(temp1,start) · · · F0,1(skyCover1,%) F0,1(skyCover1,time) F0,2(skyCover1,%) F0,2(skyCover1,time) W0,1(skyCover1,%) W0,1(skyCover1,time) sunny FS1,2(skyCover1,start) R2,7(skyCover1.t) FS1,2(temp1,start) R2,7(temp1.t) F1,2(temp1,max) F1,2(temp1,min) W1,2(temp1,min) g1,2(min,v=10) W1,2(temp1,max) g1,2(max,v=20) F1,2(skyCover1,%) F1,2(skyCover1,time) W1,2(skyCover1,%) W1,2(skyCover1,time) with

EM Training: dynamic program similar to the inside-outside algorithm

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 11 / 25

slide-34
SLIDE 34

Introduction

Hypergraph Example

S0,7 R0,7(start) FS0,1(skyCover1,start) R1,7(skyCover1.t) FS0,2(skyCover1,start) R1,7(temp1.t) FS0,1(temp1,start) · · · F0,1(skyCover1,%) F0,1(skyCover1,time) F0,2(skyCover1,%) F0,2(skyCover1,time) W0,1(skyCover1,%) W0,1(skyCover1,time) sunny FS1,2(skyCover1,start) R2,7(skyCover1.t) FS1,2(temp1,start) R2,7(temp1.t) F1,2(temp1,max) F1,2(temp1,min) W1,2(temp1,min) g1,2(min,v=10) W1,2(temp1,max) g1,2(max,v=20) F1,2(skyCover1,%) F1,2(skyCover1,time) W1,2(skyCover1,%) W1,2(skyCover1,time) with

EM Training: dynamic program similar to the inside-outside algorithm

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 11 / 25

slide-35
SLIDE 35

Introduction

Hypergraph Example

S0,7 R0,7(start) FS0,1(skyCover1,start) R1,7(skyCover1.t) FS0,2(skyCover1,start) R1,7(temp1.t) FS0,1(temp1,start) · · · F0,1(skyCover1,%) F0,1(skyCover1,time) F0,2(skyCover1,%) F0,2(skyCover1,time) W0,1(skyCover1,%) W0,1(skyCover1,time) sunny FS1,2(skyCover1,start) R2,7(skyCover1.t) FS1,2(temp1,start) R2,7(temp1.t) F1,2(temp1,max) F1,2(temp1,min) W1,2(temp1,min) g1,2(min,v=10) W1,2(temp1,max) g1,2(max,v=20) F1,2(skyCover1,%) F1,2(skyCover1,time) W1,2(skyCover1,%) W1,2(skyCover1,time) with

EM Training: dynamic program similar to the inside-outside algorithm

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 11 / 25

slide-36
SLIDE 36

Introduction

k-best Decoding

arg max

w

P(w | d)=arg max

w

P(w) · P(d | w)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 12 / 25

slide-37
SLIDE 37

Introduction

k-best Decoding

arg max

w

P(w | d)=arg max

w

P(w) · P(d | w) Motivation: fluency and grammaticality Nodes in hypergraph → +LM items (Huang and Chiang, 2007) e.g. R2,8(temp1.t)a low⋆15 degrees k-best Viterbi, cube pruning (Chiang, 2007)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 12 / 25

slide-38
SLIDE 38

Introduction

k-best Decoding

Leaf nodes ǫ emit a k-best list of words

W0,1(skyCover1.t,%)

ǫ

 

mostly cloudy sunny · · ·

 

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 13 / 25

slide-39
SLIDE 39

Introduction

k-best Decoding

FS0,5(skyCover1.t,start)

 

mostly cloudy ⋆ the morning mostly cloudy ⋆ after 11am mostly cloudy ⋆ then becoming · · ·

 

F0,2(skyCover1.t,%)

 

mostly cloudy mostly clouds cloudy , · · ·

 

W4,5(skyCover1.t,time)

 

morning 11am after · · ·

 

W0,1(skyCover1.t,%)

 

mostly cloudy sunny · · ·

 

W1,2(skyCover1.t,%)

 

mostly cloudy sunny · · ·

 

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 14 / 25

slide-40
SLIDE 40

Introduction

k-best Decoding

FS0,5(skyCover1.t,start)

 

mostly cloudy ⋆ the morning mostly cloudy ⋆ after 11am mostly cloudy ⋆ then becoming · · ·

 

F0,2(skyCover1.t,%)

 

mostly cloudy mostly clouds cloudy , · · ·

 

W4,5(skyCover1.t,time)

 

morning 11am after · · ·

 

W0,1(skyCover1.t,%)

 

mostly cloudy sunny · · ·

 

W1,2(skyCover1.t,%)

 

mostly cloudy sunny · · ·

 

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 14 / 25

slide-41
SLIDE 41

Introduction

k-best Decoding

FS0,5(skyCover1.t,start)

 

mostly cloudy ⋆ the morning mostly cloudy ⋆ after 11am mostly cloudy ⋆ then becoming · · ·

 

F0,2(skyCover1.t,%)

 

mostly cloudy mostly clouds cloudy , · · ·

 

W4,5(skyCover1.t,time)

 

morning 11am after · · ·

 

W0,1(skyCover1.t,%)

 

mostly cloudy sunny · · ·

 

W1,2(skyCover1.t,%)

 

mostly cloudy sunny · · ·

 

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 14 / 25

slide-42
SLIDE 42

Results

Experimental Setup

RoboCup : simulated sportscasting [214 words] (Chen and Mooney, 2008) WeatherGov : weather reports [345 words] (Liang et al., 2009) Atis : mapping from λ-version [927 words] (Zettlemoyer and Collins, 2007)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 15 / 25

slide-43
SLIDE 43

Results

Experimental Setup

RoboCup : simulated sportscasting [214 words] (Chen and Mooney, 2008) WeatherGov : weather reports [345 words] (Liang et al., 2009) Atis : mapping from λ-version [927 words] (Zettlemoyer and Collins, 2007) Model parameters estimated on dev set (k-best, n-grams)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 15 / 25

slide-44
SLIDE 44

Results

Experimental Setup

RoboCup : simulated sportscasting [214 words] (Chen and Mooney, 2008) WeatherGov : weather reports [345 words] (Liang et al., 2009) Atis : mapping from λ-version [927 words] (Zettlemoyer and Collins, 2007) Model parameters estimated on dev set (k-best, n-grams) Automatic evaluation: BLEU-4 Human evaluation (MTurk): fluency, semantic correctness

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 15 / 25

slide-45
SLIDE 45

Results

Results: Automatic Evaluation

System RoboCup WeatherGov Atis 1-Best 10.79 8.64 11.85 k-Best 30.90 33.70 29.30 Angeli 28.70 38.40 26.77 RoboCup results with fixed content selection; WeatherGov and Atis results with content selection and surface realization.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 16 / 25

slide-46
SLIDE 46

Results

Results: Automatic Evaluation

System RoboCup WeatherGov Atis 1-Best 10.79 8.64 11.85 k-Best 30.90 33.70 29.30 Angeli 28.70 38.40 26.77 RoboCup results with fixed content selection; WeatherGov and Atis results with content selection and surface realization.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 16 / 25

slide-47
SLIDE 47

Results

Results: Human Evaluation

System Fluency SemCor 1-Best 2.47 2.33 k-Best 4.31 3.96 Angeli 4.03 3.70 RoboCup Human 4.47 4.37 System Fluency SemCor 1-Best 1.82 2.05 k-Best 3.92 3.30 Angeli 4.26 3.60 WeatherGov Human 4.61 4.03 System Fluency SemCor 1-Best 2.40 2.46 k-Best 4.01 3.87 Atis Angeli 3.56 3.33 Human 4.10 4.01

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 17 / 25

slide-48
SLIDE 48

Results

Results: Human Evaluation

System Fluency SemCor 1-Best 2.47 2.33 k-Best 4.31 3.96 Angeli 4.03 3.70 RoboCup Human 4.47 4.37 System Fluency SemCor 1-Best 1.82 2.05 k-Best 3.92 3.30 Angeli 4.26 3.60 WeatherGov Human 4.61 4.03 System Fluency SemCor 1-Best 2.40 2.46 k-Best 4.01 3.87 Atis Angeli 3.56 3.33 Human 4.10 4.01

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 17 / 25

slide-49
SLIDE 49

Results

Results: Human Evaluation

System Fluency SemCor 1-Best 2.47 2.33 k-Best 4.31 3.96 Angeli 4.03 3.70 RoboCup Human 4.47 4.37 System Fluency SemCor 1-Best 1.82 2.05 k-Best 3.92 3.30 Angeli 4.26 3.60 WeatherGov Human 4.61 4.03 System Fluency SemCor 1-Best 2.40 2.46 k-Best 4.01 3.87 Atis Angeli 3.56 3.33 Human 4.10 4.01

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 17 / 25

slide-50
SLIDE 50

Results

Results: Human Evaluation

System Fluency SemCor 1-Best 2.47 2.33 k-Best 4.31 3.96 Angeli 4.03 3.70 RoboCup Human 4.47 4.37 System Fluency SemCor 1-Best 1.82 2.05 k-Best 3.92 3.30 Angeli 4.26 3.60 WeatherGov Human 4.61 4.03 System Fluency SemCor 1-Best 2.40 2.46 k-Best 4.01 3.87 Atis Angeli 3.56 3.33 Human 4.10 4.01

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 17 / 25

slide-51
SLIDE 51

Results

Output

WeatherGov

Temperature Time Min Mean Max 06:00-21:00 30 38 44 Wind Speed Time Min Mean Max 06:00-21:00 6 6 7 Cloud Sky Cover Time Percent (%) 06:00-21:00 75-100 Wind Direction Time Mode 06:00-21:00 ENE Precipitation Potential (%) Time Min Mean Max 06:00-21:00 9 20 35 Chance of Rain Time Mode 06:00-21:00 Slight Chance k-Best: A chance of rain showers before 11am. Mostly cloudy, with a high

near 44. East wind between 6 and 7 mph.

Angeli: A chance of showers. Patchy fog before noon. Mostly cloudy, with a high

near 44. East wind between 6 and 7 mph. Chance of precipitation is 35%

Human: A 40 percent chance of showers before 10am. Mostly cloudy, with a high

near 44. East northeast wind around 7 mph.

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 18 / 25

slide-52
SLIDE 52

Results

Output

Atis

Input: Flight from to milwaukee phoenix Day day dep/ar/ret saturday departure Search type what query flight k-Best:

What are the flights from Milwuakee to Phoenix on Saturday

Angeli :

Show me the flights between Milwuakee and Phoenix on Saturday

Human:

Milwuakee to Phoenix on Saturday

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 19 / 25

slide-53
SLIDE 53

Results

Conclusions

Generation as parsing problem using the hypergraph framework Unsupervised end-to-end generation system Performance comparable to state-of-the-art Future work: discriminative reranking (Konstas and Lapata, 2012b)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 20 / 25

slide-54
SLIDE 54

Results

Demo

Live Weather Forecast Generator Cross domain Model trained on: weather.gov Demo runs on: wunderground.com Discrepancies (e.g. no gust information, inferred fields)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 21 / 25

slide-55
SLIDE 55

Results

Demo

Live Weather Forecast Generator Cross domain Model trained on: weather.gov Demo runs on: wunderground.com Discrepancies (e.g. no gust information, inferred fields) Live Data 6:00 17:00 21:00 6:00(+1) Day Night

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 21 / 25

slide-56
SLIDE 56

Results

Demo

Live Weather Forecast Generator Cross domain Model trained on: weather.gov Demo runs on: wunderground.com Discrepancies (e.g. no gust information, inferred fields) Live Data 6:00 13:00 17:00 21:00 6:00(+1) Day Night

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 21 / 25

slide-57
SLIDE 57

Results

Demo

Live Weather Forecast Generator Cross domain Model trained on: weather.gov Demo runs on: wunderground.com Discrepancies (e.g. no gust information, inferred fields) Live Data 6:00 13:00 17:00 21:00 6:00(+1) Day Night

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 21 / 25

slide-58
SLIDE 58

Results

Demo

Live Weather Forecast Generator Cross domain Model trained on: weather.gov Demo runs on: wunderground.com Discrepancies (e.g. no gust information, inferred fields) Live Data 6:00 13:00 17:00 21:00 6:00(+1) Day Night

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 21 / 25

slide-59
SLIDE 59

Results

Thank you

Questions ?

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 22 / 25

slide-60
SLIDE 60

Results

Human Evaluation (Mturk)

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 23 / 25

slide-61
SLIDE 61

Results

Determining Text Length

Train a linear regression model Idea: The more records and fields that have values in the database → the more facts need to be uttered Input to the model: Flattened version of the database input, i.e. each feature is a record-field pair Feature values: Values vs Counts of Fields

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 24 / 25

slide-62
SLIDE 62

Results

Output

RoboCup

Input: Pass From To purple10 purple11 k-Best:

purple10 passes back to purple11

Angeli :

purple10 passes to purple11

Human:

purple10 immediately passes to purple11

Konstas, Lapata (ILCC) Unsupervised Concept-to-text Generation NAACL 2012, Montréal 25 / 25