Learning to generate: Concept-to-text generation using machine - - PowerPoint PPT Presentation

learning to generate concept to text generation using
SMART_READER_LITE
LIVE PREVIEW

Learning to generate: Concept-to-text generation using machine - - PowerPoint PPT Presentation

Learning to generate: Concept-to-text generation using machine learning Ioannis Konstas Institute for Language, Cognition and Computation University of Edinburgh Aberdeen, NLG Summer School 21 July 2015 Konstas (ILCC) Concept-to-Text


slide-1
SLIDE 1

Learning to generate: Concept-to-text generation using machine learning

Ioannis Konstas

Institute for Language, Cognition and Computation University of Edinburgh

Aberdeen, NLG Summer School 21 July 2015

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 1 / 56

slide-2
SLIDE 2

Introduction Motivation

Introduction

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 56

slide-3
SLIDE 3

Introduction Motivation

Introduction

ROOT exhibit statue kouros exhibit7 exhibit12 complex-statue portrait imperial-portrait coin jewel relief . . . a-location museum museum1 archeological-site . . .

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 56

slide-4
SLIDE 4

Introduction Motivation

Introduction

Sensor Data

Full Descriptor Time SETTING;VENTIL;FiO2 (36%) 10.30 MEDICATION;Morphine 10.44 ACTION;CARE;TURN/ CHANGE POSITION;SUPINE 10.46-10.47 ACTION;RESP;HAND BABY 10.47-10.51 SETTING;VENTIL;FiO2 (60%) 10.47 ACTION;RESP;INTUBATE 10.51-10.52

Action Records

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 56

slide-5
SLIDE 5

Introduction Motivation

Introduction

Concept-to-text generation refers to the task of automatically producing textual output from nonlinguistic input (Reiter and Dale, 2000)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 3 / 56

slide-6
SLIDE 6

Introduction Motivation

Introduction

Concept-to-text generation refers to the task of automatically producing textual output from nonlinguistic input (Reiter and Dale, 2000)

Wind Chill Time Min Mean Max 06-21 Temperature Time Min Mean Max 06-21 52 61 70 Wind Speed Time Min Mean Max 06-21 11 22 29 Wind Direction Time Mode 06-21 S Gust Time Min Mean Max 06-21 20 39 Precipitation Potential Time Min Mean Max 06-21 26 81 100 Sky Cover Time Percent (%) 06-21 75-100 06-09 75-100 06-13 50-75 09-21 75-100 13-21 75-100 Rain Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Def 09-21 Def 13-21 Def Snow Chance Time Mode 06-21 – 06-09 – 06-13 – 09-21 – 13-21 – Sleet Chance Time Mode 06-21 – 06-09 – 06-13 – 09-21 – 13-21 – Freezing Rain Chance Time Mode 06-21 – 06-09 – 06-13 – 09-21 – 13-21 – Thunder Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Chc 09-21 Def 13-21 Def

Showers and thunderstorms. High near 70. Cloudy, with a south wind around 20mph, with gusts as high as 40 mph. Chance of precipitation is 100%.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 3 / 56

slide-7
SLIDE 7

Introduction Motivation

Introduction

Concept-to-text generation refers to the task of automatically producing textual output from nonlinguistic input (Reiter and Dale, 2000)

Desktop Cmd Name Type left-click start button Start Cmd Name Type left-click settings button Location Name Type start menu button control panel window Start Target Cmd Name Type left-click control panel button Navigate Window Cmd Name Type left-click accounts and users window Context Menu Cmd Name Type left-click advanced tab Action Context Menu Cmd Name Type left-click advanced button Window Target Cmd Name Type double-click users and passwords item

Click start, point to settings, and then click control panel. Double-click users and passwords. On the advanced tab, click advanced.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 3 / 56

slide-8
SLIDE 8

Introduction Motivation

Introduction

What has been done so far?

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 4 / 56

slide-9
SLIDE 9

Introduction Motivation

Introduction

What has been done so far? Expert knowledge deployed for the creation of hand-crafted rules - single domain

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 4 / 56

slide-10
SLIDE 10

Introduction Motivation

Introduction

What has been done so far? Expert knowledge deployed for the creation of hand-crafted rules - single domain Manually annotated corpora - discourse relations, alignments

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 4 / 56

slide-11
SLIDE 11

Introduction Motivation

Introduction

What has been done so far? Expert knowledge deployed for the creation of hand-crafted rules - single domain Manually annotated corpora - discourse relations, alignments Breakdown of process into a pipeline of modules

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 4 / 56

slide-12
SLIDE 12

Introduction Motivation

Introduction

What we will look into today?

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 5 / 56

slide-13
SLIDE 13

Introduction Motivation

Introduction

What we will look into today? Recast NLG into a generative model

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 5 / 56

slide-14
SLIDE 14

Introduction Motivation

Introduction

What we will look into today? Recast NLG into a generative model Learn parameters from (un)-annotated data - multiple domains

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 5 / 56

slide-15
SLIDE 15

Introduction Motivation

Introduction

What we will look into today? Recast NLG into a generative model Learn parameters from (un)-annotated data - multiple domains Search for the best parameters that fit the input and decode into text

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 5 / 56

slide-16
SLIDE 16

Introduction Outline

Outline

Problem Formulation Learning Alignments Pipeline Approach Joint Approaches

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 6 / 56

slide-17
SLIDE 17

Introduction Outline

Outline

Problem Formulation Learning Alignments Pipeline Approach Joint Approaches

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 6 / 56

slide-18
SLIDE 18

Problem Formulation Key Idea

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical, string) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 7 / 56

slide-19
SLIDE 19

Problem Formulation Key Idea

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical, string) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 7 / 56

slide-20
SLIDE 20

Problem Formulation Key Idea

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical, string) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 7 / 56

slide-21
SLIDE 21

Problem Formulation Key Idea

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical, string) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 7 / 56

slide-22
SLIDE 22

Problem Formulation Key Idea

Input

Input: database records d Output: words w corresponding to some records of d Each record r ∈ d has a type r.t and fields f Fields have values f .v and types f .t (integer, categorical, string) Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 mostly cloudy,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 7 / 56

slide-23
SLIDE 23

Problem Formulation Key Idea

Key Idea

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 8 / 56

slide-24
SLIDE 24

Problem Formulation Key Idea

Key Idea

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 8 / 56

slide-25
SLIDE 25

Problem Formulation Key Idea

Key Idea

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 8 / 56

slide-26
SLIDE 26

Problem Formulation Key Idea

Key Idea

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 8 / 56

slide-27
SLIDE 27

Problem Formulation Key Idea

Key Idea

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 8 / 56

slide-28
SLIDE 28

Problem Formulation Key Idea

Key Idea

Temperature Time Min Mean Max 06:00-21:00 9 15 21 Wind Speed Time Min Mean Max 06:00-21:00 15 20 30 Cloud Sky Cover Time Percent (%) 06:00-09:00 25-50 09:00-12:00 50-75 Wind Direction Time Mode 06:00-21:00 S Cloudy, with a low around 10. South wind between 15 and 30 mph. Partly cloudy, with a low around 9. Breezy, with a south wind be- tween 15 and 30 mph.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 8 / 56

slide-29
SLIDE 29

Problem Formulation Key Idea

Traditional NLG Pipeline

Content Planning Sentence Planning Surface Realisation Text Input Data Communicative Goal

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 9 / 56

slide-30
SLIDE 30

Problem Formulation Key Idea

Traditional NLG Pipeline

Content Planning Content Selection Document Planning Sentence Planning Surface Realisation Text Input Data Communicative Goal Liang et al. (2009)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 9 / 56

slide-31
SLIDE 31

Learning Alignments Liang et al. 2009

Liang et al., ACL 2009

Learning Semantic Correspondences with Less Supervision

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 10 / 56

slide-32
SLIDE 32

Learning Alignments Liang et al. 2009

Alignment Task

Wind Chill Time Min Mean Max 06-21 Temperature Time Min Mean Max 06-21 52 61 70 Wind Speed Time Min Mean Max 06-21 11 22 29 Wind Direction Time Mode 06-21 S Gust Time Min Mean Max 06-21 20 39 Precipitation Potential Time Min Mean Max 06-21 26 81 100 Sky Cover Time Percent (%) 06-21 75-100 06-09 75-100 06-13 50-75 09-21 75-100 13-21 75-100 Rain Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Def 09-21 Def 13-21 Def Snow Chance Time Mode 06-21 – 06-09 – 06-13 – 09-21 – 13-21 – Thunder Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Chc 09-21 Def 13-21 Def Freezing Rain Chance Time Mode 06-21 – 06-09 – 06-13 – 09-21 – 13-21 – Sleet Chance Time Mode 06-21 – 06-09 – 06-13 – 09-21 – 13-21 – Showers and thunderstorms. High near 70. Cloudy, with a south wind around 20mph, with gusts as high as 40 mph. Chance of precipitation is 100%.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 11 / 56

slide-33
SLIDE 33

Learning Alignments Liang et al. 2009

Generative Story

1

Record choice: choose a sequence of records r = r1, . . . , r|r|

  • p(r | d) =

|r|

  • i

p(ri.t | ri−1.t) 1 |s(ri.t)| p(r, f, c, w|d) = p(r|d)p(f|r)p(c, w|r, f, d)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 12 / 56

slide-34
SLIDE 34

Learning Alignments Liang et al. 2009

Generative Story

1

Record choice: choose a sequence of records r = r1, . . . , r|r|

  • p(r | d) =

|r|

  • i

p(ri.t | ri−1.t) 1 |s(ri.t)|

2

Field choice: for each chosen record ri, select a sequence of fields fi = fi1, . . . , fi|fi |

  • p(f | ri.t) =

|ri .f|

  • k

p(ri.fk | ri.fk−1) p(r, f, c, w|d) = p(r|d)p(f|r)p(c, w|r, f, d)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 12 / 56

slide-35
SLIDE 35

Learning Alignments Liang et al. 2009

Generative Story

1

Record choice: choose a sequence of records r = r1, . . . , r|r|

  • p(r | d) =

|r|

  • i

p(ri.t | ri−1.t) 1 |s(ri.t)|

2

Field choice: for each chosen record ri, select a sequence of fields fi = fi1, . . . , fi|fi |

  • p(f | ri.t) =

|ri .f|

  • k

p(ri.fk | ri.fk−1)

3

Word choice: for each chosen field fik, choose a number cik > 0 uniformly, and generate a sequence of cik words. p(w |ri, ri.fk, ri.fk.t, cik) =

|w|

  • j

p(wj | ri.t, ri.fk.v) p(r, f, c, w|d) = p(r|d)p(f|r)p(c, w|r, f, d)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 12 / 56

slide-36
SLIDE 36

Learning Alignments Liang et al. 2009

Hierarchical Semi-Markov Model (HSMM)

d ri r1 . . . r|r| . . . r1.f1 . . . ri.f1 . . . ri.f|f | r|r|.f|f | . . . w1 . . . w w . . . w w . . . w w . . . wN

EM Training: dynamic program similar to the inside-outside algorithm

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 13 / 56

slide-37
SLIDE 37

Learning Alignments Liang et al. 2009

Aligned Output

Records: k skyCover1 Fields: max=70 percent=75-100 N Text: High near 70 . Cloudy , Records: k k Fields: N mode=S N N mean=20 Text: with ag southg windg aroundg 20 mph . temperature1 windDir1 windSpeed1

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 14 / 56

slide-38
SLIDE 38

Pipeline Approaches Outline

Outline

Problem Formulation Learning Alignments Pipeline Approach Joint Approaches

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 15 / 56

slide-39
SLIDE 39

Pipeline Approaches History-based Generation

Traditional NLG Pipeline

Content Planning Content Selection Document Planning Sentence Planning Surface Realisation Text Input Data Communicative Goal Kim and Mooney (2010) Angeli et al. (2010)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 16 / 56

slide-40
SLIDE 40

Pipeline Approaches History-based Generation

Angeli et al., EMNLP 2010

A Simple Domain-Independent Probabilistic Approach to Generation

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 17 / 56

slide-41
SLIDE 41

Pipeline Approaches History-based Generation

Generative Story

for i = 1, 2, . . . :

1 choose a record ri ∈ d Konstas (ILCC) Concept-to-Text Generation 21 July 2015 18 / 56

slide-42
SLIDE 42

Pipeline Approaches History-based Generation

Generative Story

for i = 1, 2, . . . :

1 choose a record ri ∈ d 2 if ri = stop: return Konstas (ILCC) Concept-to-Text Generation 21 July 2015 18 / 56

slide-43
SLIDE 43

Pipeline Approaches History-based Generation

Generative Story

for i = 1, 2, . . . :

1 choose a record ri ∈ d 2 if ri = stop: return 3 choose a field fj ∈ ri.t.f Konstas (ILCC) Concept-to-Text Generation 21 July 2015 18 / 56

slide-44
SLIDE 44

Pipeline Approaches History-based Generation

Generative Story

for i = 1, 2, . . . :

1 choose a record ri ∈ d 2 if ri = stop: return 3 choose a field fj ∈ ri.t.f 4 choose a template Tk ∈ ri.t.fj.T Konstas (ILCC) Concept-to-Text Generation 21 July 2015 18 / 56

slide-45
SLIDE 45

Pipeline Approaches History-based Generation

Generative Story

for i = 1, 2, . . . :

1 choose a record ri ∈ d 2 if ri = stop: return 3 choose a field fj ∈ ri.t.f 4 choose a template Tk ∈ ri.t.fj.T Konstas (ILCC) Concept-to-Text Generation 21 July 2015 18 / 56

slide-46
SLIDE 46

Pipeline Approaches History-based Generation

Generative Story

for i = 1, 2, . . . :

1 choose a record ri ∈ d 2 if ri = stop: return 3 choose a field fj ∈ ri.t.f 4 choose a template Tk ∈ ri.t.fj.T

Each decision is governed by a set of feature templates

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 18 / 56

slide-47
SLIDE 47

Pipeline Approaches History-based Generation

Feature Templates

Record R1 list of k = 1, 2 record types r2.t=temp ∧ (r1.t, r0.t)=(skyCover, start) R2 set of prev record types r2.t=temp ∧ {r1.t}={skyCover} R3 record type already gen r2.t=temp ∧ rj.t =temp, ∀j < 2 R4 field values r2.t=temp ∧ r2.v[min]=10, r2.v[max]=20 R5 stop under LM r3.t=stop × pLM

  • stop|degrees .

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 19 / 56

slide-48
SLIDE 48

Pipeline Approaches History-based Generation

Feature Templates

Record R1 list of k = 1, 2 record types r2.t=temp ∧ (r1.t, r0.t)=(skyCover, start) R2 set of prev record types r2.t=temp ∧ {r1.t}={skyCover} R3 record type already gen r2.t=temp ∧ rj.t =temp, ∀j < 2 R4 field values r2.t=temp ∧ r2.v[min]=10, r2.v[max]=20 R5 stop under LM r3.t=stop × pLM

  • stop|degrees .

Field F1 field set f2= {time, min, mean, max} F2 field values f2= {min, max} ∧ f2.v[min]=10, . . .

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 19 / 56

slide-49
SLIDE 49

Pipeline Approaches History-based Generation

Feature Templates

Record R1 list of k = 1, 2 record types r2.t=temp ∧ (r1.t, r0.t)=(skyCover, start) R2 set of prev record types r2.t=temp ∧ {r1.t}={skyCover} R3 record type already gen r2.t=temp ∧ rj.t =temp, ∀j < 2 R4 field values r2.t=temp ∧ r2.v[min]=10, r2.v[max]=20 R5 stop under LM r3.t=stop × pLM

  • stop|degrees .

Field F1 field set f2= {time, min, mean, max} F2 field values f2= {min, max} ∧ f2.v[min]=10, . . . TemplateW1 base/coarse B(T2) = with a low around [min] C(T2) = with a [time] around [min] W2 field values W3 1st word of T under LM pLM(with|cloudy ,)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 19 / 56

slide-50
SLIDE 50

Pipeline Approaches History-based Generation

Feature Templates

Record R1 list of k = 1, 2 record types r2.t=temp ∧ (r1.t, r0.t)=(skyCover, start) R2 set of prev record types r2.t=temp ∧ {r1.t}={skyCover} R3 record type already gen r2.t=temp ∧ rj.t =temp, ∀j < 2 R4 field values r2.t=temp ∧ r2.v[min]=10, r2.v[max]=20 R5 stop under LM r3.t=stop × pLM

  • stop|degrees .

Field F1 field set f2= {time, min, mean, max} F2 field values f2= {min, max} ∧ f2.v[min]=10, . . . TemplateW1 base/coarse B(T2) = with a low around [min] C(T2) = with a [time] around [min] W2 field values W3 1st word of T under LM pLM(with|cloudy ,)

p(c|d; θ) =

|c|

  • j=1

p

cj|c<j; θ

  • Konstas (ILCC)

Concept-to-Text Generation 21 July 2015 19 / 56

slide-51
SLIDE 51

Pipeline Approaches History-based Generation

Feature Templates

Record R1 list of k = 1, 2 record types r2.t=temp ∧ (r1.t, r0.t)=(skyCover, start) R2 set of prev record types r2.t=temp ∧ {r1.t}={skyCover} R3 record type already gen r2.t=temp ∧ rj.t =temp, ∀j < 2 R4 field values r2.t=temp ∧ r2.v[min]=10, r2.v[max]=20 R5 stop under LM r3.t=stop × pLM

  • stop|degrees .

Field F1 field set f2= {time, min, mean, max} F2 field values f2= {min, max} ∧ f2.v[min]=10, . . . TemplateW1 base/coarse B(T2) = with a low around [min] C(T2) = with a [time] around [min] W2 field values W3 1st word of T under LM pLM(with|cloudy ,)

p(c|d; θ) =

|c|

  • j=1

p

cj|c<j; θ

  • L-BFGS learning: Use Liang et al. (2009) alignments to compute features

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 19 / 56

slide-52
SLIDE 52

Pipeline Approaches History-based Generation

Decoding

ˆ cj = arg max

cj

p

cj|c<j; θ

  • Greedy search: choose the best decision ˆ

cj until the stop record is drawn

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 20 / 56

slide-53
SLIDE 53

Pipeline Approaches History-based Generation

Decoding

ˆ cj = arg max

cj

p

cj|c<j; θ

  • Greedy search: choose the best decision ˆ

cj until the stop record is drawn Alternatively, sample from the distribution p

cj|c<j; θ ;

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 20 / 56

slide-54
SLIDE 54

Pipeline Approaches History-based Generation

Decoding

ˆ cj = arg max

cj

p

cj|c<j; θ

  • Greedy search: choose the best decision ˆ

cj until the stop record is drawn Alternatively, sample from the distribution p

cj|c<j; θ ;

Viterbi search over arg maxcj p

cj|d; θ

  • Konstas (ILCC)

Concept-to-Text Generation 21 July 2015 20 / 56

slide-55
SLIDE 55

Pipeline Approaches History-based Generation

Conclusions

Generation recast into a generative story Ensemble of local decisions Discriminatively trained end-to-end generation system

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 21 / 56

slide-56
SLIDE 56

Pipeline Approaches History-based Generation

Conclusions

Generation recast into a generative story Ensemble of local decisions Discriminatively trained end-to-end generation system How about we model generation jointly and learn without supervision?

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 21 / 56

slide-57
SLIDE 57

Pipeline Approaches Outline

Outline

Problem Formulation Learning Alignments Pipeline Approach Joint Approaches

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 22 / 56

slide-58
SLIDE 58

Joint Approaches Grammar-based Generation

Traditional NLG Pipeline

Content Planning Content Selection Document Planning Sentence Planning Surface Realisation Text Input Data Communicative Goal Kim and Mooney (2010) Angeli et al. (2010) Konstas and Lapata (2012a, 2012b, 2013b)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 23 / 56

slide-59
SLIDE 59

Joint Approaches Grammar-based Generation

Konstas and Lapata, NAACL 2012

Unsupervised Concept-to-text Generation with Hypergraphs

Konstas and Lapata, JAIR 2013

A Global Model for Concept-to-Text Generation

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 24 / 56

slide-60
SLIDE 60

Joint Approaches Grammar-based Generation

Grammar

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-61
SLIDE 61

Joint Approaches Grammar-based Generation

Grammar

1 S → R(start) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-62
SLIDE 62

Joint Approaches Grammar-based Generation

Grammar

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) | FS(rj, start)

R(skyCover1.t) → FS(temperature1, start)R(temperature1.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-63
SLIDE 63

Joint Approaches Grammar-based Generation

Grammar

Rain Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Def 09-21 Def 13-21 Def Thunder Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Chc 09-21 Def 13-21 Def Temperature Time Min Mean Max 06-21 52 61 70 Sky Cover Time Percent (%) 06-21 75-100 06-09 75-100 06-13 50-75 09-21 75-100 13-21 75-100 Wind Direction Time Mode 06-21 S Wind Speed Time Min Mean Max 06-21 11 22 29 Gust Time Min Mean Max 06-21 20 39 Precipitation Potential Time Min Mean Max 06-21 26 81 100

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) | FS(rj, start)

R(skyCover1.t) → FS(temperature1, start)R(temperature1.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-64
SLIDE 64

Joint Approaches Grammar-based Generation

Grammar

Rain Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Def 09-21 Def 13-21 Def Thunder Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Chc 09-21 Def 13-21 Def Temperature Time Min Mean Max 06-21 52 61 70 Sky Cover Time Percent (%) 06-21 75-100 06-09 75-100 06-13 50-75 09-21 75-100 13-21 75-100 Wind Direction Time Mode 06-21 S Wind Speed Time Min Mean Max 06-21 11 22 29 Gust Time Min Mean Max 06-21 20 39 Precipitation Potential Time Min Mean Max 06-21 26 81 100

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) | FS(rj, start) 3 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj)

FS(wSpeed1, min) → F(wSpeed1, max)FS(wSpeed1, max)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-65
SLIDE 65

Joint Approaches Grammar-based Generation

Grammar

Rain Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Def 09-21 Def 13-21 Def Thunder Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Chc 09-21 Def 13-21 Def Temperature Time Min Mean Max 06-21 52 61 70 Sky Cover Time Percent (%) 06-21 75-100 06-09 75-100 06-13 50-75 09-21 75-100 13-21 75-100 Wind Direction Time Mode 06-21 S Wind Speed Time Min Mean Max 06-21 11 22 29 Gust Time Min Mean Max 06-21 20 39 Precipitation Potential Time Min Mean Max 06-21 26 81 100

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) | FS(rj, start) 3 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 4 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f )

F(gust1, min) → W(gust1, mean)F(gust1, mean)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-66
SLIDE 66

Joint Approaches Grammar-based Generation

Grammar

Rain Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Def 09-21 Def 13-21 Def Thunder Chance Time Mode 06-21 Def 06-09 Lkly 06-13 Chc 09-21 Def 13-21 Def Temperature Time Min Mean Max 06-21 52 61 70 Sky Cover Time Percent (%) 06-21 75-100 06-09 75-100 06-13 50-75 09-21 75-100 13-21 75-100 Wind Direction Time Mode 06-21 S Wind Speed Time Min Mean Max 06-21 11 22 29 Gust Time Min Mean Max 06-21 20 39 Precipitation Potential Time Min Mean Max 06-21 26 81 100

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) | FS(rj, start) 3 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 4 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f ) 5 W(r, r.f )→α | g(f .v)

W(skyCover1, %) → cloudy [%.v = ‘75-100’]

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-67
SLIDE 67

Joint Approaches Grammar-based Generation

Grammar

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) | FS(rj, start) 3 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 4 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f ) 5 W(r, r.f )→α | g(f .v)

EM Training: dynamic program similar to the inside-outside algorithm

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 25 / 56

slide-68
SLIDE 68

Joint Approaches Grammar-based Generation

Decoding

ˆ g = f

  • arg max

g,h p(g) · p( g, h | d)

  • Konstas (ILCC)

Concept-to-Text Generation 21 July 2015 26 / 56

slide-69
SLIDE 69

Joint Approaches Grammar-based Generation

Decoding

ˆ g = f

  • arg max

g,h p(g) · p( g, h | d)

  • Bottom-up Viterbi search

Keep k-best derivations at each node, cube pruning (Chiang, 2007) p(g) rescores derivations by linearly interpolating:

n-gram language model dependency model (DMV; Klein and Manning, 2004)

Implement using hypergraphs (Klein and Manning, 2001)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 26 / 56

slide-70
SLIDE 70

Joint Approaches Grammar-based Generation

Decoding

Leaf nodes ǫ emit a k-best list of words

W0,1(skyCover1.t,%)

ǫ

 

mostly ; RB cloudy ; JJ sunny ; JJ · · ·

 

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 27 / 56

slide-71
SLIDE 71

Joint Approaches Grammar-based Generation

Decoding

FS0,5(skyCover1.t,start)

 

mostly cloudy ⋆ the morning ; JJ mostly cloudy ⋆ after 11am ; JJ mostly cloudy ⋆ then becoming ; JJ · · ·

 

F0,2(skyCover1.t,%)

 

mostly cloudy ; RB mostly clouds ; NNS cloudy , ; JJ · · ·

 

W4,5(skyCover1.t,time)

 

morning ; NN 11am ; NN after ; PREP · · ·

 

W0,1(skyCover1.t,%)

 

mostly ; RB cloudy ; JJ sunny ; JJ · · ·

 

W1,2(skyCover1.t,%)

 

mostly ; RB cloudy ; JJ sunny ; JJ · · ·

 

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 28 / 56

slide-72
SLIDE 72

Joint Approaches Grammar-based Generation

Decoding

FS0,5(skyCover1.t,start)

 

mostly cloudy ⋆ the morning ; JJ mostly cloudy ⋆ after 11am ; JJ mostly cloudy ⋆ then becoming ; JJ · · ·

 

F0,2(skyCover1.t,%)

 

mostly cloudy ; RB mostly clouds ; NNS cloudy , ; JJ · · ·

 

W4,5(skyCover1.t,time)

 

morning ; NN 11am ; NN after ; PREP · · ·

 

W0,1(skyCover1.t,%)

 

mostly ; RB cloudy ; JJ sunny ; JJ · · ·

 

W1,2(skyCover1.t,%)

 

mostly ; RB cloudy ; JJ sunny ; JJ · · ·

 

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 28 / 56

slide-73
SLIDE 73

Joint Approaches Grammar-based Generation

Decoding

FS0,5(skyCover1.t,start)

 

mostly cloudy ⋆ the morning ; JJ mostly cloudy ⋆ after 11am ; JJ mostly cloudy ⋆ then becoming ; JJ · · ·

 

F0,2(skyCover1.t,%)

 

mostly cloudy ; RB mostly clouds ; NNS cloudy , ; JJ · · ·

 

W4,5(skyCover1.t,time)

 

morning ; NN 11am ; NN after ; PREP · · ·

 

W0,1(skyCover1.t,%)

 

mostly ; RB cloudy ; JJ sunny ; JJ · · ·

 

W1,2(skyCover1.t,%)

 

mostly ; RB cloudy ; JJ sunny ; JJ · · ·

 

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 28 / 56

slide-74
SLIDE 74

Joint Approaches Results

Experimental Setup

Data RoboCup : simulated sportscasting [214 words] (Chen and Mooney, 2008) WeatherGov : weather reports [4 sents, 345 words] (Liang et al., 2009) Atis : flight booking [1 sent, 927 words] (Zettlemoyer and Collins, 2007) WinHelp : troubleshooting guides [4.3 sents, 629 words] (Branavan et al., 2009)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 29 / 56

slide-75
SLIDE 75

Joint Approaches Results

Experimental Setup

Data RoboCup : simulated sportscasting [214 words] (Chen and Mooney, 2008) WeatherGov : weather reports [4 sents, 345 words] (Liang et al., 2009) Atis : flight booking [1 sent, 927 words] (Zettlemoyer and Collins, 2007) WinHelp : troubleshooting guides [4.3 sents, 629 words] (Branavan et al., 2009) Evaluation Automatic evaluation: BLEU-4 Human evaluation: Fluency, Semantic Correctness

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 29 / 56

slide-76
SLIDE 76

Joint Approaches Results

Experimental Setup

Data RoboCup : simulated sportscasting [214 words] (Chen and Mooney, 2008) WeatherGov : weather reports [4 sents, 345 words] (Liang et al., 2009) Atis : flight booking [1 sent, 927 words] (Zettlemoyer and Collins, 2007) WinHelp : troubleshooting guides [4.3 sents, 629 words] (Branavan et al., 2009) Evaluation Automatic evaluation: BLEU-4 Human evaluation: Fluency, Semantic Correctness System Comparison 1−best, k-Best-lm, k-Best-lm-dmv Angeli et al. (2010)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 29 / 56

slide-77
SLIDE 77

Joint Approaches Results

Results: Automatic Evaluation

Base Angeli k-lm k-lm-dmv 10 20 30 40 10.79 28.7 30.9 29.73 BLEU-4 RoboCup Base Angeli k-lm k-lm-dmv 10 20 30 40 8.64 38.4 33.7 34.18 BLEU-4 WeatherGov Base Angeli k-lm k-lm-dmv 10 20 30 40 11.85 26.77 29.3 30.37 BLEU-4 Atis Base Angeli k-lm k-lm-dmv 10 20 30 40 16.02 32.21 38.26 39.03 BLEU-4 WinHelp

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 30 / 56

slide-78
SLIDE 78

Joint Approaches Results

Results: Human Evaluation (Fluency)

Base k-lm-dmvAngeli Human 1 2 3 4 5 2.47 4.31 4.03 4.47 Fluency RoboCup Base k-lm-dmvAngeli Human 1 2 3 4 5 1.82 3.92 4.26 4.61 Fluency WeatherGov Base k-lm-dmvAngeli Human 1 2 3 4 5 2.4 4.01 3.56 4.1 Fluency Atis Base k-lm-dmvAngeli Human 1 2 3 4 5 2.57 3.41 3.57 4.15 Fluency WinHelp

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 31 / 56

slide-79
SLIDE 79

Joint Approaches Results

Output

WeatherGov

Temperature Time Min Mean Max 06:00-21:00 30 38 44 Wind Speed Time Min Mean Max 06:00-21:00 6 6 7 Cloud Sky Cover Time Percent (%) 06:00-21:00 75-100 Wind Direction Time Mode 06:00-21:00 ENE Precipitation Potential (%) Time Min Mean Max 06:00-21:00 9 20 35 Chance of Rain Time Mode 06:00-11:00 Slight Chance k-Best: A chance of rain showers before 11am. Mostly cloudy, with a high

near 44. East wind between 6 and 7 mph.

Angeli: A chance of showers. Patchy fog before noon. Mostly cloudy, with a high

near 44. East wind between 6 and 7 mph. Chance of precipitation is 35%

Human: A 40 percent chance of showers before 10am. Mostly cloudy, with a high

near 44. East northeast wind around 7 mph.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 32 / 56

slide-80
SLIDE 80

Joint Approaches Results

Output

Atis

Input: Flight from to milwaukee phoenix Day day dep/ar/ret saturday departure Search type what query flight k-Best:

What are the flights from Milwuakee to Phoenix on Saturday

Angeli :

Show me the flights between Milwuakee and Phoenix on Saturday

Human:

Milwuakee to Phoenix on Saturday

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 33 / 56

slide-81
SLIDE 81

Joint Approaches Results

Dependency Output

Atis

ROOT

  • n
  • n
  • n
  • n

Saturday

  • n

from Phoenix Phoenix Phoenix to Milwaukee from flights show me the me show

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 34 / 56

slide-82
SLIDE 82

Joint Approaches Results

Conclusions

Generation as parsing problem Unsupervised end-to-end generation system Performance comparable to state-of-the-art

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 35 / 56

slide-83
SLIDE 83

Joint Approaches Results

Conclusions

Generation as parsing problem Unsupervised end-to-end generation system Performance comparable to state-of-the-art What about document planning?

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 35 / 56

slide-84
SLIDE 84

Joint Approaches Inducing Document Planning

Traditional NLG Pipeline

Content Planning Content Selection Document Planning Sentence Planning Surface Realisation Text Input Data Communicative Goal Kim and Mooney (2010) Angeli et al. (2010) Konstas and Lapata (2012a, 2012b, 2013a) Konstas and Lapata (2013a)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 36 / 56

slide-85
SLIDE 85

Joint Approaches Inducing Document Planning

Traditional NLG Pipeline

Content Planning Content Selection Document Planning Sentence Planning Surface Realisation Text Input Data Communicative Goal Kim and Mooney (2010) Angeli et al. (2010) Konstas and Lapata (2012a, 2012b, 2013a) Konstas and Lapata (2013a)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 36 / 56

slide-86
SLIDE 86

Joint Approaches Inducing Document Planning

Konstas and Lapata, EMNLP 2013

Inducing Document Plans for Concept-to-text Generation, EMNLP 2013

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 37 / 56

slide-87
SLIDE 87

Joint Approaches Key Idea

Key Idea

Desktop Cmd Name Type left-click start button Start Cmd Name Type left-click settings button Location Name Type start menu button control panel window Start Target Cmd Name Type left-click control panel button Navigate Window Cmd Name Type left-click accounts and users window Context Menu Cmd Name Type left-click advanced tab Action Context Menu Cmd Name Type left-click advanced button Window Target Cmd Name Type double-click users and passwords item

Click start, point to settings, and then click control panel. Double-click users and passwords. On the advanced tab, click advanced.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 38 / 56

slide-88
SLIDE 88

Joint Approaches Key Idea

Key Idea

Desktop Cmd Name Type left-click start button Start Cmd Name Type left-click settings button Location Name Type start menu button control panel window Start Target Cmd Name Type left-click control panel button Navigate Window Cmd Name Type left-click accounts and users window Context Menu Cmd Name Type left-click advanced tab Action Context Menu Cmd Name Type left-click advanced button Window Target Cmd Name Type double-click users and passwords item

Click start, point to settings, and then click control panel. Double-click users and passwords. On the advanced tab, click advanced.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 38 / 56

slide-89
SLIDE 89

Joint Approaches Key Idea

Key Idea

Desktop Cmd Name Type left-click start button Start Cmd Name Type left-click settings button Start Target Cmd Name Type left-click control panel button Window Target Cmd Name Type double-click users and passwords item Context Menu Cmd Name Type left-click advanced tab Action Context Menu Cmd Name Type left-click advanced button

Click start, point to settings, and then click control panel. Double-click users and passwords. On the advanced tab, click advanced.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 38 / 56

slide-90
SLIDE 90

Joint Approaches Key Idea

Key Idea

Desktop Cmd Name Type left-click start button Start Cmd Name Type left-click settings button Start Target Cmd Name Type left-click control panel button Window Target Cmd Name Type double-click users and passwords item Context Menu Cmd Name Type left-click advanced tab Action Context Menu Cmd Name Type left-click advanced button

Click start, point to settings, and then click control panel. Double-click users and passwords. On the advanced tab, click advanced.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 38 / 56

slide-91
SLIDE 91

Joint Approaches Key Idea

Key Idea

Desktop Cmd Name Type left-click start button Start Cmd Name Type left-click settings button Start Target Cmd Name Type left-click control panel button Window Target Cmd Name Type double-click users and passwords item Context Menu Cmd Name Type left-click advanced tab Action Context Menu Cmd Name Type left-click advanced button

Click start, point to settings, and then click control panel. Double-click users and passwords. On the advanced tab, click advanced.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 38 / 56

slide-92
SLIDE 92

Joint Approaches Key Idea

Key Idea

Key Idea: Grammar-based document plans

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 39 / 56

slide-93
SLIDE 93

Joint Approaches Key Idea

Key Idea

Key Idea: Grammar-based document plans Re-use the generation model based on a PCFG grammar of input

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 39 / 56

slide-94
SLIDE 94

Joint Approaches Key Idea

Key Idea

Key Idea: Grammar-based document plans Re-use the generation model based on a PCFG grammar of input Replace existing locally coherent Content Selection model and incorporate global Document Planning (explore two solutions):

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 39 / 56

slide-95
SLIDE 95

Joint Approaches Key Idea

Key Idea

Key Idea: Grammar-based document plans Re-use the generation model based on a PCFG grammar of input Replace existing locally coherent Content Selection model and incorporate global Document Planning (explore two solutions):

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 39 / 56

slide-96
SLIDE 96

Joint Approaches Key Idea

Key Idea

Key Idea: Grammar-based document plans Re-use the generation model based on a PCFG grammar of input Replace existing locally coherent Content Selection model and incorporate global Document Planning (explore two solutions): Patterns of record sequences within a sentence and among sentences Rhetorical Structure Theory (Mann and Thompson, 1988) inspired plans

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 39 / 56

slide-97
SLIDE 97

Joint Approaches Planning with Record Sequences

Planning with Record Sequences Key idea: Grammar on sequences of record types

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 40 / 56

slide-98
SLIDE 98

Joint Approaches Planning with Record Sequences

Planning with Record Sequences Key idea: Grammar on sequences of record types

1

Click start, point to settings, and then click control panel. Double-click users and

  • passwords. On the advanced tab, click advanced.

Split a document into sentences, each terminated by a full-stop.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 40 / 56

slide-99
SLIDE 99

Joint Approaches Planning with Record Sequences

Planning with Record Sequences Key idea: Grammar on sequences of record types

1

Click start, point to settings, and then click control panel. Double-click users and

  • passwords. On the advanced tab, click advanced.

Split a document into sentences, each terminated by a full-stop.

2

desktop | start | start-target Click start, point to settings, and then click control panel. window-target Double-click users and passwords. contextMenu | action-contextMenu On the advanced tab, click advanced.

Then split a sentence further into a sequence of record types.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 40 / 56

slide-100
SLIDE 100

Joint Approaches Planning with Record Sequences

Planning with Record Sequences Key idea: Grammar on sequences of record types

1

Click start, point to settings, and then click control panel. Double-click users and

  • passwords. On the advanced tab, click advanced.

Split a document into sentences, each terminated by a full-stop.

2

desktop | start | start-target Click start, point to settings, and then click control panel. window-target Double-click users and passwords. contextMenu | action-contextMenu On the advanced tab, click advanced.

Then split a sentence further into a sequence of record types.

3 Goal: Learn patterns of record type sequences within and among

sentences

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 40 / 56

slide-101
SLIDE 101

Joint Approaches Planning with Record Sequences

Extended Grammar

Desktop Cmd Name Type left-click start button Start Cmd Name Type left-click settings button Start Target Cmd Name Type left-click control panel button Window Target Cmd Name Type double-click users and passwords item Context Menu Cmd Name Type left-click advanced tab Action Context Menu Cmd Name Type left-click advanced button

1 S → R(start) 2 R(ri.t)→FS(rj, start)R(rj.t) | FS(rj, start) 3 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 4 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f ) 5 W(r, r.f )→α | g(f .v) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 41 / 56

slide-102
SLIDE 102

Joint Approaches Planning with Record Sequences

Extended Grammar

D SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) R(start-target) R(start) R(desk) 1 D → SENT(ti, . . . , tj) . . . SENT(tl, . . . , tm) 2 SENT(ti, . . . , tj) → R(ra.ti) . . . R(rk.tj) · 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 5 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f ) 6 W(r, r.f )→α | g(f .v) | gen_str(f .v, i) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 42 / 56

slide-103
SLIDE 103

Joint Approaches Planning with Record Sequences

Extended Grammar

D SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) R(start-target) R(start) R(desk) 1 D → SENT(ti, . . . , tj) . . . SENT(tl, . . . , tm) 2 SENT(ti, . . . , tj) → R(ra.ti) . . . R(rk.tj) · 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 5 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f ) 6 W(r, r.f )→α | g(f .v) | gen_str(f .v, i)

Straightforward solution: Embed the parameters with the original grammar and train using EM

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 42 / 56

slide-104
SLIDE 104

Joint Approaches Planning with Record Sequences

Extended Grammar

D SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) R(start-target) R(start) R(desk) 1 D → SENT(ti, . . . , tj) . . . SENT(tl, . . . , tm) 2 SENT(ti, . . . , tj) → R(ra.ti) . . . R(rk.tj) · 3 R(ri.t)→FS(rj, start) 4 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 5 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f ) 6 W(r, r.f )→α | g(f .v) | gen_str(f .v, i)

Straightforward solution: Embed the parameters with the original grammar and train using EM Plan B: Extract grammar rules from training data

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 42 / 56

slide-105
SLIDE 105

Joint Approaches Planning with Record Sequences

Grammar Extraction

desktop Click start, start point to settings, start-target and then click control panel. window-target Double-click users and passwords. contextMenu On the advanced tab ,p action-contextMenu click advanced. Liang et al. (2009)

  • desktop start start-target window-target contextMenu action-contMenu
  • D

SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) R(start-target) R(start) R(desk) D [SENT(win-target)-SENT(contMenu, action-contMenu)] SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) SENT(start, start-target) R(start-target) R(start) R(desk) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 43 / 56

slide-106
SLIDE 106

Joint Approaches Planning with Record Sequences

Grammar Extraction

desktop Click start, start point to settings, start-target and then click control panel. window-target Double-click users and passwords. contextMenu On the advanced tab ,p action-contextMenu click advanced. Liang et al. (2009)

  • desktop start start-target window-target contextMenu action-contMenu
  • D

SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) R(start-target) R(start) R(desk) D [SENT(win-target)-SENT(contMenu, action-contMenu)] SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) SENT(start, start-target) R(start-target) R(start) R(desk) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 43 / 56

slide-107
SLIDE 107

Joint Approaches Planning with Record Sequences

Grammar Extraction

desktop Click start, start point to settings, start-target and then click control panel. window-target Double-click users and passwords. contextMenu On the advanced tab ,p action-contextMenu click advanced. Liang et al. (2009)

  • desktop start start-target window-target contextMenu action-contMenu
  • D

SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) R(start-target) R(start) R(desk) D [SENT(win-target)-SENT(contMenu, action-contMenu)] SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) SENT(start, start-target) R(start-target) R(start) R(desk) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 43 / 56

slide-108
SLIDE 108

Joint Approaches Planning with Record Sequences

Grammar Extraction

desktop Click start, start point to settings, start-target and then click control panel. window-target Double-click users and passwords. contextMenu On the advanced tab ,p action-contextMenu click advanced. Liang et al. (2009)

  • desktop start start-target window-target contextMenu action-contMenu
  • D

SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) R(start-target) R(start) R(desk) D [SENT(win-target)-SENT(contMenu, action-contMenu)] SENT(contMenu, action-contMenu) R(action-contMenu) R(contMenu) SENT(win-target) R(win-target) SENT(desk, start, start-target) SENT(start, start-target) R(start-target) R(start) R(desk) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 43 / 56

slide-109
SLIDE 109

Joint Approaches Planning with RST

Planning with Rhetorical Structure Theory

RST (Mann and Thompson, 1988)

D Background[N][S] The sound settings window allows you to control your sound devices. Elaboration[N][S] and click on the sound settings. Open the control panel,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 44 / 56

slide-110
SLIDE 110

Joint Approaches Planning with RST

Planning with Rhetorical Structure Theory

RST (Mann and Thompson, 1988)

D Background[N][S] The sound settings window allows you to control your sound devices. Elaboration[N][S] and click on the sound settings. Open the control panel,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 44 / 56

slide-111
SLIDE 111

Joint Approaches Planning with RST

Planning with Rhetorical Structure Theory

RST (Mann and Thompson, 1988)

D Background[N][S] The sound settings window allows you to control your sound devices. Elaboration[N][S] and click on the sound settings. Open the control panel,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 44 / 56

slide-112
SLIDE 112

Joint Approaches Planning with RST

Planning with Rhetorical Structure Theory

RST (Mann and Thompson, 1988)

D Background[N][S] The sound settings window allows you to control your sound devices. Elaboration[N][S] and click on the sound settings. Open the control panel,

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 44 / 56

slide-113
SLIDE 113

Joint Approaches Planning with RST

Planning with Rhetorical Structure Theory Key idea: Grammar using RST relations (GRST)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 45 / 56

slide-114
SLIDE 114

Joint Approaches Planning with RST

Planning with Rhetorical Structure Theory Key idea: Grammar using RST relations (GRST)

Assumption

Each record in the database input corresponds to a unique non-overlapping span in the collocated text, and can be therefore mapped to an EDU.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 45 / 56

slide-115
SLIDE 115

Joint Approaches Planning with RST

Grammar Extraction

desktop Click start, start point to settings, start-target and then click control panel. window-target Double-click users and passwords. contextMenu On the advanced tab ,p action-contextMenu click advanced. Liang et al. (2009) [Click start,]desktop [point to settings, ]start [and then click control panel.]start−target [Double-click users and passwords.]window−target [On the advanced tab,]contextMenu [click advanced.]action−contextMenu

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 46 / 56

slide-116
SLIDE 116

Joint Approaches Planning with RST

Grammar Extraction

desktop Click start, start point to settings, start-target and then click control panel. window-target Double-click users and passwords. contextMenu On the advanced tab ,p action-contextMenu click advanced. Liang et al. (2009) [Click start,]desktop [point to settings, ]start [and then click control panel.]start−target [Double-click users and passwords.]window−target [On the advanced tab,]contextMenu [click advanced.]action−contextMenu

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 46 / 56

slide-117
SLIDE 117

Joint Approaches Planning with RST

Grammar Extraction

D Elaboration[N][S] Elaboration[N][S] click advanced. On the advanced tab, Elaboration[N][S] Double-click users and passwords. Elaboration[N][S] Joint[N][N] and then click control panel. point to settings, Click start,

Feng and Hirst (2012)

D Elaboration[N][S] Elaboration[N][S] R(action-contextMenu) R(contextMenu) Elaboration[N][S] R(window-target) Elaboration[N][S] Joint[N][N] R(start-target) R(start) R(desktop) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 47 / 56

slide-118
SLIDE 118

Joint Approaches Planning with RST

Grammar Extraction

D Elaboration[N][S] Elaboration[N][S] click advanced. On the advanced tab, Elaboration[N][S] Double-click users and passwords. Elaboration[N][S] Joint[N][N] and then click control panel. point to settings, Click start,

Feng and Hirst (2012)

D Elaboration[N][S] Elaboration[N][S] R(action-contextMenu) R(contextMenu) Elaboration[N][S] R(window-target) Elaboration[N][S] Joint[N][N] R(start-target) R(start) R(desktop) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 47 / 56

slide-119
SLIDE 119

Joint Approaches Planning with RST

Extended Grammar

1 GRST 2 R(ri.t)→FS(rj, start) 3 FS(r, r.fi)→F(r, r.fj)FS(r, r.fj) | F(r, r.fj) 4 F(r, r.f )→W(r, r.f )F(r, r.f ) | W(r, r.f ) 5 W(r, r.f )→α | g(f .v) | gen_str(f .v, i) Konstas (ILCC) Concept-to-Text Generation 21 July 2015 48 / 56

slide-120
SLIDE 120

Joint Approaches Results

Experimental Setup

Data WeatherGov : weather reports [4 sents, 345 words] (Liang et al., 2009) WinHelp : troubleshooting guides [4.3 sents, 629 words] (Branavan et al., 2009)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 49 / 56

slide-121
SLIDE 121

Joint Approaches Results

Experimental Setup

Data WeatherGov : weather reports [4 sents, 345 words] (Liang et al., 2009) WinHelp : troubleshooting guides [4.3 sents, 629 words] (Branavan et al., 2009) Evaluation Automatic evaluation: BLEU-4 Human evaluation: Fluency, Semantic Correctness, Coherence

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 49 / 56

slide-122
SLIDE 122

Joint Approaches Results

Experimental Setup

Data WeatherGov : weather reports [4 sents, 345 words] (Liang et al., 2009) WinHelp : troubleshooting guides [4.3 sents, 629 words] (Branavan et al., 2009) Evaluation Automatic evaluation: BLEU-4 Human evaluation: Fluency, Semantic Correctness, Coherence System Comparison GRSE, GRST Konstas and Lapata (2012a) Angeli et al. (2010)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 49 / 56

slide-123
SLIDE 123

Joint Approaches Results

Results: Automatic Evaluation (BLEU-4)

Angeli K&L GRSE GRST 30 35 40 38.4 33.7 35.6 36.54 WeatherGov Angeli K&L GRSE GRST 30 35 40 32.21 38.26 40.92 40.65 WinHelp

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 50 / 56

slide-124
SLIDE 124

Joint Approaches Results

Results: Human Evaluation (Coherence)

Angeli K&L GRSE GRST Human 2 4 3.82 3.59 4.18 4.1 4.11 WeatherGov Angeli K&L GRSE GRST Human 2 4 2.97 2.93 3.35 3.22 4.25 WinHelp

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 51 / 56

slide-125
SLIDE 125

Joint Approaches Results

Output

GRSE Click start, point to settings, and then click control panel. Double- click network and dial-up connections. Right-click local area con- nection, and then click properties. Click install, and then click

  • add. Click network monitor driver, and then click ok.

K&L Click start, point to settings, and then click control panel. Double- click network and dial-up connections. Double-click network and dial-up connections. Right-click local area connection, and then click ok. Human Click start, point to settings, click control panel, and then double- click network and dial-up connections. Right-click local area con- nection, and then click properties. Click install, click protocol, and then click add. Click network monitor driver, and then click ok.

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 52 / 56

slide-126
SLIDE 126

Joint Approaches Conclusions

Conclusions

End-to-end generation system that incorporates document planning Grammar-based approach allows for document planning naturally: all we need is a discourse grammar Provide two solutions for document plans:

Linguistically naive record sequence grammar (GRSE) RST-inspired grammar (GRST)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 53 / 56

slide-127
SLIDE 127

Recap

Recap

Recast NLG into a generative model

History-based local decisions - Add more features Hierarchical joint model - Add more layers

Learn parameters from (un)-annotated data - multiple domains Decoding: greedy search, k-best Viterbi search

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 54 / 56

slide-128
SLIDE 128

Future Work

Where do we go from here?

Generate from more open-ended formalisms: AMR More challenging factual domains: biographies from Wikipedia More sophisticated sentence planning: aggregation, referring expressions More engineering: address sparsity, with Deep Learning Apply document planning grammars to summarisation

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 55 / 56

slide-129
SLIDE 129

Future Work

Thank you

Questions ?

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 56 / 56

slide-130
SLIDE 130

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 1 / 4

slide-131
SLIDE 131

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 1 / 4

slide-132
SLIDE 132

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 1 / 4

slide-133
SLIDE 133

Hypergraphs

Definition

An ordered hypergraph H is a tuple N, E, t, R, where N is a finite set of nodes, E is a finite set of hyperarcs, t ∈ N is a target node and R is the set of weights. Each hyperarc e ∈ E is a triple e = T(e), h(e), f (e), where h(e) ∈ N is its head node, T(e) ∈ N∗ is a set of tail nodes and f (e) is a monotonic weight function R|T(e)| to R. t a b f(e)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 1 / 4

slide-134
SLIDE 134

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph H : N, E, t, R R(search1.t) FS(flight1.t,start) R(flight1.t) f (e) =f (FS5,7(flight1.t, start)) ⊗ f (R7,9(flight1.t))⊗ w(R(search1.t) → FS(flight1, start) R(flight1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 4

slide-135
SLIDE 135

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph H : N, E, t, R R(search1.t) FS(flight1.t,start) R(flight1.t) f (e) =f (FS5,7(flight1.t, start)) ⊗ f (R7,9(flight1.t))⊗ w(R(search1.t) → FS(flight1, start) R(flight1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 4

slide-136
SLIDE 136

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph H : N, E, t, R R(search1.t) FS(flight1.t,start) R(flight1.t) f (e) =f (FS5,7(flight1.t, start)) ⊗ f (R7,9(flight1.t))⊗ w(R(search1.t) → FS(flight1, start) R(flight1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 4

slide-137
SLIDE 137

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph H : N, E, t, R R(search1.t) FS(flight1.t,start) R(flight1.t) f (e) =f (FS5,7(flight1.t, start)) ⊗ f (R7,9(flight1.t))⊗ w(R(search1.t) → FS(flight1, start) R(flight1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 4

slide-138
SLIDE 138

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph H : N, E, t, R R(search1.t) FS(flight1.t,start) R(flight1.t) f (e) =f (FS5,7(flight1.t, start)) ⊗ f (R7,9(flight1.t))⊗ w(R(search1.t) → FS(flight1, start) R(flight1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 4

slide-139
SLIDE 139

Hypergraph Construction

Map standard weighted CYK algorithm to hypergraph H : N, E, t, R R(search1.t) FS(flight1.t,start) R(flight1.t) f (e) =f (FS5,7(flight1.t, start)) ⊗ f (R7,9(flight1.t))⊗ w(R(search1.t) → FS(flight1, start) R(flight1.t)) R(ri.t)→FS(rj, start)R(rj.t)

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 2 / 4

slide-140
SLIDE 140

Hypergraph Example

S0,7 R0,7(start) FS0,1(skyCover1,start) R1,7(skyCover1.t) FS0,2(skyCover1,start) R1,7(temp1.t) FS0,1(temp1,start) · · · F0,1(skyCover1,%) F0,1(skyCover1,time) F0,2(skyCover1,%) F0,2(skyCover1,time) W0,1(skyCover1,%) W0,1(skyCover1,time) sunny FS1,2(skyCover1,start) R2,7(skyCover1.t) FS1,2(temp1,start) R2,7(temp1.t) F1,2(temp1,max) F1,2(temp1,min) W1,2(temp1,min) g1,2(min,v=10) W1,2(temp1,max) g1,2(max,v=20) F1,2(skyCover1,%) F1,2(skyCover1,time) W1,2(skyCover1,%) W1,2(skyCover1,time) with

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 3 / 4

slide-141
SLIDE 141

Hypergraph Example

S0,7 R0,7(start) FS0,1(skyCover1,start) R1,7(skyCover1.t) FS0,2(skyCover1,start) R1,7(temp1.t) FS0,1(temp1,start) · · · F0,1(skyCover1,%) F0,1(skyCover1,time) F0,2(skyCover1,%) F0,2(skyCover1,time) W0,1(skyCover1,%) W0,1(skyCover1,time) sunny FS1,2(skyCover1,start) R2,7(skyCover1.t) FS1,2(temp1,start) R2,7(temp1.t) F1,2(temp1,max) F1,2(temp1,min) W1,2(temp1,min) g1,2(min,v=10) W1,2(temp1,max) g1,2(max,v=20) F1,2(skyCover1,%) F1,2(skyCover1,time) W1,2(skyCover1,%) W1,2(skyCover1,time) with

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 3 / 4

slide-142
SLIDE 142

Hypergraph Example

S0,7 R0,7(start) FS0,1(skyCover1,start) R1,7(skyCover1.t) FS0,2(skyCover1,start) R1,7(temp1.t) FS0,1(temp1,start) · · · F0,1(skyCover1,%) F0,1(skyCover1,time) F0,2(skyCover1,%) F0,2(skyCover1,time) W0,1(skyCover1,%) W0,1(skyCover1,time) sunny FS1,2(skyCover1,start) R2,7(skyCover1.t) FS1,2(temp1,start) R2,7(temp1.t) F1,2(temp1,max) F1,2(temp1,min) W1,2(temp1,min) g1,2(min,v=10) W1,2(temp1,max) g1,2(max,v=20) F1,2(skyCover1,%) F1,2(skyCover1,time) W1,2(skyCover1,%) W1,2(skyCover1,time) with

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 3 / 4

slide-143
SLIDE 143

Determining Text Length

Train a linear regression model Idea: The more records and fields that have values in the database → the more facts need to be uttered Input to the model: Flattened version of the database input, i.e. each feature is a record-field pair Feature values: Values vs Counts of Fields

Konstas (ILCC) Concept-to-Text Generation 21 July 2015 4 / 4