Ne Neural T Text Ge Generation f from S Struct ctured Da Data - - PowerPoint PPT Presentation

ne neural t text ge generation f from s struct ctured da
SMART_READER_LITE
LIVE PREVIEW

Ne Neural T Text Ge Generation f from S Struct ctured Da Data - - PowerPoint PPT Presentation

Ne Neural T Text Ge Generation f from S Struct ctured Da Data wi with h Appl Application n to the he Biogr graph phy Domain Rmi Lebret, David Grangier, Michael Auli Fr From Str truc uctur tured ed Data a to Sen entenc


slide-1
SLIDE 1

Ne Neural T Text Ge Generation f from S Struct ctured Da Data wi with h Appl Application n to the he Biogr graph phy Domain

Rémi Lebret, David Grangier, Michael Auli

slide-2
SLIDE 2

Fr From Str truc uctur tured ed Data a to Sen entenc ences es

  • Why?

User-friendly access to structured data: ØQuestion answering ØVirtual assistant ØProfile summary Machines like to read structured data, people don’t

slide-3
SLIDE 3

Co Concept-to to-Te Text Generation

  • Weather forecast:

Cloudy, with temperatures between 10 and 20 degrees. South wind around 20 mph.

slide-4
SLIDE 4

Co Concept-to to-Te Text Generation

  • Flight query:

Give me the flights leaving Denver August ninth coming back to Boston before 4pm.

slide-5
SLIDE 5

Mo Motivations for r Going La Large Sc Scale

  • Template-based approaches:

PROS CONS Natural language Repetitive No training Scale poorly

Small datasets with limited vocabularies

  • Generating natural language from Wikipedia infoboxes

Ø 700K biographies Ø 400K words vocabulary

slide-6
SLIDE 6

Ge Generatin ting Bio iograp aphy from Wi Wiki kipe pedi dia In Infobo box

Conditioning on tables (fields + values)

Z

Copy actions

slide-7
SLIDE 7

Pr Proposed Approach

slide-8
SLIDE 8

Fr From Str truc uctur tured ed Data a to Sen entenc ences es

  • How?

Neural language model for constrained sentence generation Success in: ØCaption generation (Vinyals et al, 2015) ØMachine translation (D. Bahdanau et al, 2014) ØModeling conversations and dialogues (Shang et al, 2015)

slide-9
SLIDE 9

La Langu guage e Model el with Co Conditioning g 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

slide-10
SLIDE 10

La Langu guage e Model el with Co Conditioning g 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

slide-11
SLIDE 11

La Langu guage e Model el with Co Conditioning g 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end

copy actions

slide-12
SLIDE 12

La Langu guage e Model el with Co Conditioning g 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end

copy actions

slide-13
SLIDE 13

La Langu guage e Model el with Co Conditioning g 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end

copy actions

slide-14
SLIDE 14

La Langu guage e Model el with Co Conditioning g 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end

Local conditioning à already generated fields copy actions

slide-15
SLIDE 15

La Langu guage e Model el with Co Conditioning g 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end

Local conditioning à already generated fields Global conditioning à fields and values copy actions

slide-16
SLIDE 16

Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

john doe ( 18 april 1352 ) is a

Embeddings-based model

slide-17
SLIDE 17

Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

john doe ( 18 april 1352 ) is a

Aggregating embeddings –> component-wise max

slide-18
SLIDE 18

Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

john doe ( 18 april 1352 ) is a

𝜔(𝑕,) 𝜔(𝑕-) 𝜔(𝑨)*) 𝜔(𝑑$) Input 𝑦 = 𝑑$, 𝑨)*, 𝑕,, 𝑕- : 𝜔3 𝑦 = 𝜔(𝑑$); 𝜔(𝑨)*); 𝜔(𝑕,); 𝜔(𝑕-)

slide-19
SLIDE 19

Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

Input: 𝜔3 𝑦 = 𝜔(𝑑$); 𝜔(𝑨)*); 𝜔(𝑕,); 𝜔(𝑕-) ℎ(𝑦) Non-linear transformation Final score: 𝜚7 𝑦, 𝑥 = 𝜚3

𝒳 𝑦, 𝑥 + 𝜚9 𝒭 𝑦, 𝑥

slide-20
SLIDE 20

Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

Input: 𝜔3 𝑦 = 𝜔(𝑑$); 𝜔(𝑨)*); 𝜔(𝑕,); 𝜔(𝑕-) Final score: 𝜚7 𝑦, 𝑥 = 𝜚3

𝒳 𝑦, 𝑥 + 𝜚9 𝒭 𝑦, 𝑥

Softmax function: log 𝑄(𝑥|𝑦) = 𝜚7 𝑦, 𝑥 − log ? exp 𝜚7(𝑦, 𝑥′)

  • E∈𝒳∪𝒭

Training: Maximize Likelihood of Training Text 𝑀7 𝑡 = ? log 𝑄(𝑥$|𝑑$, 𝑨)*, 𝑕,, 𝑕-)

J $KL

slide-21
SLIDE 21

Ex Experi riments

slide-22
SLIDE 22

Wi Wiki kiBio da dataset

  • 728,321 Wikipedia biographies (80% - 10% - 10%)

Ø Infobox Ø Introduction section (only 1st sentence for the generation)

Available at

https://rlebret.github.io/wikipedia-biography-dataset/

slide-23
SLIDE 23

Qu Quantitative Results

  • KN = Kneser-Ney language model (5-gram)
  • NLM = Neural Language model (11-gram)

without copy actions with copy actions

slide-24
SLIDE 24

At Attention Mechanism

  • Adding a bias 𝜚𝒭 to 𝜚𝒳

Continuing an incomplete field Handling transitions between fields

slide-25
SLIDE 25

Be Beam m Si Size Imp mpact

  • 100

200 500 1000 2000 15 20 25 30 35 40 45 time in ms BLEU

  • ● ●
  • 1

2 3 4 5 6 8 10 15 2025 1 345 67 810 15 20 25

  • Template KN

Table NLM beam size

  • Much faster than Kneser-Ney

thanks to GPU

  • Best BLEU with 𝐿 = 5

200 ms

slide-26
SLIDE 26

MODEL GENERATED SENTENCE Template KN frederick parker-rhodes ( born november 21 , 1914 – march 2 , 1987 ) was an english cricketer . Table NLM +Local (field, start) frederick parker-rhodes ( 21 november 1914 – 2 march 1987 ) was an australian rules footballer who played with carlton in the victorian football league ( vfl ) during the XXXXs and XXXXs . + Global (field) frederick parker-rhodes ( 21 november 1914 – 2 march 1987 ) was an english mycology and plant pathology , mathematics at the university of uk . + Global (field, word) frederick parker-rhodes ( 21 november 1914 – 2 march 1987 ) was a british computer scientist , best known for his contributions to computational linguistics .

Qu Qualitative Results

slide-27
SLIDE 27

Co Conclusi sion

  • Generating sentences with:

Ø copying facts from the table. Ø understanding type of fields. Ø understanding relation between record tokens and table tokens. Ø network with low capacity → fast generation.

  • WikiBio dataset available to download

Ø https://rlebret.github.io/wikipedia-biography-dataset/

slide-28
SLIDE 28

Futur Future e Work

  • Generating multiple sentences
  • Loss / evaluation that assess factual accuracy (≠ BLEU)

Thank you!