A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
1
Wan-Ting Hsu
National Tsing Hua University
Chieh-Kai Lin
National Tsing Hua University
Project page
A Unified Model for Extractive and Abstractive Summarization using - - PowerPoint PPT Presentation
A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss Project page Wan-Ting Hsu Chieh-Kai Lin National Tsing Hua University National Tsing Hua University 1 Outline Motivation Our Method Training
1
Wan-Ting Hsu
National Tsing Hua University
Chieh-Kai Lin
National Tsing Hua University
Project page
2
3
4
Overview
People spend 12 hours everyday consuming media in 2018. – eMarketer
https://www.emarketer.com/topics/topic/time-spent-with-media
5
Overview
People spend 12 hours everyday consuming media in 2018. – eMarketer
https://www.emarketer.com/topics/topic/time-spent-with-media
6
Overview
People spend 12 hours everyday consuming media in 2018. – eMarketer
https://www.emarketer.com/topics/topic/time-spent-with-media
7
Overview
People spend 12 hours everyday consuming media in 2018. – eMarketer
https://www.emarketer.com/topics/topic/time-spent-with-media
8
Overview
important points
9
Overview
market reports)
10
Overview
market reports)
11
Overview
market reports)
12
Overview
market reports)
13
Overview
market reports)
14
Overview
important points
Extractive Summarization Abstractive Summarization
select text from the article generate the summary word-by-word
15
Overview
summarization of documents. AAAI 2017
Representation
1 3 9 2 5 6 5 7 8 1 1 4
sentence 1 sentence 2 sentence 3
…
16
Overview
tosequence rnns and beyond. CoNLL 2016.
Encoder Article Representations Decoder
(select sentences):
(generate word-by-word):
17
Overview
Italian artist Johannes Stoetter has painted two naked women to look like a chameleon. The 37-year-old has previously transformed his models into frogs and parrots but this may be his most intricate and impressive artwork to date.
not concise
(select sentences):
(generate word-by-word):
18
Overview
Italian artist Johannes Stoetter has painted two naked women to look like a chameleon. The 37-year-old has previously transformed his models into frogs and parrots but this may be his most intricate and impressive artwork to date. Johannes Stoetter has previously transformed his models into frogs and parrots but this chameleon may be his most impressive artwork to date.
not concise concise
(select sentences):
(generate word-by-word):
19
Overview
Italian artist Johannes Stoetter has painted two naked women to look like a chameleon. The 37-year-old has previously transformed his models into frogs and parrots but this may be his most intricate and impressive artwork to date. Johannes Stoetter has previously transformed his models into frogs and parrots but this chameleon may be his most impressive artwork to date.
not concise concise
Justin Bieber
(select sentences):
(generate word-by-word):
20
Overview
Italian artist Johannes Stoetter has painted two naked women to look like a chameleon. The 37-year-old has previously transformed his models into frogs and parrots but this may be his most intricate and impressive artwork to date. Johannes Stoetter has previously transformed his models into frogs and parrots but this chameleon may be his most impressive artwork to date.
not concise concise
Justin Bieber
21
22
Extractor
Method
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. AAAI 2017
23
Extractor
Method
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. AAAI 2017
static sentence attention
24
Extractor Abstracter
Method
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. AAAI 2017 Abigail See, Peter J Liu, and Christopher D Manning. Get to the point: Summarization with pointer-generator networks. ACL 2017
static sentence attention
25
Extractor Abstracter
Method
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. AAAI 2017 Abigail See, Peter J Liu, and Christopher D Manning. Get to the point: Summarization with pointer-generator networks. ACL 2017
static sentence attention dynamic word attention
26
Extractor Abstracter
Method
static sentence attention dynamic word attention
𝛾 𝛽
𝑛: word index 𝑜: sentence index 𝑢: generated word index
27
Extractor Abstracter
Method
static sentence attention dynamic word attention
𝛾1
𝛽1
𝑛: word index 𝑜: sentence index 𝑢: generated word index
𝛽2 𝛽3
Cindy is lucky. She won $1000. She is going to …
28
Extractor Abstracter
Method
static sentence attention dynamic word attention
𝛾1
𝛽1
𝑛: word index 𝑜: sentence index 𝑢: generated word index
𝛾2
𝛽2 𝛽3
Cindy is lucky. She won $1000. She is going to …
𝛽4 𝛽5 𝛽6
29
Extractor Abstracter
Method
static sentence attention dynamic word attention
𝛾1
𝛽1
𝑛: word index 𝑜: sentence index 𝑢: generated word index
𝛾2 𝛾3
𝛽2 𝛽3
Cindy is lucky. She won $1000. She is going to …
𝛽4 𝛽5 𝛽6 𝛽7 𝛽8 𝛽9 …
30
Method
to take advantage of both extractive and abstractive summarization approaches.
31
Method
final word distribution
32
Method
model to be mutually beneficial to both extractive and abstractive summarization.
multiplied attention of top K attended words
maximize
33
Method
time step.
Sentence 1 Sentence 2 Sentence 3 inconsistent
1.0 0.5
K = 2
consistent
inconsistency loss: consistent < inconsistent
34
35
Extractive Summarization Abstractive Summarization
select sentences from the article generate the summary word-by-word
Training Procedures
Training Procedures
36
+ coverage loss
Training Procedures
37
+ coverage loss
Training Procedures
38
+ coverage loss
Ground Truth 1 0 1
Training Procedures
39
+ coverage loss
Ground Truth 1 0 1
40
Training Procedures
the extracted sentences should contain information that is needed to generate an abstractive summary as much as possible.
ROUGE-L recall score between the sentence and the reference abstractive summary.
sentence at a time if the new sentence can increase the informativity of all the selected sentences.
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. AAAI 2017
41
Extractor Abstracter
static sentence attention dynamic word attention 𝑛: word index 𝑜: sentence index 𝑢: generated word index
Training Procedures
0.5 0.5 0.5
+ coverage loss
Training Procedures
42
Abigail See, Peter J Liu, and Christopher D Manning. Get to the point: Summarization with pointer-generator networks. ACL 2017
Training Procedures
43
+ coverage loss
44
Training Procedures
45
Training Procedures
and output only those sentences. = Hard attention on the original article.
sentences to the abstracter.
Extractor
extracted sentences
Abstracter
summary article
46
Training Procedures
word-level attention
Extractor Abstracter
summary article
47
Training Procedures
word-level attention
Extractor Abstracter
summary article
48
49
Experiment
Train Validation Test Article-summary pairs 287,113 13,368 11,490 (…) Article ≈ 766 words Summary ≈ 53 words
50
Experiment
Train Validation Test Article-summary pairs 287,113 13,368 11,490 (…) Article ≈ 766 words Summary ≈ 53 words
Highlight
50 words
Article
700 words
51
Experiment
52
Experiment
53
Experiment
54
Experiment
sentence attention and word attention in time step 𝑢
inconsistency step 𝒖𝒋𝒐𝒅: inconsistency rate:
55
Experiment
sentence attention and word attention in time step 𝑢
inconsistency step 𝒖𝒋𝒐𝒅: inconsistency rate:
56
Experiment
sentence attention and word attention in time step 𝑢
inconsistency step 𝒖𝒋𝒐𝒅: inconsistency rate:
57
Experiment
sentence attention and word attention in time step 𝑢
inconsistency step 𝒖𝒋𝒐𝒅: inconsistency rate:
58
Experiment
how well does the summary capture the important parts of the article?
is the summary clear enough to explain everything without being redundant?
how well-written (fluent and grammatical) the summary is?
trap
59
Experiment
60
61
Conclusion and Future work
summarization.
two levels of attentions. The inconsistency loss enables extractive and abstractive summarization to be mutually beneficial.
most informative and readable summarization on the CNN/Daily Mail dataset in a solid human evaluation.
62
Min Sun Wen-Ting Tsu Chieh-Kai Lin Ming-Ying Lee Kerui Min Jing Tang
63
https://hsuwanting.github.io/unified_summ/