KitAi-PI: Summarization System for NTCIR-14 QA Lab-PoliInfo Satoshi - - PowerPoint PPT Presentation

kitai pi summarization system
SMART_READER_LITE
LIVE PREVIEW

KitAi-PI: Summarization System for NTCIR-14 QA Lab-PoliInfo Satoshi - - PowerPoint PPT Presentation

KitAi-PI: Summarization System for NTCIR-14 QA Lab-PoliInfo Satoshi Hiai, Yuka Otani, Takashi Yamamura and Kazutaka Shimada Department of Artificial Intelligence, Kyushu Institute of Technology 1 Contents Introduction and Objective


slide-1
SLIDE 1

KitAi-PI: Summarization System for NTCIR-14 QA Lab-PoliInfo

Satoshi Hiai, Yuka Otani, Takashi Yamamura and Kazutaka Shimada Department of Artificial Intelligence, Kyushu Institute of Technology

1

slide-2
SLIDE 2

Contents

  • Introduction and Objective
  • Outline of Our System
  • Training Data Construction
  • Formal Run
  • Summary

2

slide-3
SLIDE 3

Contents

  • Introduction and Objective
  • Outline of Our System
  • Training Data Construction
  • Formal Run
  • Summary

3

slide-4
SLIDE 4

Introduction – Assembly Minutes Summarization

  • Two types of summarization methods
  • Abstractive: Use of expressions not contained in the source text
  • Extractive: Use of expressions in the source text
  • Assembly minutes corpus
  • A summary consists of expressions contained in a speech

4 Summary 被災地そして日本の未来のため 東京は先頭に立つべき。知事の所見は。

The same expressions Summary generation with an extractive approach

Assembly member speech U.1 我々が生きている日本列島は、数限りない 天変地異に見舞われてきました。 U.2 被災地のため、そして、日本の未来のため に、東京は先頭に立つべきと考えますが、 知事の所見を伺います。 …

slide-5
SLIDE 5

Introduction – Extractive Summarization

  • Extraction of a set of important utterances
  • Supervised method usually shows better performance

than unsupervised method

  • Use of a machine learning method
  • Construction of importance prediction model
  • Problem
  • Given assembly minutes data do not contain importance

information for each utterance

5

slide-6
SLIDE 6

Objective

  • Automatic training data construction
  • Hypothesis
  • An utterance with high similarity to a sentence in a summary

is more important

6 Summary 被災地そして日本の未来のため 東京は先頭に立つべき。知事の所見は。

No importance information for utterances

Assembly member speech U.1 我々が生きている日本列島は、数限りない 天変地異に見舞われてきました。 U.2 被災地のため、そして、日本の未来のため に、東京は先頭に立つべきと考えますが、 知事の所見を伺います。 …

slide-7
SLIDE 7

Objective

  • Automatic training data construction
  • Hypothesis
  • An utterance with high similarity to a sentence in a summary

is more important

7

Low importance score High importance score

Assignment of utterance importance using a word similarity

Summary 被災地そして日本の未来のため 東京は先頭に立つべき。知事の所見は。

We can apply a machine learning method

Assembly member speech U.1 我々が生きている日本列島は、数限りない 天変地異に見舞われてきました。 U.2 被災地のため、そして、日本の未来のため に、東京は先頭に立つべきと考えますが、 知事の所見を伺います。 …

slide-8
SLIDE 8

Contents

  • Introduction and Objective
  • Outline of Our System
  • Training Data Construction
  • Formal Run
  • Summary

8

slide-9
SLIDE 9
  • Training data construction
  • Training utterance importance prediction model
  • Utterance extraction with trained model

Outline of Our System

9

Speeches Reference summaries Training data

Speeches and Importance scores of utterances Importance prediction model Training data

Speech Uttrunce.1 Uttrunce.2 …

Importance prediction model

Speech Uttrunce.1: 0.4 Uttrunce.2: 0.8 … Generated summary Utterance.2: 0.8

slide-10
SLIDE 10

Contents

  • Introduction and Objective
  • Outline of Our System
  • Training Data Construction
  • Formal Run
  • Summary

10

slide-11
SLIDE 11

Training Data Construction – Assignment of Importance Scores

  • Automatic assignment of an importance score to

each utterance using a word similarity

  • We regard a word similarity as an importance score
  • Evaluation of similarity measures
  • e. g. cosine similarity, edit distance, …

11

Summary Sentence 1 Sentence 2 Assembly member speech Utterance 1 Utterance 2 ・・・ Utterance N 0.123 0.900 0.201 0.820 0.110 0.221

Assignment of maximum scores

・・・ ・・・

slide-12
SLIDE 12
  • Given corpus: 529 speeches (7,226 utterances)
  • Training data: 477 speeches (6,551 utterances)
  • Development data: 52 speeches (675 utterances)

Similarity measures Cosine similarity between BoWs Edit distance

Training Data Construction – Evaluation of Similarity Measures

12 Cosine similarity between BoWs Training data of each similarity measure

Speeches Reference summaries

Speech Generated Summary …

Importance prediction model Edit distance

Training data

slide-13
SLIDE 13

Training Data Construction – Similarity Measures

  • Cosine similarity between bag-of-words
  • Edit distance
  • We adopt 1 – (the distances) as the similarity measure
  • ROUGE-1 similarity score
  • We use word unigram overlap
  • Cosine similarity between sentence embeddings
  • Two methods to generate sentence embeddings
  • Average of word embeddings generated with word2vec
  • Sentence embedding generated with doc2vec
  • Average of all the similarity measures

13

slide-14
SLIDE 14

Training Data Construction – Result of Similarity Measures Evaluation

  • Evaluation of generated summaries

14

Similarity measure Rouge N1 Cosine similarity between bag-of-words 0.333 Edit distance 0.338 ROUGE-1 similarity score 0.341 Cosine similarity between sentence embedding (Word2vec) 0.306 Cosine similarity between sentence embedding (Doc2vec) 0.316 Average of all of the similarity measures 0.349

Average of all the similarity measures is adopted on the formal run

slide-15
SLIDE 15

Contents

  • Introduction and Objective
  • Outline of Our System
  • Training Data Construction
  • Formal Run
  • Summary

15

slide-16
SLIDE 16

Settings for Formal Run

  • Importance prediction model
  • Features
  • BoW, sentence position in the speech, speaker of the speech
  • Support vector regression (SVR)
  • Our methods for the formal run
  • w/ sentence compression
  • We applied a sentence compression on the basis of simple

rules

  • w/o sentence compression

16

このため、関係機関と連携し、狭隘道路における 消火栓等の整備を促進してまいります。 The first content word The last verbal noun Speech U.1 U.2 …

Importance prediction model

Speech U.1: 0.4 U.2: 0.8 …

slide-17
SLIDE 17
  • Our methods outperformed OtherSysAve on all the scores
  • F-measure of Rouge N4 of the method with sentence

compression was the best score

  • It can generate summaries containing important phrases

Result on Formal Run – ROUGE Scores

Recall F-measure

N1 N2 N3 N4 N1 N2 N3 N4 Surface form w/o sentence compression 0.440 0.185 0.121 0.085 0.357 0.147 0.096 0.067 w/ sentence compression 0.390 0.174 0.113 0.078 0.343 0.154 0.101 0.069 OtherSysAve 0.282 0.096 0.058 0.038 0.272 0.088 0.051 0.033 17

OtherSysAve: the average scores of all the submitted runs

  • f all the participants
slide-18
SLIDE 18

Result on Formal Run – Participants Assessment

  • Quality question scores
  • The method w/o the sentence compression step
  • utperformed OtherSysAve on all the scores
  • The formedness score of the method with sentence

compression was lower than OtherSysAve

18 Content Formed Total X=0 X=2 w/o sentence compression 0.856 1.134 1.732 0.912 w/ sentence compression 0.788 1.035 1.308 0.667 OtherSysAve 0.423 0.603 1.655 0.435

The improvement of the sentence compression step is important future work

slide-19
SLIDE 19

Summary

  • KitAi-PI: extractive summarization system
  • Automatic training data construction
  • Applying the supervised machine learning method
  • The formal run result showed the effectiveness of
  • ur method
  • Summaries containing important phrases but ill-formed ones
  • The improvement of the sentence compression step

is important future work

19