Deep Keyphrase Generation Rui Meng, Sanqiang Zhao, Shuguang Han, - PowerPoint PPT Presentation

Deep Keyphrase Generation Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, Yu Chi School of Computing and Information University of Pittsburgh

Introduction Keyphrase TITLE • Keyphrase o Short texts highly summarize the significant content of a document o Applications o Knowledge mining (concept) o Information retrieval (indexing term) o Summarization o Provided by authors/editors • This work aims to obtain keyphrases from scientific papers o (title+abstract) automatically 1

Background Previous Approaches • 3-step process Source Text Issues: Recommender systems play an important role in reducing the negative Recommender systems play an important role in reducing the negative impact of information overload on those websites where users have the impact of information overload on those websites where users have the 1. Candidates must be acquired from the possibility of voting for their preferences on items… possibility of voting for their preferences on items… source text. 1. Find candidates (noun phrase etc.) • Only able to predict phrases appear in text recommender systems, important role, negative impact, information overload, websites, users, possibility of voting, preferences, items… • & 2. Scoring 2. Highly rely on manual feature design Dataset % Present % Absent Inspec 73.62% 26.38% recommender systems (0.733), important role (0.019), negative impact • simple features can hardly represent (0.057), information overload (0.524), websites (0.132), users (0.014), Krapivin 54.33% 45.67% possibility of voting (0.104), preferences (0.197), items (0.027)… deep semantics NUS 45.63% 54.37% 3. Rank and return Top K • neither flexible nor scalable SemEval 55.66% 44.34% 1. recommender systems (0.733) 2. information overload (0.524) 3. preferences (0.197) Performance Upper 4. websites (0.132), Bound 5. negative impact (0.057) 2

Motivation Revisit Keyphrase Generation • How do humans assign keyphrases? 1. Reading the text 2. Understand and get contextual information 3. Summarize and write down the most meaningful phrases 4. Get hints from text, copy certain phrases Read & Understand • Can machine simulate this process? topic tracking Memory • Recurrent Neural Networks [Step 1-3] multilingual Write • Copy Mechanism [Step 4] text mining Keyphrase native language hypothesis ……

Methodology Recurrent Neural Networks Input Text Output Text Output Text Decoder Encoder Read Write RNN RNN Context tracking Prob=0.027 Memory topic Vector Prob=0.257 model Prob=0.022 multilingual Prob=0.122 language multiple Prob=0.013 Prob=0.119 dirichlet allocation Encoder-decoder model (Seq2seq) latent Prob=0.101 Prob=0.010 mining One and one o Prob=0.014 text Prob=0.093 analysis Gated recurrent units (GRU) cell o Prob=0.003 … Decoder generates multiple short sequences by beam search o 4

Methodology Recurrent Neural Networks Input Text Output Text Output Text Decoder Encoder Read Write RNN RNN Context topic tracking Prob=0.027 Memory Vector topic model Prob=0.022 multilingual Prob=0.122 language multiple Prob=0.013 dirichlet allocation Encoder-decoder model (Seq2seq) latent Prob=0.010 text mining One and one o Prob=0.014 text analysis Gated recurrent units (GRU) cell o Prob=0.003 … Decoder generates multiple short sequences by beam search o Rank them and return the top K results o 5

Methodology Recurrent Neural Networks Input Text Output Text Output Text Decoder Encoder Read Write RNN RNN Context topic tracking Context unk unk unk Vector multiple language unk unk multilingual unk Problem of RNN model RNN Dictionary • Keep everything in memory 50k words • Only train vectors for top 50k high-frequency words “topic” • Long-tail words are replaced with an “unknown” symbol <unk> “text” “multiple” Unable to predict long-tail words o Many keyphrases contain long-tail words (2%) o “language” “multilingual” 50k short-tail words 250k long-tail words 6

Methodology Copy Mechanism Input Text Output Text Output Text Decoder Encoder Read Write RNN RNN Context topic tracking Context unk unk unk Vector multiple language unk unk multilingual unk native language hypothesis CopyRNN Model RNN Dictionary 50k words Copy words from input text o “topic” Locate the words of interest by contextual o features “text” Copy corresponding part to output o “multiple” Enhance the RNN with extractive ability o “language” “multilingual” 50k short-tail words 250k long-tail words 7

Experiment Dataset • All data are scientific papers in Computer Science domain • Training Data • Collected from Elsevier, ACM Digital Library, Web of Science etc. # (Paper) = 571,267 o # (Phrase) = 3,011,651 o # (Unique word) = 324,163 o • Testing Data • Four commonly used datasets, only use abstract text • Overlapping papers are removed from training dataset Dataset # Paper # All (Avg) # Present # Absent % Absent Inspec 500 4,913 (9.82) 3,617 1,296 26.38% Krapivin 400 2,461 (6.15) 1,337 1,124 45.67% NUS 211 1,466 (6.94) 669 797 54.37% SemEval 100 2,339 (23.39) 1,302 1,037 44.34% 37.21% KP20k 20,000 105,471 (5.27) 66,221 39,250 8

Experiment Dataset #(Unique Keyphrase)=324,163 Length Number of Percentag 1400000 1320695 of Terms Frequency e 1200000 1 1 944840 30.88% 2 944840 1000000 3 NUMBER OF KEYPHRASE 2 944840 43.16% 4 800000 5 567462 6 3 567462 18.55% 600000 7 8 400000 4 160002 5.23% 9 160002 10 200000 44348 12873 4222 2240 1140 592 5 44348 1.45% 0 Length of Keyphrase >5 0.73% 9

Experiment Experiment Setup • Evaluation Methods • Process ground-truth and predicted phrases with Porter stemmer • Macro-average of precision, recall and F-measure @5,@10 • Tasks 1. Present phrases prediction o Compare to previous studies: Tf-Idf, TextRank, SingleRank, ExpandRank, KEA, Maui 2. Absent phrases prediction o No baseline comparison 3. Transfer to news dataset 10

Result Task 1 - Predict Present Keyphrase Dataset Inspec Krapivin NUS SemEval KP20k Method F@5 F@10 F@5 F@10 F@5 F@10 F@5 F@10 F@5 F@10 Tf-Idf 0.221 0.313 0.129 0.160 0.136 0.184 0.128 0.194 0.102 0.126 TextRank 0.223 0.281 0.189 0.162 0.195 0.196 0.176 0.187 0.175 0.147 SingleRank 0.214 0.306 0.110 0.153 0.140 0.173 0.135 0.176 0.096 0.119 ExpandRank 0.210 0.304 0.110 0.152 0.132 0.164 0.139 0.170 - - KEA 0.098 0.126 0.123 0.134 0.069 0.084 0.025 0.026 0.171 0.154 Maui 0.040 0.042 0.249 0.216 0.249 0.268 0.044 0.039 0.270 0.230 0.179 0.189 RNN 0.085 0.064 0.135 0.088 0.169 0.127 0.157 0.124 0.333 0.262 0.278 0.342 0.311 0.266 0.334 0.326 0.293 0.304 CopyRNN (24.7%) (9.3%) (24.9%) (23.1%) (34.1%) (21.6%) (66.5%) (56.7%) (23.3%) (13.9%) Take-away 1. Naïve RNN model fails to compete with baseline models 2. CopyRNN models outperform baseline models and RNN significantly. Copy mechanism can capture key information in source text. 11

Result Example - Phraseness [Title] Nonlinear Extrapolation Algorithm for Realization of a Scalar Random Process [Abstract] A method of construction of a nonlinear extrapolation algorithm is proposed. This method makes it possible to take into account any nonlinear random dependences that exist in an investigated process and are described by mixed central moment functions. The method is based on the V. S. Pugachev canonical decomposition apparatus. As an example, the problem of nonlinear extrapolation is solved for a moment function of third order. [Ground-truth] 6 ground-truth phrases moment function nonlinear extrapolation algorithm canonical decomposition apparatus scalar random process nonlinear random dependences mixed central moment functions [Prediction] nonlinear extrapol account CopyRNN Tf-Idf moment function example canon decomposit method extrapol algorithm mixed central moment functions scalar random process moment function random process nonlinear extrapolation central moment function nonlinear extrapolation algorithm nonlinear extrapol algorithm nonlinear random dependences mix central moment function problem central moment process mix central moment pugachev canonical decomposition apparatus random depend realization investig process s nonlinear random depend scalar random process scalar random third order 12

Deep Keyphrase Generation Rui Meng, Sanqiang Zhao, Shuguang Han, - PowerPoint PPT Presentation

Deep Keyphrase Generation Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, Yu Chi School of Computing and Information University of Pittsburgh Introduction Keyphrase TITLE Keyphrase o Short texts highly summarize the

Reducing Over-generation Errors for Automatic Keyphrase Extraction using Integer Linear

Representing Documents via Latent Keyphrase Inference April. 15 th , 2016 Document Representation

GpKex : Genetically Programmed Keyphrase Extraction from Croatian Texts Marko Bekavac and Jan

1 WHAT L-KD ( Labelled-KD ): tool for keyphrase clustering and labelling Extension of KD:

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Arne Naess Founder of Deep Ecology: biospheric egalitarianism Coined term deep

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Event Generation and Statistical Sampling with Deep Generative Models Rob Verheyen Introduction

Procedural Generation Lauri Kongas What is procedural generation? Procedural Generation It is

Procedural Generation Kaarel T onisson 2018-04-20 Kaarel T onisson Procedural Generation

America Needs Deep America Needs Deep America Needs Deep Innovation Again Innovation Again

St1 Deep Heat Oy First deep geothermal project in Scandinavia Achievements today Tero Saarno

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Operating System Structure Heechul Yun Disclaimer: some slides are adopted from the book

Emergency Load Reduction Program Technical Panel February 14, 2006 Jeannette Briggs, IESO

J OSEPH A. C URTATONE M AYOR M EMBERS Michael A. Capuano, Chair Dick Bauer, Vice Chair C ITY OF S

MR AR NDOU 16/02/2017 PRESENTATION OUTLINE Introduction Legal mandate

Companypresentation His tory 35 years anticipating trends, evolving strategies and markets - Key

PROGRAMMABLE LOGIC CONTROLLER Control Systems Types Programmable Logic Controllers

HAKAN version 3 Hasicska 2643 address: 756 61 ROZNOV pod RADHOSTEM +420 571 843 162, +420 571

BACK TO SCHOOL Grade 11 Tuesday, 24 September 2019 The Marymount London Pastoral System