SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder - PowerPoint PPT Presentation

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos Ed nburgh NLP University of Edinburgh Natural Language Processing NAACL-HLT 2019, Minneapolis, USA SEQ 3 Autoencoder Baziotis et al. 1 / 12

Introduction Dialogue Machine Translation Text to Code the big black cat … A: What do you want to do tonight? sort a list of numbers η μεγάλη μαύρη γάτα… B: Let’s go for a movie! for i in range(len(A)): min_idx = i for j in range(i+1, len(A)): Text to Tree if A[min_idx] > A[j]: min_idx = j A[i], A[min_idx] = A[min_idx], A[i] the big black cat … Sentence Compression SEQ 3 Autoencoder Baziotis et al. 2 / 12

Introduction Dialogue Machine Translation Text to Code the big black cat … A: What do you want to do tonight? sort a list of numbers η μεγάλη μαύρη γάτα… B: Let’s go for a movie! for i in range(len(A)): min_idx = i for j in range(i+1, len(A)): Text to Tree if A[min_idx] > A[j]: min_idx = j A[i], A[min_idx] = A[min_idx], A[i] the big black cat … Sentence Compression SEQ 3 : Sequence-to-Sequence-to-Sequence Autoencoder Input Sentence Compression Reconstruction SEQ 3 Autoencoder Baziotis et al. 2 / 12

… ෝ 𝒚 𝟐 , ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 𝒚 𝑶 Unsupervised Models for Language Vanilla Autoencoders 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 ෝ 𝒚 𝟐 , ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 3 / 12

Unsupervised Models for Language Vanilla Autoencoders 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 ෝ 𝒚 𝟐 , ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝑶 Discrete Latent Variable Autoencoders … 𝒚 𝟐 , ෝ ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 𝒚 𝑶 + Model the discreteness of language − Sampling is not differentiable − REINFORCE: sample inefficient and unstable SEQ 3 Autoencoder Baziotis et al. 3 / 12

Contributions Model Supervision Abstractive Differentiable Latent Miao & Blunsom (2016) semi Wang & Lee (2018) weak Fevry & Phang (2018) none seq 3 none seq 3 Features (+ contributions) + Fully unsupervised and abstractive + Fully differentiable (continuous approximations) + Topic -grounded compressions Human-readable compressions via LM prior User-defined flexible compression ratio SOTA in unsupervised sentence compression SEQ 3 Autoencoder Baziotis et al. 4 / 12

seq 3 Overview Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑓 𝐶𝑃𝑇 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview 𝑑 𝑓 1 𝒛 𝟐 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑓 𝐶𝑃𝑇 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview 𝑑 𝑓 1 𝒛 𝟐 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview 𝑑 𝑑 𝑓 1 𝑓 2 𝒛 𝟐 𝒛 𝟑 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview … 𝑑 𝑑 𝑓 1 𝑓 2 𝒛 𝟐 𝒛 𝟑 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview … 𝑑 𝑑 𝑑 𝑓 1 𝑓 2 𝑓 𝑁 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Reconstructor Encoder … 𝑑 𝑑 𝑑 𝑓 1 𝑓 2 𝑓 𝑁 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview ෝ 𝒚 𝟐 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ Reconstructor Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Reconstruction loss: distill input into the latent sequence Reconstruction Loss Minimize input reconstruction error: 𝒚 𝟐 ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ x ) = − � N L R ( x , ˆ i =1 log p R (ˆ x i = x i ) Reconstructor Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions ෝ 𝒚 𝟐 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ Reconstructor Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 Reconstructor 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions LM Prior Loss 𝒚 𝟐 ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ Minimize D KL between Compressor and LM: Compressor Reconstructor M 1 L P = � D KL ( p C ( y t | y <t , x ) � ) Encoder Decoder M t =1 … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 Reconstructor 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions LM Prior Loss 𝒚 𝟐 ෝ ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 Minimize D KL between Compressor and LM: Compressor Reconstructor LM M 1 L P = � D KL ( p C ( y t | y <t , x ) � p LM ( y t | y <t )) Encoder Decoder M t =1 … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 Reconstructor 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions Topic loss: similar topic as input Topic Loss v x : IDF -weighted average of e s 𝒚 𝟐 ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ i Compressor Reconstructor LM Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor input Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions Topic loss: similar topic as input Topic Loss v x : IDF -weighted average of e s ෝ 𝒚 𝟐 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ i v y : average of e c Compressor Reconstructor i LM Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 compression Compressor input Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder - PowerPoint PPT Presentation

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos Ed nburgh NLP University of Edinburgh Natural

Jen Grenier Director, TREx Facility Announcements New and Improved Project Submission Form

Introduction to RNA-Seq Mary Piper Bioinformatics Consultant and Trainer DataCamp RNA-Seq

Importing data Peter Humburg Statistician, Macquarie University DataCamp ChIP-seq Workflows in

An Enriched Perspective on Differentiable Stacks Benjamin MacAdam Joint work with Jonathan

Methods for Analyzing ChIP-Seq data Introduction to the ChIP-Seq server at SIB Lausanne Public

RNA-seq Data Analysis Introduction to RNA-seq data analysis June, 2018 1 Luigi Grassi < lg

RNA-seq: filtering, quality control and visualisation COMBINE RNA-seq Workshop QC and

RNA-seq basics: From reads to differential expression COMBINE RNA-seq Workshop RNA sequencing

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Winter School, 2 July 2012 Why do RNA-seq? Differential expression analysis of Discover new

ChIP-seq data analysis 04-05-12 Outlook Friday 04-05-12: Next-generation sequencing

RAD-seq in Roscoff Matthieu Bruneaux 2015-03-10 Mini-workshop about ddRAD Introduction about

Re-analysis of a CD4 ChIP-Seq data set with csaw Ryan C. Thompson Salomon Lab The Scripps

Differential expression analysis for sequencing count data Simon Anders RNA-Seq Count data in

SEQ Regional & NRM Plans reporting Unit Last Printed enQuire Summary Diagram SEQ catchment

THE RALPH M. BROWN ACT Cal. Govt Code 54950 et et seq seq. Presented t to the D Domestic

Flexure Design Sequence Determine Effective flange width Determine maximum tensile beam

HIV-GRADE HBV-Tool M. Obermeier 04/ 2008 Medizinisches Labor Berg Daten- bank HIVdb HIVdb

ABC - Folded Plate Bridge System Uxbridge Massachusetts Maury Tayarani Bridge Project Management

Dynamic Programming Lecture 8 February 15, 2011 Sariel (UIUC) CS473 1 Spring 2011 1 / 38

Handling Line Continua- tions Seth Stewart FamilySearch Language Modeling Combining

Multi-Resolution Broadcasting Over the Grassmann and Stiefel Manifolds Mohammad T. Hussien ,

November 2 0 1 8 Disclaimer This management presentation is intended to provide an overview of

Vehicular Cyber-Physical Systems (Or, Improving Your Commute) Hari Balakrishnan

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder - PowerPoint PPT Presentation

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos Ed nburgh NLP University of Edinburgh Natural

Jen Grenier Director, TREx Facility Announcements New and Improved Project Submission Form

Introduction to RNA-Seq Mary Piper Bioinformatics Consultant and Trainer DataCamp RNA-Seq

Importing data Peter Humburg Statistician, Macquarie University DataCamp ChIP-seq Workflows in

An Enriched Perspective on Differentiable Stacks Benjamin MacAdam Joint work with Jonathan

Methods for Analyzing ChIP-Seq data Introduction to the ChIP-Seq server at SIB Lausanne Public

RNA-seq Data Analysis Introduction to RNA-seq data analysis June, 2018 1 Luigi Grassi &lt; lg

RNA-seq: filtering, quality control and visualisation COMBINE RNA-seq Workshop QC and

RNA-seq basics: From reads to differential expression COMBINE RNA-seq Workshop RNA sequencing

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Winter School, 2 July 2012 Why do RNA-seq? Differential expression analysis of Discover new

ChIP-seq data analysis 04-05-12 Outlook Friday 04-05-12: Next-generation sequencing

RAD-seq in Roscoff Matthieu Bruneaux 2015-03-10 Mini-workshop about ddRAD Introduction about

Re-analysis of a CD4 ChIP-Seq data set with csaw Ryan C. Thompson Salomon Lab The Scripps

Differential expression analysis for sequencing count data Simon Anders RNA-Seq Count data in

SEQ Regional &amp; NRM Plans reporting Unit Last Printed enQuire Summary Diagram SEQ catchment

THE RALPH M. BROWN ACT Cal. Govt Code 54950 et et seq seq. Presented t to the D Domestic

Flexure Design Sequence Determine Effective flange width Determine maximum tensile beam

HIV-GRADE HBV-Tool M. Obermeier 04/ 2008 Medizinisches Labor Berg Daten- bank HIVdb HIVdb

ABC - Folded Plate Bridge System Uxbridge Massachusetts Maury Tayarani Bridge Project Management

Dynamic Programming Lecture 8 February 15, 2011 Sariel (UIUC) CS473 1 Spring 2011 1 / 38

Handling Line Continua- tions Seth Stewart FamilySearch Language Modeling Combining

Multi-Resolution Broadcasting Over the Grassmann and Stiefel Manifolds Mohammad T. Hussien ,

November 2 0 1 8 Disclaimer This management presentation is intended to provide an overview of

Vehicular Cyber-Physical Systems (Or, Improving Your Commute) Hari Balakrishnan

RNA-seq Data Analysis Introduction to RNA-seq data analysis June, 2018 1 Luigi Grassi < lg

SEQ Regional & NRM Plans reporting Unit Last Printed enQuire Summary Diagram SEQ catchment