seq 3 differentiable sequence to sequence to sequence
play

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder - PowerPoint PPT Presentation

SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos Ed nburgh NLP University of Edinburgh Natural


  1. SEQ 3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos Ed nburgh NLP University of Edinburgh Natural Language Processing NAACL-HLT 2019, Minneapolis, USA SEQ 3 Autoencoder Baziotis et al. 1 / 12

  2. Introduction Dialogue Machine Translation Text to Code the big black cat … A: What do you want to do tonight? sort a list of numbers η μεγάλη μαύρη γάτα… B: Let’s go for a movie! for i in range(len(A)): min_idx = i for j in range(i+1, len(A)): Text to Tree if A[min_idx] > A[j]: min_idx = j A[i], A[min_idx] = A[min_idx], A[i] the big black cat … Sentence Compression SEQ 3 Autoencoder Baziotis et al. 2 / 12

  3. Introduction Dialogue Machine Translation Text to Code the big black cat … A: What do you want to do tonight? sort a list of numbers η μεγάλη μαύρη γάτα… B: Let’s go for a movie! for i in range(len(A)): min_idx = i for j in range(i+1, len(A)): Text to Tree if A[min_idx] > A[j]: min_idx = j A[i], A[min_idx] = A[min_idx], A[i] the big black cat … Sentence Compression SEQ 3 : Sequence-to-Sequence-to-Sequence Autoencoder Input Sentence Compression Reconstruction SEQ 3 Autoencoder Baziotis et al. 2 / 12

  4. … ෝ 𝒚 𝟐 , ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 𝒚 𝑶 Unsupervised Models for Language Vanilla Autoencoders 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 ෝ 𝒚 𝟐 , ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 3 / 12

  5. Unsupervised Models for Language Vanilla Autoencoders 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 ෝ 𝒚 𝟐 , ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝑶 Discrete Latent Variable Autoencoders … 𝒚 𝟐 , ෝ ෝ 𝒚 𝟑 , … , ෝ 𝒚 𝟐 , 𝒚 𝟑 , … , 𝒚 𝑶 𝒚 𝑶 + Model the discreteness of language − Sampling is not differentiable − REINFORCE: sample inefficient and unstable SEQ 3 Autoencoder Baziotis et al. 3 / 12

  6. Contributions Model Supervision Abstractive Differentiable Latent Miao & Blunsom (2016) semi Wang & Lee (2018) weak Fevry & Phang (2018) none seq 3 none seq 3 Features (+ contributions) + Fully unsupervised and abstractive + Fully differentiable (continuous approximations) + Topic -grounded compressions Human-readable compressions via LM prior User-defined flexible compression ratio SOTA in unsupervised sentence compression SEQ 3 Autoencoder Baziotis et al. 4 / 12

  7. seq 3 Overview Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  8. seq 3 Overview Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑓 𝐶𝑃𝑇 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  9. seq 3 Overview Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑓 𝐶𝑃𝑇 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  10. seq 3 Overview 𝑑 𝑓 1 𝒛 𝟐 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑓 𝐶𝑃𝑇 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  11. seq 3 Overview 𝑑 𝑓 1 𝒛 𝟐 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  12. seq 3 Overview 𝑑 𝑓 1 𝒛 𝟐 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  13. seq 3 Overview 𝑑 𝑑 𝑓 1 𝑓 2 𝒛 𝟐 𝒛 𝟑 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  14. seq 3 Overview 𝑑 𝑑 𝑓 1 𝑓 2 𝒛 𝟐 𝒛 𝟑 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  15. seq 3 Overview … 𝑑 𝑑 𝑓 1 𝑓 2 𝒛 𝟐 𝒛 𝟑 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  16. seq 3 Overview … 𝑑 𝑑 𝑑 𝑓 1 𝑓 2 𝑓 𝑁 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  17. seq 3 Overview Reconstructor Encoder … 𝑑 𝑑 𝑑 𝑓 1 𝑓 2 𝑓 𝑁 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  18. seq 3 Overview ෝ 𝒚 𝟐 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ Reconstructor Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  19. seq 3 Overview Reconstruction loss: distill input into the latent sequence Reconstruction Loss Minimize input reconstruction error: 𝒚 𝟐 ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ x ) = − � N L R ( x , ˆ i =1 log p R (ˆ x i = x i ) Reconstructor Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  20. seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions ෝ 𝒚 𝟐 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ Reconstructor Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 Reconstructor 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  21. seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions LM Prior Loss 𝒚 𝟐 ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ Minimize D KL between Compressor and LM: Compressor Reconstructor M 1 L P = � D KL ( p C ( y t | y <t , x ) � ) Encoder Decoder M t =1 … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 Reconstructor 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  22. seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions LM Prior Loss 𝒚 𝟐 ෝ ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 Minimize D KL between Compressor and LM: Compressor Reconstructor LM M 1 L P = � D KL ( p C ( y t | y <t , x ) � p LM ( y t | y <t )) Encoder Decoder M t =1 … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 Reconstructor 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  23. seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions Topic loss: similar topic as input Topic Loss v x : IDF -weighted average of e s 𝒚 𝟐 ෝ 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ i Compressor Reconstructor LM Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 Compressor input Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

  24. seq 3 Overview Reconstruction loss: distill input into the latent sequence LM Prior loss: human-readable compressions Topic loss: similar topic as input Topic Loss v x : IDF -weighted average of e s ෝ 𝒚 𝟐 𝒚 𝟑 ෝ 𝒚 𝑶 ෝ i v y : average of e c Compressor Reconstructor i LM Encoder Decoder … 𝑑 𝑑 𝑑 𝑠 … 𝑓 1 𝑓 2 𝑓 𝑁 𝑓 𝐶𝑃𝑇 𝑠 𝑓 1 𝑓 𝑂−1 𝒛 𝟐 𝒛 𝟑 𝒛 𝚴 compression Compressor input Encoder Decoder … 𝑡 𝑡 𝑡 𝑓 1 𝑓 2 𝑓 𝑂 𝑑 … 𝑑 𝑓 𝐶𝑃𝑇 𝑓 1 𝑓 𝑁−1 𝒚 𝟐 𝒚 𝟑 𝒚 𝑶 SEQ 3 Autoencoder Baziotis et al. 5 / 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend