episodic memory in lifelong language learning
play

Episodic Memory in Lifelong Language Learning NIPS 19 Cyprien de - PowerPoint PPT Presentation

Episodic Memory in Lifelong Language Learning NIPS 19 Cyprien de Masson dAutume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama DeepMind Xiachong Feng Outline Author Background Task Model Experiment Result Author


  1. Episodic Memory in Lifelong Language Learning NIPS 19 Cyprien de Masson d’Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama DeepMind Xiachong Feng

  2. Outline • Author • Background • Task • Model • Experiment • Result

  3. Author Lingpeng Kong Cyprien de Masson Sebastian Ruder Dani Yogatama (孔令鹏) d’Autume DeepMind DeepMind DeepMind

  4. Background • Life long learning

  5. Background • Catastrophic Forgetting

  6. Task • Text classification • Question answering

  7. Model • Example encoder • Task decoder • Episodic memory module.

  8. Example encoder && Task decoder

  9. • Text classification, 𝒚 𝒖 is a document to be Episodic Memory classified • Question answering, 𝒚 𝒖 is a concatenation of a context paragraph and a question separated by [SEP]. • key-value memory block Label key value • Text Classification • [CLS] • Question Answering • The first token of question Pretrained BERT Model (freeze)

  10. Episodic Memory Sparse experience replay Local adaptation Episodic Memory Model

  11. Model - Training • Write • Based on random write • Read sparse experience replay • Uniformly random sampling • Perform gradient updates based on the retrieved examples • Sparsely : randomly retrieve 100 examples every 10,000 new examples

  12. Model - Inference • Read local adaptation • Key net à query vector • K-nearest neighbors using the Euclidean distance function 1 𝐿

  13. Experiments • Text classification • News classification (AGNews), sentiment analysis (Yelp, Amazon), Wikipedia article classification (DBPedia), and questions and answers categorization (Yahoo). • AGNews (4 classes), Yelp (5 classes), DBPedia (14 classes), Amazon (5 classes), and Yahoo (10 classes) datasets. • Yelp and Amazon datasets have similar semantics (product ratings), we merge the classes for these two datasets. • Question answering • SQuAD 1.1 ,TriviaQA, QuAC • Create a balanced version all datasets

  14. QA Text classification Results randomly retrieved examples for local adaptation multitask model

  15. Result

  16. Result store only 50% and 10% of training examples.

  17. Result

  18. Thanks!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend