lifelong learning
play

Lifelong Learning CS 330 Logistics Project milestone due Wednesday. - PowerPoint PPT Presentation

Lifelong Learning CS 330 Logistics Project milestone due Wednesday. Two guest lectures next week! Je ff Clune Sergey Levine 2 Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the


  1. Lifelong Learning CS 330

  2. Logistics Project milestone due Wednesday. Two guest lectures next week! Je ff Clune Sergey Levine 2

  3. Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement 
 from the meta-learning perspective 3

  4. A brief review of problem statements. Mul8-Task Learning Meta-Learning Given i.i.d. task distribu0on, Learn to solve a set of tasks. learn a new task efficiently learn tasks perform tasks learn to learn tasks quickly learn new task 4

  5. In contrast, many real world se@ngs look like: Mul8-Task Learning learn tasks perform tasks 0me Our agents may not be given a large batch of data/tasks right off the bat! Some examples: a student learning concepts in school Meta-Learning - a deployed image classifica8on system learning from a - learn to learn tasks quickly learn stream of images from users new task a robot acquiring an increasingly large set of skills in - different environments a virtual assistant learning to help different users with - different tasks at different points in 0me a doctor’s assistant aiding in medical decision-making - 5

  6. Some Terminology Sequen8al learning se@ngs online learning, lifelong learning, con0nual learning, incremental learning, streaming data dis0nct from sequence data and sequen8al decision-making 6

  7. What is the lifelong learning problem statement ? Exercise : 1. Pick an example se@ng . 2. Discuss problem statement with your neighbor : (a) how would you set-up an experiment to develop & test your algorithm? (b) what are desirable/required proper0es of the algorithm? (c) how do you evaluate such a system? A. a student learning concepts in school B. a deployed image classifica8on system learning from a stream of images from users C. a robot acquiring an increasingly large set of skills in Example seTngs: different environments D. a virtual assistant learning to help different users with different tasks at different points in 0me E. a doctor’s assistant aiding in medical decision-making 7

  8. What is the lifelong learning problem statement ? Problem varia0ons: - task/data order : i.i.d. vs. predictable vs. curriculum vs. adversarial - discrete task boundaries vs. con8nuous shiVs (vs. both) - known task boundaries/shiVs vs. unknown Some considera0ons: - model performance - data efficiency - computa8onal resources - memory - others: privacy, interpretability, fairness, test 0me compute & memory Substan0al variety in problem statement! 8

  9. What is the lifelong learning problem statement ? General [supervised] online learning problem: for t = 1, …, n observe x t <— if observable task boundaries : observe x t , z t predict ̂ y t observe label y t i.i.d. setting : x t ∼ p ( x ), y t ∼ p ( y | x ) streaming setting : cannot store ( x t , y t ) lack of memory - not a function of p t lack of computational resources - otherwise: x t ∼ p t ( x ), y t ∼ p t ( y | x ) privacy considerations - want to study neural memory mechanisms - true in some cases, but not in many cases! recall: replay bu ff ers - 9

  10. What do you want from your lifelong learning algorithm? minimal regret (that grows slowly with ) t regret : cumula0ve loss of learner — cumula0ve loss of best learner in hindsight T T ∑ ∑ Regret T := ℒ t ( θ t ) − min ℒ t ( θ ) θ 1 1 (cannot be evaluated in prac0ce, useful for analysis) Regret that grows linearly in is trivial. Why? t 10

  11. What do you want from your lifelong learning algorithm? posi1ve & nega1ve transfer posi8ve forward transfer : previous tasks cause you to do be[er on future tasks compared to learning future tasks from scratch posi8ve backward transfer : current tasks cause you to do be[er on previous tasks compared to learning past tasks from scratch posi8ve -> nega8ve : beMer -> worse 11

  12. Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement 
 from the meta-learning perspective 12

  13. Approaches Store all the data you’ve seen so far, and train on it. —> follow the leader algorithm + will achieve very strong performance - computa8on intensive —> Con8nuous fine-tuning can help. - can be memory intensive [depends on the applica0on] Take a gradient step on the datapoint you observe. —> stochas8c gradient descent + computa0onally cheap + requires 0 memory - subject to nega8ve backward transfer some0mes referred to as “forgeTng” catastrophic forgeTng - slow learning Can we do beMer? 13

  14. Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement 
 from the meta-learning perspective 14

  15. Case Study: Can we use meta-learning to accelerate online learning? 15

  16. Recall: model-based meta-RL gradual terrain change motor malfunction time online adaptation = few-shot learning tasks are temporal slices of experience 16

  17. example online learning problem icy terrain gradual terrain change motor malfunction time k time steps not sufficient to learn entirely new terrain + will be fast with MAML initialization Continue to run SGD? - what if ice goes away? (subject to forgetting) 17 Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend