Lifelong Learning CS 330 Logistics Project milestone due Wednesday. - - PowerPoint PPT Presentation

lifelong learning
SMART_READER_LITE
LIVE PREVIEW

Lifelong Learning CS 330 Logistics Project milestone due Wednesday. - - PowerPoint PPT Presentation

Lifelong Learning CS 330 Logistics Project milestone due Wednesday. Two guest lectures next week! Je ff Clune Sergey Levine 2 Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the


slide-1
SLIDE 1

CS 330

Lifelong Learning

slide-2
SLIDE 2

Logistics

2

Project milestone due Wednesday. Two guest lectures next week! Jeff Clune Sergey Levine

slide-3
SLIDE 3

Plan for Today

3

The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement
 from the meta-learning perspective

slide-4
SLIDE 4

4

A brief review of problem statements.

Meta-Learning

Given i.i.d. task distribu0on, learn a new task efficiently

quickly learn new task learn to learn tasks

Mul8-Task Learning

Learn to solve a set of tasks.

perform tasks learn tasks

slide-5
SLIDE 5

5

In contrast, many real world se@ngs look like:

Meta-Learning

learn to learn tasks quickly learn new task

Mul8-Task Learning

perform tasks learn tasks

0me

  • a student learning concepts in school
  • a deployed image classifica8on system learning from a

stream of images from users

  • a robot acquiring an increasingly large set of skills in

different environments

  • a virtual assistant learning to help different users with

different tasks at different points in 0me

  • a doctor’s assistant aiding in medical decision-making

Some examples:

Our agents may not be given a large batch of data/tasks right off the bat!

slide-6
SLIDE 6

Sequen8al learning se@ngs

  • nline learning, lifelong learning, con0nual learning, incremental learning, streaming data

dis0nct from sequence data and sequen8al decision-making

Some Terminology

6

slide-7
SLIDE 7
  • 1. Pick an example se@ng.
  • 2. Discuss problem statement with your neighbor:

(a) how would you set-up an experiment to develop & test your algorithm? (b) what are desirable/required proper0es of the algorithm? (c) how do you evaluate such a system?

What is the lifelong learning problem statement?

  • A. a student learning concepts in school
  • B. a deployed image classifica8on system learning from a

stream of images from users

  • C. a robot acquiring an increasingly large set of skills in

different environments

  • D. a virtual assistant learning to help different users with

different tasks at different points in 0me

  • E. a doctor’s assistant aiding in medical decision-making

Example seTngs: Exercise:

7

slide-8
SLIDE 8

Some considera0ons:

  • computa8onal resources
  • memory
  • model performance
  • data efficiency

Problem varia0ons:

  • task/data order: i.i.d. vs. predictable vs. curriculum vs. adversarial
  • others: privacy, interpretability, fairness,

test 0me compute & memory

  • discrete task boundaries vs. con8nuous shiVs (vs. both)
  • known task boundaries/shiVs vs. unknown

Substan0al variety in problem statement!

What is the lifelong learning problem statement?

8

slide-9
SLIDE 9

General [supervised] online learning problem:

What is the lifelong learning problem statement?

for t = 1, …, n

  • bserve xt

predict ̂ yt

  • bserve label yt

i.i.d. setting: xt ∼ p(x), yt ∼ p(y|x) not a function of p t streaming setting: cannot store (xt, yt)

  • lack of memory
  • lack of computational resources
  • privacy considerations
  • want to study neural memory mechanisms
  • therwise: xt ∼ pt(x), yt ∼ pt(y|x)

true in some cases, but not in many cases!

  • recall: replay buffers

<— if observable task boundaries: observe xt, zt

9

slide-10
SLIDE 10

What do you want from your lifelong learning algorithm?

minimal regret (that grows slowly with )

t

regret: cumula0ve loss of learner — cumula0ve loss of best learner in hindsight (cannot be evaluated in prac0ce, useful for analysis) RegretT :=

T

1

ℒt(θt) − min

θ T

1

ℒt(θ)

10

Regret that grows linearly in is trivial.

t

Why?

slide-11
SLIDE 11

posi1ve & nega1ve transfer

What do you want from your lifelong learning algorithm?

posi8ve forward transfer: previous tasks cause you to do be[er on future tasks compared to learning future tasks from scratch posi8ve backward transfer: current tasks cause you to do be[er on previous tasks compared to learning past tasks from scratch posi8ve -> nega8ve : beMer -> worse

11

slide-12
SLIDE 12

Plan for Today

12

The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement
 from the meta-learning perspective

slide-13
SLIDE 13

Store all the data you’ve seen so far, and train on it.

Approaches

—> follow the leader algorithm Take a gradient step on the datapoint you observe. —> stochas8c gradient descent + will achieve very strong performance

  • computa8on intensive
  • can be memory intensive

—> Con8nuous fine-tuning can help. [depends on the applica0on] + computa0onally cheap + requires 0 memory

  • subject to nega8ve backward transfer

“forgeTng” some0mes referred to as catastrophic forgeTng

  • slow learning

13

Can we do beMer?

slide-14
SLIDE 14

Plan for Today

14

The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement
 from the meta-learning perspective

slide-15
SLIDE 15

Case Study: Can we use meta-learning to accelerate online learning?

15

slide-16
SLIDE 16

motor malfunction gradual terrain change time

  • nline adaptation = few-shot learning

tasks are temporal slices of experience

Recall: model-based meta-RL

16

slide-17
SLIDE 17

motor malfunction gradual terrain change time icy terrain k time steps not sufficient to learn entirely new terrain Continue to run SGD?

example online learning problem

+ will be fast with MAML initialization

  • what if ice goes away? (subject to forgetting)

17

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

slide-18
SLIDE 18

time

Online inference problem: infer latent “task” variable at each time step

Note: If neural net is random initialized, this procedure would be too slow. Alternate between:

M-step: Update mixture of network parameters Mixture of neural networks over task variable T, adapted continually: E-step: Estimate latent “task” variable at each time step given data

prior gradient step on each mixture element, weighted by task probability likelihood of the data
 under task .

P(Tt = Ti|xt, yt) ∝ pθ(Ti)(yt|xt, Tt = Ti)P(Tt = Ti)

<latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="ck8pdC+ekZH4nUmSP+ZG7r8lEyk=">AB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i7+gsTLXTzQrMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GS7KDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RtRxzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D7odOn4MoA7ncAFXEMIN3MEDdKALAhJ4hXdv4r15H6uat6tDP4I+/zBzjGijg=</latexit><latexit sha1_base64="HYuehfr9ScUR7ZI+Dh5V0LkyYlA=">AGKnicxVRLb9NAEHZLAiU2nLlMqKSFRaxVxAoEpIcEDiUqS+pGyw1ut1sur6od1xaeRufxQX/gcnOHAIX4H6zya2InUE2Ily7Mz34z8+3DT6XQ2Ol8X1m9VavfvrN2t3Fv/f6Djc2t9WOdZIrxI5bIRJ36VHMpYn6EAiU/TRWnkS/5iX/2poifnHOlRIf4jDlvYj2YxEKRtG6vK3a+ybJWySiOPDXBsPn8J0RkczNe+ygBx3XNMGYhpNkgpotgLkgKCA460b3XWGh6BPkF5sg1mlmkIZ9CMd5GZX5WzNFKipiU0qIr5Zx2zp2z16iKZU78TWaBzPqQ8tQrBxReFgKiHajgrRlQBwec1/URJnWNACSVWSYgKpl48TVOhNax5/uYRqmni/vBuqMt2BCdKIPIYkgwhVSJR1kckD7ELRGeRVePJvms+WlFcU+GzIVPNuAMk4LaAoz3BYSGmH8ybUJlOqARIn+AHs2peJRcs7tL8jigMZsuCD/PxR1TtL2f2h+cqpGd6JSZdH08jMHu0B8G7hpj2+SKa+pFONFrIYK0fSn4m4BFA+nKVkbi6ImGiqJTlomZ3cLEmb3O7s9cZDVg03Imx7UzGgbf5lQJyIeI5NU67bSbGXU4WCSW4aJNM8peyM9nXmjGNuO7lo1fPi7WE4Ct0X4xwsg7vyKnkdbDyLfIokZdjRXOZbFuhuGLXi7iNLO3i40ThZkEeySLJxQCoThDObQGZfYOCgZsQBVlaB/ahXBrba8aBw/23Ot/aHjrDmPnMdOy3Gd585r51z4Bw5rPa59q32s/ar/qX+o/57LNfqykS3h05p1P/8BQlyJAo=</latexit><latexit sha1_base64="HYuehfr9ScUR7ZI+Dh5V0LkyYlA=">AGKnicxVRLb9NAEHZLAiU2nLlMqKSFRaxVxAoEpIcEDiUqS+pGyw1ut1sur6od1xaeRufxQX/gcnOHAIX4H6zya2InUE2Ily7Mz34z8+3DT6XQ2Ol8X1m9VavfvrN2t3Fv/f6Djc2t9WOdZIrxI5bIRJ36VHMpYn6EAiU/TRWnkS/5iX/2poifnHOlRIf4jDlvYj2YxEKRtG6vK3a+ybJWySiOPDXBsPn8J0RkczNe+ygBx3XNMGYhpNkgpotgLkgKCA460b3XWGh6BPkF5sg1mlmkIZ9CMd5GZX5WzNFKipiU0qIr5Zx2zp2z16iKZU78TWaBzPqQ8tQrBxReFgKiHajgrRlQBwec1/URJnWNACSVWSYgKpl48TVOhNax5/uYRqmni/vBuqMt2BCdKIPIYkgwhVSJR1kckD7ELRGeRVePJvms+WlFcU+GzIVPNuAMk4LaAoz3BYSGmH8ybUJlOqARIn+AHs2peJRcs7tL8jigMZsuCD/PxR1TtL2f2h+cqpGd6JSZdH08jMHu0B8G7hpj2+SKa+pFONFrIYK0fSn4m4BFA+nKVkbi6ImGiqJTlomZ3cLEmb3O7s9cZDVg03Imx7UzGgbf5lQJyIeI5NU67bSbGXU4WCSW4aJNM8peyM9nXmjGNuO7lo1fPi7WE4Ct0X4xwsg7vyKnkdbDyLfIokZdjRXOZbFuhuGLXi7iNLO3i40ThZkEeySLJxQCoThDObQGZfYOCgZsQBVlaB/ahXBrba8aBw/23Ot/aHjrDmPnMdOy3Gd585r51z4Bw5rPa59q32s/ar/qX+o/57LNfqykS3h05p1P/8BQlyJAo=</latexit><latexit sha1_base64="ekeP9cfXZoDKmiLpGft+htEIbE=">AGNXicxVRLbxMxEN6WBEp4tIUjlxFVRKLSKsFBKpUCQ5IXIroS4rDyut4E6veh+zZ0mjrit/Ehf/BCQ4cQIgrfwFvHk12E6knhKXVjmc+fzPz+eEnUmhstb4tLV+rVK/fWLlZu3X7zt3VtfV7hzpOFeMHLJaxOvap5lJE/AFSn6cKE5DX/Ij/+RlHj865UqLONrHQcI7Ie1FIhCMonV565U3dZI1SEix7weZNh4+hsmMDmdq1mUBGW6pgnE1OokEVBvENaNcwj2OdKm9V5ioe4R5GeYIdopGcGHYgGOVlVGavzASpqIhMISG+WMRt69g6eY6mUO7YV6vTan3LUO+ckjhYSEgmrUS0pZVAMD5Jf9ZQZxBTgskUXGCMSReNkpQojeNWfz5AqpJ4p3iQriLtsRHCmByCOIU4REiVhZH5E8wDYQnYZWjUc7rnlvRXFNic+GTDnjJpAul+UWYLQvIDRE/INpEiqTPgUgSvT62LEpFQ/jU25/3Tq0ogN5uT/h6LOSNr8D82PT9XwTpSqzJtefOZgC4hvA1ft8VUyRdSXdKLRXBZj5Yh7UxEXAIqHs5CsCRcXJIgVlbJY1PQOztfkrW20tlvDAfOGOzY2nPHY89a+kG7M0pBHyCTVu2EuxkVKFgkpsaSTVPKDuhPd62ZkRDrjvZ8NWzj4v1dMHWaL8IYeidXZHRUOtB6FtkXqMux3Lnolg7xeBZJxNRktrbxUaJglSCPZL5EwpdoThDObAGZfYOCgasTxVlaB/amhXBLbc8bxw+2Xat/ba1sfvu40iOFeB89BpOK7z1Nl1Xjt7zoHDKp8qXys/Kj+rn6vfq7+qv0fQ5aWxhPedwqj+QvgriY7</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AGNXicxVRLbxMxEN6WBEp4tXDkMqKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBX/gLePJrsJlJPCEurHc98/mbm8NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5u+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9dOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj1v9QroxS0MeIZNU67bSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHweMu19tsn6zvPo7kWHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit>

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

18

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

slide-19
SLIDE 19

Crawler with crippled legs

Does it work?

  • nline learning w. MAML initialization

SGD w. MAML initialization MAML (always reset to prior + 1 grad step) model-based, no adaptation model-based, grad steps

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

no meta-learning

19

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

slide-20
SLIDE 20

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

Latent task distribution during online learning

Does it work?

Crawler with crippled legs

20

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

slide-21
SLIDE 21

Case Study: Can we modify vanilla SGD to avoid nega0ve backward transfer?

21

(from scratch)

slide-22
SLIDE 22

Idea:

22

Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17

(1) store small amount of data per task in memory (2) when making updates for new tasks, ensure that they don’t unlearn previous tasks

How do we accomplish (2)?

memory: for task

ℳk zk

For t = 0,...,T minimize ℒ( fθ( ⋅ , zt) , (xt, yt) ) subject to for all

ℒ( fθ , ℳk ) ≤ ℒ( f t−1

θ

, ℳk ) zk < zt

learning predictor yt = fθ(xt, zt) (i.e. s.t. loss on previous tasks doesn’t get worse) Can formulate & solve as a QP. Assume local linearity: ⟨gt, gk⟩ := ⟨ ∂ℒ( fθ , (xt, yt) ) ∂θ , ℒ( fθ , ℳk ) ∂θ ⟩ ≥ 0 for all zk < zt

slide-23
SLIDE 23

23

Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17

Experiments

If we take a step back… do these experimental domains make sense?

BWT: backward transfer, FWT: forward transfer

  • MNIST permuta0ons
  • MNIST rota0ons
  • CIFAR-100 (5 new classes/task)

Problems: Total memory size: 5012 examples

slide-24
SLIDE 24

Can we meta-learn how to avoid nega0ve backward transfer?

24

Javed & White. Meta-Learning Representa3ons for Con3nual Learning. NeurIPS ‘19

slide-25
SLIDE 25

Plan for Today

25

The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement
 from the meta-learning perspective

slide-26
SLIDE 26

More realis8cally:

learn learn learn learn learn learn slow learning rapid learning learn

0me

What might be wrong with the online learning formula0on?

Online Learning

(Hannan ’57, Zinkevich ’03)

Perform sequence of tasks while minimizing sta0c regret.

0me

perform perform perform perform perform perform perform

zero-shot performance

26

slide-27
SLIDE 27

Online Learning

(Hannan ’57, Zinkevich ’03)

Perform sequence of tasks while minimizing sta0c regret.

(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)

Online Meta-Learning

Efficiently learn a sequence of tasks from a non-sta0onary distribu0on.

0me

learn learn learn learn learn learn learn

0me

perform perform perform perform perform perform perform

zero-shot performance evaluate performance aQer seeing a small amount of data

What might be wrong with the online learning formula0on?

27

Primarily a difference in evalua&on, rather than the data stream.

slide-28
SLIDE 28

The Online Meta-Learning Se=ng

RegretT :=

T

X

t=1

`t(Φt(✓t)) − min

θ∈Θ T

X

t=1

`t(Φt(✓))

<latexit sha1_base64="2+KP9DCIWvgRsB3gBx4m2dSNKQ=">ACu3ichVFNbxMxEPUuX6UFmvJx4mKRWoPRLvhQIUqRIcOHAIKGkrZcPK650kprZ3ZY+LotWK38kP4H/gTfZAUyRGsvz0Zt6M/SavpLAYx7+C8M7de/cf7D3cP3j0+Mlh7+jpuS2d4TDlpSzNZc4sSKFhigIlXFYGmMolXORXH9r8xTUYK0o9wXUFc8WiwEZ+iprPczimiqGK6Mqr/C0gA2YS+H9HUOpXVOEqabxOagpQZHqfjlWgvXAGyDE9O6BsvFjqrtxRNhabpIXN/xt4eRlvX48iDdBb4OkA3SxTg7CqK0KLlToJFLZu0siSuc18yg4BKa/dRZqBi/YkuYeaiZAjuvN0419LVnCrojT8a6Yb9W1EzZe1a5b6yNcXu5lryX7mZw8XpvBa6cgiabwctnKRY0tZ2WgDHOXaA8aN8G+lfMUM4+iXc2MKaKcEgvI/0fCDl0oxXdQp/9zUrYs7tCsq30X5HApZQLeIpvG+Jrsu3gbnw0HydjD8Muyfewc3iMvyStyTBLyjpyRT2RMpoST38FB8Dx4EY5CHn4P5bY0DrNM3IjQvcHPI7YVw=</latexit>

Goal: Learning algorithm with sub-linear

Loss of algorithm Loss of best algorithm
 in hindsight

28

for task t = 1, …, n

  • bserve 𝒠tr

t

use update procedure to produce parameters Φ(θt, 𝒠tr

t )

ϕt

  • bserve label yt
  • bserve xt

predict ̂ yt = fϕt(xt) Standard online learning se@ng

(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)

slide-29
SLIDE 29

29

Store all the data you’ve seen so far, and train on it. Recall the follow the leader (FTL) algorithm: Follow the meta-leader (FTML) algorithm:

Can we apply meta-learning in lifelong learning seTngs?

Store all the data you’ve seen so far, and meta-train on it. Run update procedure on the current task. Deploy model on current task. What meta-learning algorithms are well-suited for FTML? What if is non-sta0onary?

pt(𝒰)

slide-30
SLIDE 30

Experiment with sequences of tasks:

  • Colored, rotated, scaled MNIST
  • 3D object pose predic1on
  • CIFAR-100 classifica0on

Example pose predic0on tasks plane car chair

Experiments

30

slide-31
SLIDE 31

Experiments

Learning efficiency


(# datapoints)

Task index Rainbow MNIST Pose Predic8on Task index Rainbow MNIST Pose Predic8on

  • TOE (train on everything): train on all data so far
  • FTL (follow the leader): train on all data so far, fine-tune on current task
  • From Scratch: train from scratch on each task

Learning proficiency


(error)

Comparisons:

31

Follow The Meta-Leader learns each new task faster & with greater proficiency, approaches few-shot learning regime

slide-32
SLIDE 32

32

Takeaways

Many flavors of lifelong learning, all under the same name. Defining the problem statement is often the hardest part Meta-learning can be viewed as a slice of the lifelong learning problem. A very open area of research.

slide-33
SLIDE 33

33

Reminders

Project milestone due Wednesday. Two guest lectures next week! Jeff Clune Sergey Levine