online meta learning
play

Online Meta-Learning Chelsea Finn*, Aravind Rajeswaran*, Sham - PowerPoint PPT Presentation

Online Meta-Learning Chelsea Finn*, Aravind Rajeswaran*, Sham Kakade, Sergey Levine Deep networks + large datasets = In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints In many prac9cal situa9ons


  1. Online Meta-Learning Chelsea Finn*, Aravind Rajeswaran*, Sham Kakade, Sergey Levine

  2. Deep networks + large datasets =

  3. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints

  4. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently

  5. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently

  6. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently

  7. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently More realis9cally :

  8. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn More realis9cally : Nme

  9. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn learn More realis9cally : Nme

  10. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn learn learn More realis9cally : Nme

  11. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn learn learn learn More realis9cally : Nme

  12. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn learn learn learn learn More realis9cally : Nme

  13. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn learn learn learn learn learn More realis9cally : Nme

  14. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn learn learn learn learn learn learn More realis9cally : Nme

  15. In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently learn learn learn learn learn learn learn More realis9cally : Nme slow learning rapid learning

  16. Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently Online Learning perform perform perform perform perform perform perform (Hannan ’57, Zinkevich ’03) Perform sequence of tasks while minimizing staNc regret. Nme

  17. Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently Online Learning perform perform perform perform perform perform perform (Hannan ’57, Zinkevich ’03) Perform sequence of tasks while minimizing staNc regret. Nme zero-shot performance

  18. Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently Online Learning perform perform perform perform perform perform perform (Hannan ’57, Zinkevich ’03) Perform sequence of tasks while minimizing staNc regret. Nme zero-shot performance Online Meta-Learning (this work) Efficiently learn a sequence of tasks from a non-staNonary distribuNon.

  19. Meta-Learning learn (Schmidhuber et al. ’87, Bengio et al. ’92) Given i.i.d. task distribuNon, learn a new task efficiently Online Learning perform perform perform perform perform perform perform (Hannan ’57, Zinkevich ’03) Perform sequence of tasks while minimizing staNc regret. Nme zero-shot performance learn learn learn learn learn learn learn Online Meta-Learning (this work) Efficiently learn a sequence of tasks from a non-staNonary distribuNon. Nme performance a?er seeing a small amount of data

  20. The Online Meta-Learning Se6ng

  21. <latexit sha1_base64="lzUmXogL4embxjWdgvnwZayh68=">AEcHicdZPbhMxEIa3aYASTi3cIHBQBepVFWVpEKgSpWKQIiLXhToSYpD5PXOJla89uIDUbTKi/A03MIb8Bo8Ad5DS5OCr2Zn/HtmvpmNMsGNbd/LTWm9eu31i52bp1+87de6tr90+McprhMVNC6bOIGhRc4rHlVuBZpGmkcDTaPymiJ9+RW24kd2mE/pUPJE86o9a7BWmOHRDjkMk+EMyOBiZ21PmWUIagEMqpitbLISR2hJYC4RLIUWUaFxkf/QIkpXYURfnH2ec4BCpjEMoYSJxkRZCzQ5g90Ko+XBkqdZqckaAiG9F5nt94pDVo5/0xoq4R5Zwu6W0BErKwpXImdApmFu626epQuRU0tznpFDynV3ru32W8RbjE1mLUxrWw4VrEkHE2NkDnCy3rHNgNwnye52EteD1EacGMlKt15zAGNoQJtz5gYSzVRGA8LlVz8zrnUEDOSknlkfC4QxcFvuKIdOKYey094TkcMQH9j+oSle4VQJWkaW8JOvickxetQeVfO8vMUWXJUs6y6vFA+DyHxzL2q4t4iKONLTKvxsyWF1vb7fLA1eNTm2sB/U5HKwthSRWzL8nLRPUmF6n7SedU205E0UCj8dv3ZgOsedN6dfO9PMS1gyeU9cVpgo30bpvazIaWrMNI38zWKbzGKscP4r1nM2edXPucycRcmqRIkTYBU/wrEXCOzYuoNyjT3tQIbeWSs+CXmshSsCtK+E4kTptLUD8lP+2CWExRiwV2D9zHLRYx5Na9ZwbWzSPGqcdLd7uxsdz901/f1oRXgkfB02Aj6AQvg/3gfXAYHAes8a3xvfGj8XP5d/Nh83HzSXW1sVRrHgRzp7n5B2/+d1k=</latexit> The Online Meta-Learning Se6ng Space of parameters ✓ ∈ Θ ⊆ R d and loss functions ` : Θ → R For round t ∈ { 1 , 2 , . . . ∞ } :

  22. <latexit sha1_base64="lzUmXogL4embxjWdgvnwZayh68=">AEcHicdZPbhMxEIa3aYASTi3cIHBQBepVFWVpEKgSpWKQIiLXhToSYpD5PXOJla89uIDUbTKi/A03MIb8Bo8Ad5DS5OCr2Zn/HtmvpmNMsGNbd/LTWm9eu31i52bp1+87de6tr90+McprhMVNC6bOIGhRc4rHlVuBZpGmkcDTaPymiJ9+RW24kd2mE/pUPJE86o9a7BWmOHRDjkMk+EMyOBiZ21PmWUIagEMqpitbLISR2hJYC4RLIUWUaFxkf/QIkpXYURfnH2ec4BCpjEMoYSJxkRZCzQ5g90Ko+XBkqdZqckaAiG9F5nt94pDVo5/0xoq4R5Zwu6W0BErKwpXImdApmFu626epQuRU0tznpFDynV3ru32W8RbjE1mLUxrWw4VrEkHE2NkDnCy3rHNgNwnye52EteD1EacGMlKt15zAGNoQJtz5gYSzVRGA8LlVz8zrnUEDOSknlkfC4QxcFvuKIdOKYey094TkcMQH9j+oSle4VQJWkaW8JOvickxetQeVfO8vMUWXJUs6y6vFA+DyHxzL2q4t4iKONLTKvxsyWF1vb7fLA1eNTm2sB/U5HKwthSRWzL8nLRPUmF6n7SedU205E0UCj8dv3ZgOsedN6dfO9PMS1gyeU9cVpgo30bpvazIaWrMNI38zWKbzGKscP4r1nM2edXPucycRcmqRIkTYBU/wrEXCOzYuoNyjT3tQIbeWSs+CXmshSsCtK+E4kTptLUD8lP+2CWExRiwV2D9zHLRYx5Na9ZwbWzSPGqcdLd7uxsdz901/f1oRXgkfB02Aj6AQvg/3gfXAYHAes8a3xvfGj8XP5d/Nh83HzSXW1sVRrHgRzp7n5B2/+d1k=</latexit> <latexit sha1_base64="QpLTBGJVTJK0PRBwp5wgSav8E=">AB8nicbVDLSgNBEJyNrxhfUY9eBhPBU9iNB8VTQBCPEcwDNkuYnZ1NhszOLDO9QlgC/oQXD4p49Wu8+TdOHgdNLGgoqrp7gpTwQ247rdTWFvf2Nwqbpd2dvf2D8qHR2jMk1ZiyqhdDckhgkuWQs4CNZNSNJKFgnHN1M/c4j04Yr+QDjlAUJGUgec0rASr5WmYzwNa5CtV+uDV3BrxKvAWpoAWa/fJXL1I0S5gEKogxvuemEOREA6eCTUq9zLCU0BEZMN9SRJmgnx28gSfWSXCsdK2JOCZ+nsiJ4kx4yS0nQmBoVn2puJ/np9BfBXkXKYZMEni+JMYFB4+j+OuGYUxNgSQjW3t2I6JpQsCmVbAje8surpF2veRe1+n290rh9msdRCfoFJ0jD12iBrpDTdRCFCn0jF7RmwPOi/PufMxbC84iwmP0B87nD/hQkMw=</latexit> <latexit sha1_base64="51E2b5kWBxUvu4YMcz0TQXzGfS0=">ACS3icbVC7TgMxEPSF9zMBSpoTCRI0V0oESCgoICJAJIuSjy+TbEwo+TvQZFp/sSWvgcPoDvoEMUOCEFBEayPJrZ9a4nzQW3GEVvQWVmdm5+YXFpeWV1b1a29i8toZBm2mhTa3KbUguI2chRwmxugMhVwk96fjPybBzCWa3WFwxy6kt4p3ueMopd6tWojASF6uJewTON+o1erR81ojPAviSekTia46G0EjSTzElQyAS1thNHOXYLapAzAeVy4izklN3TO+h4qgE2y3Gm5fhrleysK+NPwrDsfqzo6DS2qFMfaWkOLDT3kj8z+s47B91C65yh6DY96C+EyHqcBRDmHEDMXQE8oM97uGbEANZejD+jUFlJMcQfqfKHhkWkqsiJh52Uxim5KdlnuX5HeQy4y8NcAkJalzWeTvEvuW414Nm67JVPz6dJLxItskO2SMxOSTH5IxckDZhxJEn8kxegtfgPfgIPr9LK8GkZ4v8QmXuC+ips9U=</latexit> <latexit sha1_base64="Ug62yUj+VwH/Ec6Iq/XsAUXW9c=">ACRHicbVDLTsMwEHR4lvfryCWiReJUJeUAx0pw4MChSLRFNFHlOFtqYTuRvQFVUf6CK3wO/8A/cENcEW6bAxRGsjSa2fXuTpQKbtDz3py5+YXFpeXKyura+sbm1vbObsckmWbQZolI9E1EDQiuoI0cBdykGqiMBHSj+7Ox30AbXirnGUQijpneIDziha6bYW4BCQ9rHW3656dW8C9y/xS1IlJVr9HacWxAnLJChkghrT870Uw5xq5ExAsRpkBlLK7ukd9CxVIJ8nKhXtoldgdJNo+he5E/dmRU2nMSEa2UlIcmlvLP7n9TIcnIY5V2mGoNh0CATLibu+H435hoYipElGlud3XZkGrK0Kb0awqoTHIEaS9R8MgSKamK84BdFnkAQszIWZzaX6T1kIsY8mwRWFz9WdT/Es6jbp/XG9cNarN8zLhCtknB+SI+OSENMkFaZE2YUSRJ/JMXpxX5935cD6npXNO2bNHfsH5+gbZ5Lz</latexit> round : t ` t ( · ) θ t The Online Meta-Learning Se6ng Space of parameters ✓ ∈ Θ ⊆ R d and loss functions ` : Θ → R For round t ∈ { 1 , 2 , . . . ∞ } : 1. World picks a loss function ` t ( · ) 2. Agent should pick ✓ t without knowledge of ` t

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend