Online Meta-Learning Chelsea Finn*, Aravind Rajeswaran*, Sham - - PowerPoint PPT Presentation
Online Meta-Learning Chelsea Finn*, Aravind Rajeswaran*, Sham - - PowerPoint PPT Presentation
Online Meta-Learning Chelsea Finn*, Aravind Rajeswaran*, Sham Kakade, Sergey Levine Deep networks + large datasets = In many prac9cal situa9ons : Deep networks + large datasets = Learn new task with only a few datapoints In many prac9cal situa9ons
Deep networks + large datasets =
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets =
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently learn
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn
Nme
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn learn
Nme
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn learn learn
Nme
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn learn learn learn
Nme
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn learn learn learn learn
Nme
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn learn learn learn learn learn
Nme
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn learn learn learn learn learn learn
Nme
In many prac9cal situa9ons: Learn new task with only a few datapoints Deep networks + large datasets = Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
More realis9cally:
learn learn learn learn learn learn learn slow learning rapid learning learn
Nme
Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
Online Learning
(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks while minimizing staNc regret.
learn Nme
perform perform perform perform perform perform perform
Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
Online Learning
(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks while minimizing staNc regret.
learn Nme
perform perform perform perform perform perform perform
zero-shot performance
Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
Online Learning
(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks while minimizing staNc regret.
Online Meta-Learning
(this work)
Efficiently learn a sequence of tasks from a non-staNonary distribuNon.
learn Nme
perform perform perform perform perform perform perform
zero-shot performance
Meta-Learning
(Schmidhuber et al. ’87, Bengio et al. ’92)
Given i.i.d. task distribuNon, learn a new task efficiently
Online Learning
(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks while minimizing staNc regret.
Online Meta-Learning
(this work)
Efficiently learn a sequence of tasks from a non-staNonary distribuNon.
learn Nme
learn learn learn learn learn learn learn
Nme
perform perform perform perform perform perform perform
zero-shot performance performance a?er seeing a small amount of data
The Online Meta-Learning Se6ng
The Online Meta-Learning Se6ng
Space of parameters ✓ ∈ Θ ⊆ Rd and loss functions ` : Θ → R For round t ∈ {1, 2, . . . ∞}:
<latexit sha1_base64="lzUmXogL4embxjWdgvnwZayh68=">AEcHicdZPbhMxEIa3aYASTi3cIHBQBepVFWVpEKgSpWKQIiLXhToSYpD5PXOJla89uIDUbTKi/A03MIb8Bo8Ad5DS5OCr2Zn/HtmvpmNMsGNbd/LTWm9eu31i52bp1+87de6tr90+McprhMVNC6bOIGhRc4rHlVuBZpGmkcDTaPymiJ9+RW24kd2mE/pUPJE86o9a7BWmOHRDjkMk+EMyOBiZ21PmWUIagEMqpitbLISR2hJYC4RLIUWUaFxkf/QIkpXYURfnH2ec4BCpjEMoYSJxkRZCzQ5g90Ko+XBkqdZqckaAiG9F5nt94pDVo5/0xoq4R5Zwu6W0BErKwpXImdApmFu626epQuRU0tznpFDynV3ru32W8RbjE1mLUxrWw4VrEkHE2NkDnCy3rHNgNwnye52EteD1EacGMlKt15zAGNoQJtz5gYSzVRGA8LlVz8zrnUEDOSknlkfC4QxcFvuKIdOKYey094TkcMQH9j+oSle4VQJWkaW8JOvickxetQeVfO8vMUWXJUs6y6vFA+DyHxzL2q4t4iKONLTKvxsyWF1vb7fLA1eNTm2sB/U5HKwthSRWzL8nLRPUmF6n7SedU205E0UCj8dv3ZgOsedN6dfO9PMS1gyeU9cVpgo30bpvazIaWrMNI38zWKbzGKscP4r1nM2edXPucycRcmqRIkTYBU/wrEXCOzYuoNyjT3tQIbeWSs+CXmshSsCtK+E4kTptLUD8lP+2CWExRiwV2D9zHLRYx5Na9ZwbWzSPGqcdLd7uxsdz901/f1oRXgkfB02Aj6AQvg/3gfXAYHAes8a3xvfGj8XP5d/Nh83HzSXW1sVRrHgRzp7n5B2/+d1k=</latexit>The Online Meta-Learning Se6ng
Space of parameters ✓ ∈ Θ ⊆ Rd and loss functions ` : Θ → R For round t ∈ {1, 2, . . . ∞}:
- 1. World picks a loss function `t(·)
- 2. Agent should pick ✓t without knowledge of `t
round : t
<latexit sha1_base64="QpLTBGJVTJK0PRBwp5wgSav8E=">AB8nicbVDLSgNBEJyNrxhfUY9eBhPBU9iNB8VTQBCPEcwDNkuYnZ1NhszOLDO9QlgC/oQXD4p49Wu8+TdOHgdNLGgoqrp7gpTwQ247rdTWFvf2Nwqbpd2dvf2D8qHR2jMk1ZiyqhdDckhgkuWQs4CNZNSNJKFgnHN1M/c4j04Yr+QDjlAUJGUgec0rASr5WmYzwNa5CtV+uDV3BrxKvAWpoAWa/fJXL1I0S5gEKogxvuemEOREA6eCTUq9zLCU0BEZMN9SRJmgnx28gSfWSXCsdK2JOCZ+nsiJ4kx4yS0nQmBoVn2puJ/np9BfBXkXKYZMEni+JMYFB4+j+OuGYUxNgSQjW3t2I6JpQsCmVbAje8surpF2veRe1+n290rh9msdRCfoFJ0jD12iBrpDTdRCFCn0jF7RmwPOi/PufMxbC84iwmP0B87nD/hQkMw=</latexit>`t(·)
<latexit sha1_base64="51E2b5kWBxUvu4YMcz0TQXzGfS0=">ACS3icbVC7TgMxEPSF9zMBSpoTCRI0V0oESCgoICJAJIuSjy+TbEwo+TvQZFp/sSWvgcPoDvoEMUOCEFBEayPJrZ9a4nzQW3GEVvQWVmdm5+YXFpeWV1b1a29i8toZBm2mhTa3KbUguI2chRwmxugMhVwk96fjPybBzCWa3WFwxy6kt4p3ueMopd6tWojASF6uJewTON+o1erR81ojPAviSekTia46G0EjSTzElQyAS1thNHOXYLapAzAeVy4izklN3TO+h4qgE2y3Gm5fhrleysK+NPwrDsfqzo6DS2qFMfaWkOLDT3kj8z+s47B91C65yh6DY96C+EyHqcBRDmHEDMXQE8oM97uGbEANZejD+jUFlJMcQfqfKHhkWkqsiJh52Uxim5KdlnuX5HeQy4y8NcAkJalzWeTvEvuW414Nm67JVPz6dJLxItskO2SMxOSTH5IxckDZhxJEn8kxegtfgPfgIPr9LK8GkZ4v8QmXuC+ips9U=</latexit>θt
<latexit sha1_base64="Ug62yUj+VwH/Ec6Iq/XsAUXW9c=">ACRHicbVDLTsMwEHR4lvfryCWiReJUJeUAx0pw4MChSLRFNFHlOFtqYTuRvQFVUf6CK3wO/8A/cENcEW6bAxRGsjSa2fXuTpQKbtDz3py5+YXFpeXKyura+sbm1vbObsckmWbQZolI9E1EDQiuoI0cBdykGqiMBHSj+7Ox30AbXirnGUQijpneIDziha6bYW4BCQ9rHW3656dW8C9y/xS1IlJVr9HacWxAnLJChkghrT870Uw5xq5ExAsRpkBlLK7ukd9CxVIJ8nKhXtoldgdJNo+he5E/dmRU2nMSEa2UlIcmlvLP7n9TIcnIY5V2mGoNh0CATLibu+H435hoYipElGlud3XZkGrK0Kb0awqoTHIEaS9R8MgSKamK84BdFnkAQszIWZzaX6T1kIsY8mwRWFz9WdT/Es6jbp/XG9cNarN8zLhCtknB+SI+OSENMkFaZE2YUSRJ/JMXpxX5935cD6npXNO2bNHfsH5+gbZ5Lz</latexit>The Online Meta-Learning Se6ng
Space of parameters ✓ ∈ Θ ⊆ Rd and loss functions ` : Θ → R For round t ∈ {1, 2, . . . ∞}:
- 1. World picks a loss function `t(·)
- 2. Agent should pick ✓t without knowledge of `t
- 3. Agent uses update procedure Φt : Θ → Θ, and obtains ˜
✓t = Φt(✓t)
<latexit sha1_base64="lzUmXogL4embxjWdgvnwZayh68=">AEcHicdZPbhMxEIa3aYASTi3cIHBQBepVFWVpEKgSpWKQIiLXhToSYpD5PXOJla89uIDUbTKi/A03MIb8Bo8Ad5DS5OCr2Zn/HtmvpmNMsGNbd/LTWm9eu31i52bp1+87de6tr90+McprhMVNC6bOIGhRc4rHlVuBZpGmkcDTaPymiJ9+RW24kd2mE/pUPJE86o9a7BWmOHRDjkMk+EMyOBiZ21PmWUIagEMqpitbLISR2hJYC4RLIUWUaFxkf/QIkpXYURfnH2ec4BCpjEMoYSJxkRZCzQ5g90Ko+XBkqdZqckaAiG9F5nt94pDVo5/0xoq4R5Zwu6W0BErKwpXImdApmFu626epQuRU0tznpFDynV3ru32W8RbjE1mLUxrWw4VrEkHE2NkDnCy3rHNgNwnye52EteD1EacGMlKt15zAGNoQJtz5gYSzVRGA8LlVz8zrnUEDOSknlkfC4QxcFvuKIdOKYey094TkcMQH9j+oSle4VQJWkaW8JOvickxetQeVfO8vMUWXJUs6y6vFA+DyHxzL2q4t4iKONLTKvxsyWF1vb7fLA1eNTm2sB/U5HKwthSRWzL8nLRPUmF6n7SedU205E0UCj8dv3ZgOsedN6dfO9PMS1gyeU9cVpgo30bpvazIaWrMNI38zWKbzGKscP4r1nM2edXPucycRcmqRIkTYBU/wrEXCOzYuoNyjT3tQIbeWSs+CXmshSsCtK+E4kTptLUD8lP+2CWExRiwV2D9zHLRYx5Na9ZwbWzSPGqcdLd7uxsdz901/f1oRXgkfB02Aj6AQvg/3gfXAYHAes8a3xvfGj8XP5d/Nh83HzSXW1sVRrHgRzp7n5B2/+d1k=</latexit>round : t
<latexit sha1_base64="QpLTBGJVTJK0PRBwp5wgSav8E=">AB8nicbVDLSgNBEJyNrxhfUY9eBhPBU9iNB8VTQBCPEcwDNkuYnZ1NhszOLDO9QlgC/oQXD4p49Wu8+TdOHgdNLGgoqrp7gpTwQ247rdTWFvf2Nwqbpd2dvf2D8qHR2jMk1ZiyqhdDckhgkuWQs4CNZNSNJKFgnHN1M/c4j04Yr+QDjlAUJGUgec0rASr5WmYzwNa5CtV+uDV3BrxKvAWpoAWa/fJXL1I0S5gEKogxvuemEOREA6eCTUq9zLCU0BEZMN9SRJmgnx28gSfWSXCsdK2JOCZ+nsiJ4kx4yS0nQmBoVn2puJ/np9BfBXkXKYZMEni+JMYFB4+j+OuGYUxNgSQjW3t2I6JpQsCmVbAje8surpF2veRe1+n290rh9msdRCfoFJ0jD12iBrpDTdRCFCn0jF7RmwPOi/PufMxbC84iwmP0B87nD/hQkMw=</latexit>`t(·)
<latexit sha1_base64="51E2b5kWBxUvu4YMcz0TQXzGfS0=">ACS3icbVC7TgMxEPSF9zMBSpoTCRI0V0oESCgoICJAJIuSjy+TbEwo+TvQZFp/sSWvgcPoDvoEMUOCEFBEayPJrZ9a4nzQW3GEVvQWVmdm5+YXFpeWV1b1a29i8toZBm2mhTa3KbUguI2chRwmxugMhVwk96fjPybBzCWa3WFwxy6kt4p3ueMopd6tWojASF6uJewTON+o1erR81ojPAviSekTia46G0EjSTzElQyAS1thNHOXYLapAzAeVy4izklN3TO+h4qgE2y3Gm5fhrleysK+NPwrDsfqzo6DS2qFMfaWkOLDT3kj8z+s47B91C65yh6DY96C+EyHqcBRDmHEDMXQE8oM97uGbEANZejD+jUFlJMcQfqfKHhkWkqsiJh52Uxim5KdlnuX5HeQy4y8NcAkJalzWeTvEvuW414Nm67JVPz6dJLxItskO2SMxOSTH5IxckDZhxJEn8kxegtfgPfgIPr9LK8GkZ4v8QmXuC+ips9U=</latexit>θt
<latexit sha1_base64="Ug62yUj+VwH/Ec6Iq/XsAUXW9c=">ACRHicbVDLTsMwEHR4lvfryCWiReJUJeUAx0pw4MChSLRFNFHlOFtqYTuRvQFVUf6CK3wO/8A/cENcEW6bAxRGsjSa2fXuTpQKbtDz3py5+YXFpeXKyura+sbm1vbObsckmWbQZolI9E1EDQiuoI0cBdykGqiMBHSj+7Ox30AbXirnGUQijpneIDziha6bYW4BCQ9rHW3656dW8C9y/xS1IlJVr9HacWxAnLJChkghrT870Uw5xq5ExAsRpkBlLK7ukd9CxVIJ8nKhXtoldgdJNo+he5E/dmRU2nMSEa2UlIcmlvLP7n9TIcnIY5V2mGoNh0CATLibu+H435hoYipElGlud3XZkGrK0Kb0awqoTHIEaS9R8MgSKamK84BdFnkAQszIWZzaX6T1kIsY8mwRWFz9WdT/Es6jbp/XG9cNarN8zLhCtknB+SI+OSENMkFaZE2YUSRJ/JMXpxX5935cD6npXNO2bNHfsH5+gbZ5Lz</latexit>˜ θt = Φt(θt)
<latexit sha1_base64="m0cJZ0NH9MDCetA2fnDZJc1MCRM=">ACfnicbVBNaxsxEB1v+pG4X05yzEXEbUkLdXbdQ3MJhLaHnpwoU4Clm02rEtImkXabFLP4/+TW5Jv03lT8KsdMHgsd7M5qZl5VaeYrjP41o69HjJ0+3d5rPnr94+aq1u3fui8pJ7MtCF+4yEx61stgnRovS4fCZBovsqsvc/iFzqvCvuTpiUOjRhbNVJSUJDS1me4VjZWmg1tu9nTU5K51hzmiCJWUrslHe5L2JSuloqab0rsnR5v960lY7sQLsIckWZE2rNBLdxuveV7IyqAlqYX3gyQuaVgLR0pqDEtUHkshr8QYB4FaYdAP68WxM/YmKDkbFS48S2yh3u+ohfF+arJQaQRN/KY3F/nDSoanQxrZcuK0MrloFGlGRVsnhzLlUNJehqIkE6FXZmcCckhXzXpqCtjCI04RKLv2VhjAhxcfl9VnPUekOu8jL8YoK3lv0812QzxYfkvNtJPna6P7rts6+rhLfhA7hCBL4BGfwDXrQBwnXcAO3cBdB9Db6EB0vS6PGqmcf1hCd/AURgMT+</latexit>The Online Meta-Learning Se6ng
Space of parameters ✓ ∈ Θ ⊆ Rd and loss functions ` : Θ → R For round t ∈ {1, 2, . . . ∞}:
- 1. World picks a loss function `t(·)
- 2. Agent should pick ✓t without knowledge of `t
- 3. Agent uses update procedure Φt : Θ → Θ, and obtains ˜
✓t = Φt(✓t)
<latexit sha1_base64="lzUmXogL4embxjWdgvnwZayh68=">AEcHicdZPbhMxEIa3aYASTi3cIHBQBepVFWVpEKgSpWKQIiLXhToSYpD5PXOJla89uIDUbTKi/A03MIb8Bo8Ad5DS5OCr2Zn/HtmvpmNMsGNbd/LTWm9eu31i52bp1+87de6tr90+McprhMVNC6bOIGhRc4rHlVuBZpGmkcDTaPymiJ9+RW24kd2mE/pUPJE86o9a7BWmOHRDjkMk+EMyOBiZ21PmWUIagEMqpitbLISR2hJYC4RLIUWUaFxkf/QIkpXYURfnH2ec4BCpjEMoYSJxkRZCzQ5g90Ko+XBkqdZqckaAiG9F5nt94pDVo5/0xoq4R5Zwu6W0BErKwpXImdApmFu626epQuRU0tznpFDynV3ru32W8RbjE1mLUxrWw4VrEkHE2NkDnCy3rHNgNwnye52EteD1EacGMlKt15zAGNoQJtz5gYSzVRGA8LlVz8zrnUEDOSknlkfC4QxcFvuKIdOKYey094TkcMQH9j+oSle4VQJWkaW8JOvickxetQeVfO8vMUWXJUs6y6vFA+DyHxzL2q4t4iKONLTKvxsyWF1vb7fLA1eNTm2sB/U5HKwthSRWzL8nLRPUmF6n7SedU205E0UCj8dv3ZgOsedN6dfO9PMS1gyeU9cVpgo30bpvazIaWrMNI38zWKbzGKscP4r1nM2edXPucycRcmqRIkTYBU/wrEXCOzYuoNyjT3tQIbeWSs+CXmshSsCtK+E4kTptLUD8lP+2CWExRiwV2D9zHLRYx5Na9ZwbWzSPGqcdLd7uxsdz901/f1oRXgkfB02Aj6AQvg/3gfXAYHAes8a3xvfGj8XP5d/Nh83HzSXW1sVRrHgRzp7n5B2/+d1k=</latexit>round : t
<latexit sha1_base64="QpLTBGJVTJK0PRBwp5wgSav8E=">AB8nicbVDLSgNBEJyNrxhfUY9eBhPBU9iNB8VTQBCPEcwDNkuYnZ1NhszOLDO9QlgC/oQXD4p49Wu8+TdOHgdNLGgoqrp7gpTwQ247rdTWFvf2Nwqbpd2dvf2D8qHR2jMk1ZiyqhdDckhgkuWQs4CNZNSNJKFgnHN1M/c4j04Yr+QDjlAUJGUgec0rASr5WmYzwNa5CtV+uDV3BrxKvAWpoAWa/fJXL1I0S5gEKogxvuemEOREA6eCTUq9zLCU0BEZMN9SRJmgnx28gSfWSXCsdK2JOCZ+nsiJ4kx4yS0nQmBoVn2puJ/np9BfBXkXKYZMEni+JMYFB4+j+OuGYUxNgSQjW3t2I6JpQsCmVbAje8surpF2veRe1+n290rh9msdRCfoFJ0jD12iBrpDTdRCFCn0jF7RmwPOi/PufMxbC84iwmP0B87nD/hQkMw=</latexit>`t(·)
<latexit sha1_base64="51E2b5kWBxUvu4YMcz0TQXzGfS0=">ACS3icbVC7TgMxEPSF9zMBSpoTCRI0V0oESCgoICJAJIuSjy+TbEwo+TvQZFp/sSWvgcPoDvoEMUOCEFBEayPJrZ9a4nzQW3GEVvQWVmdm5+YXFpeWV1b1a29i8toZBm2mhTa3KbUguI2chRwmxugMhVwk96fjPybBzCWa3WFwxy6kt4p3ueMopd6tWojASF6uJewTON+o1erR81ojPAviSekTia46G0EjSTzElQyAS1thNHOXYLapAzAeVy4izklN3TO+h4qgE2y3Gm5fhrleysK+NPwrDsfqzo6DS2qFMfaWkOLDT3kj8z+s47B91C65yh6DY96C+EyHqcBRDmHEDMXQE8oM97uGbEANZejD+jUFlJMcQfqfKHhkWkqsiJh52Uxim5KdlnuX5HeQy4y8NcAkJalzWeTvEvuW414Nm67JVPz6dJLxItskO2SMxOSTH5IxckDZhxJEn8kxegtfgPfgIPr9LK8GkZ4v8QmXuC+ips9U=</latexit>θt
<latexit sha1_base64="Ug62yUj+VwH/Ec6Iq/XsAUXW9c=">ACRHicbVDLTsMwEHR4lvfryCWiReJUJeUAx0pw4MChSLRFNFHlOFtqYTuRvQFVUf6CK3wO/8A/cENcEW6bAxRGsjSa2fXuTpQKbtDz3py5+YXFpeXKyura+sbm1vbObsckmWbQZolI9E1EDQiuoI0cBdykGqiMBHSj+7Ox30AbXirnGUQijpneIDziha6bYW4BCQ9rHW3656dW8C9y/xS1IlJVr9HacWxAnLJChkghrT870Uw5xq5ExAsRpkBlLK7ukd9CxVIJ8nKhXtoldgdJNo+he5E/dmRU2nMSEa2UlIcmlvLP7n9TIcnIY5V2mGoNh0CATLibu+H435hoYipElGlud3XZkGrK0Kb0awqoTHIEaS9R8MgSKamK84BdFnkAQszIWZzaX6T1kIsY8mwRWFz9WdT/Es6jbp/XG9cNarN8zLhCtknB+SI+OSENMkFaZE2YUSRJ/JMXpxX5935cD6npXNO2bNHfsH5+gbZ5Lz</latexit>˜ θt = Φt(θt)
<latexit sha1_base64="m0cJZ0NH9MDCetA2fnDZJc1MCRM=">ACfnicbVBNaxsxEB1v+pG4X05yzEXEbUkLdXbdQ3MJhLaHnpwoU4Clm02rEtImkXabFLP4/+TW5Jv03lT8KsdMHgsd7M5qZl5VaeYrjP41o69HjJ0+3d5rPnr94+aq1u3fui8pJ7MtCF+4yEx61stgnRovS4fCZBovsqsvc/iFzqvCvuTpiUOjRhbNVJSUJDS1me4VjZWmg1tu9nTU5K51hzmiCJWUrslHe5L2JSuloqab0rsnR5v960lY7sQLsIckWZE2rNBLdxuveV7IyqAlqYX3gyQuaVgLR0pqDEtUHkshr8QYB4FaYdAP68WxM/YmKDkbFS48S2yh3u+ohfF+arJQaQRN/KY3F/nDSoanQxrZcuK0MrloFGlGRVsnhzLlUNJehqIkE6FXZmcCckhXzXpqCtjCI04RKLv2VhjAhxcfl9VnPUekOu8jL8YoK3lv0812QzxYfkvNtJPna6P7rts6+rhLfhA7hCBL4BGfwDXrQBwnXcAO3cBdB9Db6EB0vS6PGqmcf1hCd/AURgMT+</latexit>˜ ✓t = ✓t ↵rˆ `t(✓t)
<latexit sha1_base64="74CL0P8mDCowy82JKqCu5UKlf6U=">AChHicbVFNaxsxEJU3bZqmH7HbYy+i3kIKrdm1Kc2lJZAecughdoOeI2Z1Y6zIpJ2kWYbzLJ/Kb+mlx6S3xL5o1AnHRB6vPdGIz2lpZKOouhPK9h59Hj3yd7T/WfPX7w8aHdejVxRWYFDUajCnqfgUEmDQ5Kk8Ly0CDpVOE4vT5b6+BdaJwvzkxYlTjVcGDmXAshTs/ZpGPKEpMqwTihHgmZG/AtfYw8/8gRUmQNPDKTKbzlQnaBS3nf41/Weh+Gs3Y160ar4QxBvQJdt6mzWaYVJVohKoyGhwLlJHJU0rcGSFAqb/aRyWIK4hAuceGhAo5vWqyc3/J1nMj4vrF+G+Ir9t6MG7dxCp96pgXJ3X1uS/9MmFc2PprU0ZUVoxHrQvFKcCr7Mj2fSoiC18ACElf6uXORgQZBPeWsKmkpLQu1fYvBKFqDyepEfG/WAW7TVb6U7TXtn6j8bnG91N8CEb9Xjzo9X/0u8fNgnvsTfsLTtkMfvMjtkpO2NDJtg1+81u2G2wG3wIBsGntTVobXpes60Kvt4BPp7GQ=</latexit>The Online Meta-Learning Se6ng
Space of parameters ✓ ∈ Θ ⊆ Rd and loss functions ` : Θ → R For round t ∈ {1, 2, . . . ∞}:
- 1. World picks a loss function `t(·)
- 2. Agent should pick ✓t without knowledge of `t
- 3. Agent uses update procedure Φt : Θ → Θ, and obtains ˜
✓t = Φt(✓t)
- 4. Agent suffers `t(˜
✓t) for the round
<latexit sha1_base64="lzUmXogL4embxjWdgvnwZayh68=">AEcHicdZPbhMxEIa3aYASTi3cIHBQBepVFWVpEKgSpWKQIiLXhToSYpD5PXOJla89uIDUbTKi/A03MIb8Bo8Ad5DS5OCr2Zn/HtmvpmNMsGNbd/LTWm9eu31i52bp1+87de6tr90+McprhMVNC6bOIGhRc4rHlVuBZpGmkcDTaPymiJ9+RW24kd2mE/pUPJE86o9a7BWmOHRDjkMk+EMyOBiZ21PmWUIagEMqpitbLISR2hJYC4RLIUWUaFxkf/QIkpXYURfnH2ec4BCpjEMoYSJxkRZCzQ5g90Ko+XBkqdZqckaAiG9F5nt94pDVo5/0xoq4R5Zwu6W0BErKwpXImdApmFu626epQuRU0tznpFDynV3ru32W8RbjE1mLUxrWw4VrEkHE2NkDnCy3rHNgNwnye52EteD1EacGMlKt15zAGNoQJtz5gYSzVRGA8LlVz8zrnUEDOSknlkfC4QxcFvuKIdOKYey094TkcMQH9j+oSle4VQJWkaW8JOvickxetQeVfO8vMUWXJUs6y6vFA+DyHxzL2q4t4iKONLTKvxsyWF1vb7fLA1eNTm2sB/U5HKwthSRWzL8nLRPUmF6n7SedU205E0UCj8dv3ZgOsedN6dfO9PMS1gyeU9cVpgo30bpvazIaWrMNI38zWKbzGKscP4r1nM2edXPucycRcmqRIkTYBU/wrEXCOzYuoNyjT3tQIbeWSs+CXmshSsCtK+E4kTptLUD8lP+2CWExRiwV2D9zHLRYx5Na9ZwbWzSPGqcdLd7uxsdz901/f1oRXgkfB02Aj6AQvg/3gfXAYHAes8a3xvfGj8XP5d/Nh83HzSXW1sVRrHgRzp7n5B2/+d1k=</latexit>round : t
<latexit sha1_base64="QpLTBGJVTJK0PRBwp5wgSav8E=">AB8nicbVDLSgNBEJyNrxhfUY9eBhPBU9iNB8VTQBCPEcwDNkuYnZ1NhszOLDO9QlgC/oQXD4p49Wu8+TdOHgdNLGgoqrp7gpTwQ247rdTWFvf2Nwqbpd2dvf2D8qHR2jMk1ZiyqhdDckhgkuWQs4CNZNSNJKFgnHN1M/c4j04Yr+QDjlAUJGUgec0rASr5WmYzwNa5CtV+uDV3BrxKvAWpoAWa/fJXL1I0S5gEKogxvuemEOREA6eCTUq9zLCU0BEZMN9SRJmgnx28gSfWSXCsdK2JOCZ+nsiJ4kx4yS0nQmBoVn2puJ/np9BfBXkXKYZMEni+JMYFB4+j+OuGYUxNgSQjW3t2I6JpQsCmVbAje8surpF2veRe1+n290rh9msdRCfoFJ0jD12iBrpDTdRCFCn0jF7RmwPOi/PufMxbC84iwmP0B87nD/hQkMw=</latexit>`t(·)
<latexit sha1_base64="51E2b5kWBxUvu4YMcz0TQXzGfS0=">ACS3icbVC7TgMxEPSF9zMBSpoTCRI0V0oESCgoICJAJIuSjy+TbEwo+TvQZFp/sSWvgcPoDvoEMUOCEFBEayPJrZ9a4nzQW3GEVvQWVmdm5+YXFpeWV1b1a29i8toZBm2mhTa3KbUguI2chRwmxugMhVwk96fjPybBzCWa3WFwxy6kt4p3ueMopd6tWojASF6uJewTON+o1erR81ojPAviSekTia46G0EjSTzElQyAS1thNHOXYLapAzAeVy4izklN3TO+h4qgE2y3Gm5fhrleysK+NPwrDsfqzo6DS2qFMfaWkOLDT3kj8z+s47B91C65yh6DY96C+EyHqcBRDmHEDMXQE8oM97uGbEANZejD+jUFlJMcQfqfKHhkWkqsiJh52Uxim5KdlnuX5HeQy4y8NcAkJalzWeTvEvuW414Nm67JVPz6dJLxItskO2SMxOSTH5IxckDZhxJEn8kxegtfgPfgIPr9LK8GkZ4v8QmXuC+ips9U=</latexit>θt
<latexit sha1_base64="Ug62yUj+VwH/Ec6Iq/XsAUXW9c=">ACRHicbVDLTsMwEHR4lvfryCWiReJUJeUAx0pw4MChSLRFNFHlOFtqYTuRvQFVUf6CK3wO/8A/cENcEW6bAxRGsjSa2fXuTpQKbtDz3py5+YXFpeXKyura+sbm1vbObsckmWbQZolI9E1EDQiuoI0cBdykGqiMBHSj+7Ox30AbXirnGUQijpneIDziha6bYW4BCQ9rHW3656dW8C9y/xS1IlJVr9HacWxAnLJChkghrT870Uw5xq5ExAsRpkBlLK7ukd9CxVIJ8nKhXtoldgdJNo+he5E/dmRU2nMSEa2UlIcmlvLP7n9TIcnIY5V2mGoNh0CATLibu+H435hoYipElGlud3XZkGrK0Kb0awqoTHIEaS9R8MgSKamK84BdFnkAQszIWZzaX6T1kIsY8mwRWFz9WdT/Es6jbp/XG9cNarN8zLhCtknB+SI+OSENMkFaZE2YUSRJ/JMXpxX5935cD6npXNO2bNHfsH5+gbZ5Lz</latexit>˜ ✓t = ✓t ↵rˆ `t(✓t)
<latexit sha1_base64="74CL0P8mDCowy82JKqCu5UKlf6U=">AChHicbVFNaxsxEJU3bZqmH7HbYy+i3kIKrdm1Kc2lJZAecughdoOeI2Z1Y6zIpJ2kWYbzLJ/Kb+mlx6S3xL5o1AnHRB6vPdGIz2lpZKOouhPK9h59Hj3yd7T/WfPX7w8aHdejVxRWYFDUajCnqfgUEmDQ5Kk8Ly0CDpVOE4vT5b6+BdaJwvzkxYlTjVcGDmXAshTs/ZpGPKEpMqwTihHgmZG/AtfYw8/8gRUmQNPDKTKbzlQnaBS3nf41/Weh+Gs3Y160ar4QxBvQJdt6mzWaYVJVohKoyGhwLlJHJU0rcGSFAqb/aRyWIK4hAuceGhAo5vWqyc3/J1nMj4vrF+G+Ir9t6MG7dxCp96pgXJ3X1uS/9MmFc2PprU0ZUVoxHrQvFKcCr7Mj2fSoiC18ACElf6uXORgQZBPeWsKmkpLQu1fYvBKFqDyepEfG/WAW7TVb6U7TXtn6j8bnG91N8CEb9Xjzo9X/0u8fNgnvsTfsLTtkMfvMjtkpO2NDJtg1+81u2G2wG3wIBsGntTVobXpes60Kvt4BPp7GQ=</latexit>˜ θt = Φt(θt)
<latexit sha1_base64="m0cJZ0NH9MDCetA2fnDZJc1MCRM=">ACfnicbVBNaxsxEB1v+pG4X05yzEXEbUkLdXbdQ3MJhLaHnpwoU4Clm02rEtImkXabFLP4/+TW5Jv03lT8KsdMHgsd7M5qZl5VaeYrjP41o69HjJ0+3d5rPnr94+aq1u3fui8pJ7MtCF+4yEx61stgnRovS4fCZBovsqsvc/iFzqvCvuTpiUOjRhbNVJSUJDS1me4VjZWmg1tu9nTU5K51hzmiCJWUrslHe5L2JSuloqab0rsnR5v960lY7sQLsIckWZE2rNBLdxuveV7IyqAlqYX3gyQuaVgLR0pqDEtUHkshr8QYB4FaYdAP68WxM/YmKDkbFS48S2yh3u+ohfF+arJQaQRN/KY3F/nDSoanQxrZcuK0MrloFGlGRVsnhzLlUNJehqIkE6FXZmcCckhXzXpqCtjCI04RKLv2VhjAhxcfl9VnPUekOu8jL8YoK3lv0812QzxYfkvNtJPna6P7rts6+rhLfhA7hCBL4BGfwDXrQBwnXcAO3cBdB9Db6EB0vS6PGqmcf1hCd/AURgMT+</latexit>The Online Meta-Learning Se6ng
Space of parameters ✓ ∈ Θ ⊆ Rd and loss functions ` : Θ → R For round t ∈ {1, 2, . . . ∞}:
- 1. World picks a loss function `t(·)
- 2. Agent should pick ✓t without knowledge of `t
- 3. Agent uses update procedure Φt : Θ → Θ, and obtains ˜
✓t = Φt(✓t)
- 4. Agent suffers `t(˜
✓t) for the round
<latexit sha1_base64="lzUmXogL4embxjWdgvnwZayh68=">AEcHicdZPbhMxEIa3aYASTi3cIHBQBepVFWVpEKgSpWKQIiLXhToSYpD5PXOJla89uIDUbTKi/A03MIb8Bo8Ad5DS5OCr2Zn/HtmvpmNMsGNbd/LTWm9eu31i52bp1+87de6tr90+McprhMVNC6bOIGhRc4rHlVuBZpGmkcDTaPymiJ9+RW24kd2mE/pUPJE86o9a7BWmOHRDjkMk+EMyOBiZ21PmWUIagEMqpitbLISR2hJYC4RLIUWUaFxkf/QIkpXYURfnH2ec4BCpjEMoYSJxkRZCzQ5g90Ko+XBkqdZqckaAiG9F5nt94pDVo5/0xoq4R5Zwu6W0BErKwpXImdApmFu626epQuRU0tznpFDynV3ru32W8RbjE1mLUxrWw4VrEkHE2NkDnCy3rHNgNwnye52EteD1EacGMlKt15zAGNoQJtz5gYSzVRGA8LlVz8zrnUEDOSknlkfC4QxcFvuKIdOKYey094TkcMQH9j+oSle4VQJWkaW8JOvickxetQeVfO8vMUWXJUs6y6vFA+DyHxzL2q4t4iKONLTKvxsyWF1vb7fLA1eNTm2sB/U5HKwthSRWzL8nLRPUmF6n7SedU205E0UCj8dv3ZgOsedN6dfO9PMS1gyeU9cVpgo30bpvazIaWrMNI38zWKbzGKscP4r1nM2edXPucycRcmqRIkTYBU/wrEXCOzYuoNyjT3tQIbeWSs+CXmshSsCtK+E4kTptLUD8lP+2CWExRiwV2D9zHLRYx5Na9ZwbWzSPGqcdLd7uxsdz901/f1oRXgkfB02Aj6AQvg/3gfXAYHAes8a3xvfGj8XP5d/Nh83HzSXW1sVRrHgRzp7n5B2/+d1k=</latexit>round : t
<latexit sha1_base64="QpLTBGJVTJK0PRBwp5wgSav8E=">AB8nicbVDLSgNBEJyNrxhfUY9eBhPBU9iNB8VTQBCPEcwDNkuYnZ1NhszOLDO9QlgC/oQXD4p49Wu8+TdOHgdNLGgoqrp7gpTwQ247rdTWFvf2Nwqbpd2dvf2D8qHR2jMk1ZiyqhdDckhgkuWQs4CNZNSNJKFgnHN1M/c4j04Yr+QDjlAUJGUgec0rASr5WmYzwNa5CtV+uDV3BrxKvAWpoAWa/fJXL1I0S5gEKogxvuemEOREA6eCTUq9zLCU0BEZMN9SRJmgnx28gSfWSXCsdK2JOCZ+nsiJ4kx4yS0nQmBoVn2puJ/np9BfBXkXKYZMEni+JMYFB4+j+OuGYUxNgSQjW3t2I6JpQsCmVbAje8surpF2veRe1+n290rh9msdRCfoFJ0jD12iBrpDTdRCFCn0jF7RmwPOi/PufMxbC84iwmP0B87nD/hQkMw=</latexit>`t(·)
<latexit sha1_base64="51E2b5kWBxUvu4YMcz0TQXzGfS0=">ACS3icbVC7TgMxEPSF9zMBSpoTCRI0V0oESCgoICJAJIuSjy+TbEwo+TvQZFp/sSWvgcPoDvoEMUOCEFBEayPJrZ9a4nzQW3GEVvQWVmdm5+YXFpeWV1b1a29i8toZBm2mhTa3KbUguI2chRwmxugMhVwk96fjPybBzCWa3WFwxy6kt4p3ueMopd6tWojASF6uJewTON+o1erR81ojPAviSekTia46G0EjSTzElQyAS1thNHOXYLapAzAeVy4izklN3TO+h4qgE2y3Gm5fhrleysK+NPwrDsfqzo6DS2qFMfaWkOLDT3kj8z+s47B91C65yh6DY96C+EyHqcBRDmHEDMXQE8oM97uGbEANZejD+jUFlJMcQfqfKHhkWkqsiJh52Uxim5KdlnuX5HeQy4y8NcAkJalzWeTvEvuW414Nm67JVPz6dJLxItskO2SMxOSTH5IxckDZhxJEn8kxegtfgPfgIPr9LK8GkZ4v8QmXuC+ips9U=</latexit>θt
<latexit sha1_base64="Ug62yUj+VwH/Ec6Iq/XsAUXW9c=">ACRHicbVDLTsMwEHR4lvfryCWiReJUJeUAx0pw4MChSLRFNFHlOFtqYTuRvQFVUf6CK3wO/8A/cENcEW6bAxRGsjSa2fXuTpQKbtDz3py5+YXFpeXKyura+sbm1vbObsckmWbQZolI9E1EDQiuoI0cBdykGqiMBHSj+7Ox30AbXirnGUQijpneIDziha6bYW4BCQ9rHW3656dW8C9y/xS1IlJVr9HacWxAnLJChkghrT870Uw5xq5ExAsRpkBlLK7ukd9CxVIJ8nKhXtoldgdJNo+he5E/dmRU2nMSEa2UlIcmlvLP7n9TIcnIY5V2mGoNh0CATLibu+H435hoYipElGlud3XZkGrK0Kb0awqoTHIEaS9R8MgSKamK84BdFnkAQszIWZzaX6T1kIsY8mwRWFz9WdT/Es6jbp/XG9cNarN8zLhCtknB+SI+OSENMkFaZE2YUSRJ/JMXpxX5935cD6npXNO2bNHfsH5+gbZ5Lz</latexit>˜ ✓t = ✓t ↵rˆ `t(✓t)
<latexit sha1_base64="74CL0P8mDCowy82JKqCu5UKlf6U=">AChHicbVFNaxsxEJU3bZqmH7HbYy+i3kIKrdm1Kc2lJZAecughdoOeI2Z1Y6zIpJ2kWYbzLJ/Kb+mlx6S3xL5o1AnHRB6vPdGIz2lpZKOouhPK9h59Hj3yd7T/WfPX7w8aHdejVxRWYFDUajCnqfgUEmDQ5Kk8Ly0CDpVOE4vT5b6+BdaJwvzkxYlTjVcGDmXAshTs/ZpGPKEpMqwTihHgmZG/AtfYw8/8gRUmQNPDKTKbzlQnaBS3nf41/Weh+Gs3Y160ar4QxBvQJdt6mzWaYVJVohKoyGhwLlJHJU0rcGSFAqb/aRyWIK4hAuceGhAo5vWqyc3/J1nMj4vrF+G+Ir9t6MG7dxCp96pgXJ3X1uS/9MmFc2PprU0ZUVoxHrQvFKcCr7Mj2fSoiC18ACElf6uXORgQZBPeWsKmkpLQu1fYvBKFqDyepEfG/WAW7TVb6U7TXtn6j8bnG91N8CEb9Xjzo9X/0u8fNgnvsTfsLTtkMfvMjtkpO2NDJtg1+81u2G2wG3wIBsGntTVobXpes60Kvt4BPp7GQ=</latexit>˜ θt = Φt(θt)
<latexit sha1_base64="m0cJZ0NH9MDCetA2fnDZJc1MCRM=">ACfnicbVBNaxsxEB1v+pG4X05yzEXEbUkLdXbdQ3MJhLaHnpwoU4Clm02rEtImkXabFLP4/+TW5Jv03lT8KsdMHgsd7M5qZl5VaeYrjP41o69HjJ0+3d5rPnr94+aq1u3fui8pJ7MtCF+4yEx61stgnRovS4fCZBovsqsvc/iFzqvCvuTpiUOjRhbNVJSUJDS1me4VjZWmg1tu9nTU5K51hzmiCJWUrslHe5L2JSuloqab0rsnR5v960lY7sQLsIckWZE2rNBLdxuveV7IyqAlqYX3gyQuaVgLR0pqDEtUHkshr8QYB4FaYdAP68WxM/YmKDkbFS48S2yh3u+ohfF+arJQaQRN/KY3F/nDSoanQxrZcuK0MrloFGlGRVsnhzLlUNJehqIkE6FXZmcCckhXzXpqCtjCI04RKLv2VhjAhxcfl9VnPUekOu8jL8YoK3lv0812QzxYfkvNtJPna6P7rts6+rhLfhA7hCBL4BGfwDXrQBwnXcAO3cBdB9Db6EB0vS6PGqmcf1hCd/AURgMT+</latexit>RegretT :=
T
X
t=1
`t(Φt(✓t)) − min
θ∈Θ T
X
t=1
`t(Φt(✓))
<latexit sha1_base64="2+KP9DCIWvgRsB3gBx4m2dSNKQ=">ACu3ichVFNbxMxEPUuX6UFmvJx4mKRWoPRLvhQIUqRIcOHAIKGkrZcPK650kprZ3ZY+LotWK38kP4H/gTfZAUyRGsvz0Zt6M/SavpLAYx7+C8M7de/cf7D3cP3j0+Mlh7+jpuS2d4TDlpSzNZc4sSKFhigIlXFYGmMolXORXH9r8xTUYK0o9wXUFc8WiwEZ+iprPczimiqGK6Mqr/C0gA2YS+H9HUOpXVOEqabxOagpQZHqfjlWgvXAGyDE9O6BsvFjqrtxRNhabpIXN/xt4eRlvX48iDdBb4OkA3SxTg7CqK0KLlToJFLZu0siSuc18yg4BKa/dRZqBi/YkuYeaiZAjuvN0419LVnCrojT8a6Yb9W1EzZe1a5b6yNcXu5lryX7mZw8XpvBa6cgiabwctnKRY0tZ2WgDHOXaA8aN8G+lfMUM4+iXc2MKaKcEgvI/0fCDl0oxXdQp/9zUrYs7tCsq30X5HApZQLeIpvG+Jrsu3gbnw0HydjD8Muyfewc3iMvyStyTBLyjpyRT2RMpoST38FB8Dx4EY5CHn4P5bY0DrNM3IjQvcHPI7YVw=</latexit>Goal: Learning algorithm with sub-linear
Loss of algorithm Loss of best algorithm in hindsight
✓t+1 = arg min
θ T
X
t=1
`t(Φt(✓))
<latexit sha1_base64="stWfLtnyp3w6Pnsz3MvNJqi4JU=">ACMnicbVDLahsxFNWkeTiTl5MuxGxCzYBM+Mumk3ApJt250CcBDzuoJGvbRFJM0h3AmbwN2WTLyl0S4Srf5iMiPRWLngNDhnHOR7kyKSwGwR9v7cP6xuZWadvf2d3bPygfHl3ZNDcOjyVqblJmAUpNHRQoISbzABTiYTr5Pb1L+A2NFqi9xnEFPsaEWA8EZOiku/6hW/QhHgCwu8CSc0DMaMTOkRI6nhs0srly7lk4+XlJI5AyxlrUHonpNUvU6361GpcrQSOYga6ScEqZIF2XP4V9VOeK9DIJbO2GwYZ9gpmUHAJEz/KLWSM37IhdB3VTIHtFbOVJ/SzU/p0kBp3NKZ+nqiYMrasUpcUjEc2WVvKr7ndXMcnPYKobMcQfP5Q4NcUkzptD/aFwY4yrEjBvh/kr5iBnG0bXsuxLC5ZVXyVWzEX5pNC+aldb5o4S+USOSY2E5Ctpke+kTqEk3vymzySJ+/B+v98/7Po2veYuYjeQPv+QX/agU</latexit>Follow the Meta-Leader (FTML) :
Can be implemented with MAML
✓t+1 = arg min
θ T
X
t=1
`t(Φt(✓))
<latexit sha1_base64="stWfLtnyp3w6Pnsz3MvNJqi4JU=">ACMnicbVDLahsxFNWkeTiTl5MuxGxCzYBM+Mumk3ApJt250CcBDzuoJGvbRFJM0h3AmbwN2WTLyl0S4Srf5iMiPRWLngNDhnHOR7kyKSwGwR9v7cP6xuZWadvf2d3bPygfHl3ZNDcOjyVqblJmAUpNHRQoISbzABTiYTr5Pb1L+A2NFqi9xnEFPsaEWA8EZOiku/6hW/QhHgCwu8CSc0DMaMTOkRI6nhs0srly7lk4+XlJI5AyxlrUHonpNUvU6361GpcrQSOYga6ScEqZIF2XP4V9VOeK9DIJbO2GwYZ9gpmUHAJEz/KLWSM37IhdB3VTIHtFbOVJ/SzU/p0kBp3NKZ+nqiYMrasUpcUjEc2WVvKr7ndXMcnPYKobMcQfP5Q4NcUkzptD/aFwY4yrEjBvh/kr5iBnG0bXsuxLC5ZVXyVWzEX5pNC+aldb5o4S+USOSY2E5Ctpke+kTqEk3vymzySJ+/B+v98/7Po2veYuYjeQPv+QX/agU</latexit>Follow the Meta-Leader (FTML) :
Can be implemented with MAML
Theorem (Informal): If {`t(·), ˆ `t(·)} ∀t are C2-smooth and strongly convex, the sequence of models {✓1, ✓2, . . . , ✓T } returned by FTML has the property: RegretT :=
T
X
t=1
`t(Φt(✓t)) − min
θ∈Θ T
X
t=1
`t(Φt(✓)) = O(log T)
<latexit sha1_base64="MCxhYOZUjzyjEXj/6CRfBMsys=">ADdnichVJtaxNBEL4kvtT40lY/iSBDk2ICbcnFD0qhUCyIBcUq6Qt02Nvby63dF/O3b1iO4f+Ov85u/wix/dS2KpreDAsg/P7DMzOzNxLrh1g8GPRrN16/adu0v32vcfPHy0vL6+MjqwjA8ZFpocxJTi4IrPHTcCTzJDVIZCzyOz/dq/EFGsu1GrlpjmNJ4qnFHnqWi18Y3EOGqTEVhM4Gpq9oliVMYZagNygp6+yrVRlLR34b9FLqkJChE5HqEJdr1N4Bk1M246pIkFRAgXkaFANcFahC6e2fD7qaVWrsMqErAOqPVREyBaXWBXzfAZQgWvxSoGIJOQeoEha0zAvE+R6Nw4w8aeiR8KnvJjIBUXTDoCqMwgXgKb0cf3kNG7SxwbnSOxk23291um0jqMiPLzjxgsprt3eA2EJGpdsJqzMfa/Hg4zX1zyF6/dhE4jkKirnFBCugIxqWP0/gJfvwMceEXoCo/6sEFTJldZHK53B1mBmcBOEC9AJFnYQrXwniWaFROWYoNaehoPcjUtqHGcCqzYpLOaUndMJnqoqEQ7LmdrU8G6ZxLwU/JHOZixVxUldZOZexf1v2y1301+S/faeHS1+OSq7xwfpbzRGnhN0FDvYOQcIPM+cknDLDfa3AMmoc35T274J4fUv3wRHw63w5dbw07Cz+2bRjqXgWbAW9IweBXsBu+Cg+AwYI2fzafNtWan+av1vLXejF/2mwsNE+Cv6w1+A3arRVZ</latexit>= ⇒ Avg. Regret = RegretT T → 0 as T → ∞
<latexit sha1_base64="HoSHw5vJ1khIzVpIq58fY7mBzPU=">ACrXicbVFNbxMxEPUu0JbwlcKRi0UWiVPYTQ9wqdQKDhw4FJS0leIQOd7ZjVXbu9izLZG1vxPxY5Bw0kUiKSNZfnpvPjzPi1pJh2n6K4rv3X+wt3/wsPfo8ZOnz/qHz89d1VgBE1Gpyl4uAMlDUxQoL2gLXCwUXi6sPa/3iGqyTlRnjqoaZ5qWRhRQcAzXvf0+SHpM6jAJHmea4tNqfXpdDRr9CaQFbekxZYbnwf9WOn49bP24ps7JcIre2uqEpZQg/0FPuaEvHWxqTpsBVL0nm/UE6TDdB74KsAwPSxdn8MEpYXolGg0GhuHPTLK1x5rlFKRS0PdY4qLm4iVMAzRcg5v5jTctfR2YnBaVDcg3bD/VniunVvpRchc7+d2tTX5P23aYPF+5qWpGwQjbgcVjaJY0bXRNJcWBKpVAFxYGd5KxZIHIzF8x9YUMI2WCDpsYuBGVFpzk3smPregVI7dJPXoYsOGkqVQ7iWgLxtg6/Zrot3wflomB0NR19Gg5OPncMH5CV5Rd6QjLwjJ+QTOSMTIshP8jvai/bjt/EkZvG329Q46mpekK2Iyz+uEdW4</latexit>Learning in a sequen9al non-sta9onary seDng, but s9ll compe99ve with best meta-learner in hindsight!
✓t+1 = arg min
θ T
X
t=1
`t(Φt(✓))
<latexit sha1_base64="stWfLtnyp3w6Pnsz3MvNJqi4JU=">ACMnicbVDLahsxFNWkeTiTl5MuxGxCzYBM+Mumk3ApJt250CcBDzuoJGvbRFJM0h3AmbwN2WTLyl0S4Srf5iMiPRWLngNDhnHOR7kyKSwGwR9v7cP6xuZWadvf2d3bPygfHl3ZNDcOjyVqblJmAUpNHRQoISbzABTiYTr5Pb1L+A2NFqi9xnEFPsaEWA8EZOiku/6hW/QhHgCwu8CSc0DMaMTOkRI6nhs0srly7lk4+XlJI5AyxlrUHonpNUvU6361GpcrQSOYga6ScEqZIF2XP4V9VOeK9DIJbO2GwYZ9gpmUHAJEz/KLWSM37IhdB3VTIHtFbOVJ/SzU/p0kBp3NKZ+nqiYMrasUpcUjEc2WVvKr7ndXMcnPYKobMcQfP5Q4NcUkzptD/aFwY4yrEjBvh/kr5iBnG0bXsuxLC5ZVXyVWzEX5pNC+aldb5o4S+USOSY2E5Ctpke+kTqEk3vymzySJ+/B+v98/7Po2veYuYjeQPv+QX/agU</latexit>Follow the Meta-Leader (FTML) :
Can be implemented with MAML
Theorem (Informal): If {`t(·), ˆ `t(·)} ∀t are C2-smooth and strongly convex, the sequence of models {✓1, ✓2, . . . , ✓T } returned by FTML has the property: RegretT :=
T
X
t=1
`t(Φt(✓t)) − min
θ∈Θ T
X
t=1
`t(Φt(✓)) = O(log T)
<latexit sha1_base64="MCxhYOZUjzyjEXj/6CRfBMsys=">ADdnichVJtaxNBEL4kvtT40lY/iSBDk2ICbcnFD0qhUCyIBcUq6Qt02Nvby63dF/O3b1iO4f+Ov85u/wix/dS2KpreDAsg/P7DMzOzNxLrh1g8GPRrN16/adu0v32vcfPHy0vL6+MjqwjA8ZFpocxJTi4IrPHTcCTzJDVIZCzyOz/dq/EFGsu1GrlpjmNJ4qnFHnqWi18Y3EOGqTEVhM4Gpq9oliVMYZagNygp6+yrVRlLR34b9FLqkJChE5HqEJdr1N4Bk1M246pIkFRAgXkaFANcFahC6e2fD7qaVWrsMqErAOqPVREyBaXWBXzfAZQgWvxSoGIJOQeoEha0zAvE+R6Nw4w8aeiR8KnvJjIBUXTDoCqMwgXgKb0cf3kNG7SxwbnSOxk23291um0jqMiPLzjxgsprt3eA2EJGpdsJqzMfa/Hg4zX1zyF6/dhE4jkKirnFBCugIxqWP0/gJfvwMceEXoCo/6sEFTJldZHK53B1mBmcBOEC9AJFnYQrXwniWaFROWYoNaehoPcjUtqHGcCqzYpLOaUndMJnqoqEQ7LmdrU8G6ZxLwU/JHOZixVxUldZOZexf1v2y1301+S/faeHS1+OSq7xwfpbzRGnhN0FDvYOQcIPM+cknDLDfa3AMmoc35T274J4fUv3wRHw63w5dbw07Cz+2bRjqXgWbAW9IweBXsBu+Cg+AwYI2fzafNtWan+av1vLXejF/2mwsNE+Cv6w1+A3arRVZ</latexit>FTML: prac9cal instan9a9on of our approach, extending MAML1 meta-train on all data so far, fine-tune on current task
[1] Finn et al. ICML ’17
Experiment with sequences of tasks: FTML: prac9cal instan9a9on of our approach, extending MAML1 meta-train on all data so far, fine-tune on current task
[1] Finn et al. ICML ’17
Experiment with sequences of tasks:
- Colored, rotated, scaled MNIST
FTML: prac9cal instan9a9on of our approach, extending MAML1 meta-train on all data so far, fine-tune on current task
[1] Finn et al. ICML ’17
Experiment with sequences of tasks:
- Colored, rotated, scaled MNIST
- 3D object pose predicHon
FTML: prac9cal instan9a9on of our approach, extending MAML1
Example pose predicNon tasks plane car chair
meta-train on all data so far, fine-tune on current task
[1] Finn et al. ICML ’17
Experiment with sequences of tasks:
- Colored, rotated, scaled MNIST
- 3D object pose predicHon
- CIFAR-100 classificaNon
FTML: prac9cal instan9a9on of our approach, extending MAML1
Example pose predicNon tasks plane car chair
meta-train on all data so far, fine-tune on current task
[1] Finn et al. ICML ’17
Experiments
task index
Experiments
Learning efficiency
(# datapoints)
task index Learning proficiency
(error)
task index
Experiments
Learning efficiency
(# datapoints)
Rainbow MNIST Pose Predic9on task index Learning proficiency
(error)
FTML (ours) Rainbow MNIST Pose Predic9on
task index
Experiments
Learning efficiency
(# datapoints)
Rainbow MNIST Pose Predic9on task index Learning proficiency
(error)
FTML (ours) learns each new task faster & with greater proficiency, Rainbow MNIST Pose Predic9on
task index
Experiments
Learning efficiency
(# datapoints)
Rainbow MNIST Pose Predic9on task index Learning proficiency
(error)
FTML (ours) learns each new task faster & with greater proficiency, approaches few-shot learning regime Rainbow MNIST Pose Predic9on
Takeaways
Introduced online meta-learning problem formulaNon Meta-learning is effecNve in non-staHonary se6ngs Similar guarantees to online learning, but beQer empirical performance
For more, come see us at poster #5!
Nme