Collaborators Joint work with Sarah Dean, Aurelia Guy, Horia Mania, - - PowerPoint PPT Presentation

collaborators
SMART_READER_LITE
LIVE PREVIEW

Collaborators Joint work with Sarah Dean, Aurelia Guy, Horia Mania, - - PowerPoint PPT Presentation

Collaborators Joint work with Sarah Dean, Aurelia Guy, Horia Mania, Nikolai Matni, Max Simchowitz, and Stephen Tu. trustable, scalable, predictable <latexit


slide-1
SLIDE 1

Collaborators

Joint work with Sarah Dean, Aurelia Guy, Horia Mania, Nikolai Matni, Max Simchowitz, and Stephen Tu.

slide-2
SLIDE 2

trustable, scalable, predictable

slide-3
SLIDE 3

xt

u x e

Ct is the cost. If you maximize, it’s called a reward. ft is the state-transition function

is an observed trajectory τt = (u1, . . . , ut−1, x0, . . . , xt)

<latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit><latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit><latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit><latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit>

minimize Ee hPT

t=1 Ct(xt, ut)

i s.t. xt+1 = ft(xt, ut, et) ut = πt(τt)

<latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit>

is the policy. This is the optimization decision variable. πt(τt)

<latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit><latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit><latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit><latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit>

xt is the state, ut is the input, et is a noise process

Optimal control

slide-4
SLIDE 4

Learning to control

xt

u x e

Ct is the cost. If you maximize, it’s called a reward. ft is the state-transition function

is an observed trajectory τt = (u1, . . . , ut−1, x0, . . . , xt)

<latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit><latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit><latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit><latexit sha1_base64="oTGOPnlC3lpbuJxkZHAlqk3gehs=">ACoXicbVHbahsxEJW3tzS9Oe1jX0RNwQHX7IZC8xIbaEt5MG9OAnYyzKrHdsiWmRsVm8U/0a/ra/kX/plrHgdrugOBwzlw0Z/JKSUdx/KcV3bp95+69vfv7Dx4+evykfD03BlvBQ6FUcZe5uBQSY1DkqTwsrIZa7wIr961+gX39E6afQ3WlSYljDVciIFUKCydm9M4DPiJ7zrs6Q3VoUh1/NZTa+SZW+exTfUPKPDrN2J+/Eq+C5I1qD1jHIDlrpuDCl6hJKHBulMQVpTVYkLhcn/sHVYgrmCKowA1lOjSerXWkr8MTMEnxoania/YfytqKJ1blHnILIFmbltryP9pI0+T47SWuvKEWlwPmnjFyfDGI15Ii4LUIgAQVoa/cjEDC4KCkxtTVr0rFBub1HOvpTAFbrGK5mQhkA6pBKmbreoPUin+FbTjZ3I6oxs1tG3k7ns5lcH9s3AufbiTHA6SbNu/C86P+kncTz6/7py+XZ9mjz1nL1iXJewNO2Uf2YANmWA/2E/2i/2OtGnaB9uU6NWuaZ2wjotFfdanQ+g=</latexit>

minimize Ee hPT

t=1 Ct(xt, ut)

i s.t. xt+1 = ft(xt, ut, et) ut = πt(τt)

<latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit>

is the policy. This is the optimization decision variable. πt(τt)

<latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit><latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit><latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit><latexit sha1_base64="DOx/ktybitgjChwuZWtodyh8jiA=">ACgHicbVFdSxtBFJ1s1apt/aiPvgwGIYKkuyK09ElaQR98UDQqJEu4O7lJLs7ObmfuFsPi7+hr/Vn+G2djBJN4cLhnPt9k1yT4zB8qgUfFhaXPi6vrH76/GVtfWPz67XLCquwpTKd2dsEHGoy2GJijbe5RUgTjTfJ3e9Kv/mL1lFmrniUY5zCwFCfFLCn4k5OXW50GIou73U36mEzHJucB9E1MXEzrubtbjTy1SRomGlwbl2FOYcl2CZlMaH1U7hMAd1BwNse2gReX46kf5K5nerKfWe+G5Zh9m1FC6twoTXxkCjx0s1pFvqe1C+7/iEsyecFo1EujfqElZ7I6geyRcV65AEoS35WqYZgQbE/1FSXce0c1dQm5X1hSGU9nGE137MFTzrkFMhUW5UnpLW8BOPkGQ2G/Kr6spXcOKYBsds/898we3PB/iHR7PnwfVBMwqb0cVh/ejX5DXLYlvsiIaIxHdxJE7FuWgJf6If+K/eAyCoBF8C6KX0KA2ydkSUxb8fAb8qMVJ</latexit>

Perennial challenge: how to perform optimal control when the system is unknown? unknown!

xt is the state, ut is the input, et is a noise process

How well must we understand a system in

  • rder to control it?
slide-5
SLIDE 5

∂ ∂t(ρu) + · (ρu u + pI) = · τ + ρg

M ˙ T = ˙ Q + ˙ mscp(Ts − T)

HVAC ROOM sensor state action

slide-6
SLIDE 6

∂ ∂t(ρu) + · (ρu u + pI) = · τ + ρg

M ˙ T = ˙ Q + ˙ mscp(Ts − T)

HVAC ROOM sensor state action

  • model predictive

control

  • reinforcement

learning

  • PID control?

Identify everything Identify a coarse model We don’t need no stinking models!

  • High performance

aerodynamics

We need robust fundamentals to distinguish these approaches

slide-7
SLIDE 7

But PID control works…

10

  • 2

10

  • 1

10 10

1

10

2

  • 50
  • 40
  • 30
  • 20
  • 10

10 20 30 40 50 Magnitude (dB) Bode Diagram Frequency (rad/sec)

One decade Gain crossover point Loglog slope = -1.5 2 ≈ 6dB 0.5 ≈ -6dB

2 parameters suffice for 95% of all control applications. How much needs to be modeled for more advanced control? Can we learn to compensate for poor models, changing conditions?

slide-8
SLIDE 8

Learning to control

xt

u x e

minimize Ee hPT

t=1 Ct(xt, ut)

i s.t. xt+1 = ft(xt, ut, et) ut = πt(τt)

<latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit>

What is the optimal estimation/design scheme? How many samples are needed for near optimal control?

Oracle: You can generate N trajectories of length T. Challenge: Build a controller with smallest error with fixed sampling budget (N x T).

slide-9
SLIDE 9

RL Methods

G

xt

u x e

approximate dynamic programming model-based

  • Model-based: fit model from data
  • Model-free
  • Direct policy search: search for actions from data
  • Approximate dynamic programming
  • estimate cost from data (with or without model)

direct policy search minimize Ee hPT

t=1 Ct(xt, ut)

i s.t. xt+1 = ft(xt, ut, et) ut = πt(τt)

<latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit>

How to solve optimal control when the model f is unknown?

slide-10
SLIDE 10

Model-based RL

minimize Ee hPT

t=1 Ct(xt, ut)

i s.t. xt+1 = f(xt, ut, et) ut = πt(τt)

<latexit sha1_base64="wtsO4CVxqkm5sp9iKOgdMu8Yxs=">ADGHicbVJNb9QwEHXCV7t8beHIxWJFtRXVNkFIlEOlioLg0EMR3bSOkSOd7Jr1XEie4J2ifJHuPJHuCGu3Pg3ONtUYncZydL4vednz4yTQkmLQfDH82/cvHX7zsZm5+69+w8edrcendm8NAKGIle5uUi4BSU1DFGigovCAM8SBefJ5VHDn38BY2WuT3FeQJTxiZapFBwdFHe/swQmUlfcGD6vK6XqDsuSfFZlUstMfoWablOWcZwmSfWujoEpSHEbJnFR6E9edTehRjfxbjbhnjDjNyMsWIsdbGDnDQWMyc+nlY0wOaXmt3odGzjZ1G0ewQjojhrzx6TDQ4/ZVcbcXDIJF0PUkbJMeaeMk3vIiNs5FmYFGobi1ozAoMHJ2KIUCV2JpoeDik9g5FLNM7BRtehmTZ85ZEzT3LilkS7Qf09UPLN2niVO2fTFrnIN+D9uVGK6H1VSFyWCFlcXpaWimNmNHQsDQhUc5dwYaR7KxVTbrhAN8ClWxbeBYilSqpZqaXIx7CKpyh4Q60gBmXuqmqei+Vop+4tvS4Gdg162wbuv9WTiTa3WP3S/TOmtgNJFxt/3py9mIQBoPw48ve4Zt2NBvkCXlK+iQkr8gh+UBOyJAIb9Pb8/a91/43/4f/0/91JfW9sxjshT+7+2e/1Q</latexit><latexit sha1_base64="wtsO4CVxqkm5sp9iKOgdMu8Yxs=">ADGHicbVJNb9QwEHXCV7t8beHIxWJFtRXVNkFIlEOlioLg0EMR3bSOkSOd7Jr1XEie4J2ifJHuPJHuCGu3Pg3ONtUYncZydL4vednz4yTQkmLQfDH82/cvHX7zsZm5+69+w8edrcendm8NAKGIle5uUi4BSU1DFGigovCAM8SBefJ5VHDn38BY2WuT3FeQJTxiZapFBwdFHe/swQmUlfcGD6vK6XqDsuSfFZlUstMfoWablOWcZwmSfWujoEpSHEbJnFR6E9edTehRjfxbjbhnjDjNyMsWIsdbGDnDQWMyc+nlY0wOaXmt3odGzjZ1G0ewQjojhrzx6TDQ4/ZVcbcXDIJF0PUkbJMeaeMk3vIiNs5FmYFGobi1ozAoMHJ2KIUCV2JpoeDik9g5FLNM7BRtehmTZ85ZEzT3LilkS7Qf09UPLN2niVO2fTFrnIN+D9uVGK6H1VSFyWCFlcXpaWimNmNHQsDQhUc5dwYaR7KxVTbrhAN8ClWxbeBYilSqpZqaXIx7CKpyh4Q60gBmXuqmqei+Vop+4tvS4Gdg162wbuv9WTiTa3WP3S/TOmtgNJFxt/3py9mIQBoPw48ve4Zt2NBvkCXlK+iQkr8gh+UBOyJAIb9Pb8/a91/43/4f/0/91JfW9sxjshT+7+2e/1Q</latexit><latexit sha1_base64="wtsO4CVxqkm5sp9iKOgdMu8Yxs=">ADGHicbVJNb9QwEHXCV7t8beHIxWJFtRXVNkFIlEOlioLg0EMR3bSOkSOd7Jr1XEie4J2ifJHuPJHuCGu3Pg3ONtUYncZydL4vednz4yTQkmLQfDH82/cvHX7zsZm5+69+w8edrcendm8NAKGIle5uUi4BSU1DFGigovCAM8SBefJ5VHDn38BY2WuT3FeQJTxiZapFBwdFHe/swQmUlfcGD6vK6XqDsuSfFZlUstMfoWablOWcZwmSfWujoEpSHEbJnFR6E9edTehRjfxbjbhnjDjNyMsWIsdbGDnDQWMyc+nlY0wOaXmt3odGzjZ1G0ewQjojhrzx6TDQ4/ZVcbcXDIJF0PUkbJMeaeMk3vIiNs5FmYFGobi1ozAoMHJ2KIUCV2JpoeDik9g5FLNM7BRtehmTZ85ZEzT3LilkS7Qf09UPLN2niVO2fTFrnIN+D9uVGK6H1VSFyWCFlcXpaWimNmNHQsDQhUc5dwYaR7KxVTbrhAN8ClWxbeBYilSqpZqaXIx7CKpyh4Q60gBmXuqmqei+Vop+4tvS4Gdg162wbuv9WTiTa3WP3S/TOmtgNJFxt/3py9mIQBoPw48ve4Zt2NBvkCXlK+iQkr8gh+UBOyJAIb9Pb8/a91/43/4f/0/91JfW9sxjshT+7+2e/1Q</latexit><latexit sha1_base64="wtsO4CVxqkm5sp9iKOgdMu8Yxs=">ADGHicbVJNb9QwEHXCV7t8beHIxWJFtRXVNkFIlEOlioLg0EMR3bSOkSOd7Jr1XEie4J2ifJHuPJHuCGu3Pg3ONtUYncZydL4vednz4yTQkmLQfDH82/cvHX7zsZm5+69+w8edrcendm8NAKGIle5uUi4BSU1DFGigovCAM8SBefJ5VHDn38BY2WuT3FeQJTxiZapFBwdFHe/swQmUlfcGD6vK6XqDsuSfFZlUstMfoWablOWcZwmSfWujoEpSHEbJnFR6E9edTehRjfxbjbhnjDjNyMsWIsdbGDnDQWMyc+nlY0wOaXmt3odGzjZ1G0ewQjojhrzx6TDQ4/ZVcbcXDIJF0PUkbJMeaeMk3vIiNs5FmYFGobi1ozAoMHJ2KIUCV2JpoeDik9g5FLNM7BRtehmTZ85ZEzT3LilkS7Qf09UPLN2niVO2fTFrnIN+D9uVGK6H1VSFyWCFlcXpaWimNmNHQsDQhUc5dwYaR7KxVTbrhAN8ClWxbeBYilSqpZqaXIx7CKpyh4Q60gBmXuqmqei+Vop+4tvS4Gdg162wbuv9WTiTa3WP3S/TOmtgNJFxt/3py9mIQBoPw48ve4Zt2NBvkCXlK+iQkr8gh+UBOyJAIb9Pb8/a91/43/4f/0/91JfW9sxjshT+7+2e/1Q</latexit>

Collect some simulation data. Should have xt+1 ≈ ϕ(xt, ut) + νt

<latexit sha1_base64="qv2LanEkuNBubcf2z1eK6m/O+og=">ACnXicbVHbahsxEJW3tzS9Oe1jHypqCg4JZrcUksfQC+2DKSmt4B3WblsS2ilYQ0G2wW/0K/pq/tf/RvqnVcqO0OCA7nzEUzp7BKeorj363o1u07d+/t3d9/8PDR4yftg6cX3lRO4EAYZdxlAR6V1DgSQovrUMoC4XD4updow+v0Xlp9DdaWMxKmGo5kQIoUHm7O89rOkqWPAVrnZnz9BqcncnA03GV0yE/4qkOIG934l68Cr4LkjXosHWc5wetLB0bUZWoSjwfpTElrIaHEmhcLmfVh4tiCuY4ihADSX6rF6tOSvAjPmE+PC08RX7L8VNZTeL8oiZJZAM7+tNeT/tFk9OsltpWhFrcDJpUipPhzX34WDoUpBYBgHAy/JWLGTgQFK64MWXV26LY2KSeV1oKM8YtVtGcHATSI5UgdbNV/VEqxb+C9rwvpzP6q4a2jdx9L6eS/HE/WKUPd5KDIcn2+XfBxeteEveSL286Z2/X1uyx5+wl67KEnbAz9omdswET7Dv7wX6yX9GL6EPUjz7fpEatdc0zthHR8A8FI8/+</latexit><latexit sha1_base64="qv2LanEkuNBubcf2z1eK6m/O+og=">ACnXicbVHbahsxEJW3tzS9Oe1jHypqCg4JZrcUksfQC+2DKSmt4B3WblsS2ilYQ0G2wW/0K/pq/tf/RvqnVcqO0OCA7nzEUzp7BKeorj363o1u07d+/t3d9/8PDR4yftg6cX3lRO4EAYZdxlAR6V1DgSQovrUMoC4XD4updow+v0Xlp9DdaWMxKmGo5kQIoUHm7O89rOkqWPAVrnZnz9BqcncnA03GV0yE/4qkOIG934l68Cr4LkjXosHWc5wetLB0bUZWoSjwfpTElrIaHEmhcLmfVh4tiCuY4ihADSX6rF6tOSvAjPmE+PC08RX7L8VNZTeL8oiZJZAM7+tNeT/tFk9OsltpWhFrcDJpUipPhzX34WDoUpBYBgHAy/JWLGTgQFK64MWXV26LY2KSeV1oKM8YtVtGcHATSI5UgdbNV/VEqxb+C9rwvpzP6q4a2jdx9L6eS/HE/WKUPd5KDIcn2+XfBxeteEveSL286Z2/X1uyx5+wl67KEnbAz9omdswET7Dv7wX6yX9GL6EPUjz7fpEatdc0zthHR8A8FI8/+</latexit><latexit sha1_base64="qv2LanEkuNBubcf2z1eK6m/O+og=">ACnXicbVHbahsxEJW3tzS9Oe1jHypqCg4JZrcUksfQC+2DKSmt4B3WblsS2ilYQ0G2wW/0K/pq/tf/RvqnVcqO0OCA7nzEUzp7BKeorj363o1u07d+/t3d9/8PDR4yftg6cX3lRO4EAYZdxlAR6V1DgSQovrUMoC4XD4updow+v0Xlp9DdaWMxKmGo5kQIoUHm7O89rOkqWPAVrnZnz9BqcncnA03GV0yE/4qkOIG934l68Cr4LkjXosHWc5wetLB0bUZWoSjwfpTElrIaHEmhcLmfVh4tiCuY4ihADSX6rF6tOSvAjPmE+PC08RX7L8VNZTeL8oiZJZAM7+tNeT/tFk9OsltpWhFrcDJpUipPhzX34WDoUpBYBgHAy/JWLGTgQFK64MWXV26LY2KSeV1oKM8YtVtGcHATSI5UgdbNV/VEqxb+C9rwvpzP6q4a2jdx9L6eS/HE/WKUPd5KDIcn2+XfBxeteEveSL286Z2/X1uyx5+wl67KEnbAz9omdswET7Dv7wX6yX9GL6EPUjz7fpEatdc0zthHR8A8FI8/+</latexit><latexit sha1_base64="qv2LanEkuNBubcf2z1eK6m/O+og=">ACnXicbVHbahsxEJW3tzS9Oe1jHypqCg4JZrcUksfQC+2DKSmt4B3WblsS2ilYQ0G2wW/0K/pq/tf/RvqnVcqO0OCA7nzEUzp7BKeorj363o1u07d+/t3d9/8PDR4yftg6cX3lRO4EAYZdxlAR6V1DgSQovrUMoC4XD4updow+v0Xlp9DdaWMxKmGo5kQIoUHm7O89rOkqWPAVrnZnz9BqcncnA03GV0yE/4qkOIG934l68Cr4LkjXosHWc5wetLB0bUZWoSjwfpTElrIaHEmhcLmfVh4tiCuY4ihADSX6rF6tOSvAjPmE+PC08RX7L8VNZTeL8oiZJZAM7+tNeT/tFk9OsltpWhFrcDJpUipPhzX34WDoUpBYBgHAy/JWLGTgQFK64MWXV26LY2KSeV1oKM8YtVtGcHATSI5UgdbNV/VEqxb+C9rwvpzP6q4a2jdx9L6eS/HE/WKUPd5KDIcn2+XfBxeteEveSL286Z2/X1uyx5+wl67KEnbAz9omdswET7Dv7wX6yX9GL6EPUjz7fpEatdc0zthHR8A8FI8/+</latexit>

minimize Eω hPT

t=1 Ct(xt, ut)

i s.t. xt+1 = ϕ(xt, ut) + ωt ut = π(τt)

<latexit sha1_base64="7WwS5p4/hlK3102Z185pb4izZt8=">ADJnicbVJNbxMxEPUuXyV8pXDgwMUiokrVarWLkOASqaJAOfRQRNWipeV13E2Vm3vyp6tElb7f7jyR7ghxI2fgjdZEkYydLovZlnzynhRQWwvCn51+7fuPmra3bnTt3791/0N1+eGbz0jA+ZLnMzUVKLZdC8yEIkPyiMJyqVPLz9PKw4c+vuLEi16cwL3isaKbFRDAKDkq6X0nKM6Eragyd15WUdYeoNJ9VSmihxGde4x1MFIVpmlZv64TkimeUSD6BEbGlSioYRPWnU3yYQH+WwH6ZwC4xIptCTEirZQMIGp2Zq96LajzA5IqaYir+duA9vFROwHXtYIcNSCH6BGhDdwjX4/aNSbcXBuEi8GYStUkPtXGSbHsxGesVFwDk9TaURQWEDs5ExyN3BpeUHZJc34yKWaKm7jarHbGj9zyBhPcuOBrxA/+2oqLJ2rlJX2WzJrnMN+D9uVMLkVwJXZTANVteNCklhw3RuGxMJyBnLuEMiPcWzGbUkMZODtXbloF5ytTFLNSi1YPuZrqIQZGOpAy0FRoZupqiMhJf5ItcXHjXN/WCfb0P03IhNg94/dn9G7G8XOkGh9/ZvJ2fMgCoPow4vewevWmi30BD1FfRShl+gAvUcnaIiY9gbeO+8I/+L/83/7v9Ylvpe2/MIrYT/6zcbuwOX</latexit><latexit sha1_base64="7WwS5p4/hlK3102Z185pb4izZt8=">ADJnicbVJNbxMxEPUuXyV8pXDgwMUiokrVarWLkOASqaJAOfRQRNWipeV13E2Vm3vyp6tElb7f7jyR7ghxI2fgjdZEkYydLovZlnzynhRQWwvCn51+7fuPmra3bnTt3791/0N1+eGbz0jA+ZLnMzUVKLZdC8yEIkPyiMJyqVPLz9PKw4c+vuLEi16cwL3isaKbFRDAKDkq6X0nKM6Eragyd15WUdYeoNJ9VSmihxGde4x1MFIVpmlZv64TkimeUSD6BEbGlSioYRPWnU3yYQH+WwH6ZwC4xIptCTEirZQMIGp2Zq96LajzA5IqaYir+duA9vFROwHXtYIcNSCH6BGhDdwjX4/aNSbcXBuEi8GYStUkPtXGSbHsxGesVFwDk9TaURQWEDs5ExyN3BpeUHZJc34yKWaKm7jarHbGj9zyBhPcuOBrxA/+2oqLJ2rlJX2WzJrnMN+D9uVMLkVwJXZTANVteNCklhw3RuGxMJyBnLuEMiPcWzGbUkMZODtXbloF5ytTFLNSi1YPuZrqIQZGOpAy0FRoZupqiMhJf5ItcXHjXN/WCfb0P03IhNg94/dn9G7G8XOkGh9/ZvJ2fMgCoPow4vewevWmi30BD1FfRShl+gAvUcnaIiY9gbeO+8I/+L/83/7v9Ylvpe2/MIrYT/6zcbuwOX</latexit><latexit sha1_base64="7WwS5p4/hlK3102Z185pb4izZt8=">ADJnicbVJNbxMxEPUuXyV8pXDgwMUiokrVarWLkOASqaJAOfRQRNWipeV13E2Vm3vyp6tElb7f7jyR7ghxI2fgjdZEkYydLovZlnzynhRQWwvCn51+7fuPmra3bnTt3791/0N1+eGbz0jA+ZLnMzUVKLZdC8yEIkPyiMJyqVPLz9PKw4c+vuLEi16cwL3isaKbFRDAKDkq6X0nKM6Eragyd15WUdYeoNJ9VSmihxGde4x1MFIVpmlZv64TkimeUSD6BEbGlSioYRPWnU3yYQH+WwH6ZwC4xIptCTEirZQMIGp2Zq96LajzA5IqaYir+duA9vFROwHXtYIcNSCH6BGhDdwjX4/aNSbcXBuEi8GYStUkPtXGSbHsxGesVFwDk9TaURQWEDs5ExyN3BpeUHZJc34yKWaKm7jarHbGj9zyBhPcuOBrxA/+2oqLJ2rlJX2WzJrnMN+D9uVMLkVwJXZTANVteNCklhw3RuGxMJyBnLuEMiPcWzGbUkMZODtXbloF5ytTFLNSi1YPuZrqIQZGOpAy0FRoZupqiMhJf5ItcXHjXN/WCfb0P03IhNg94/dn9G7G8XOkGh9/ZvJ2fMgCoPow4vewevWmi30BD1FfRShl+gAvUcnaIiY9gbeO+8I/+L/83/7v9Ylvpe2/MIrYT/6zcbuwOX</latexit><latexit sha1_base64="7WwS5p4/hlK3102Z185pb4izZt8=">ADJnicbVJNbxMxEPUuXyV8pXDgwMUiokrVarWLkOASqaJAOfRQRNWipeV13E2Vm3vyp6tElb7f7jyR7ghxI2fgjdZEkYydLovZlnzynhRQWwvCn51+7fuPmra3bnTt3791/0N1+eGbz0jA+ZLnMzUVKLZdC8yEIkPyiMJyqVPLz9PKw4c+vuLEi16cwL3isaKbFRDAKDkq6X0nKM6Eragyd15WUdYeoNJ9VSmihxGde4x1MFIVpmlZv64TkimeUSD6BEbGlSioYRPWnU3yYQH+WwH6ZwC4xIptCTEirZQMIGp2Zq96LajzA5IqaYir+duA9vFROwHXtYIcNSCH6BGhDdwjX4/aNSbcXBuEi8GYStUkPtXGSbHsxGesVFwDk9TaURQWEDs5ExyN3BpeUHZJc34yKWaKm7jarHbGj9zyBhPcuOBrxA/+2oqLJ2rlJX2WzJrnMN+D9uVMLkVwJXZTANVteNCklhw3RuGxMJyBnLuEMiPcWzGbUkMZODtXbloF5ytTFLNSi1YPuZrqIQZGOpAy0FRoZupqiMhJf5ItcXHjXN/WCfb0P03IhNg94/dn9G7G8XOkGh9/ZvJ2fMgCoPow4vewevWmi30BD1FfRShl+gAvUcnaIiY9gbeO+8I/+L/83/7v9Ylvpe2/MIrYT/6zcbuwOX</latexit>

Solve approximate problem: Fit dynamics with supervised learning:

ˆ ϕ = arg min

ϕ T−1

X

t=0

||xt+1 − ϕ(xt, ut)||2

<latexit sha1_base64="aZaKMn8L/sK6nbs0Judh9ed024=">ACzHicbVFNb9NAEN2YrxK+0nLksiICpaKN7AqJXoqgQSHChXRNJVi15psNvaq67W1O64SbXzlX/FDOHOF/8A6NYIkjLTS2/fezOzOjAspDPr+95Z36/adu/e27rcfPHz0+Elne+fc5KVmfMBymeuLMRguheIDFCj5RaE5ZGPJh+Ord7U+vObaiFyd4bzgUQaJElPBAB0Vd4ZhCmjDa9BFKip6REPQSZgJFf8lX9LQlFls8civLu3ZflDRxWLm7q8c2qeNrzeLca+McXexuDxox52u3/eXQTdB0IAuaeI03m5F4SRnZcYVMgnGjAK/wMiCRsEkr9phaXgB7AoSPnJQcZNZJcTqOgLx0zoNfuKRL9t8MC5kx82zsnBlgata1mvyfNipxehZoYoSuWI3jalpJjTepx0IjRnKOcOANPCvZWyFDQwdENf6bKsXC28hM7K5Vg+YSvsRJnqMGRhmMGQtW/sh+ElPQLKENPRJLiH9WVreXe5EINHsnbrNqd8PsFhKsj38TnB/0A78fH7dPX7brGaLPCPSY8E5A05Jh/JKRkQRr6RH+Qn+eV98tCzXnVj9VpNzlOyEt7X3604vI=</latexit><latexit sha1_base64="aZaKMn8L/sK6nbs0Judh9ed024=">ACzHicbVFNb9NAEN2YrxK+0nLksiICpaKN7AqJXoqgQSHChXRNJVi15psNvaq67W1O64SbXzlX/FDOHOF/8A6NYIkjLTS2/fezOzOjAspDPr+95Z36/adu/e27rcfPHz0+Elne+fc5KVmfMBymeuLMRguheIDFCj5RaE5ZGPJh+Ord7U+vObaiFyd4bzgUQaJElPBAB0Vd4ZhCmjDa9BFKip6REPQSZgJFf8lX9LQlFls8civLu3ZflDRxWLm7q8c2qeNrzeLca+McXexuDxox52u3/eXQTdB0IAuaeI03m5F4SRnZcYVMgnGjAK/wMiCRsEkr9phaXgB7AoSPnJQcZNZJcTqOgLx0zoNfuKRL9t8MC5kx82zsnBlgata1mvyfNipxehZoYoSuWI3jalpJjTepx0IjRnKOcOANPCvZWyFDQwdENf6bKsXC28hM7K5Vg+YSvsRJnqMGRhmMGQtW/sh+ElPQLKENPRJLiH9WVreXe5EINHsnbrNqd8PsFhKsj38TnB/0A78fH7dPX7brGaLPCPSY8E5A05Jh/JKRkQRr6RH+Qn+eV98tCzXnVj9VpNzlOyEt7X3604vI=</latexit><latexit sha1_base64="aZaKMn8L/sK6nbs0Judh9ed024=">ACzHicbVFNb9NAEN2YrxK+0nLksiICpaKN7AqJXoqgQSHChXRNJVi15psNvaq67W1O64SbXzlX/FDOHOF/8A6NYIkjLTS2/fezOzOjAspDPr+95Z36/adu/e27rcfPHz0+Elne+fc5KVmfMBymeuLMRguheIDFCj5RaE5ZGPJh+Ord7U+vObaiFyd4bzgUQaJElPBAB0Vd4ZhCmjDa9BFKip6REPQSZgJFf8lX9LQlFls8civLu3ZflDRxWLm7q8c2qeNrzeLca+McXexuDxox52u3/eXQTdB0IAuaeI03m5F4SRnZcYVMgnGjAK/wMiCRsEkr9phaXgB7AoSPnJQcZNZJcTqOgLx0zoNfuKRL9t8MC5kx82zsnBlgata1mvyfNipxehZoYoSuWI3jalpJjTepx0IjRnKOcOANPCvZWyFDQwdENf6bKsXC28hM7K5Vg+YSvsRJnqMGRhmMGQtW/sh+ElPQLKENPRJLiH9WVreXe5EINHsnbrNqd8PsFhKsj38TnB/0A78fH7dPX7brGaLPCPSY8E5A05Jh/JKRkQRr6RH+Qn+eV98tCzXnVj9VpNzlOyEt7X3604vI=</latexit><latexit sha1_base64="aZaKMn8L/sK6nbs0Judh9ed024=">ACzHicbVFNb9NAEN2YrxK+0nLksiICpaKN7AqJXoqgQSHChXRNJVi15psNvaq67W1O64SbXzlX/FDOHOF/8A6NYIkjLTS2/fezOzOjAspDPr+95Z36/adu/e27rcfPHz0+Elne+fc5KVmfMBymeuLMRguheIDFCj5RaE5ZGPJh+Ord7U+vObaiFyd4bzgUQaJElPBAB0Vd4ZhCmjDa9BFKip6REPQSZgJFf8lX9LQlFls8civLu3ZflDRxWLm7q8c2qeNrzeLca+McXexuDxox52u3/eXQTdB0IAuaeI03m5F4SRnZcYVMgnGjAK/wMiCRsEkr9phaXgB7AoSPnJQcZNZJcTqOgLx0zoNfuKRL9t8MC5kx82zsnBlgata1mvyfNipxehZoYoSuWI3jalpJjTepx0IjRnKOcOANPCvZWyFDQwdENf6bKsXC28hM7K5Vg+YSvsRJnqMGRhmMGQtW/sh+ElPQLKENPRJLiH9WVreXe5EINHsnbrNqd8PsFhKsj38TnB/0A78fH7dPX7brGaLPCPSY8E5A05Jh/JKRkQRr6RH+Qn+eV98tCzXnVj9VpNzlOyEt7X3604vI=</latexit>
slide-11
SLIDE 11

RL Methods

G

xt

u x e

approximate dynamic programming model-based direct policy search minimize Ee hPT

t=1 Ct(xt, ut)

i s.t. xt+1 = ft(xt, ut, et) ut = πt(τt)

<latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit><latexit sha1_base64="Vs+14vGXEYCWQa4/aBIirWhHyZg=">ADGnicbVJNb9NAELXNV0n5SOHIZUVElYoshESCFSpoiA49FBE01bKGmu9GSer7q6t3TFKsPxPuPJHuCGuXPg3rFMjSMJIlmbfe/N2Z8ZpIYXFMPzlB1euXrt+Y+tmZ/vW7Tt3uzv3Tm1eGg4jnsvcnKfMghQaRihQwnlhgKlUwl6cdjwZ5/AWJHrE1wUECs21SITnKGDku5XmsJU6IoZwxZ1JWXdoSrN5USWijxGWqyS6hiOEvT6k2dAJWQ4ZjaUiUV7kf1xNymGB/nuCgTHCPGjGdYUxpa2OHOGws5k79OKrJPsn+qgfQVNDOLnEHR9FCOJIia5w6FPSkfVfS7YXDcBlkM4napOe1cZzs+DGd5LxUoJFLZu04CguMnR0KLsE1WVoGL9gUxi7VDMFNq6W86zJI4dMSJYb92kS/TfiopaxcqdcpmMnada8D/ceMSs+dxJXRImh+eVFWSoI5aZDJsIAR7lwCeNGuLcSPmOGcXQrXLl6V0AX+mkmpda8HwCa6jEORrmQAuomNBNV9VbISX5wLQlR83K/rDOtqH7r8VUoB0cuf9E72I3UKi9fFvJqdPhlE4jN4/7R28alez5T3wHnp9L/KeQfeO+/YG3nc3/Yj/4X/MvgSfAu+Bz8upYHf1tz3ViL4+Rs0RP43</latexit>
  • Model-based: fit model from data
  • Model-free
  • Direct policy search: search for actions from data
  • Approximate dynamic programming
  • estimate cost from data (with or without model)

How to solve optimal control when the model f is unknown?

slide-12
SLIDE 12

“Simplest” Example: LQR

minimize E h

1 T

PT

t=1 x∗ t Qxt + u∗ t Rut

i s.t. xt+1 = Axt + But + et

<latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit><latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit><latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit><latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit>
  • Optimization simplicity
  • Elegant Dynamic Programming solutions
  • Exact solution for baseline
  • Static state feedback solution on infinite horizon
  • Natural robustness
  • Broadly applicable as is
  • Core of many MPC and nonlinear control methods
slide-13
SLIDE 13

“Simplest” Example: LQR

What is the optimal estimation/design scheme? How many samples are needed for near optimal control?

minimize E h

1 T

PT

t=1 x∗ t Qxt + u∗ t Rut

i s.t. xt+1 = Axt + But + et

<latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit><latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit><latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit><latexit sha1_base64="j4leBCDOJZuWdWUCyrZFILAbTYQ=">ADHicbVLfb9MwEHbCr1EYdPDIi0UFGgxVCSDBA5XGAMHDHjZot0lNFjmu01qznci+oBYr/wqv/CO8IV6R+G+wuyDRjpOiu3zf3dl3n/NKcANR9DsIL12+cvXaxvXOjZubt253t+4cmbLWlI1oKUp9khPDBFdsBwEO6k0IzIX7Dg/e+P5489MG16qISwqlkoyVbzglICDsu63JGdTrizRmiwaK0TSWRezq3kikv+hTX4IU4kgVme23dNIlgB46TQhNq4scMGJ6aWmYVB3JwO8TyD08f40Hu8g+vl30fvE82nM0iTpO1u+tD3neudCdu8AC/bov2fLrzLINOwtSkvVnW7UX9aGn4YhC3Q+1dpBtBWkyKWktmQIqiDHjOKogde2AU8HcmLVhFaFnZMrGLlREMpPa5UYb/MAhE1yU2n0K8BL9t8ISacxC5i7T78ascx78HzeuoXiZWq6qGpi5wcVtcBQYi8PnDNKIiFCwjV3N0V0xlx6wYn4sopy94VoyuT2HmtOC0nbA0VMAdNHGgYSMKVn8q+50LgT0QZvO/V+cu6tp7efsunHMyTfdS1KMLyU6QeH39F4Ojp/046seHz3u7e60G+geuo+2UYxeoF30AR2gEaLBZvAseBUMwq/h9/BH+PM8NQzamrtoxcJfwDK/f3K</latexit>

Oracle: You can generate N trajectories of length T. Challenge: Build a controller with smallest error with fixed sampling budget (N x T).

slide-14
SLIDE 14

Algorithm Samples per

  • bservation

LQR parameters “optimal” error after NT steps Model-based d d2+dp ADP 1 Policy search 1 dp ✓d + p 2 ◆

<latexit sha1_base64="dD3RDoHKXOghfa+Mo9lMERJHo=">ACgXicbVFbSxtBFJ5stF6q9dJHXwZDQamE3VCoIhgoX3wQdGokGzl7OxJMjg7s8yclYQl/8PX9l/13Q2RjCJBw58fN+5nyRX0lEY/qsF9aXlDyura+sfNzY/bW3v7N46U1iBbWGUsfcJOFRSY5skKbzPLUKWKLxLHs8r/e4JrZNG39AoxziDvpY9KYA89btMv+bjrhgY45C3HrYbYTOcGF8E0RQ02NQuH3ZqcTc1oshQk1DgXCcKc4pLsCSFwvF6t3CYg3iEPnY81JChi8vJ2GP+xTMp7xnrXROfsG8zSsicG2WJj8yABm5eq8j3tE5BveO4lDovCLV4adQrFCfDqxvwVFoUpEYegLDSz8rFACwI8pea6TKpnaOY2aQcFloKk+Icq2hIFjzpkDKQutq/CmV4tegHb+Q/QG9qr5sJR/8kH1J7ujCv0MfLgT7h0Tz518Et61mFDajq2+Ns9Ppa1bZHtnByxi39kZ+8UuWZsJZtkz+8P+BvXgMAiD1ktoUJvmfGYzFpz8B9Q9xaM=</latexit><latexit sha1_base64="dD3RDoHKXOghfa+Mo9lMERJHo=">ACgXicbVFbSxtBFJ5stF6q9dJHXwZDQamE3VCoIhgoX3wQdGokGzl7OxJMjg7s8yclYQl/8PX9l/13Q2RjCJBw58fN+5nyRX0lEY/qsF9aXlDyura+sfNzY/bW3v7N46U1iBbWGUsfcJOFRSY5skKbzPLUKWKLxLHs8r/e4JrZNG39AoxziDvpY9KYA89btMv+bjrhgY45C3HrYbYTOcGF8E0RQ02NQuH3ZqcTc1oshQk1DgXCcKc4pLsCSFwvF6t3CYg3iEPnY81JChi8vJ2GP+xTMp7xnrXROfsG8zSsicG2WJj8yABm5eq8j3tE5BveO4lDovCLV4adQrFCfDqxvwVFoUpEYegLDSz8rFACwI8pea6TKpnaOY2aQcFloKk+Icq2hIFjzpkDKQutq/CmV4tegHb+Q/QG9qr5sJR/8kH1J7ujCv0MfLgT7h0Tz518Et61mFDajq2+Ns9Ppa1bZHtnByxi39kZ+8UuWZsJZtkz+8P+BvXgMAiD1ktoUJvmfGYzFpz8B9Q9xaM=</latexit><latexit sha1_base64="dD3RDoHKXOghfa+Mo9lMERJHo=">ACgXicbVFbSxtBFJ5stF6q9dJHXwZDQamE3VCoIhgoX3wQdGokGzl7OxJMjg7s8yclYQl/8PX9l/13Q2RjCJBw58fN+5nyRX0lEY/qsF9aXlDyura+sfNzY/bW3v7N46U1iBbWGUsfcJOFRSY5skKbzPLUKWKLxLHs8r/e4JrZNG39AoxziDvpY9KYA89btMv+bjrhgY45C3HrYbYTOcGF8E0RQ02NQuH3ZqcTc1oshQk1DgXCcKc4pLsCSFwvF6t3CYg3iEPnY81JChi8vJ2GP+xTMp7xnrXROfsG8zSsicG2WJj8yABm5eq8j3tE5BveO4lDovCLV4adQrFCfDqxvwVFoUpEYegLDSz8rFACwI8pea6TKpnaOY2aQcFloKk+Icq2hIFjzpkDKQutq/CmV4tegHb+Q/QG9qr5sJR/8kH1J7ujCv0MfLgT7h0Tz518Et61mFDajq2+Ns9Ppa1bZHtnByxi39kZ+8UuWZsJZtkz+8P+BvXgMAiD1ktoUJvmfGYzFpz8B9Q9xaM=</latexit><latexit sha1_base64="dD3RDoHKXOghfa+Mo9lMERJHo=">ACgXicbVFbSxtBFJ5stF6q9dJHXwZDQamE3VCoIhgoX3wQdGokGzl7OxJMjg7s8yclYQl/8PX9l/13Q2RjCJBw58fN+5nyRX0lEY/qsF9aXlDyura+sfNzY/bW3v7N46U1iBbWGUsfcJOFRSY5skKbzPLUKWKLxLHs8r/e4JrZNG39AoxziDvpY9KYA89btMv+bjrhgY45C3HrYbYTOcGF8E0RQ02NQuH3ZqcTc1oshQk1DgXCcKc4pLsCSFwvF6t3CYg3iEPnY81JChi8vJ2GP+xTMp7xnrXROfsG8zSsicG2WJj8yABm5eq8j3tE5BveO4lDovCLV4adQrFCfDqxvwVFoUpEYegLDSz8rFACwI8pea6TKpnaOY2aQcFloKk+Icq2hIFjzpkDKQutq/CmV4tegHb+Q/QG9qr5sJR/8kH1J7ujCv0MfLgT7h0Tz518Et61mFDajq2+Ns9Ppa1bZHtnByxi39kZ+8UuWZsJZtkz+8P+BvXgMAiD1ktoUJvmfGYzFpz8B9Q9xaM=</latexit>

ADP model-based policy search

x ∈ Rd

<latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit>

u ∈ Rp

<latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit>

Continuous Control:

Sample Complexity?

C C C r d + p NT

<latexit sha1_base64="DhxT65NUgb6bBhG57tbAuCyjSOM=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/r2zpDrZwaESz+Xtft2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwVpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHSYsmb</latexit><latexit sha1_base64="DhxT65NUgb6bBhG57tbAuCyjSOM=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/r2zpDrZwaESz+Xtft2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwVpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHSYsmb</latexit><latexit sha1_base64="DhxT65NUgb6bBhG57tbAuCyjSOM=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/r2zpDrZwaESz+Xtft2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwVpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHSYsmb</latexit><latexit sha1_base64="DhxT65NUgb6bBhG57tbAuCyjSOM=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/r2zpDrZwaESz+Xtft2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwVpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHSYsmb</latexit>

d + p √ NT

<latexit sha1_base64="gMukWtYe1Q5Rjd+iDbAaUkNCKJE=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/qZAeHSz2Xt+vbOkPt2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwhpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHP8mb</latexit><latexit sha1_base64="gMukWtYe1Q5Rjd+iDbAaUkNCKJE=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/qZAeHSz2Xt+vbOkPt2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwhpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHP8mb</latexit><latexit sha1_base64="gMukWtYe1Q5Rjd+iDbAaUkNCKJE=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/qZAeHSz2Xt+vbOkPt2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwhpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHP8mb</latexit><latexit sha1_base64="gMukWtYe1Q5Rjd+iDbAaUkNCKJE=">ACinicbVFdSxtBFJ1sP7TW1qj0qS9DQ8HSIrsiqAgiVNAHKZYaFZIl3J29mwzOzq4zd4th2B/TV/1F/TedjSk0S8MHM65cz/OTUolLYXh71bw7PmLl0vLr1Zer75u9Ze37iyRWUEdkWhCnOTgEUlNXZJksKb0iDkicLr5PZro1/RGNloS9pXGKcw1DLTAogTw3a7/qZAeHSz2Xt+vbOkPt2WdeDdifcDifBF0E0BR02jYvBeivup4WoctQkFjbi8KSYgeGpFBYr/QriyWIWxhiz0MNOdrYTeav+UfPpDwrjH+a+IT94eD3NpxnvjMHGhk57WG/J/Wqyjbj53UZUWoxVOjrFKcCt6YwVNpUJAaewDCSD8rFyPwhpC3bKbLpHaJYmYTd19pKYoU51hF92TAkxYpB6mbrdypVIr/AG35uRyO6K/qyzby1okcSrJfzv1d9KeFZH+QaN7+RXC1sx15/H23c3w0Pc0ye8+sC0WsT12zM7YBesywRz7xR7Y7Aa7AQHweFTatCa/tlkMxGc/AHP8mb</latexit>r

dp NT

<latexit sha1_base64="iUZjQVqI2Ns+UdDX0tDAKnUbNVo=">ACiXicbVHbatAEF0rvaRJL07Tt74sNYUSpBKICYPIZBA+xBKSuMkYAszWo3kxauVujsqcRb9S1/bP+rfdOW4UNsdWDicMzuXM0mlpKUw/N0JNh48fPR48nW9tNnz190d15e2bI2AgeiVKW5ScCikhoHJEnhTWUQikThdTI9bfXr72isLPUlzSqMC8i1zKQA8tS4+2pkvxlyo8yAcGnVuM+XTPu9sL9cB58HUQL0GOLuBjvdOJRWoq6QE1CgbXDKwodmBICoXN1qi2WIGYQo5DzUaGM3H7/hbz2T8qw0/mnic/bfHw4Ka2dF4jMLoIld1Vryf9qwpqwfO6mrmlCL+0ZrTiVvPWCp9KgIDXzAISRflYuJuCdIO/YUpd57QrF0ibutZSlCmusIpuyYAnLVIBUrdbuY9SKf4VtOXnMp/QX9WXbeW9M5lLsu/P/Vn0u7Vkf5Bo1f51cPVhP/L4y0Hv5Hhxmk32mr1heyxih+yEfWIXbMAEu2M/2E/2K9gOoqAfHN2nBp3Fn12FMHpH01YyWY=</latexit><latexit sha1_base64="iUZjQVqI2Ns+UdDX0tDAKnUbNVo=">ACiXicbVHbatAEF0rvaRJL07Tt74sNYUSpBKICYPIZBA+xBKSuMkYAszWo3kxauVujsqcRb9S1/bP+rfdOW4UNsdWDicMzuXM0mlpKUw/N0JNh48fPR48nW9tNnz190d15e2bI2AgeiVKW5ScCikhoHJEnhTWUQikThdTI9bfXr72isLPUlzSqMC8i1zKQA8tS4+2pkvxlyo8yAcGnVuM+XTPu9sL9cB58HUQL0GOLuBjvdOJRWoq6QE1CgbXDKwodmBICoXN1qi2WIGYQo5DzUaGM3H7/hbz2T8qw0/mnic/bfHw4Ka2dF4jMLoIld1Vryf9qwpqwfO6mrmlCL+0ZrTiVvPWCp9KgIDXzAISRflYuJuCdIO/YUpd57QrF0ibutZSlCmusIpuyYAnLVIBUrdbuY9SKf4VtOXnMp/QX9WXbeW9M5lLsu/P/Vn0u7Vkf5Bo1f51cPVhP/L4y0Hv5Hhxmk32mr1heyxih+yEfWIXbMAEu2M/2E/2K9gOoqAfHN2nBp3Fn12FMHpH01YyWY=</latexit><latexit sha1_base64="iUZjQVqI2Ns+UdDX0tDAKnUbNVo=">ACiXicbVHbatAEF0rvaRJL07Tt74sNYUSpBKICYPIZBA+xBKSuMkYAszWo3kxauVujsqcRb9S1/bP+rfdOW4UNsdWDicMzuXM0mlpKUw/N0JNh48fPR48nW9tNnz190d15e2bI2AgeiVKW5ScCikhoHJEnhTWUQikThdTI9bfXr72isLPUlzSqMC8i1zKQA8tS4+2pkvxlyo8yAcGnVuM+XTPu9sL9cB58HUQL0GOLuBjvdOJRWoq6QE1CgbXDKwodmBICoXN1qi2WIGYQo5DzUaGM3H7/hbz2T8qw0/mnic/bfHw4Ka2dF4jMLoIld1Vryf9qwpqwfO6mrmlCL+0ZrTiVvPWCp9KgIDXzAISRflYuJuCdIO/YUpd57QrF0ibutZSlCmusIpuyYAnLVIBUrdbuY9SKf4VtOXnMp/QX9WXbeW9M5lLsu/P/Vn0u7Vkf5Bo1f51cPVhP/L4y0Hv5Hhxmk32mr1heyxih+yEfWIXbMAEu2M/2E/2K9gOoqAfHN2nBp3Fn12FMHpH01YyWY=</latexit><latexit sha1_base64="iUZjQVqI2Ns+UdDX0tDAKnUbNVo=">ACiXicbVHbatAEF0rvaRJL07Tt74sNYUSpBKICYPIZBA+xBKSuMkYAszWo3kxauVujsqcRb9S1/bP+rfdOW4UNsdWDicMzuXM0mlpKUw/N0JNh48fPR48nW9tNnz190d15e2bI2AgeiVKW5ScCikhoHJEnhTWUQikThdTI9bfXr72isLPUlzSqMC8i1zKQA8tS4+2pkvxlyo8yAcGnVuM+XTPu9sL9cB58HUQL0GOLuBjvdOJRWoq6QE1CgbXDKwodmBICoXN1qi2WIGYQo5DzUaGM3H7/hbz2T8qw0/mnic/bfHw4Ka2dF4jMLoIld1Vryf9qwpqwfO6mrmlCL+0ZrTiVvPWCp9KgIDXzAISRflYuJuCdIO/YUpd57QrF0ibutZSlCmusIpuyYAnLVIBUrdbuY9SKf4VtOXnMp/QX9WXbeW9M5lLsu/P/Vn0u7Vkf5Bo1f51cPVhP/L4y0Hv5Hhxmk32mr1heyxih+yEfWIXbMAEu2M/2E/2K9gOoqAfHN2nBp3Fn12FMHpH01YyWY=</latexit>

minimize E hPT

t=1 x∗ t Qxt + u∗ t Rut

i s.t. xt+1 = Axt + But + et ut = Kxt

<latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit><latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit><latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit><latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit>
slide-15
SLIDE 15

x ∈ Rd

<latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit>

u ∈ Rp

<latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit>

[Tu, R. 2018]

ADP vs model- based

minimize limT→∞ E h

1 T

PT

t=1 x∗ t Qxt + u∗ t Rut

i s.t. xt+1 = Axt + But + et ut = Kxt

<latexit sha1_base64="ZrHOcK0Hn+gAk02Qb0V4DKjDn0=">ADRXicbVJb9MwFHbCbZTLOnjkxaJiGgxVDULAS6UxQCxhw3abVKdVY7rtNZsJ7JPoMXK7+OV34Eb4hXcNIg0Q5L0fl8vnPJOZ+TXAoLvd73ILx0+crVaxvXWzdu3rq92d6c2yzwjA+ZJnMzGlCLZdC8yEIkPw0N5yqRPKT5PxVxZ984saKTA9gkfNY0akWqWAUvGvc/kYSPhXaUWPonRSli2ikmzulNBCiS+8xNuYSKHGbkCMmM7AR2afidApLEqiKMySxL0pieQpjEhqKHNR6QYlJrbwSdCPyrMBno/h7BE+qizexUV9+1DZc2YkKat7UK3ajn3qbtRifv4ZO0X4V7y30SaW3Xtz5+X7EtwvWkGWHc7vS6vfrgiyBqQAc153C8FcRkrFCcQ1MUmtHUS+H2JcDwST3+ygszyk7p1M+8lBTxW3s6tWX+IH3THCaGf9pwLX3wxHlbULlfjIald2nauc/+NGBaQvYid0XgDXbNkoLSGDFc64okwnIFceECZEf5fMZtRv37waq90qWvnK1M4uaFiyb8DWvhDkY6p2Wg6JeZT+VeyukxB+ptvigUusv68tW9M5rMRVgHx/4J6UfXgj2gkTr678Ijp90I4+Pnb29htpNtA9dB/toAg9R3voHTpEQ8SCZwEJeJCGX8Mf4c/w1zI0DJqcu2jlhL/AIMsDmo=</latexit><latexit sha1_base64="ZrHOcK0Hn+gAk02Qb0V4DKjDn0=">ADRXicbVJb9MwFHbCbZTLOnjkxaJiGgxVDULAS6UxQCxhw3abVKdVY7rtNZsJ7JPoMXK7+OV34Eb4hXcNIg0Q5L0fl8vnPJOZ+TXAoLvd73ILx0+crVaxvXWzdu3rq92d6c2yzwjA+ZJnMzGlCLZdC8yEIkPw0N5yqRPKT5PxVxZ984saKTA9gkfNY0akWqWAUvGvc/kYSPhXaUWPonRSli2ikmzulNBCiS+8xNuYSKHGbkCMmM7AR2afidApLEqiKMySxL0pieQpjEhqKHNR6QYlJrbwSdCPyrMBno/h7BE+qizexUV9+1DZc2YkKat7UK3ajn3qbtRifv4ZO0X4V7y30SaW3Xtz5+X7EtwvWkGWHc7vS6vfrgiyBqQAc153C8FcRkrFCcQ1MUmtHUS+H2JcDwST3+ygszyk7p1M+8lBTxW3s6tWX+IH3THCaGf9pwLX3wxHlbULlfjIald2nauc/+NGBaQvYid0XgDXbNkoLSGDFc64okwnIFceECZEf5fMZtRv37waq90qWvnK1M4uaFiyb8DWvhDkY6p2Wg6JeZT+VeyukxB+ptvigUusv68tW9M5rMRVgHx/4J6UfXgj2gkTr678Ijp90I4+Pnb29htpNtA9dB/toAg9R3voHTpEQ8SCZwEJeJCGX8Mf4c/w1zI0DJqcu2jlhL/AIMsDmo=</latexit><latexit sha1_base64="ZrHOcK0Hn+gAk02Qb0V4DKjDn0=">ADRXicbVJb9MwFHbCbZTLOnjkxaJiGgxVDULAS6UxQCxhw3abVKdVY7rtNZsJ7JPoMXK7+OV34Eb4hXcNIg0Q5L0fl8vnPJOZ+TXAoLvd73ILx0+crVaxvXWzdu3rq92d6c2yzwjA+ZJnMzGlCLZdC8yEIkPw0N5yqRPKT5PxVxZ984saKTA9gkfNY0akWqWAUvGvc/kYSPhXaUWPonRSli2ikmzulNBCiS+8xNuYSKHGbkCMmM7AR2afidApLEqiKMySxL0pieQpjEhqKHNR6QYlJrbwSdCPyrMBno/h7BE+qizexUV9+1DZc2YkKat7UK3ajn3qbtRifv4ZO0X4V7y30SaW3Xtz5+X7EtwvWkGWHc7vS6vfrgiyBqQAc153C8FcRkrFCcQ1MUmtHUS+H2JcDwST3+ygszyk7p1M+8lBTxW3s6tWX+IH3THCaGf9pwLX3wxHlbULlfjIald2nauc/+NGBaQvYid0XgDXbNkoLSGDFc64okwnIFceECZEf5fMZtRv37waq90qWvnK1M4uaFiyb8DWvhDkY6p2Wg6JeZT+VeyukxB+ptvigUusv68tW9M5rMRVgHx/4J6UfXgj2gkTr678Ijp90I4+Pnb29htpNtA9dB/toAg9R3voHTpEQ8SCZwEJeJCGX8Mf4c/w1zI0DJqcu2jlhL/AIMsDmo=</latexit><latexit sha1_base64="ZrHOcK0Hn+gAk02Qb0V4DKjDn0=">ADRXicbVJb9MwFHbCbZTLOnjkxaJiGgxVDULAS6UxQCxhw3abVKdVY7rtNZsJ7JPoMXK7+OV34Eb4hXcNIg0Q5L0fl8vnPJOZ+TXAoLvd73ILx0+crVaxvXWzdu3rq92d6c2yzwjA+ZJnMzGlCLZdC8yEIkPw0N5yqRPKT5PxVxZ984saKTA9gkfNY0akWqWAUvGvc/kYSPhXaUWPonRSli2ikmzulNBCiS+8xNuYSKHGbkCMmM7AR2afidApLEqiKMySxL0pieQpjEhqKHNR6QYlJrbwSdCPyrMBno/h7BE+qizexUV9+1DZc2YkKat7UK3ajn3qbtRifv4ZO0X4V7y30SaW3Xtz5+X7EtwvWkGWHc7vS6vfrgiyBqQAc153C8FcRkrFCcQ1MUmtHUS+H2JcDwST3+ygszyk7p1M+8lBTxW3s6tWX+IH3THCaGf9pwLX3wxHlbULlfjIald2nauc/+NGBaQvYid0XgDXbNkoLSGDFc64okwnIFceECZEf5fMZtRv37waq90qWvnK1M4uaFiyb8DWvhDkY6p2Wg6JeZT+VeyukxB+ptvigUusv68tW9M5rMRVgHx/4J6UfXgj2gkTr678Ijp90I4+Pnb29htpNtA9dB/toAg9R3voHTpEQ8SCZwEJeJCGX8Mf4c/w1zI0DJqcu2jlhL/AIMsDmo=</latexit>

Fix state feedback K, How to estimate value function? V(x) = x∗P?x

<latexit sha1_base64="rHpYpF/0hLc8q/tsNgMCVDOahW4=">ACiXicbVFbSxtBFJ5s6/0W98GQxCFJFdEZRCQWqhfAhxSYKyRrOTk6SwdnZeZs2XTxv/hq/5H/xtkYwSQ9cODj+879RKmSlnz/ueJ9+LiwuLS8srq2vrG5Vd3+1LJZgQ2RaIScxuBRSU1NkmSwtvUIMSRwpvo/rLUb/6gsTLRv2mUYhjDQMu+FECO6lZ3WvX8gH/l+d0hb3Q7lsDwvFut+cf+2Pg8CagxibW6G5Xwk4vEVmMmoQCa9uBn1JYgCEpFD6sdjKLKYh7GDbQ0x2rAYj/A9x3T4/3EONfEx+z7jAJia0dx5CJjoKGd1Uryf1o7o/5WEidZoRavDbqZ4pTwstb8J40KEiNHABhpJuViyEYEOQuNtVlXDtFMbVJkWdaiqSHM6yinAw40iLFIHW5VfFDKsWvQVt+JQdDelNd2VKuf5cDSfboyr1FH8wFu4cEs+efB62T48DhX6e1i2+T1yzXbH6ixgZ+yC/WQN1mSC/WP7In989a8wDv3vryGepVJzmc2Zd7lC0Y2xwc=</latexit><latexit sha1_base64="rHpYpF/0hLc8q/tsNgMCVDOahW4=">ACiXicbVFbSxtBFJ5s6/0W98GQxCFJFdEZRCQWqhfAhxSYKyRrOTk6SwdnZeZs2XTxv/hq/5H/xtkYwSQ9cODj+879RKmSlnz/ueJ9+LiwuLS8srq2vrG5Vd3+1LJZgQ2RaIScxuBRSU1NkmSwtvUIMSRwpvo/rLUb/6gsTLRv2mUYhjDQMu+FECO6lZ3WvX8gH/l+d0hb3Q7lsDwvFut+cf+2Pg8CagxibW6G5Xwk4vEVmMmoQCa9uBn1JYgCEpFD6sdjKLKYh7GDbQ0x2rAYj/A9x3T4/3EONfEx+z7jAJia0dx5CJjoKGd1Uryf1o7o/5WEidZoRavDbqZ4pTwstb8J40KEiNHABhpJuViyEYEOQuNtVlXDtFMbVJkWdaiqSHM6yinAw40iLFIHW5VfFDKsWvQVt+JQdDelNd2VKuf5cDSfboyr1FH8wFu4cEs+efB62T48DhX6e1i2+T1yzXbH6ixgZ+yC/WQN1mSC/WP7In989a8wDv3vryGepVJzmc2Zd7lC0Y2xwc=</latexit><latexit sha1_base64="rHpYpF/0hLc8q/tsNgMCVDOahW4=">ACiXicbVFbSxtBFJ5s6/0W98GQxCFJFdEZRCQWqhfAhxSYKyRrOTk6SwdnZeZs2XTxv/hq/5H/xtkYwSQ9cODj+879RKmSlnz/ueJ9+LiwuLS8srq2vrG5Vd3+1LJZgQ2RaIScxuBRSU1NkmSwtvUIMSRwpvo/rLUb/6gsTLRv2mUYhjDQMu+FECO6lZ3WvX8gH/l+d0hb3Q7lsDwvFut+cf+2Pg8CagxibW6G5Xwk4vEVmMmoQCa9uBn1JYgCEpFD6sdjKLKYh7GDbQ0x2rAYj/A9x3T4/3EONfEx+z7jAJia0dx5CJjoKGd1Uryf1o7o/5WEidZoRavDbqZ4pTwstb8J40KEiNHABhpJuViyEYEOQuNtVlXDtFMbVJkWdaiqSHM6yinAw40iLFIHW5VfFDKsWvQVt+JQdDelNd2VKuf5cDSfboyr1FH8wFu4cEs+efB62T48DhX6e1i2+T1yzXbH6ixgZ+yC/WQN1mSC/WP7In989a8wDv3vryGepVJzmc2Zd7lC0Y2xwc=</latexit><latexit sha1_base64="rHpYpF/0hLc8q/tsNgMCVDOahW4=">ACiXicbVFbSxtBFJ5s6/0W98GQxCFJFdEZRCQWqhfAhxSYKyRrOTk6SwdnZeZs2XTxv/hq/5H/xtkYwSQ9cODj+879RKmSlnz/ueJ9+LiwuLS8srq2vrG5Vd3+1LJZgQ2RaIScxuBRSU1NkmSwtvUIMSRwpvo/rLUb/6gsTLRv2mUYhjDQMu+FECO6lZ3WvX8gH/l+d0hb3Q7lsDwvFut+cf+2Pg8CagxibW6G5Xwk4vEVmMmoQCa9uBn1JYgCEpFD6sdjKLKYh7GDbQ0x2rAYj/A9x3T4/3EONfEx+z7jAJia0dx5CJjoKGd1Uryf1o7o/5WEidZoRavDbqZ4pTwstb8J40KEiNHABhpJuViyEYEOQuNtVlXDtFMbVJkWdaiqSHM6yinAw40iLFIHW5VfFDKsWvQVt+JQdDelNd2VKuf5cDSfboyr1FH8wFu4cEs+efB62T48DhX6e1i2+T1yzXbH6ixgZ+yC/WQN1mSC/WP7In989a8wDv3vryGepVJzmc2Zd7lC0Y2xwc=</latexit>

Use ADP (LSTD): ˆ Plstd

<latexit sha1_base64="0Jnr3mujqrK09sJP0cyXepsQ6Wg=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7e5MnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaxkaOpAK1LD9ajAB8t16AcaGK1V47yubDdic+iGfGV0EyBx02t+5wp5UO8kJUGg0JBc71k7ik1IMlKRTWm4PKYQniCsbYD9CARpf62Qo1fxOYnI8KG9wQn7EPMzxo56Y6C5HNnG5Za8j/af2KRp9TL01ZERpx12hUKU4Fb+7Bc2lRkJoGAMLKMCsXE7AgKFxtocusdoliYRN/XRkpihyXWEXZCGQDkmDNM1W/lgqxc/AOH4ixO6V0PZRt7/LseS3LuT8BrzdiU4PCRZPv8qOH9/kAR8+qFz9HX+mg2x16zfZawT+yI/WBd1mOC/WY37Jb9ibaj9GX6PAuNGrNc16yBYuO/wH+38sC</latexit><latexit sha1_base64="0Jnr3mujqrK09sJP0cyXepsQ6Wg=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7e5MnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaxkaOpAK1LD9ajAB8t16AcaGK1V47yubDdic+iGfGV0EyBx02t+5wp5UO8kJUGg0JBc71k7ik1IMlKRTWm4PKYQniCsbYD9CARpf62Qo1fxOYnI8KG9wQn7EPMzxo56Y6C5HNnG5Za8j/af2KRp9TL01ZERpx12hUKU4Fb+7Bc2lRkJoGAMLKMCsXE7AgKFxtocusdoliYRN/XRkpihyXWEXZCGQDkmDNM1W/lgqxc/AOH4ixO6V0PZRt7/LseS3LuT8BrzdiU4PCRZPv8qOH9/kAR8+qFz9HX+mg2x16zfZawT+yI/WBd1mOC/WY37Jb9ibaj9GX6PAuNGrNc16yBYuO/wH+38sC</latexit><latexit sha1_base64="0Jnr3mujqrK09sJP0cyXepsQ6Wg=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7e5MnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaxkaOpAK1LD9ajAB8t16AcaGK1V47yubDdic+iGfGV0EyBx02t+5wp5UO8kJUGg0JBc71k7ik1IMlKRTWm4PKYQniCsbYD9CARpf62Qo1fxOYnI8KG9wQn7EPMzxo56Y6C5HNnG5Za8j/af2KRp9TL01ZERpx12hUKU4Fb+7Bc2lRkJoGAMLKMCsXE7AgKFxtocusdoliYRN/XRkpihyXWEXZCGQDkmDNM1W/lgqxc/AOH4ixO6V0PZRt7/LseS3LuT8BrzdiU4PCRZPv8qOH9/kAR8+qFz9HX+mg2x16zfZawT+yI/WBd1mOC/WY37Jb9ibaj9GX6PAuNGrNc16yBYuO/wH+38sC</latexit><latexit sha1_base64="0Jnr3mujqrK09sJP0cyXepsQ6Wg=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7e5MnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaxkaOpAK1LD9ajAB8t16AcaGK1V47yubDdic+iGfGV0EyBx02t+5wp5UO8kJUGg0JBc71k7ik1IMlKRTWm4PKYQniCsbYD9CARpf62Qo1fxOYnI8KG9wQn7EPMzxo56Y6C5HNnG5Za8j/af2KRp9TL01ZERpx12hUKU4Fb+7Bc2lRkJoGAMLKMCsXE7AgKFxtocusdoliYRN/XRkpihyXWEXZCGQDkmDNM1W/lgqxc/AOH4ixO6V0PZRt7/LseS3LuT8BrzdiU4PCRZPv8qOH9/kAR8+qFz9HX+mg2x16zfZawT+yI/WBd1mOC/WY37Jb9ibaj9GX6PAuNGrNc16yBYuO/wH+38sC</latexit>

Estimate closed-loop model, solve DLE: ˆ Pplug

<latexit sha1_base64="ztdh71C/MRoTuQs0Q2Uh52E7biE=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7udkMnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaciPHUgAFath+NZgA+W49AMNLHal6rK65oP2534IJ4ZXwXJHTY3LrDnVY6GBWi0mhIKHCun8QlpR4sSaGw3hxUDksQV5BjP0ADGl3qZyvU/E1gRnxc2OCG+Ix9mOFBOzfVWYhs5nTLWkP+T+tXNP6cemnKitCIu0bjSnEqeHMPpIWBalpACsDLNyMQELgsLVFrMapcoFjbx15WRohjhEqvomiwE0iFpkKbZyh9LpfgZGMdPZD6hezWUbeT97zKX5N6dhNeYtyvB4SHJ8vlXwfn7gyTg0w+do6/z12ywPfa7bOEfWJH7Afrsh4T7De7YbfsT7QdfYy+RId3oVFrnvOSLVh0/A8BAMsD</latexit><latexit sha1_base64="ztdh71C/MRoTuQs0Q2Uh52E7biE=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7udkMnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaciPHUgAFath+NZgA+W49AMNLHal6rK65oP2534IJ4ZXwXJHTY3LrDnVY6GBWi0mhIKHCun8QlpR4sSaGw3hxUDksQV5BjP0ADGl3qZyvU/E1gRnxc2OCG+Ix9mOFBOzfVWYhs5nTLWkP+T+tXNP6cemnKitCIu0bjSnEqeHMPpIWBalpACsDLNyMQELgsLVFrMapcoFjbx15WRohjhEqvomiwE0iFpkKbZyh9LpfgZGMdPZD6hezWUbeT97zKX5N6dhNeYtyvB4SHJ8vlXwfn7gyTg0w+do6/z12ywPfa7bOEfWJH7Afrsh4T7De7YbfsT7QdfYy+RId3oVFrnvOSLVh0/A8BAMsD</latexit><latexit sha1_base64="ztdh71C/MRoTuQs0Q2Uh52E7biE=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7udkMnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaciPHUgAFath+NZgA+W49AMNLHal6rK65oP2534IJ4ZXwXJHTY3LrDnVY6GBWi0mhIKHCun8QlpR4sSaGw3hxUDksQV5BjP0ADGl3qZyvU/E1gRnxc2OCG+Ix9mOFBOzfVWYhs5nTLWkP+T+tXNP6cemnKitCIu0bjSnEqeHMPpIWBalpACsDLNyMQELgsLVFrMapcoFjbx15WRohjhEqvomiwE0iFpkKbZyh9LpfgZGMdPZD6hezWUbeT97zKX5N6dhNeYtyvB4SHJ8vlXwfn7gyTg0w+do6/z12ywPfa7bOEfWJH7Afrsh4T7De7YbfsT7QdfYy+RId3oVFrnvOSLVh0/A8BAMsD</latexit><latexit sha1_base64="ztdh71C/MRoTuQs0Q2Uh52E7biE=">ACjXicbVFdaxNBFJ2srdZabVpfhL4MBqGClF1R9EFKUaE+9CGlpi0kS7g7udkMnZldZu5Kw7D+mr7W/+O/cTZNoUm8cOFwzv2+Wamkozj+24oera0/frLxdPZ1vMX2+2d3XNXVFZgTxSqsJcZOFTSYI8kKbwsLYLOF5kV98a/eIXWicL85OmJaYaciPHUgAFath+NZgA+W49AMNLHal6rK65oP2534IJ4ZXwXJHTY3LrDnVY6GBWi0mhIKHCun8QlpR4sSaGw3hxUDksQV5BjP0ADGl3qZyvU/E1gRnxc2OCG+Ix9mOFBOzfVWYhs5nTLWkP+T+tXNP6cemnKitCIu0bjSnEqeHMPpIWBalpACsDLNyMQELgsLVFrMapcoFjbx15WRohjhEqvomiwE0iFpkKbZyh9LpfgZGMdPZD6hezWUbeT97zKX5N6dhNeYtyvB4SHJ8vlXwfn7gyTg0w+do6/z12ywPfa7bOEfWJH7Afrsh4T7De7YbfsT7QdfYy+RId3oVFrnvOSLVh0/A8BAMsD</latexit>

lim

T→∞ T E

  • ˆ

Pplug − P?

  • 2

F

  • ≤ lim

T→∞ T E

  • ˆ

Plstd − P?

  • 2

F

  • <latexit sha1_base64="6tr0GT8ImoUJZVK3f1GanCR3BY=">ADWHicrVJdb9MwFHVaYFv46sYjLxYV0pBgaiYkeEKT+Jb2UES7TapD5LhOas1xgn0DjbzwP+GVP4KTFIm2SLxwpcQn51zfm+vjuJDCwGj03ev1r12/sbO759+8dfvO3cH+wZnJS834lOUy1xcxNVwKxacgQPKLQnOaxZKfx5cvG/38C9dG5GoCVcHDjKZKJIJRcFQ0+EmkyCI7IVqkC6Ba518xESqBqsaTbySjsIhj+7rGRPIEZt1CrnyoGDHdWTbFJ3ZQpZpXeMneBwR4wr5XUVyFb35dIy7jxBj3xX43Lz/T1dpYP7PrtFgODoatYG3QbACQ7SKcbTvhWSeszLjCpikxsyCUQGhpRoEk7z2SWl4QdklTfnMQUzbkLbulHjh46Z4yTX7lGAW/bPHZmxlRZ7DKbKcym1pB/02YlJM9DK1RAlesa5SUEkOG2vxXGjOQFYOUKaF+1fMFlRTBu4CrHVpaxecrU1il6USLJ/zDVbCEjR1pOGQUeSm8q+FVLij1QZfNqc8W/VlW3kw1ciFWAen7pbph5tJTtDgs3j3wZnx0eBwx+eDk9erKzZRfRA3SIAvQMnaB3aIymiHnvdxbelXvRx/1d/p7XWrPW+25h9aif/ALe+YWZg=</latexit><latexit sha1_base64="6tr0GT8ImoUJZVK3f1GanCR3BY=">ADWHicrVJdb9MwFHVaYFv46sYjLxYV0pBgaiYkeEKT+Jb2UES7TapD5LhOas1xgn0DjbzwP+GVP4KTFIm2SLxwpcQn51zfm+vjuJDCwGj03ev1r12/sbO759+8dfvO3cH+wZnJS834lOUy1xcxNVwKxacgQPKLQnOaxZKfx5cvG/38C9dG5GoCVcHDjKZKJIJRcFQ0+EmkyCI7IVqkC6Ba518xESqBqsaTbySjsIhj+7rGRPIEZt1CrnyoGDHdWTbFJ3ZQpZpXeMneBwR4wr5XUVyFb35dIy7jxBj3xX43Lz/T1dpYP7PrtFgODoatYG3QbACQ7SKcbTvhWSeszLjCpikxsyCUQGhpRoEk7z2SWl4QdklTfnMQUzbkLbulHjh46Z4yTX7lGAW/bPHZmxlRZ7DKbKcym1pB/02YlJM9DK1RAlesa5SUEkOG2vxXGjOQFYOUKaF+1fMFlRTBu4CrHVpaxecrU1il6USLJ/zDVbCEjR1pOGQUeSm8q+FVLij1QZfNqc8W/VlW3kw1ciFWAen7pbph5tJTtDgs3j3wZnx0eBwx+eDk9erKzZRfRA3SIAvQMnaB3aIymiHnvdxbelXvRx/1d/p7XWrPW+25h9aif/ALe+YWZg=</latexit><latexit sha1_base64="6tr0GT8ImoUJZVK3f1GanCR3BY=">ADWHicrVJdb9MwFHVaYFv46sYjLxYV0pBgaiYkeEKT+Jb2UES7TapD5LhOas1xgn0DjbzwP+GVP4KTFIm2SLxwpcQn51zfm+vjuJDCwGj03ev1r12/sbO759+8dfvO3cH+wZnJS834lOUy1xcxNVwKxacgQPKLQnOaxZKfx5cvG/38C9dG5GoCVcHDjKZKJIJRcFQ0+EmkyCI7IVqkC6Ba518xESqBqsaTbySjsIhj+7rGRPIEZt1CrnyoGDHdWTbFJ3ZQpZpXeMneBwR4wr5XUVyFb35dIy7jxBj3xX43Lz/T1dpYP7PrtFgODoatYG3QbACQ7SKcbTvhWSeszLjCpikxsyCUQGhpRoEk7z2SWl4QdklTfnMQUzbkLbulHjh46Z4yTX7lGAW/bPHZmxlRZ7DKbKcym1pB/02YlJM9DK1RAlesa5SUEkOG2vxXGjOQFYOUKaF+1fMFlRTBu4CrHVpaxecrU1il6USLJ/zDVbCEjR1pOGQUeSm8q+FVLij1QZfNqc8W/VlW3kw1ciFWAen7pbph5tJTtDgs3j3wZnx0eBwx+eDk9erKzZRfRA3SIAvQMnaB3aIymiHnvdxbelXvRx/1d/p7XWrPW+25h9aif/ALe+YWZg=</latexit><latexit sha1_base64="6tr0GT8ImoUJZVK3f1GanCR3BY=">ADWHicrVJdb9MwFHVaYFv46sYjLxYV0pBgaiYkeEKT+Jb2UES7TapD5LhOas1xgn0DjbzwP+GVP4KTFIm2SLxwpcQn51zfm+vjuJDCwGj03ev1r12/sbO759+8dfvO3cH+wZnJS834lOUy1xcxNVwKxacgQPKLQnOaxZKfx5cvG/38C9dG5GoCVcHDjKZKJIJRcFQ0+EmkyCI7IVqkC6Ba518xESqBqsaTbySjsIhj+7rGRPIEZt1CrnyoGDHdWTbFJ3ZQpZpXeMneBwR4wr5XUVyFb35dIy7jxBj3xX43Lz/T1dpYP7PrtFgODoatYG3QbACQ7SKcbTvhWSeszLjCpikxsyCUQGhpRoEk7z2SWl4QdklTfnMQUzbkLbulHjh46Z4yTX7lGAW/bPHZmxlRZ7DKbKcym1pB/02YlJM9DK1RAlesa5SUEkOG2vxXGjOQFYOUKaF+1fMFlRTBu4CrHVpaxecrU1il6USLJ/zDVbCEjR1pOGQUeSm8q+FVLij1QZfNqc8W/VlW3kw1ciFWAen7pbph5tJTtDgs3j3wZnx0eBwx+eDk9erKzZRfRA3SIAvQMnaB3aIymiHnvdxbelXvRx/1d/p7XWrPW+25h9aif/ALe+YWZg=</latexit>

Thm:

lim

T→∞ T E

  • ˆ

Pplug − P?

  • 2

F

  • ≤ O(d2)
<latexit sha1_base64="ChRro+Svcp/KN/rhbtlFwa2gZfc=">AC8HicbVFdb9MwFHXCx0b5auGRF4sKqZNgaiskeJom8fkwiSLabVKdRY7jpNYcJ7NvYJUXfgdviFf+EY/8E5ykSLTlSpFPzrn32veqJDCwHD4y/OvXb9xc2f3Vuf2nbv37nd7D45NXmrGZyXuT6NqOFSKD4DAZKfFprTLJL8JDp/Vesn7k2IldTWBY8yGiqRCIYBUeFXUukyEI7JVqkC6Ba518wESqBZYWnX0lGYRF9k2FieQJzNuDXHXIgoKdVKFtUnRmC1mVYWf4UlIjGvUaTuSq/Dt2Ri3PwGu6y/wh0F8Nt7rhN3+cH/YBN4GoxXo1VMwp4XkDhnZcYVMEmNmY+GBQSWahBM8qpDSsMLys5pyucOKpxE9hmSxV+4pgYJ7l2nwLcsP9WJoZs8wil1nPZDa1mvyfNi8heRlYoYoSuGLtRUkpMeS4XjmOheYM5NIByrRwb8VsQTVl4IxZu6XpXC2Nom9LJVgecw3WAmXoKkjDYeMOs/cVPadkBJ/osrgo3rjf1XtpYHr0UqwDw9cu6rva1kZ8hoc/3b4Hi8P3L4/P+4cHKml30CD1GAzRCL9Aheo8maIY+u3teF2v52v/m/d/9Gm+t6q5iFaC/nH+5y7y4=</latexit><latexit sha1_base64="ChRro+Svcp/KN/rhbtlFwa2gZfc=">AC8HicbVFdb9MwFHXCx0b5auGRF4sKqZNgaiskeJom8fkwiSLabVKdRY7jpNYcJ7NvYJUXfgdviFf+EY/8E5ykSLTlSpFPzrn32veqJDCwHD4y/OvXb9xc2f3Vuf2nbv37nd7D45NXmrGZyXuT6NqOFSKD4DAZKfFprTLJL8JDp/Vesn7k2IldTWBY8yGiqRCIYBUeFXUukyEI7JVqkC6Ba518wESqBZYWnX0lGYRF9k2FieQJzNuDXHXIgoKdVKFtUnRmC1mVYWf4UlIjGvUaTuSq/Dt2Ri3PwGu6y/wh0F8Nt7rhN3+cH/YBN4GoxXo1VMwp4XkDhnZcYVMEmNmY+GBQSWahBM8qpDSsMLys5pyucOKpxE9hmSxV+4pgYJ7l2nwLcsP9WJoZs8wil1nPZDa1mvyfNi8heRlYoYoSuGLtRUkpMeS4XjmOheYM5NIByrRwb8VsQTVl4IxZu6XpXC2Nom9LJVgecw3WAmXoKkjDYeMOs/cVPadkBJ/osrgo3rjf1XtpYHr0UqwDw9cu6rva1kZ8hoc/3b4Hi8P3L4/P+4cHKml30CD1GAzRCL9Aheo8maIY+u3teF2v52v/m/d/9Gm+t6q5iFaC/nH+5y7y4=</latexit><latexit sha1_base64="ChRro+Svcp/KN/rhbtlFwa2gZfc=">AC8HicbVFdb9MwFHXCx0b5auGRF4sKqZNgaiskeJom8fkwiSLabVKdRY7jpNYcJ7NvYJUXfgdviFf+EY/8E5ykSLTlSpFPzrn32veqJDCwHD4y/OvXb9xc2f3Vuf2nbv37nd7D45NXmrGZyXuT6NqOFSKD4DAZKfFprTLJL8JDp/Vesn7k2IldTWBY8yGiqRCIYBUeFXUukyEI7JVqkC6Ba518wESqBZYWnX0lGYRF9k2FieQJzNuDXHXIgoKdVKFtUnRmC1mVYWf4UlIjGvUaTuSq/Dt2Ri3PwGu6y/wh0F8Nt7rhN3+cH/YBN4GoxXo1VMwp4XkDhnZcYVMEmNmY+GBQSWahBM8qpDSsMLys5pyucOKpxE9hmSxV+4pgYJ7l2nwLcsP9WJoZs8wil1nPZDa1mvyfNi8heRlYoYoSuGLtRUkpMeS4XjmOheYM5NIByrRwb8VsQTVl4IxZu6XpXC2Nom9LJVgecw3WAmXoKkjDYeMOs/cVPadkBJ/osrgo3rjf1XtpYHr0UqwDw9cu6rva1kZ8hoc/3b4Hi8P3L4/P+4cHKml30CD1GAzRCL9Aheo8maIY+u3teF2v52v/m/d/9Gm+t6q5iFaC/nH+5y7y4=</latexit><latexit sha1_base64="ChRro+Svcp/KN/rhbtlFwa2gZfc=">AC8HicbVFdb9MwFHXCx0b5auGRF4sKqZNgaiskeJom8fkwiSLabVKdRY7jpNYcJ7NvYJUXfgdviFf+EY/8E5ykSLTlSpFPzrn32veqJDCwHD4y/OvXb9xc2f3Vuf2nbv37nd7D45NXmrGZyXuT6NqOFSKD4DAZKfFprTLJL8JDp/Vesn7k2IldTWBY8yGiqRCIYBUeFXUukyEI7JVqkC6Ba518wESqBZYWnX0lGYRF9k2FieQJzNuDXHXIgoKdVKFtUnRmC1mVYWf4UlIjGvUaTuSq/Dt2Ri3PwGu6y/wh0F8Nt7rhN3+cH/YBN4GoxXo1VMwp4XkDhnZcYVMEmNmY+GBQSWahBM8qpDSsMLys5pyucOKpxE9hmSxV+4pgYJ7l2nwLcsP9WJoZs8wil1nPZDa1mvyfNi8heRlYoYoSuGLtRUkpMeS4XjmOheYM5NIByrRwb8VsQTVl4IxZu6XpXC2Nom9LJVgecw3WAmXoKkjDYeMOs/cVPadkBJ/osrgo3rjf1XtpYHr0UqwDw9cu6rva1kZ8hoc/3b4Hi8P3L4/P+4cHKml30CD1GAzRCL9Aheo8maIY+u3teF2v52v/m/d/9Gm+t6q5iFaC/nH+5y7y4=</latexit>

There exist instances where and no algorithm can do better than O(d2).

lim

T→∞ T E

  • ˆ

Plstd − P?

  • 2

F

  • ≥ Ω(d3)
<latexit sha1_base64="6bDWoN3fdKh/IBOm/D9R9Ii6F0=">AC9XicbVLNbhMxEPYufyUmoLgwsUiQkolqJKCBCdUid9DJYJI2krxduX1ejdWbe9iz9JG7iJehBviyvPwAjwH3iRIJGEky5+/b2bsmXFSmGh1/sVhJcuX7l6beN68bmzVtb7e3bh7aoDOMjVsjCHCfUcik0H4EAyY9Lw6lKJD9KTl82+tFnbqwo9BCmJY8UzbXIBKPgqbj9lUihYjckRuQToMYUZ5gIncG0xsMvRFGYJIl7XWMieQbj+UYuWmRCwQ3q2M1cjHLSQlrX+DEexMT6RK15RnIRvznZw/NDhDHJ+SdM3iue02568mSnFbc7vd3ezPA6C9ABy1sEG8HEUkLVimugUlq7bjfKyFy1IBgktctUleUnZKcz72UFPFbeRmrarxQ8+kOCuMXxrwjP03wlFl7VQl3rMpzK5qDfk/bVxB9jxyQpcVcM3mF2WVxFDgpu84FYzkFMPKDPCvxWzCTWUgZ/O0i2z3CVnS5W480oLVqR8hZVwDoZ60nJQ1A/OV+XeCinxR6otPmja/lf1aRu5+0rkAuyjA/8F9M6asx9If7X96+Bwb7fv8Yenf0Xi9FsoPvoAeqiPnqG9tE7NEAjxNDvYDO4G9wLz8Jv4fwx9w1DBYxd9CShT/AJtY8VE=</latexit><latexit sha1_base64="6bDWoN3fdKh/IBOm/D9R9Ii6F0=">AC9XicbVLNbhMxEPYufyUmoLgwsUiQkolqJKCBCdUid9DJYJI2krxduX1ejdWbe9iz9JG7iJehBviyvPwAjwH3iRIJGEky5+/b2bsmXFSmGh1/sVhJcuX7l6beN68bmzVtb7e3bh7aoDOMjVsjCHCfUcik0H4EAyY9Lw6lKJD9KTl82+tFnbqwo9BCmJY8UzbXIBKPgqbj9lUihYjckRuQToMYUZ5gIncG0xsMvRFGYJIl7XWMieQbj+UYuWmRCwQ3q2M1cjHLSQlrX+DEexMT6RK15RnIRvznZw/NDhDHJ+SdM3iue02568mSnFbc7vd3ezPA6C9ABy1sEG8HEUkLVimugUlq7bjfKyFy1IBgktctUleUnZKcz72UFPFbeRmrarxQ8+kOCuMXxrwjP03wlFl7VQl3rMpzK5qDfk/bVxB9jxyQpcVcM3mF2WVxFDgpu84FYzkFMPKDPCvxWzCTWUgZ/O0i2z3CVnS5W480oLVqR8hZVwDoZ60nJQ1A/OV+XeCinxR6otPmja/lf1aRu5+0rkAuyjA/8F9M6asx9If7X96+Bwb7fv8Yenf0Xi9FsoPvoAeqiPnqG9tE7NEAjxNDvYDO4G9wLz8Jv4fwx9w1DBYxd9CShT/AJtY8VE=</latexit><latexit sha1_base64="6bDWoN3fdKh/IBOm/D9R9Ii6F0=">AC9XicbVLNbhMxEPYufyUmoLgwsUiQkolqJKCBCdUid9DJYJI2krxduX1ejdWbe9iz9JG7iJehBviyvPwAjwH3iRIJGEky5+/b2bsmXFSmGh1/sVhJcuX7l6beN68bmzVtb7e3bh7aoDOMjVsjCHCfUcik0H4EAyY9Lw6lKJD9KTl82+tFnbqwo9BCmJY8UzbXIBKPgqbj9lUihYjckRuQToMYUZ5gIncG0xsMvRFGYJIl7XWMieQbj+UYuWmRCwQ3q2M1cjHLSQlrX+DEexMT6RK15RnIRvznZw/NDhDHJ+SdM3iue02568mSnFbc7vd3ezPA6C9ABy1sEG8HEUkLVimugUlq7bjfKyFy1IBgktctUleUnZKcz72UFPFbeRmrarxQ8+kOCuMXxrwjP03wlFl7VQl3rMpzK5qDfk/bVxB9jxyQpcVcM3mF2WVxFDgpu84FYzkFMPKDPCvxWzCTWUgZ/O0i2z3CVnS5W480oLVqR8hZVwDoZ60nJQ1A/OV+XeCinxR6otPmja/lf1aRu5+0rkAuyjA/8F9M6asx9If7X96+Bwb7fv8Yenf0Xi9FsoPvoAeqiPnqG9tE7NEAjxNDvYDO4G9wLz8Jv4fwx9w1DBYxd9CShT/AJtY8VE=</latexit><latexit sha1_base64="6bDWoN3fdKh/IBOm/D9R9Ii6F0=">AC9XicbVLNbhMxEPYufyUmoLgwsUiQkolqJKCBCdUid9DJYJI2krxduX1ejdWbe9iz9JG7iJehBviyvPwAjwH3iRIJGEky5+/b2bsmXFSmGh1/sVhJcuX7l6beN68bmzVtb7e3bh7aoDOMjVsjCHCfUcik0H4EAyY9Lw6lKJD9KTl82+tFnbqwo9BCmJY8UzbXIBKPgqbj9lUihYjckRuQToMYUZ5gIncG0xsMvRFGYJIl7XWMieQbj+UYuWmRCwQ3q2M1cjHLSQlrX+DEexMT6RK15RnIRvznZw/NDhDHJ+SdM3iue02568mSnFbc7vd3ezPA6C9ABy1sEG8HEUkLVimugUlq7bjfKyFy1IBgktctUleUnZKcz72UFPFbeRmrarxQ8+kOCuMXxrwjP03wlFl7VQl3rMpzK5qDfk/bVxB9jxyQpcVcM3mF2WVxFDgpu84FYzkFMPKDPCvxWzCTWUgZ/O0i2z3CVnS5W480oLVqR8hZVwDoZ60nJQ1A/OV+XeCinxR6otPmja/lf1aRu5+0rkAuyjA/8F9M6asx9If7X96+Bwb7fv8Yenf0Xi9FsoPvoAeqiPnqG9tE7NEAjxNDvYDO4G9wLz8Jv4fwx9w1DBYxd9CShT/AJtY8VE=</latexit>
slide-16
SLIDE 16

Direct Policy Search vs model-based

x ∈ Rd

<latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit><latexit sha1_base64="o7jY9jXqYIAsIkr1L5PoErhw4=">ACh3icbVFdSxtBFJ1sbav2w2gfRkaChZKuivS+lQsFeyD9Y2KiTbcHf2Jrk4O7vM3C0JS/5KX/Uv+W86GyOYpBcuHM653zcpNDkOw7tG8GTt6bPn6xubL16+er3V3N65cHlpFXZUrnN7lYBDTQY7TKzxqrAIWaLxMrn+VuXf9A6ys0vnhQYZzA0NCAF7Kl+c2cse2RkLwMeJUl1Pv2d9putsB3OTK6CaA5aYm5n/e1G3EtzVWZoWGlwrhuFBcVWCalcbrZKx0WoK5hiF0PDWTo4mo2/FS+80wqB7n1bljO2McZFWTOTbLER9ZDumWtJv+ndUseHMYVmaJkNOq+0aDUknNZX0KmZFGxngAypKfVaoRWFDs7XQZVa7QLWwSTUuDak8xSVW85gteNIhZ0Cm3qo6Ia3lTzBOntJwxA+qL1vLe8c0JHYfTv1TzPuVYP+QaPn8q+Bivx2F7ejHQevoy/w162JXvBV7IhKfxZH4Ls5ERygxFn/FjbgNoKPwafg8D40aMxz3ogFC7+A1Ztx5c=</latexit>

u ∈ Rp

<latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit><latexit sha1_base64="K+4wRtuzPJPhIfhoCRwiZawdit0=">ACh3icbVFdSxtBFJ2s2vrVGvWxL0ODoFDSXZHqk1haB98sNaokGzD3clNcnF2dpm5KwlL/oqv7V/qv+lsTKFJeuHC4Zz7fZNck+Mw/F0LVlbXrxc39jc2n71eqe+u3frsIqbKlMZ/Y+AYeaDLaYWON9bhHSRONd8vCp0u8e0TrKzA2Pc4xTGBjqkwL2VLe+V8gOGdlJgYdJUl5PfuTdeiNshlOTyCagYaY2V3txZ3epkqUjSsNDjXjsKc4xIsk9I42ewUDnNQDzDAtocGUnRxOR1+Ig805P9zHo3LKfsvxklpM6N08RHVkO6Ra0i/6e1C+6fxSWZvGA06rlRv9CSM1ldQvbIomI9gCUJT+rVEOwoNjfa67LtHaOam6TclQYUlkPF1jNI7bgSYecAplq/ILaS2/g3HykgZD/qv6spV8+JkGxO7dpX+KOVoK9g+JFs+/DG6Pm1HYjL6dNC7OZ69ZF2/EW3EoInEqLsRXcSVaQomReBI/xa9gI3gfAjOnkOD2ixnX8xZ8PEPaMrHoA=</latexit>

[Tu, R. 2018]

minimize E hPT

t=1 x∗ t Qxt + u∗ t Rut

i s.t. xt+1 = Axt + But + et ut = Kxt

<latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit><latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit><latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit><latexit sha1_base64="VtXu1OIv2HOV4ux9WIOn1s6Y31E=">ADIHicbVJNbxMxEPUuXyV8NIEjF4uIqlAUZRFSuURKCwgkemihaStl08jrTBKrtndlz6KE1f4ZrvwRbogj/Brs7SKRlJFW8/zezNgzs0kmhcVu91cQXrt+4+atjduNO3fv3d9sth6c2DQ3HAY8lak5S5gFKTQMUKCEs8wAU4mE0+TitdP4OxItXHuMxgpNhMi6ngDB01bn6LE5gJXTBj2LIspCwbsUrSRaGEFkp8gZJu0VgxnCdJ8baMJUxGNtcjQvsReX5MV2M8fwZPfKe7tC8On30PjZiNsdRHNcVbQc7vtrCpe5EJe3RvTp34c7Dy4pbmxVpx794NVGDHpSv27cbHc73croVRDVoE1qOxy3glE8SXmuQCOXzNph1M1w5Mqh4BJcq7mFjPELNoOhg5opsKOimpJnzhmQqepcZ9GWrH/ZhRMWbtUiYv087Hrmif/pw1znL4aFUJnOYLmlxdNc0kxpX5FdCIMcJRLBxg3wr2V8jkzjKNb5MotVe0M+EonxSLXgqcTWGMlLtAwR1pAxYT2XRXvhJT0E9OWHvht/VdWS9vxEzgfb5gftb9NMrwW4h0fr4r4KTF53I4aOX7f5+vZoN8og8JtskIrukT96TQzIgPGgFu0E/2Au/ht/DH+HPy9AwqHMekhULf/8B35v96A=</latexit>

Find state feedback K. How does its cost compare to optimal J⋆? There exist instances where d>p and and no algorithm can do better than O(p2/d).

(with a sophisticated uncomputable baseline) lim inf

N→∞ N E

h J(ˆ Kpg)/J? − 1 i ≥ Ω(T2(p4/d + d2p))

<latexit sha1_base64="5XFJrQ3tOiQlOX1B/JVcQO1L5Oo=">AC63icbVFdaxNBFJ1dv2paNdVHXwaDkKC2VDQJymoKLXUik1byGyW2dnZzdCZ2Xmbm1Y1j/hm/jqjxL8Mc4mEUzihYHDOfdj7j1xIYWFfv+X51+7fuPmrY3brc2tO3fvtbfvn9q8NIwPWS5zcx5Ty6XQfAgCJD8vDKcqlvwsvnjV6GeX3FiR6xOYFjxUNMiFYyCo6L2JZFCZ1G1RExIpsANSb/gomjYFrjo69EUZjEcfWmJpKnMDrokgmF6n0dVTPJqKrI6rq3exAR6qfBfM+Icn4Z9wiHxTPaPdkPOgW473d5EkyHhS9XtTu9Hf6s8DrIFiADlrEcbTthSTJWam4BiaptaOgX0BYUQOCSV63SGl5QdkFzfjIQU0Vt2E1O1CNHzsmwWlu3NOAZ+y/FRV1k5V7DKbneyq1pD/0YlpC/CSuiBK7ZfFBaSgw5bq6NE2E4Azl1gDIj3F8xm1BDGThPlqbMehecLW1SXZVasDzhK6yEKzDUkZaDos4rt1X1VkiJP1Ft8WFjwF/VtW3k7muRCbBPD53xureW7AwJVs+/Dk4HO4HDH/c6+y8X1mygh+gR6qIAPUf76B06RkPE0G/P9za9LV/53/zv/o95qu8tah6gpfB/gG49e04</latexit><latexit sha1_base64="5XFJrQ3tOiQlOX1B/JVcQO1L5Oo=">AC63icbVFdaxNBFJ1dv2paNdVHXwaDkKC2VDQJymoKLXUik1byGyW2dnZzdCZ2Xmbm1Y1j/hm/jqjxL8Mc4mEUzihYHDOfdj7j1xIYWFfv+X51+7fuPmrY3brc2tO3fvtbfvn9q8NIwPWS5zcx5Ty6XQfAgCJD8vDKcqlvwsvnjV6GeX3FiR6xOYFjxUNMiFYyCo6L2JZFCZ1G1RExIpsANSb/gomjYFrjo69EUZjEcfWmJpKnMDrokgmF6n0dVTPJqKrI6rq3exAR6qfBfM+Icn4Z9wiHxTPaPdkPOgW473d5EkyHhS9XtTu9Hf6s8DrIFiADlrEcbTthSTJWam4BiaptaOgX0BYUQOCSV63SGl5QdkFzfjIQU0Vt2E1O1CNHzsmwWlu3NOAZ+y/FRV1k5V7DKbneyq1pD/0YlpC/CSuiBK7ZfFBaSgw5bq6NE2E4Azl1gDIj3F8xm1BDGThPlqbMehecLW1SXZVasDzhK6yEKzDUkZaDos4rt1X1VkiJP1Ft8WFjwF/VtW3k7muRCbBPD53xureW7AwJVs+/Dk4HO4HDH/c6+y8X1mygh+gR6qIAPUf76B06RkPE0G/P9za9LV/53/zv/o95qu8tah6gpfB/gG49e04</latexit><latexit sha1_base64="5XFJrQ3tOiQlOX1B/JVcQO1L5Oo=">AC63icbVFdaxNBFJ1dv2paNdVHXwaDkKC2VDQJymoKLXUik1byGyW2dnZzdCZ2Xmbm1Y1j/hm/jqjxL8Mc4mEUzihYHDOfdj7j1xIYWFfv+X51+7fuPmrY3brc2tO3fvtbfvn9q8NIwPWS5zcx5Ty6XQfAgCJD8vDKcqlvwsvnjV6GeX3FiR6xOYFjxUNMiFYyCo6L2JZFCZ1G1RExIpsANSb/gomjYFrjo69EUZjEcfWmJpKnMDrokgmF6n0dVTPJqKrI6rq3exAR6qfBfM+Icn4Z9wiHxTPaPdkPOgW473d5EkyHhS9XtTu9Hf6s8DrIFiADlrEcbTthSTJWam4BiaptaOgX0BYUQOCSV63SGl5QdkFzfjIQU0Vt2E1O1CNHzsmwWlu3NOAZ+y/FRV1k5V7DKbneyq1pD/0YlpC/CSuiBK7ZfFBaSgw5bq6NE2E4Azl1gDIj3F8xm1BDGThPlqbMehecLW1SXZVasDzhK6yEKzDUkZaDos4rt1X1VkiJP1Ft8WFjwF/VtW3k7muRCbBPD53xureW7AwJVs+/Dk4HO4HDH/c6+y8X1mygh+gR6qIAPUf76B06RkPE0G/P9za9LV/53/zv/o95qu8tah6gpfB/gG49e04</latexit><latexit sha1_base64="5XFJrQ3tOiQlOX1B/JVcQO1L5Oo=">AC63icbVFdaxNBFJ1dv2paNdVHXwaDkKC2VDQJymoKLXUik1byGyW2dnZzdCZ2Xmbm1Y1j/hm/jqjxL8Mc4mEUzihYHDOfdj7j1xIYWFfv+X51+7fuPmrY3brc2tO3fvtbfvn9q8NIwPWS5zcx5Ty6XQfAgCJD8vDKcqlvwsvnjV6GeX3FiR6xOYFjxUNMiFYyCo6L2JZFCZ1G1RExIpsANSb/gomjYFrjo69EUZjEcfWmJpKnMDrokgmF6n0dVTPJqKrI6rq3exAR6qfBfM+Icn4Z9wiHxTPaPdkPOgW473d5EkyHhS9XtTu9Hf6s8DrIFiADlrEcbTthSTJWam4BiaptaOgX0BYUQOCSV63SGl5QdkFzfjIQU0Vt2E1O1CNHzsmwWlu3NOAZ+y/FRV1k5V7DKbneyq1pD/0YlpC/CSuiBK7ZfFBaSgw5bq6NE2E4Azl1gDIj3F8xm1BDGThPlqbMehecLW1SXZVasDzhK6yEKzDUkZaDos4rt1X1VkiJP1Ft8WFjwF/VtW3k7muRCbBPD53xureW7AwJVs+/Dk4HO4HDH/c6+y8X1mygh+gR6qIAPUf76B06RkPE0G/P9za9LV/53/zv/o95qu8tah6gpfB/gG49e04</latexit>

lim

N→∞ N E

h J(ˆ Kplug)/J? − 1 i = O(p)

<latexit sha1_base64="oI63R7VKi2l9cBPeuxoAXY3ACjM=">AC03icbVFdaxNBFJ2sX7V+NVHXwaDkIDWXRH0RSmoKLXUiqYtZNfl7mQ2GTozu8zc1YZhfRBf/Vf+Cf+Cr/oDnN1EMIkXBg7n3K85NyulsBiGPzrBufMXLl7auLx5eq161vd7RtHtqgM40NWyMKcZGC5FJoPUaDkJ6XhoDLJj7PTZ41+/JEbKwr9HmclTxRMtMgFA/RU2v0QS6FSdxAbMZkiGFN8orHQOc5qevA5VoDTLHMv6ljyHEd7/XgK6F7XqWslo1wpq0ldD+7vpbH19feieafkCX3TLwdptxfuhG3QdRAtQI8s4jDd7iTxuGCV4hqZBGtHUVhi4sCgYJLXm3FleQnsFCZ85KEGxW3iWiNqesczY5oXxj+NtGX/rXCgrJ2pzGc269tVrSH/p40qzB8nTuiyQq7ZfFBeSYoFbVylY2E4QznzAJgRflfKpmCAofd+aUrbu+Rs6SfurNKCFWO+wko8QwOetBwV+LP4X7mXQkr6DrSl+43Tf1XftpH7z8VEoL27w+sB2vJ/iDRqv3r4OjBTuTx24e93aeL02yQW+Q26ZOIPCK75BU5JEPCyHfyk/wiv4Nh4Ivwd5atBZ1NwkSxF8+wMfz+ek</latexit><latexit sha1_base64="oI63R7VKi2l9cBPeuxoAXY3ACjM=">AC03icbVFdaxNBFJ2sX7V+NVHXwaDkIDWXRH0RSmoKLXUiqYtZNfl7mQ2GTozu8zc1YZhfRBf/Vf+Cf+Cr/oDnN1EMIkXBg7n3K85NyulsBiGPzrBufMXLl7auLx5eq161vd7RtHtqgM40NWyMKcZGC5FJoPUaDkJ6XhoDLJj7PTZ41+/JEbKwr9HmclTxRMtMgFA/RU2v0QS6FSdxAbMZkiGFN8orHQOc5qevA5VoDTLHMv6ljyHEd7/XgK6F7XqWslo1wpq0ldD+7vpbH19feieafkCX3TLwdptxfuhG3QdRAtQI8s4jDd7iTxuGCV4hqZBGtHUVhi4sCgYJLXm3FleQnsFCZ85KEGxW3iWiNqesczY5oXxj+NtGX/rXCgrJ2pzGc269tVrSH/p40qzB8nTuiyQq7ZfFBeSYoFbVylY2E4QznzAJgRflfKpmCAofd+aUrbu+Rs6SfurNKCFWO+wko8QwOetBwV+LP4X7mXQkr6DrSl+43Tf1XftpH7z8VEoL27w+sB2vJ/iDRqv3r4OjBTuTx24e93aeL02yQW+Q26ZOIPCK75BU5JEPCyHfyk/wiv4Nh4Ivwd5atBZ1NwkSxF8+wMfz+ek</latexit><latexit sha1_base64="oI63R7VKi2l9cBPeuxoAXY3ACjM=">AC03icbVFdaxNBFJ2sX7V+NVHXwaDkIDWXRH0RSmoKLXUiqYtZNfl7mQ2GTozu8zc1YZhfRBf/Vf+Cf+Cr/oDnN1EMIkXBg7n3K85NyulsBiGPzrBufMXLl7auLx5eq161vd7RtHtqgM40NWyMKcZGC5FJoPUaDkJ6XhoDLJj7PTZ41+/JEbKwr9HmclTxRMtMgFA/RU2v0QS6FSdxAbMZkiGFN8orHQOc5qevA5VoDTLHMv6ljyHEd7/XgK6F7XqWslo1wpq0ldD+7vpbH19feieafkCX3TLwdptxfuhG3QdRAtQI8s4jDd7iTxuGCV4hqZBGtHUVhi4sCgYJLXm3FleQnsFCZ85KEGxW3iWiNqesczY5oXxj+NtGX/rXCgrJ2pzGc269tVrSH/p40qzB8nTuiyQq7ZfFBeSYoFbVylY2E4QznzAJgRflfKpmCAofd+aUrbu+Rs6SfurNKCFWO+wko8QwOetBwV+LP4X7mXQkr6DrSl+43Tf1XftpH7z8VEoL27w+sB2vJ/iDRqv3r4OjBTuTx24e93aeL02yQW+Q26ZOIPCK75BU5JEPCyHfyk/wiv4Nh4Ivwd5atBZ1NwkSxF8+wMfz+ek</latexit><latexit sha1_base64="oI63R7VKi2l9cBPeuxoAXY3ACjM=">AC03icbVFdaxNBFJ2sX7V+NVHXwaDkIDWXRH0RSmoKLXUiqYtZNfl7mQ2GTozu8zc1YZhfRBf/Vf+Cf+Cr/oDnN1EMIkXBg7n3K85NyulsBiGPzrBufMXLl7auLx5eq161vd7RtHtqgM40NWyMKcZGC5FJoPUaDkJ6XhoDLJj7PTZ41+/JEbKwr9HmclTxRMtMgFA/RU2v0QS6FSdxAbMZkiGFN8orHQOc5qevA5VoDTLHMv6ljyHEd7/XgK6F7XqWslo1wpq0ldD+7vpbH19feieafkCX3TLwdptxfuhG3QdRAtQI8s4jDd7iTxuGCV4hqZBGtHUVhi4sCgYJLXm3FleQnsFCZ85KEGxW3iWiNqesczY5oXxj+NtGX/rXCgrJ2pzGc269tVrSH/p40qzB8nTuiyQq7ZfFBeSYoFbVylY2E4QznzAJgRflfKpmCAofd+aUrbu+Rs6SfurNKCFWO+wko8QwOetBwV+LP4X7mXQkr6DrSl+43Tf1XftpH7z8VEoL27w+sB2vJ/iDRqv3r4OjBTuTx24e93aeL02yQW+Q26ZOIPCK75BU5JEPCyHfyk/wiv4Nh4Ivwd5atBZ1NwkSxF8+wMfz+ek</latexit>

lim inf

N→∞ N E

h J(ˆ Kpg)/J? − 1 i ≥ Ω(p3/d + p2)

<latexit sha1_base64="Ls1f0leAM1yAoRBL6iJgKLC8rZ8=">AC5nicbVFdaxNBFJ2s2tb4leqjL4NBSFDb3SrokxRUlFpqRdMWMtldjK7GTozu525qw3D+hN8E1/9V+KPEZxNIpjECwOHc+7H3HvSUgoLYfirFVy6fGVtfeNq+9r1GzdvdTZvH9miMowPWCELc5JSy6XQfACJD8pDacqlfw4PXvR6MefuLGi0B9hUvJY0VyLTDAKnko650QKJXSWuANiRD4GakzxGRNPwaTGB1+IojBOU/eqJpJnMNzrkTEF97ZO3FQypV5Xfe39xJifWjaNYnJjk/x23yTvGc9srTx9ujB+XpTh8nW64FU4Dr4JoDrpoHofJZismo4JVimtgklo7jMISYkcNCZ53SaV5SVlZzTnQw81VdzGbnqbGt/3zAhnhfFPA56y/1Y4qydqNRnNuvYZa0h/6cNK8iexU7osgKu2WxQVkMBW4OjUfCcAZy4gFlRvi/YjamhjLwdixMmfYuOVvYxF1UWrBixJdYCRdgqCctB0W9TX4r91pIiT9QbfF+c/u/qm/byL2XIhdgH+57z3V/JdkbEi2fxUc7WxFHr9/0t19PrdmA91F91APRegp2kVv0CEaIZ+ot+tdZ6MA6+Bt+C7PUoDWvuYMWIvjxB0T97Is=</latexit><latexit sha1_base64="Ls1f0leAM1yAoRBL6iJgKLC8rZ8=">AC5nicbVFdaxNBFJ2s2tb4leqjL4NBSFDb3SrokxRUlFpqRdMWMtldjK7GTozu525qw3D+hN8E1/9V+KPEZxNIpjECwOHc+7H3HvSUgoLYfirFVy6fGVtfeNq+9r1GzdvdTZvH9miMowPWCELc5JSy6XQfACJD8pDacqlfw4PXvR6MefuLGi0B9hUvJY0VyLTDAKnko650QKJXSWuANiRD4GakzxGRNPwaTGB1+IojBOU/eqJpJnMNzrkTEF97ZO3FQypV5Xfe39xJifWjaNYnJjk/x23yTvGc9srTx9ujB+XpTh8nW64FU4Dr4JoDrpoHofJZismo4JVimtgklo7jMISYkcNCZ53SaV5SVlZzTnQw81VdzGbnqbGt/3zAhnhfFPA56y/1Y4qydqNRnNuvYZa0h/6cNK8iexU7osgKu2WxQVkMBW4OjUfCcAZy4gFlRvi/YjamhjLwdixMmfYuOVvYxF1UWrBixJdYCRdgqCctB0W9TX4r91pIiT9QbfF+c/u/qm/byL2XIhdgH+57z3V/JdkbEi2fxUc7WxFHr9/0t19PrdmA91F91APRegp2kVv0CEaIZ+ot+tdZ6MA6+Bt+C7PUoDWvuYMWIvjxB0T97Is=</latexit><latexit sha1_base64="Ls1f0leAM1yAoRBL6iJgKLC8rZ8=">AC5nicbVFdaxNBFJ2s2tb4leqjL4NBSFDb3SrokxRUlFpqRdMWMtldjK7GTozu525qw3D+hN8E1/9V+KPEZxNIpjECwOHc+7H3HvSUgoLYfirFVy6fGVtfeNq+9r1GzdvdTZvH9miMowPWCELc5JSy6XQfACJD8pDacqlfw4PXvR6MefuLGi0B9hUvJY0VyLTDAKnko650QKJXSWuANiRD4GakzxGRNPwaTGB1+IojBOU/eqJpJnMNzrkTEF97ZO3FQypV5Xfe39xJifWjaNYnJjk/x23yTvGc9srTx9ujB+XpTh8nW64FU4Dr4JoDrpoHofJZismo4JVimtgklo7jMISYkcNCZ53SaV5SVlZzTnQw81VdzGbnqbGt/3zAhnhfFPA56y/1Y4qydqNRnNuvYZa0h/6cNK8iexU7osgKu2WxQVkMBW4OjUfCcAZy4gFlRvi/YjamhjLwdixMmfYuOVvYxF1UWrBixJdYCRdgqCctB0W9TX4r91pIiT9QbfF+c/u/qm/byL2XIhdgH+57z3V/JdkbEi2fxUc7WxFHr9/0t19PrdmA91F91APRegp2kVv0CEaIZ+ot+tdZ6MA6+Bt+C7PUoDWvuYMWIvjxB0T97Is=</latexit><latexit sha1_base64="Ls1f0leAM1yAoRBL6iJgKLC8rZ8=">AC5nicbVFdaxNBFJ2s2tb4leqjL4NBSFDb3SrokxRUlFpqRdMWMtldjK7GTozu525qw3D+hN8E1/9V+KPEZxNIpjECwOHc+7H3HvSUgoLYfirFVy6fGVtfeNq+9r1GzdvdTZvH9miMowPWCELc5JSy6XQfACJD8pDacqlfw4PXvR6MefuLGi0B9hUvJY0VyLTDAKnko650QKJXSWuANiRD4GakzxGRNPwaTGB1+IojBOU/eqJpJnMNzrkTEF97ZO3FQypV5Xfe39xJifWjaNYnJjk/x23yTvGc9srTx9ujB+XpTh8nW64FU4Dr4JoDrpoHofJZismo4JVimtgklo7jMISYkcNCZ53SaV5SVlZzTnQw81VdzGbnqbGt/3zAhnhfFPA56y/1Y4qydqNRnNuvYZa0h/6cNK8iexU7osgKu2WxQVkMBW4OjUfCcAZy4gFlRvi/YjamhjLwdixMmfYuOVvYxF1UWrBixJdYCRdgqCctB0W9TX4r91pIiT9QbfF+c/u/qm/byL2XIhdgH+57z3V/JdkbEi2fxUc7WxFHr9/0t19PrdmA91F91APRegp2kVv0CEaIZ+ot+tdZ6MA6+Bt+C7PUoDWvuYMWIvjxB0T97Is=</latexit>

Estimate state-space model, solve DARE. Run policy gradient.

slide-17
SLIDE 17

“Simplest” Example: LQR

+ru2

t

subject to xt+1 =

 1 1 1

  • xt +

 1/m

  • ut
<latexit sha1_base64="oI5Ov9KcOeHyn9bWcwJat8Txu4=">AC4HicbVHLihNBFK1uX2P7yujSTWFQRkYyXSLMuFAGFHQxixGNM5BuQnX1TaeYquqm6rYkNP0BrsStn+XOP3FpdRJlknih4HDOuY+6N6uUdBjHv4LwytVr12/s3Ixu3b5z915v9/5nV9ZWwFCUqrTnGXegpIEhSlRwXlngOlNwl286fSzL2CdLM0nFeQal4YOZGCo6fGPTUbN7jPWvqKJhkU0jSZ5mjlrI0YfUIZTZIoXgIw+T+xy2rpPo02k+IkYQfa85fdtO7sUTu9eNBvAi6DdgK9MkqTse7QZrkpag1GBSKOzdicYVpwy1KoaCNktpBxcUFL2DkoeEaXNos1tLSx57J6aS0/hmkC/ZyRsO1c3OdeacfdOo2tY78nzaqcXKUNtJUNYIRy0aTWlEsabdjmksLAtXcAy6s9LNSMeWC/SXWOuyqF2BWPtJM6uNFGUOG6zCGVruSQeouTdr5p3Uin6kRtHT2Qxb+qL9vJe29lIdE9O/HnNk+3zP4gbHP92D4fPBywD686B+/Xl1mhzwkj8geYeSQHJP35JQMiSA/ye8gCMJQhF/Db+H3pTUMVjkPyFqEP/4A9obmoA=</latexit><latexit sha1_base64="oI5Ov9KcOeHyn9bWcwJat8Txu4=">AC4HicbVHLihNBFK1uX2P7yujSTWFQRkYyXSLMuFAGFHQxixGNM5BuQnX1TaeYquqm6rYkNP0BrsStn+XOP3FpdRJlknih4HDOuY+6N6uUdBjHv4LwytVr12/s3Ixu3b5z915v9/5nV9ZWwFCUqrTnGXegpIEhSlRwXlngOlNwl286fSzL2CdLM0nFeQal4YOZGCo6fGPTUbN7jPWvqKJhkU0jSZ5mjlrI0YfUIZTZIoXgIw+T+xy2rpPo02k+IkYQfa85fdtO7sUTu9eNBvAi6DdgK9MkqTse7QZrkpag1GBSKOzdicYVpwy1KoaCNktpBxcUFL2DkoeEaXNos1tLSx57J6aS0/hmkC/ZyRsO1c3OdeacfdOo2tY78nzaqcXKUNtJUNYIRy0aTWlEsabdjmksLAtXcAy6s9LNSMeWC/SXWOuyqF2BWPtJM6uNFGUOG6zCGVruSQeouTdr5p3Uin6kRtHT2Qxb+qL9vJe29lIdE9O/HnNk+3zP4gbHP92D4fPBywD686B+/Xl1mhzwkj8geYeSQHJP35JQMiSA/ye8gCMJQhF/Db+H3pTUMVjkPyFqEP/4A9obmoA=</latexit><latexit sha1_base64="oI5Ov9KcOeHyn9bWcwJat8Txu4=">AC4HicbVHLihNBFK1uX2P7yujSTWFQRkYyXSLMuFAGFHQxixGNM5BuQnX1TaeYquqm6rYkNP0BrsStn+XOP3FpdRJlknih4HDOuY+6N6uUdBjHv4LwytVr12/s3Ixu3b5z915v9/5nV9ZWwFCUqrTnGXegpIEhSlRwXlngOlNwl286fSzL2CdLM0nFeQal4YOZGCo6fGPTUbN7jPWvqKJhkU0jSZ5mjlrI0YfUIZTZIoXgIw+T+xy2rpPo02k+IkYQfa85fdtO7sUTu9eNBvAi6DdgK9MkqTse7QZrkpag1GBSKOzdicYVpwy1KoaCNktpBxcUFL2DkoeEaXNos1tLSx57J6aS0/hmkC/ZyRsO1c3OdeacfdOo2tY78nzaqcXKUNtJUNYIRy0aTWlEsabdjmksLAtXcAy6s9LNSMeWC/SXWOuyqF2BWPtJM6uNFGUOG6zCGVruSQeouTdr5p3Uin6kRtHT2Qxb+qL9vJe29lIdE9O/HnNk+3zP4gbHP92D4fPBywD686B+/Xl1mhzwkj8geYeSQHJP35JQMiSA/ye8gCMJQhF/Db+H3pTUMVjkPyFqEP/4A9obmoA=</latexit><latexit sha1_base64="oI5Ov9KcOeHyn9bWcwJat8Txu4=">AC4HicbVHLihNBFK1uX2P7yujSTWFQRkYyXSLMuFAGFHQxixGNM5BuQnX1TaeYquqm6rYkNP0BrsStn+XOP3FpdRJlknih4HDOuY+6N6uUdBjHv4LwytVr12/s3Ixu3b5z915v9/5nV9ZWwFCUqrTnGXegpIEhSlRwXlngOlNwl286fSzL2CdLM0nFeQal4YOZGCo6fGPTUbN7jPWvqKJhkU0jSZ5mjlrI0YfUIZTZIoXgIw+T+xy2rpPo02k+IkYQfa85fdtO7sUTu9eNBvAi6DdgK9MkqTse7QZrkpag1GBSKOzdicYVpwy1KoaCNktpBxcUFL2DkoeEaXNos1tLSx57J6aS0/hmkC/ZyRsO1c3OdeacfdOo2tY78nzaqcXKUNtJUNYIRy0aTWlEsabdjmksLAtXcAy6s9LNSMeWC/SXWOuyqF2BWPtJM6uNFGUOG6zCGVruSQeouTdr5p3Uin6kRtHT2Qxb+qL9vJe29lIdE9O/HnNk+3zP4gbHP92D4fPBywD686B+/Xl1mhzwkj8geYeSQHJP35JQMiSA/ye8gCMJQhF/Db+H3pTUMVjkPyFqEP/4A9obmoA=</latexit>

xt =  zt vt

  • <latexit sha1_base64="+ojsm2yurvuosb1Y5f0dyh5w9EA=">ACo3icbVHbtNAEN2YWym3FB5WRGBioSCjZBaHkCVQIKHPBRoaKXYisbriTPqem3tjqsEK5/B1/AKH8HfsE4DIgkjrfbonLlPWmlyHIa/OsGVq9eu39i5uXvr9p2797p797+4srYKh6rUpT1LwaEmg0Mm1nhWYQi1Xianr9t9dMLtI5Kc8LzCpMCckMTUsCeGnefz8YsX8s4xZxMkxbAlmYL+dWzcSwv/CdjNlfZdzthf1waXIbRCvQEys7Hu91kjgrV2gYaXBuVEUVpw0YJmUxsVuXDusQJ1DjiMPDRTokmY52UI+9kwmJ6X1z7Bcsv9GNFA4Ny9S7+kbnLpNrSX/p41qnhwmDZmqZjTqstCk1pJL2a5JZmRsZ57AMqS71WqKVhQ7Je5VmWZu0K1Nkzqw2pMsMNVvOMLXjSIRdAp2qeU9ay89gnBxQPuU/qk/byvKCd2zwb+YubplrM/SLS5/m0wfNF/1Y8+vuwdvVldZkc8FI/EvojEgTgSH8SxGAolvonv4of4GTwJBsGn4OTSNeisYh6INQuS31Rr01Q=</latexit><latexit sha1_base64="+ojsm2yurvuosb1Y5f0dyh5w9EA=">ACo3icbVHbtNAEN2YWym3FB5WRGBioSCjZBaHkCVQIKHPBRoaKXYisbriTPqem3tjqsEK5/B1/AKH8HfsE4DIgkjrfbonLlPWmlyHIa/OsGVq9eu39i5uXvr9p2797p797+4srYKh6rUpT1LwaEmg0Mm1nhWYQi1Xianr9t9dMLtI5Kc8LzCpMCckMTUsCeGnefz8YsX8s4xZxMkxbAlmYL+dWzcSwv/CdjNlfZdzthf1waXIbRCvQEys7Hu91kjgrV2gYaXBuVEUVpw0YJmUxsVuXDusQJ1DjiMPDRTokmY52UI+9kwmJ6X1z7Bcsv9GNFA4Ny9S7+kbnLpNrSX/p41qnhwmDZmqZjTqstCk1pJL2a5JZmRsZ57AMqS71WqKVhQ7Je5VmWZu0K1Nkzqw2pMsMNVvOMLXjSIRdAp2qeU9ay89gnBxQPuU/qk/byvKCd2zwb+YubplrM/SLS5/m0wfNF/1Y8+vuwdvVldZkc8FI/EvojEgTgSH8SxGAolvonv4of4GTwJBsGn4OTSNeisYh6INQuS31Rr01Q=</latexit><latexit sha1_base64="+ojsm2yurvuosb1Y5f0dyh5w9EA=">ACo3icbVHbtNAEN2YWym3FB5WRGBioSCjZBaHkCVQIKHPBRoaKXYisbriTPqem3tjqsEK5/B1/AKH8HfsE4DIgkjrfbonLlPWmlyHIa/OsGVq9eu39i5uXvr9p2797p797+4srYKh6rUpT1LwaEmg0Mm1nhWYQi1Xianr9t9dMLtI5Kc8LzCpMCckMTUsCeGnefz8YsX8s4xZxMkxbAlmYL+dWzcSwv/CdjNlfZdzthf1waXIbRCvQEys7Hu91kjgrV2gYaXBuVEUVpw0YJmUxsVuXDusQJ1DjiMPDRTokmY52UI+9kwmJ6X1z7Bcsv9GNFA4Ny9S7+kbnLpNrSX/p41qnhwmDZmqZjTqstCk1pJL2a5JZmRsZ57AMqS71WqKVhQ7Je5VmWZu0K1Nkzqw2pMsMNVvOMLXjSIRdAp2qeU9ay89gnBxQPuU/qk/byvKCd2zwb+YubplrM/SLS5/m0wfNF/1Y8+vuwdvVldZkc8FI/EvojEgTgSH8SxGAolvonv4of4GTwJBsGn4OTSNeisYh6INQuS31Rr01Q=</latexit><latexit sha1_base64="+ojsm2yurvuosb1Y5f0dyh5w9EA=">ACo3icbVHbtNAEN2YWym3FB5WRGBioSCjZBaHkCVQIKHPBRoaKXYisbriTPqem3tjqsEK5/B1/AKH8HfsE4DIgkjrfbonLlPWmlyHIa/OsGVq9eu39i5uXvr9p2797p797+4srYKh6rUpT1LwaEmg0Mm1nhWYQi1Xianr9t9dMLtI5Kc8LzCpMCckMTUsCeGnefz8YsX8s4xZxMkxbAlmYL+dWzcSwv/CdjNlfZdzthf1waXIbRCvQEys7Hu91kjgrV2gYaXBuVEUVpw0YJmUxsVuXDusQJ1DjiMPDRTokmY52UI+9kwmJ6X1z7Bcsv9GNFA4Ny9S7+kbnLpNrSX/p41qnhwmDZmqZjTqstCk1pJL2a5JZmRsZ57AMqS71WqKVhQ7Je5VmWZu0K1Nkzqw2pMsMNVvOMLXjSIRdAp2qeU9ay89gnBxQPuU/qk/byvKCd2zwb+YubplrM/SLS5/m0wfNF/1Y8+vuwdvVldZkc8FI/EvojEgTgSH8SxGAolvonv4of4GTwJBsGn4OTSNeisYh6INQuS31Rr01Q=</latexit>

minimize

<latexit sha1_base64="mr94ezQtH17vzwJopx3THjSdtck=">ACg3icbVFbaxNRED5ZtdZ6aqPvhwMQoUSdktBfSgUFfShDxWNLSRLmD2ZJEPZTlntiQu+SW+6o/y3g2jWASBwY+vm/uU5SaAqfp71Zy5+69nfu7D/YePnr8ZL98PRbcJVX2FNO39VQEBNFntMrPGq9Aim0HhZXL9v9Msb9IGc/crzEnMDE0tjUsCRGrb3B6Zws9qQJUPfcTFsd9JujS5DbIV6IiVXQwPWvlg5FRl0LSEI/S0vOa/BMSuNib1AFLEFdwT7EVowGPJ6OflCvozMSI6dj25ZLtl/M2owIcxNESMN8DRsag35P61f8fhNXpMtK0arbhuNKy3ZyeYMckQeFet5BKA8xVmlmoIHxfFYa12WtUtUa5vUs8qSciPcYDXP2EMkA7IBs1W9UfSWn4BG+Q5Tab8V41lG/nwA02Iw9F5/Ih9tRUcH5Jtn8b9I67b7vZ5PO2bvVZ3bFc/FCHIpMvBZn4pO4ED2hRCV+iJ/iV7KTHCXHycltaNJa5TwTa5ac/gFfIcb8</latexit><latexit sha1_base64="mr94ezQtH17vzwJopx3THjSdtck=">ACg3icbVFbaxNRED5ZtdZ6aqPvhwMQoUSdktBfSgUFfShDxWNLSRLmD2ZJEPZTlntiQu+SW+6o/y3g2jWASBwY+vm/uU5SaAqfp71Zy5+69nfu7D/YePnr8ZL98PRbcJVX2FNO39VQEBNFntMrPGq9Aim0HhZXL9v9Msb9IGc/crzEnMDE0tjUsCRGrb3B6Zws9qQJUPfcTFsd9JujS5DbIV6IiVXQwPWvlg5FRl0LSEI/S0vOa/BMSuNib1AFLEFdwT7EVowGPJ6OflCvozMSI6dj25ZLtl/M2owIcxNESMN8DRsag35P61f8fhNXpMtK0arbhuNKy3ZyeYMckQeFet5BKA8xVmlmoIHxfFYa12WtUtUa5vUs8qSciPcYDXP2EMkA7IBs1W9UfSWn4BG+Q5Tab8V41lG/nwA02Iw9F5/Ih9tRUcH5Jtn8b9I67b7vZ5PO2bvVZ3bFc/FCHIpMvBZn4pO4ED2hRCV+iJ/iV7KTHCXHycltaNJa5TwTa5ac/gFfIcb8</latexit><latexit sha1_base64="mr94ezQtH17vzwJopx3THjSdtck=">ACg3icbVFbaxNRED5ZtdZ6aqPvhwMQoUSdktBfSgUFfShDxWNLSRLmD2ZJEPZTlntiQu+SW+6o/y3g2jWASBwY+vm/uU5SaAqfp71Zy5+69nfu7D/YePnr8ZL98PRbcJVX2FNO39VQEBNFntMrPGq9Aim0HhZXL9v9Msb9IGc/crzEnMDE0tjUsCRGrb3B6Zws9qQJUPfcTFsd9JujS5DbIV6IiVXQwPWvlg5FRl0LSEI/S0vOa/BMSuNib1AFLEFdwT7EVowGPJ6OflCvozMSI6dj25ZLtl/M2owIcxNESMN8DRsag35P61f8fhNXpMtK0arbhuNKy3ZyeYMckQeFet5BKA8xVmlmoIHxfFYa12WtUtUa5vUs8qSciPcYDXP2EMkA7IBs1W9UfSWn4BG+Q5Tab8V41lG/nwA02Iw9F5/Ih9tRUcH5Jtn8b9I67b7vZ5PO2bvVZ3bFc/FCHIpMvBZn4pO4ED2hRCV+iJ/iV7KTHCXHycltaNJa5TwTa5ac/gFfIcb8</latexit><latexit sha1_base64="mr94ezQtH17vzwJopx3THjSdtck=">ACg3icbVFbaxNRED5ZtdZ6aqPvhwMQoUSdktBfSgUFfShDxWNLSRLmD2ZJEPZTlntiQu+SW+6o/y3g2jWASBwY+vm/uU5SaAqfp71Zy5+69nfu7D/YePnr8ZL98PRbcJVX2FNO39VQEBNFntMrPGq9Aim0HhZXL9v9Msb9IGc/crzEnMDE0tjUsCRGrb3B6Zws9qQJUPfcTFsd9JujS5DbIV6IiVXQwPWvlg5FRl0LSEI/S0vOa/BMSuNib1AFLEFdwT7EVowGPJ6OflCvozMSI6dj25ZLtl/M2owIcxNESMN8DRsag35P61f8fhNXpMtK0arbhuNKy3ZyeYMckQeFet5BKA8xVmlmoIHxfFYa12WtUtUa5vUs8qSciPcYDXP2EMkA7IBs1W9UfSWn4BG+Q5Tab8V41lG/nwA02Iw9F5/Ih9tRUcH5Jtn8b9I67b7vZ5PO2bvVZ3bFc/FCHIpMvBZn4pO4ED2hRCV+iJ/iV7KTHCXHycltaNJa5TwTa5ac/gFfIcb8</latexit>

T

X

t=0

(xt)2

1

<latexit sha1_base64="C3vaSFLHSyP2ZENoaL07qvHm1sk=">ACi3icbVFdSxtBFJ1sbWtq7E+NCXoaEQoYRdEaqlgvhBfDBUlOFZF3uTm6SwZnZeauJCz5NX1tf5D/xtmYQpP0wsDhnPsx9540V9JRGD7Ugmcrz1+8XH219vrN2/WN+ua7ny4rMC2yFRmb1JwqKTBNklSeJNbBJ0qvE7vTir9+h6tk5m5onGOsYaBkX0pgDyV1Le7rtBJSYfh5PaKN0cJ7STR7S5P6o2wFU6DL4NoBhpsFpfJZi3u9jJRaDQkFDjXicKc4hIsSaFwstYtHOYg7mCAHQ8NaHRxOd1gwj96psf7mfXPEJ+y/1aUoJ0b69RnaqChW9Qq8n9ap6D+flxKkxeERjwN6heKU8arc/CetChIjT0AYaX/KxdDsCDIH21uyrR3jmJuk3JUGCmyHi6wikZkwZMOSYM01VblN6kU/wHG8Qs5GNJf1bet5OapHEhyny68M2ZnKdkbEi2efxm0d1sHrej7XuPoeObMKnvPrAmi9hndsTO2SVrM8Em7Bf7zf4E68Fe8CX4+pQa1GY1W2wugrNHJQfIdw=</latexit><latexit sha1_base64="C3vaSFLHSyP2ZENoaL07qvHm1sk=">ACi3icbVFdSxtBFJ1sbWtq7E+NCXoaEQoYRdEaqlgvhBfDBUlOFZF3uTm6SwZnZeauJCz5NX1tf5D/xtmYQpP0wsDhnPsx9540V9JRGD7Ugmcrz1+8XH219vrN2/WN+ua7ny4rMC2yFRmb1JwqKTBNklSeJNbBJ0qvE7vTir9+h6tk5m5onGOsYaBkX0pgDyV1Le7rtBJSYfh5PaKN0cJ7STR7S5P6o2wFU6DL4NoBhpsFpfJZi3u9jJRaDQkFDjXicKc4hIsSaFwstYtHOYg7mCAHQ8NaHRxOd1gwj96psf7mfXPEJ+y/1aUoJ0b69RnaqChW9Qq8n9ap6D+flxKkxeERjwN6heKU8arc/CetChIjT0AYaX/KxdDsCDIH21uyrR3jmJuk3JUGCmyHi6wikZkwZMOSYM01VblN6kU/wHG8Qs5GNJf1bet5OapHEhyny68M2ZnKdkbEi2efxm0d1sHrej7XuPoeObMKnvPrAmi9hndsTO2SVrM8Em7Bf7zf4E68Fe8CX4+pQa1GY1W2wugrNHJQfIdw=</latexit><latexit sha1_base64="C3vaSFLHSyP2ZENoaL07qvHm1sk=">ACi3icbVFdSxtBFJ1sbWtq7E+NCXoaEQoYRdEaqlgvhBfDBUlOFZF3uTm6SwZnZeauJCz5NX1tf5D/xtmYQpP0wsDhnPsx9540V9JRGD7Ugmcrz1+8XH219vrN2/WN+ua7ny4rMC2yFRmb1JwqKTBNklSeJNbBJ0qvE7vTir9+h6tk5m5onGOsYaBkX0pgDyV1Le7rtBJSYfh5PaKN0cJ7STR7S5P6o2wFU6DL4NoBhpsFpfJZi3u9jJRaDQkFDjXicKc4hIsSaFwstYtHOYg7mCAHQ8NaHRxOd1gwj96psf7mfXPEJ+y/1aUoJ0b69RnaqChW9Qq8n9ap6D+flxKkxeERjwN6heKU8arc/CetChIjT0AYaX/KxdDsCDIH21uyrR3jmJuk3JUGCmyHi6wikZkwZMOSYM01VblN6kU/wHG8Qs5GNJf1bet5OapHEhyny68M2ZnKdkbEi2efxm0d1sHrej7XuPoeObMKnvPrAmi9hndsTO2SVrM8Em7Bf7zf4E68Fe8CX4+pQa1GY1W2wugrNHJQfIdw=</latexit><latexit sha1_base64="C3vaSFLHSyP2ZENoaL07qvHm1sk=">ACi3icbVFdSxtBFJ1sbWtq7E+NCXoaEQoYRdEaqlgvhBfDBUlOFZF3uTm6SwZnZeauJCz5NX1tf5D/xtmYQpP0wsDhnPsx9540V9JRGD7Ugmcrz1+8XH219vrN2/WN+ua7ny4rMC2yFRmb1JwqKTBNklSeJNbBJ0qvE7vTir9+h6tk5m5onGOsYaBkX0pgDyV1Le7rtBJSYfh5PaKN0cJ7STR7S5P6o2wFU6DL4NoBhpsFpfJZi3u9jJRaDQkFDjXicKc4hIsSaFwstYtHOYg7mCAHQ8NaHRxOd1gwj96psf7mfXPEJ+y/1aUoJ0b69RnaqChW9Qq8n9ap6D+flxKkxeERjwN6heKU8arc/CetChIjT0AYaX/KxdDsCDIH21uyrR3jmJuk3JUGCmyHi6wikZkwZMOSYM01VblN6kU/wHG8Qs5GNJf1bet5OapHEhyny68M2ZnKdkbEi2efxm0d1sHrej7XuPoeObMKnvPrAmi9hndsTO2SVrM8Em7Bf7zf4E68Fe8CX4+pQa1GY1W2wugrNHJQfIdw=</latexit>

samples

nominal control with 10 samples

argmin.net/code/lqr_policy_comparisons.ipynb

slide-18
SLIDE 18

Extraordinary Claims Require Extraordinary Evidence*

“How can we dismiss an entire field which claims such success?” * only if your prior is correct

“Reinforcement learning results are tricky to reproduce: performance is very noisy, algorithms have many moving parts which allow for subtle bugs, and many papers don’t report all the required tricks.” “RL algorithms are challenging to implement correctly; good results typically only come after fixing many seemingly-trivial bugs.”

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

Timesteps

×106

1000 2000 3000 4000 5000

Average Return HalfCheetah-v1 (TRPO, Different Random Seeds)

Random Average (5 runs) Random Average (5 runs)

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

Timesteps

×106

−500 500 1000 1500 2000

Average Return HalfCheetah-v1 (TRPO, Codebase Comparison)

Schulman 2015 Schulman 2017 Duan 2016

blog.openai.com/openai-baselines-dqn/

There has to be a better way!

arxiv:1709.06560

slide-19
SLIDE 19

G K

u y

Coarse-ID control

^

Coarse-grained model is trivial to fit

Δ

w v High dimensional stats bounds the error Design robust control for feedback loop Robust certainty equivalence.

slide-20
SLIDE 20

Coarse-ID Control for LQR

[Dean, Mania, Matni, R.,Tu, 2017] Gaussian noise

Run an experiment for T steps with random input. Then minimize(A,B) PT

i=1 kxi+1 Axi Buik2

minimize limT→∞ E h

1 T

PT

t=1 x∗ t Qxt + u∗ t Rut

i s.t. xt+1 = Axt + But + et

<latexit sha1_base64="eUYQlM8OqOqnwPVqLDlbjJBJAnM=">ADNXicbVJLbxMxEPYurxIeTeHIxSKiKhRFuwgJLpVKAcEhxaStFK8jbyON7Fqe1f2LCRY+7u48jc4cENc+Qt40UiCSNZM/7m5ZnPaSGFhSj6HoRXrl67fmPrZuvW7Tt3t9s794Y2Lw3jA5bL3Jyl1HIpNB+AMnPCsOpSiU/TS9e1/7T9xYkes+LAqeKDrVIhOMgofG7W8k5VOhHTWGLionZdUiKs3nTgktlPjCK7yLiRq7PrEiOkMfGT+mQidwaIisIsTd3bikiewYhkhjIXV65fYWJLnwQHcXex/MxnD/BJ7XG+7hc3j7U+rJmQkjT1nahW7ec+9T9uMIH+FWTdFSHe83H0CJcT5onj9udqBstBW8acWN0UCPH450gIZOclYprYJaO4qjAhJfDgST3M9fWl5QdkGnfORNTRW3iVusKPDLBW780YCX6L8ZjiprFyr1kfVu7LqvBv/nG5WQvUyc0EUJXLPLRlkpMeS45g1PhOEM5MIblBnh34rZjPp1g2d3pcuydsHZyiRuXmrB8glfQyXMwVAPWg6Kelb9VO6dkBJ/pNriXs3OX68vW7v3oipAPu057+QfrwR7AmJ19e/aQyfdeOoG587xweNdRsoQfoIdpDMXqBDtF7dIwGiAW7QS8YBMPwa/gj/Bn+ugwNgybnPlqR8PcfXx4JRw=</latexit><latexit sha1_base64="eUYQlM8OqOqnwPVqLDlbjJBJAnM=">ADNXicbVJLbxMxEPYurxIeTeHIxSKiKhRFuwgJLpVKAcEhxaStFK8jbyON7Fqe1f2LCRY+7u48jc4cENc+Qt40UiCSNZM/7m5ZnPaSGFhSj6HoRXrl67fmPrZuvW7Tt3t9s794Y2Lw3jA5bL3Jyl1HIpNB+AMnPCsOpSiU/TS9e1/7T9xYkes+LAqeKDrVIhOMgofG7W8k5VOhHTWGLionZdUiKs3nTgktlPjCK7yLiRq7PrEiOkMfGT+mQidwaIisIsTd3bikiewYhkhjIXV65fYWJLnwQHcXex/MxnD/BJ7XG+7hc3j7U+rJmQkjT1nahW7ec+9T9uMIH+FWTdFSHe83H0CJcT5onj9udqBstBW8acWN0UCPH450gIZOclYprYJaO4qjAhJfDgST3M9fWl5QdkGnfORNTRW3iVusKPDLBW780YCX6L8ZjiprFyr1kfVu7LqvBv/nG5WQvUyc0EUJXLPLRlkpMeS45g1PhOEM5MIblBnh34rZjPp1g2d3pcuydsHZyiRuXmrB8glfQyXMwVAPWg6Kelb9VO6dkBJ/pNriXs3OX68vW7v3oipAPu057+QfrwR7AmJ19e/aQyfdeOoG587xweNdRsoQfoIdpDMXqBDtF7dIwGiAW7QS8YBMPwa/gj/Bn+ugwNgybnPlqR8PcfXx4JRw=</latexit><latexit sha1_base64="eUYQlM8OqOqnwPVqLDlbjJBJAnM=">ADNXicbVJLbxMxEPYurxIeTeHIxSKiKhRFuwgJLpVKAcEhxaStFK8jbyON7Fqe1f2LCRY+7u48jc4cENc+Qt40UiCSNZM/7m5ZnPaSGFhSj6HoRXrl67fmPrZuvW7Tt3t9s794Y2Lw3jA5bL3Jyl1HIpNB+AMnPCsOpSiU/TS9e1/7T9xYkes+LAqeKDrVIhOMgofG7W8k5VOhHTWGLionZdUiKs3nTgktlPjCK7yLiRq7PrEiOkMfGT+mQidwaIisIsTd3bikiewYhkhjIXV65fYWJLnwQHcXex/MxnD/BJ7XG+7hc3j7U+rJmQkjT1nahW7ec+9T9uMIH+FWTdFSHe83H0CJcT5onj9udqBstBW8acWN0UCPH450gIZOclYprYJaO4qjAhJfDgST3M9fWl5QdkGnfORNTRW3iVusKPDLBW780YCX6L8ZjiprFyr1kfVu7LqvBv/nG5WQvUyc0EUJXLPLRlkpMeS45g1PhOEM5MIblBnh34rZjPp1g2d3pcuydsHZyiRuXmrB8glfQyXMwVAPWg6Kelb9VO6dkBJ/pNriXs3OX68vW7v3oipAPu057+QfrwR7AmJ19e/aQyfdeOoG587xweNdRsoQfoIdpDMXqBDtF7dIwGiAW7QS8YBMPwa/gj/Bn+ugwNgybnPlqR8PcfXx4JRw=</latexit><latexit sha1_base64="eUYQlM8OqOqnwPVqLDlbjJBJAnM=">ADNXicbVJLbxMxEPYurxIeTeHIxSKiKhRFuwgJLpVKAcEhxaStFK8jbyON7Fqe1f2LCRY+7u48jc4cENc+Qt40UiCSNZM/7m5ZnPaSGFhSj6HoRXrl67fmPrZuvW7Tt3t9s794Y2Lw3jA5bL3Jyl1HIpNB+AMnPCsOpSiU/TS9e1/7T9xYkes+LAqeKDrVIhOMgofG7W8k5VOhHTWGLionZdUiKs3nTgktlPjCK7yLiRq7PrEiOkMfGT+mQidwaIisIsTd3bikiewYhkhjIXV65fYWJLnwQHcXex/MxnD/BJ7XG+7hc3j7U+rJmQkjT1nahW7ec+9T9uMIH+FWTdFSHe83H0CJcT5onj9udqBstBW8acWN0UCPH450gIZOclYprYJaO4qjAhJfDgST3M9fWl5QdkGnfORNTRW3iVusKPDLBW780YCX6L8ZjiprFyr1kfVu7LqvBv/nG5WQvUyc0EUJXLPLRlkpMeS45g1PhOEM5MIblBnh34rZjPp1g2d3pcuydsHZyiRuXmrB8glfQyXMwVAPWg6Kelb9VO6dkBJ/pNriXs3OX68vW7v3oipAPu057+QfrwR7AmJ19e/aQyfdeOoG587xweNdRsoQfoIdpDMXqBDtF7dIwGiAW7QS8YBMPwa/gj/Bn+ugwNgybnPlqR8PcfXx4JRw=</latexit>

[Mania, Jordan, R., Simchowitz, Tu, 2018]

How many samples are needed to Estimate (A,B)? (A stable) If A ˆ A B ˆ B and then T ≥ ˜ O ✓2(d + p) ✏2 ◆

<latexit sha1_base64="WAkescFze8OXUlEokrTNCbYup8I=">ACs3icbVFdi9NAFJ3Gr3X96uqjL4NFaFGWpCj6JIsK+rDgitvdhSZbJpOb9LIzkzhzI1tC/o2/xld98d846VawrRcGDufcj7npVCR2H4uxdcu37j5q2d27t37t67/6C/9/DElbWVMJGlKu1ZKhwoNDAhJAVnlQWhUwWn6cW7Tj/9BtZhaY5pUGiRWEwRynIU7P+m+O4gK8JlQZNJ/aWEFOwzi3Qjaxw0KL8zEfZs+qUdvEUDlUpTkftzy2WMxpNOsPwv1wGXwbRCswYKs4mu31kjgrZa3BkFTCuWkUVpQ0whJKBe1uXDuohLwQBUw9NEKDS5rloi1/6pmM56X1zxBfsv9WNEI7t9Cpz9SC5m5T68j/adOa8tdJg6aqCYy8GpTXilPJO9d4hYkqYUHQlr0f+VyLrxJ5L1dm7LsXYFc26S5rA3KMoMNVtElWeFJB6QFm6r5gMqxb8I4/h5/Ff1bft5OF7LJDc80N/QDPaSvYHiTbt3wYn4/3I48vBgdvV6fZY/ZEzZkEXvFDthHdsQmTLv7Af7yX4FL4NpkAbZVWrQW9U8YmsR6D/HWNmB</latexit><latexit sha1_base64="WAkescFze8OXUlEokrTNCbYup8I=">ACs3icbVFdi9NAFJ3Gr3X96uqjL4NFaFGWpCj6JIsK+rDgitvdhSZbJpOb9LIzkzhzI1tC/o2/xld98d846VawrRcGDufcj7npVCR2H4uxdcu37j5q2d27t37t67/6C/9/DElbWVMJGlKu1ZKhwoNDAhJAVnlQWhUwWn6cW7Tj/9BtZhaY5pUGiRWEwRynIU7P+m+O4gK8JlQZNJ/aWEFOwzi3Qjaxw0KL8zEfZs+qUdvEUDlUpTkftzy2WMxpNOsPwv1wGXwbRCswYKs4mu31kjgrZa3BkFTCuWkUVpQ0whJKBe1uXDuohLwQBUw9NEKDS5rloi1/6pmM56X1zxBfsv9WNEI7t9Cpz9SC5m5T68j/adOa8tdJg6aqCYy8GpTXilPJO9d4hYkqYUHQlr0f+VyLrxJ5L1dm7LsXYFc26S5rA3KMoMNVtElWeFJB6QFm6r5gMqxb8I4/h5/Ff1bft5OF7LJDc80N/QDPaSvYHiTbt3wYn4/3I48vBgdvV6fZY/ZEzZkEXvFDthHdsQmTLv7Af7yX4FL4NpkAbZVWrQW9U8YmsR6D/HWNmB</latexit><latexit sha1_base64="WAkescFze8OXUlEokrTNCbYup8I=">ACs3icbVFdi9NAFJ3Gr3X96uqjL4NFaFGWpCj6JIsK+rDgitvdhSZbJpOb9LIzkzhzI1tC/o2/xld98d846VawrRcGDufcj7npVCR2H4uxdcu37j5q2d27t37t67/6C/9/DElbWVMJGlKu1ZKhwoNDAhJAVnlQWhUwWn6cW7Tj/9BtZhaY5pUGiRWEwRynIU7P+m+O4gK8JlQZNJ/aWEFOwzi3Qjaxw0KL8zEfZs+qUdvEUDlUpTkftzy2WMxpNOsPwv1wGXwbRCswYKs4mu31kjgrZa3BkFTCuWkUVpQ0whJKBe1uXDuohLwQBUw9NEKDS5rloi1/6pmM56X1zxBfsv9WNEI7t9Cpz9SC5m5T68j/adOa8tdJg6aqCYy8GpTXilPJO9d4hYkqYUHQlr0f+VyLrxJ5L1dm7LsXYFc26S5rA3KMoMNVtElWeFJB6QFm6r5gMqxb8I4/h5/Ff1bft5OF7LJDc80N/QDPaSvYHiTbt3wYn4/3I48vBgdvV6fZY/ZEzZkEXvFDthHdsQmTLv7Af7yX4FL4NpkAbZVWrQW9U8YmsR6D/HWNmB</latexit><latexit sha1_base64="WAkescFze8OXUlEokrTNCbYup8I=">ACs3icbVFdi9NAFJ3Gr3X96uqjL4NFaFGWpCj6JIsK+rDgitvdhSZbJpOb9LIzkzhzI1tC/o2/xld98d846VawrRcGDufcj7npVCR2H4uxdcu37j5q2d27t37t67/6C/9/DElbWVMJGlKu1ZKhwoNDAhJAVnlQWhUwWn6cW7Tj/9BtZhaY5pUGiRWEwRynIU7P+m+O4gK8JlQZNJ/aWEFOwzi3Qjaxw0KL8zEfZs+qUdvEUDlUpTkftzy2WMxpNOsPwv1wGXwbRCswYKs4mu31kjgrZa3BkFTCuWkUVpQ0whJKBe1uXDuohLwQBUw9NEKDS5rloi1/6pmM56X1zxBfsv9WNEI7t9Cpz9SC5m5T68j/adOa8tdJg6aqCYy8GpTXilPJO9d4hYkqYUHQlr0f+VyLrxJ5L1dm7LsXYFc26S5rA3KMoMNVtElWeFJB6QFm6r5gMqxb8I4/h5/Ff1bft5OF7LJDc80N/QDPaSvYHiTbt3wYn4/3I48vBgdvV6fZY/ZEzZkEXvFDthHdsQmTLv7Af7yX4FL4NpkAbZVWrQW9U8YmsR6D/HWNmB</latexit>

w.h.p.

slide-21
SLIDE 21

Coarse-ID Control for LQR

[Dean, Mania, Matni, R., Tu 2017]

minimize

u

sup

k∆Ak2✏A, k∆Bk2✏B

lim

T!1 1 T

PT

t=1 x⇤ t Qxt + u⇤ t Rut

s.t. xt+1 = (ˆ A + ∆A)xt + (ˆ B + ∆B)ut Robust LQR solution via robust system level synthesis Solving an SDP relaxation of this robust control problem yields w.h.p.

J(ˆ K) − J? J? ≤ C r σ2(d + p) T

<latexit sha1_base64="hMONhSG+imaZ6o0oxBkEgDwKW9s=">ACwnicbVFda9swFW8ry7SrfHvYiFQcKyYpdB+zQKLWx0HXSsaQuxF65l2dEqy450PRo0/6r9mr5uf2RyksKS7ILgcM790D03LqUw6Ps3Le/O3Xv3H2w9bD96/OTps87283NTVJrxIStkoS9jMFwKxYcoUPLUnPIY8kv4qvDRr/4wbURhTrDWcmjHDIlUsEAHTXufA5TDcwe98IJoP1U98ej0ODoGt7C2go+ZQehoNw0A7NVKOli6LQiCyHb7u95E3Zr+1ZXY87X/HnwfdBMESdMkyTsfbrShMClblXCGTYMwo8EuMLGgUTPK6HVaGl8CuIOMjBxXk3ER2vndNXzsmoWmh3VNI5+y/FRZyY2Z57DJzwIlZ1xryf9qownQ/skKVFXLFoPSlIsaGMiTYTmDOXMAWBauL9SNgHnCTqrV6bMe5ecrWxiryslWJHwNVbiNWpwpOGYg1DNVvaDkJ+BWXoicgmeKu6to3cOxKZQDM4cfdU/Y1kd5Bg3f5NcL67Ezj85V34P3yNFvkJXlFeiQge+SAfCSnZEgY+UVuyG/yxzvyvntTzyxSvday5gVZCe/nX/f03zA=</latexit><latexit sha1_base64="hMONhSG+imaZ6o0oxBkEgDwKW9s=">ACwnicbVFda9swFW8ry7SrfHvYiFQcKyYpdB+zQKLWx0HXSsaQuxF65l2dEqy450PRo0/6r9mr5uf2RyksKS7ILgcM790D03LqUw6Ps3Le/O3Xv3H2w9bD96/OTps87283NTVJrxIStkoS9jMFwKxYcoUPLUnPIY8kv4qvDRr/4wbURhTrDWcmjHDIlUsEAHTXufA5TDcwe98IJoP1U98ej0ODoGt7C2go+ZQehoNw0A7NVKOli6LQiCyHb7u95E3Zr+1ZXY87X/HnwfdBMESdMkyTsfbrShMClblXCGTYMwo8EuMLGgUTPK6HVaGl8CuIOMjBxXk3ER2vndNXzsmoWmh3VNI5+y/FRZyY2Z57DJzwIlZ1xryf9qownQ/skKVFXLFoPSlIsaGMiTYTmDOXMAWBauL9SNgHnCTqrV6bMe5ecrWxiryslWJHwNVbiNWpwpOGYg1DNVvaDkJ+BWXoicgmeKu6to3cOxKZQDM4cfdU/Y1kd5Bg3f5NcL67Ezj85V34P3yNFvkJXlFeiQge+SAfCSnZEgY+UVuyG/yxzvyvntTzyxSvday5gVZCe/nX/f03zA=</latexit><latexit sha1_base64="hMONhSG+imaZ6o0oxBkEgDwKW9s=">ACwnicbVFda9swFW8ry7SrfHvYiFQcKyYpdB+zQKLWx0HXSsaQuxF65l2dEqy450PRo0/6r9mr5uf2RyksKS7ILgcM790D03LqUw6Ps3Le/O3Xv3H2w9bD96/OTps87283NTVJrxIStkoS9jMFwKxYcoUPLUnPIY8kv4qvDRr/4wbURhTrDWcmjHDIlUsEAHTXufA5TDcwe98IJoP1U98ej0ODoGt7C2go+ZQehoNw0A7NVKOli6LQiCyHb7u95E3Zr+1ZXY87X/HnwfdBMESdMkyTsfbrShMClblXCGTYMwo8EuMLGgUTPK6HVaGl8CuIOMjBxXk3ER2vndNXzsmoWmh3VNI5+y/FRZyY2Z57DJzwIlZ1xryf9qownQ/skKVFXLFoPSlIsaGMiTYTmDOXMAWBauL9SNgHnCTqrV6bMe5ecrWxiryslWJHwNVbiNWpwpOGYg1DNVvaDkJ+BWXoicgmeKu6to3cOxKZQDM4cfdU/Y1kd5Bg3f5NcL67Ezj85V34P3yNFvkJXlFeiQge+SAfCSnZEgY+UVuyG/yxzvyvntTzyxSvday5gVZCe/nX/f03zA=</latexit><latexit sha1_base64="hMONhSG+imaZ6o0oxBkEgDwKW9s=">ACwnicbVFda9swFW8ry7SrfHvYiFQcKyYpdB+zQKLWx0HXSsaQuxF65l2dEqy450PRo0/6r9mr5uf2RyksKS7ILgcM790D03LqUw6Ps3Le/O3Xv3H2w9bD96/OTps87283NTVJrxIStkoS9jMFwKxYcoUPLUnPIY8kv4qvDRr/4wbURhTrDWcmjHDIlUsEAHTXufA5TDcwe98IJoP1U98ej0ODoGt7C2go+ZQehoNw0A7NVKOli6LQiCyHb7u95E3Zr+1ZXY87X/HnwfdBMESdMkyTsfbrShMClblXCGTYMwo8EuMLGgUTPK6HVaGl8CuIOMjBxXk3ER2vndNXzsmoWmh3VNI5+y/FRZyY2Z57DJzwIlZ1xryf9qownQ/skKVFXLFoPSlIsaGMiTYTmDOXMAWBauL9SNgHnCTqrV6bMe5ecrWxiryslWJHwNVbiNWpwpOGYg1DNVvaDkJ+BWXoicgmeKu6to3cOxKZQDM4cfdU/Y1kd5Bg3f5NcL67Ezj85V34P3yNFvkJXlFeiQge+SAfCSnZEgY+UVuyG/yxzvyvntTzyxSvday5gVZCe/nX/f03zA=</latexit>

J(ˆ K)

<latexit sha1_base64="0vOXTOtJnwXS6Uel5IyOwtRxds=">ACfXicbVFdSxtBFJ2srVq1NtHvgwNQgQJu1KITxJQsFgfLG0SMVnk7uQmGTI7u8zcFcOSf+Fr/V/+Gp2NKTSJFy4czrnfN0qVtOT7zyVv7cPH9Y3NT1vbO593v5Qre2bZEZgSyQqMTcRWFRSY4skKbxJDUIcKexE47NC79yjsTLRf2iSYhjDUMuBFECOur2s9UZA+c/p4V256tf9mfFVEMxBlc3t+q5SCnv9RGQxahIKrO0GfkphDoakUDjd6mUWUxBjGLXQ0x2jCfjTzlB47p80FinGviM/b/jBxiaydx5CJjoJFd1gryPa2b0eAkzKVOM0It3hoNMsUp4cX+vC8NClITB0AY6WblYgQGBLkrLXSZ1U5RLGySP2RaiqSPS6yiBzLgSIsUg9TFVvmFVIr/Bm35lRyO6J/qyhZy7VwOJdmjK/cKfbgS7B4SLJ9/FbSP64HDv75Xm6fz12yr+wbq7GANViT/WDXrMUE0+yR/WVPpRfvwDvy6m+hXmes8WzGu8AtcixMs=</latexit><latexit sha1_base64="0vOXTOtJnwXS6Uel5IyOwtRxds=">ACfXicbVFdSxtBFJ2srVq1NtHvgwNQgQJu1KITxJQsFgfLG0SMVnk7uQmGTI7u8zcFcOSf+Fr/V/+Gp2NKTSJFy4czrnfN0qVtOT7zyVv7cPH9Y3NT1vbO593v5Qre2bZEZgSyQqMTcRWFRSY4skKbxJDUIcKexE47NC79yjsTLRf2iSYhjDUMuBFECOur2s9UZA+c/p4V256tf9mfFVEMxBlc3t+q5SCnv9RGQxahIKrO0GfkphDoakUDjd6mUWUxBjGLXQ0x2jCfjTzlB47p80FinGviM/b/jBxiaydx5CJjoJFd1gryPa2b0eAkzKVOM0It3hoNMsUp4cX+vC8NClITB0AY6WblYgQGBLkrLXSZ1U5RLGySP2RaiqSPS6yiBzLgSIsUg9TFVvmFVIr/Bm35lRyO6J/qyhZy7VwOJdmjK/cKfbgS7B4SLJ9/FbSP64HDv75Xm6fz12yr+wbq7GANViT/WDXrMUE0+yR/WVPpRfvwDvy6m+hXmes8WzGu8AtcixMs=</latexit><latexit sha1_base64="0vOXTOtJnwXS6Uel5IyOwtRxds=">ACfXicbVFdSxtBFJ2srVq1NtHvgwNQgQJu1KITxJQsFgfLG0SMVnk7uQmGTI7u8zcFcOSf+Fr/V/+Gp2NKTSJFy4czrnfN0qVtOT7zyVv7cPH9Y3NT1vbO593v5Qre2bZEZgSyQqMTcRWFRSY4skKbxJDUIcKexE47NC79yjsTLRf2iSYhjDUMuBFECOur2s9UZA+c/p4V256tf9mfFVEMxBlc3t+q5SCnv9RGQxahIKrO0GfkphDoakUDjd6mUWUxBjGLXQ0x2jCfjTzlB47p80FinGviM/b/jBxiaydx5CJjoJFd1gryPa2b0eAkzKVOM0It3hoNMsUp4cX+vC8NClITB0AY6WblYgQGBLkrLXSZ1U5RLGySP2RaiqSPS6yiBzLgSIsUg9TFVvmFVIr/Bm35lRyO6J/qyhZy7VwOJdmjK/cKfbgS7B4SLJ9/FbSP64HDv75Xm6fz12yr+wbq7GANViT/WDXrMUE0+yR/WVPpRfvwDvy6m+hXmes8WzGu8AtcixMs=</latexit><latexit sha1_base64="0vOXTOtJnwXS6Uel5IyOwtRxds=">ACfXicbVFdSxtBFJ2srVq1NtHvgwNQgQJu1KITxJQsFgfLG0SMVnk7uQmGTI7u8zcFcOSf+Fr/V/+Gp2NKTSJFy4czrnfN0qVtOT7zyVv7cPH9Y3NT1vbO593v5Qre2bZEZgSyQqMTcRWFRSY4skKbxJDUIcKexE47NC79yjsTLRf2iSYhjDUMuBFECOur2s9UZA+c/p4V256tf9mfFVEMxBlc3t+q5SCnv9RGQxahIKrO0GfkphDoakUDjd6mUWUxBjGLXQ0x2jCfjTzlB47p80FinGviM/b/jBxiaydx5CJjoJFd1gryPa2b0eAkzKVOM0It3hoNMsUp4cX+vC8NClITB0AY6WblYgQGBLkrLXSZ1U5RLGySP2RaiqSPS6yiBzLgSIsUg9TFVvmFVIr/Bm35lRyO6J/qyhZy7VwOJdmjK/cKfbgS7B4SLJ9/FbSP64HDv75Xm6fz12yr+wbq7GANViT/WDXrMUE0+yR/WVPpRfvwDvy6m+hXmes8WzGu8AtcixMs=</latexit>

= LQR cost of coarse-ID control J?

<latexit sha1_base64="lh6NBekHXcGCOX4qDagmKH6cUHY=">ACfHicbVFbSxtBFJ5s1Xpr1fbRl8EoKLZhV0rbpyIoKOKDovFCsoSzk5NkcGZ2mTlbDEt+ha/tD/PiLMxgk8cODj+879JmSjsLwsRJ8mJmd+zi/sLi0/OnzyuralyuX5lZgXaQqtTcJOFTSYJ0kKbzJLIJOF4ndwelfv0XrZOpuaR+hrGrpEdKYA8dVuctJqOwA5aq9WwFg6NT4NoBKpsZGetUrcbKci12hIKHCuEYUZxQVYkLhYLGZO8xA3EXGx4a0OjiYjxgG95ps07qfVuiA/ZtxkFaOf6OvGRGqjnJrWSfE9r5NT5HRfSZDmhES+NOrnilPJyfd6WFgWpvgcgrPSzctEDC4L8kca6DGtnKMY2Ke5zI0XaxglW0T1Z8KRD0iBNuVxJXiF2AcP5XdHr2qvmwpbx/KriT37dR/wuxMBfuHRJPnwZXe7XI4/Mf1f0/o9fMs3W2wbZxH6xfXbMzlidCabZA/vH/legs1gN/j+EhpURjlf2ZgFP58BNOrFAQ=</latexit><latexit sha1_base64="lh6NBekHXcGCOX4qDagmKH6cUHY=">ACfHicbVFbSxtBFJ5s1Xpr1fbRl8EoKLZhV0rbpyIoKOKDovFCsoSzk5NkcGZ2mTlbDEt+ha/tD/PiLMxgk8cODj+879JmSjsLwsRJ8mJmd+zi/sLi0/OnzyuralyuX5lZgXaQqtTcJOFTSYJ0kKbzJLIJOF4ndwelfv0XrZOpuaR+hrGrpEdKYA8dVuctJqOwA5aq9WwFg6NT4NoBKpsZGetUrcbKci12hIKHCuEYUZxQVYkLhYLGZO8xA3EXGx4a0OjiYjxgG95ps07qfVuiA/ZtxkFaOf6OvGRGqjnJrWSfE9r5NT5HRfSZDmhES+NOrnilPJyfd6WFgWpvgcgrPSzctEDC4L8kca6DGtnKMY2Ke5zI0XaxglW0T1Z8KRD0iBNuVxJXiF2AcP5XdHr2qvmwpbx/KriT37dR/wuxMBfuHRJPnwZXe7XI4/Mf1f0/o9fMs3W2wbZxH6xfXbMzlidCabZA/vH/legs1gN/j+EhpURjlf2ZgFP58BNOrFAQ=</latexit><latexit sha1_base64="lh6NBekHXcGCOX4qDagmKH6cUHY=">ACfHicbVFbSxtBFJ5s1Xpr1fbRl8EoKLZhV0rbpyIoKOKDovFCsoSzk5NkcGZ2mTlbDEt+ha/tD/PiLMxgk8cODj+879JmSjsLwsRJ8mJmd+zi/sLi0/OnzyuralyuX5lZgXaQqtTcJOFTSYJ0kKbzJLIJOF4ndwelfv0XrZOpuaR+hrGrpEdKYA8dVuctJqOwA5aq9WwFg6NT4NoBKpsZGetUrcbKci12hIKHCuEYUZxQVYkLhYLGZO8xA3EXGx4a0OjiYjxgG95ps07qfVuiA/ZtxkFaOf6OvGRGqjnJrWSfE9r5NT5HRfSZDmhES+NOrnilPJyfd6WFgWpvgcgrPSzctEDC4L8kca6DGtnKMY2Ke5zI0XaxglW0T1Z8KRD0iBNuVxJXiF2AcP5XdHr2qvmwpbx/KriT37dR/wuxMBfuHRJPnwZXe7XI4/Mf1f0/o9fMs3W2wbZxH6xfXbMzlidCabZA/vH/legs1gN/j+EhpURjlf2ZgFP58BNOrFAQ=</latexit><latexit sha1_base64="lh6NBekHXcGCOX4qDagmKH6cUHY=">ACfHicbVFbSxtBFJ5s1Xpr1fbRl8EoKLZhV0rbpyIoKOKDovFCsoSzk5NkcGZ2mTlbDEt+ha/tD/PiLMxgk8cODj+879JmSjsLwsRJ8mJmd+zi/sLi0/OnzyuralyuX5lZgXaQqtTcJOFTSYJ0kKbzJLIJOF4ndwelfv0XrZOpuaR+hrGrpEdKYA8dVuctJqOwA5aq9WwFg6NT4NoBKpsZGetUrcbKci12hIKHCuEYUZxQVYkLhYLGZO8xA3EXGx4a0OjiYjxgG95ps07qfVuiA/ZtxkFaOf6OvGRGqjnJrWSfE9r5NT5HRfSZDmhES+NOrnilPJyfd6WFgWpvgcgrPSzctEDC4L8kca6DGtnKMY2Ke5zI0XaxglW0T1Z8KRD0iBNuVxJXiF2AcP5XdHr2qvmwpbx/KriT37dR/wuxMBfuHRJPnwZXe7XI4/Mf1f0/o9fMs3W2wbZxH6xfXbMzlidCabZA/vH/legs1gN/j+EhpURjlf2ZgFP58BNOrFAQ=</latexit>

= optimal LQR when (A,B) known

This also tells you when your cost is finite!

slide-22
SLIDE 22

Why robust?

Slightly unstable system, system ID tends to think some nodes are stable

xt+1 =   1.01 0.01 0.01 1.01 0.01 0.01 1.01   xt +   1 1 1   ut + et

slide-23
SLIDE 23

Least-squares estimate may yield unstable controller Robust synthesis yields stable controller

slide-24
SLIDE 24

Model-free performs worse than model-based

slide-25
SLIDE 25

Even LQR is not simple!!!

Gaussian noise

minimize J := ∞

t=1 xT t Qxt + uT t Rut

s.t. xt+1 = Axt + But + et

  • Coarse-ID control is the first non-asymptotic bound for LQR in

this oracle model.

  • The estimation results disagree with 50 papers on Cosma

Shalizi’s blog.

  • The guarantees for least-squares estimation required some heavy

machinery and are building on results from last year.

  • The SDP relaxation uses brand new techniques in controller

parameterization (Systems Level Synthesis by Matni et al.)

  • Key insight: Robustness makes analysis tractable

Even the simplest RL problems are really hard.

slide-26
SLIDE 26
  • Poly-time Direct Adaptive Control (NeurIPS2018)
  • Limitations of policy search (NeurIPS 2018)
  • Safe exploration (submitted to ACC 2019)

Extensions So far…

  • Model based methods seem to perform better than

model free ones in theory and practice.

  • RL needs better baselines!
  • Simple algorithms seem to be surprisingly competitive.
  • Analysis of time series is annoyingly hard!
slide-27
SLIDE 27

The Linearization Principle

If a machine learning algorithm does crazy things when restricted to linear models, it’s going to do crazy things on complex nonlinear models too. What happens when we return to nonlinear models?

Simple, static, linear policies outperform deep RL and can be found with simple, standard

  • ptimization algorithms.

Model Predictive Control

  • utperforms any direct

policy search method. [Mania, Guy, R. 2018] Video from Todorov Lab, 2012

slide-28
SLIDE 28

What is ML good for in control?

  • Fundamentally, almost all machine learning successes are

in nonparametric prediction (mostly classification).

Perceptual sensors in the loop Forecasting in MPC

How to incorporate uncertain predictive perception in trustable, scalable, predictable autonomy?

slide-29
SLIDE 29

Collaborators

Joint work with Sarah Dean, Aurelia Guy, Horia Mania, Nikolai Matni, Max Simchowitz, and Stephen Tu.

slide-30
SLIDE 30

References

  • “On the Sample Complexity of the Linear Quadratic Regulator.” S. Dean, H. Mania, N. Matni,
  • B. Recht, and S. Tu. arXiv:1710.01688
  • “Least-squares Temporal Differencing for the Linear Quadratic Regulator” S. Tu and B. Recht.

In ICML 2018. arXiv:1712.08642

  • “Learning without Mixing.” H. Mania, M. I. Jordan, B. Recht, M. Simchowitz, and S. Tu. In COLT
  • 2018. arXiv:1802.08334
  • “Simple random search provides a competitive approach to reinforcement learning.” H.

Mania, A. Guy, and B. Recht. In NeurIPS 2018. arXiv:1803.07055

  • “Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator.” S. Dean, H.

Mania, N. Matni, B. Recht, and S. Tu. In NeurIPS 2018. arXiv:1805.09388

  • “The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic

Regulator: An Asymptotic Viewpoint.” S. Tu and B. Recht. arXiv:1812.03565

  • A Tour of Reinforcement Learning: The View from Continuous Control.” B. Recht.

arXiv:1806.09460