overview of robot decision making
play

Overview of Robot Decision Making Prof. Yuke Zhu Fall 2020 CS391R: - PowerPoint PPT Presentation

Overview of Robot Decision Making Prof. Yuke Zhu Fall 2020 CS391R: Robot Learning (Fall 2020) 1 Todays Agenda What is Robot Decision Making? Mathematical Framework of Sequential Decision Making Learning for Decision Making


  1. Overview of Robot Decision Making Prof. Yuke Zhu Fall 2020 CS391R: Robot Learning (Fall 2020) 1

  2. Today’s Agenda ● What is Robot Decision Making? ● Mathematical Framework of Sequential Decision Making ● Learning for Decision Making ○ reinforcement learning (model-free vs. model-based) ○ imitation learning (behavior cloning, DAgger, IRL, and adversarial learning) ● Research Frontiers ○ compositionality, learning to learn, … CS391R: Robot Learning (Fall 2020) 3

  3. Robot Learning is to close the perception- action loop. Ro Perceive Perceive Act Act Act Perceive [Sa et al. IROS 2014] [Levine et al. JMLR 2016] [Bohg et al. ICRA 2018] CS391R: Robot Learning (Fall 2020) 4

  4. What is Robot Decision Making? Choosing the action a robot should perform in the physical world… Assistive Robots (Companions) Outer Space (Explorers) Autonomous Driving (Transporters) CS391R: Robot Learning (Fall 2020) 5

  5. What is Robot Decision Making? Choosing the action a robot should perform in the physical world… • Behaviors can’t be easily programmed • Imperfect sensing and actuation • Safety and robustness under uncertainty [Source: Boston Dynamics] CS391R: Robot Learning (Fall 2020) 6

  6. Robot Decision Making vs. Playing Games Robot decision making is embodied , active , and environmentally situated . [Source: Boston Dynamics] [Source: DeepMind’s AlphaGo] CS391R: Robot Learning (Fall 2020) 7

  7. Before We Dive In… ● This lecture is intended to provide a high-level, bird-eye view on (robot) decision making. ● The goal is not to go through all technical details: ○ We will re-visit them through paper reading in the following weeks. ○ Study the parts that you are less familiar with from online resources. ● Take related courses and read textbooks to learn this subject in depth (see the last slide). CS391R: Robot Learning (Fall 2020) 8

  8. <latexit sha1_base64="Pnkf2aWKD9BYQ2qZKVRM674ymOg=">ACFHicdVDLSgNBEJz1bXxFPXoZDIKSEHZjQpJDQBDBo4JRIVmW3slEh8w+mOkVwpqP8OKvePGgiFcP3vwbJw9BRQsaiqpurv8WAqNtv1hTU3PzM7NLyxmlpZXVtey6xvnOkoU40WyUhd+qC5FCFvokDJL2PFIfAlv/B7h0P/4oYrLaLwDPsxdwO4CkVXMEAjedm82tUF2KMN2g4Ar30/PRq0lJdi3hnQW6ob2sMChQZ46HrZnF0sOZVSpU7HpL4/IeUqdYr2CDkywYmXfW93IpYEPEQmQeuWY8fopqBQMkHmXaieQysB1e8ZWgIAduOnpqQHeM0qHdSJkKkY7U7xMpBFr3A90Di/Xv72h+JfXSrBbc1MRxgnykI0XdRNJMaLDhGhHKM5Q9g0BpoS5lbJrUMDQ5JgxIXx9Sv8n56WiUy7WTsu5g9IkjgWyRbJLnFIlRyQY3JCmoSRO/JAnsizdW89Wi/W67h1yprMbJIfsN4+AR9XnP8=</latexit> <latexit sha1_base64="jGRosKYE5dgsYh5U1uEh78GwnDA=">AB/nicdVDLSsNAFJ34rPUVFVduBotQNyWJLW13BTcuK9oHNKFMpN26GQSZiZCQV/xY0LRdz6He78GydtBU9MHA4517umePHjEplWR/Gyura+sZmYau4vbO7t28eHZlAhMOjhikej7SBJGOekoqhjpx4Kg0Gek508vM793R4SkEb9Vs5h4IRpzGlCMlJaG5nFZDhV0KYduiNQEI5bezM+HZsmqOHbNqTXhkjQvclKtQ7tiLVACOdpD890dRTgJCVeYISkHthUrL0VCUczIvOgmksQIT9GYDTlKCTSxfx5/BMKyMYREI/ruBC/b6RolDKWejrySyj/O1l4l/eIFBw0spjxNFOF4eChIGVQSzLuCICoIVm2mCsKA6K8QTJBWurGiLuHrp/B/0nUqdrXSuK6Wk5eRwGcgFNQBjaogxa4Am3QARik4AE8gWfj3ng0XozX5eiKke8cgR8w3j4BpBSVRg=</latexit> <latexit sha1_base64="5iAfSjFxJW7S24c+E2zRXg95A=">AB9XicdVDLSgMxFM3UV62vqks3wSK4kGFSWtplwY3LCvYBM2PJpJk2NMkMSUYpQ/DjQtF3Pov7vwb04egogcuHM65l3viVLOtPG8D6ewtr6xuVXcLu3s7u0flA+PujrJFKEdkvBE9SOsKWeSdgwznPZTRbGIO1Fk8u537ujSrNE3phpSkOBR5LFjGBjpdtghIXAZO+d4HCQbniudVG02sguCAI1ZfEq9chcr0FKmCF9qD8HgwTkgkqDeFYax95qQlzrAwjnM5KQaZpiskEj6hvqcSC6jBfXD2DZ1YZwjhRtqSBC/X7RI6F1lMR2U6BzVj/9ubiX56fmbgZ5kymaGSLBfFGYcmgfMI4JApSgyfWoKJYvZWSMZYWJsUCUbwten8H/Srbqo5java5VWdRVHEZyAU3AOEGiAFrgCbdABCjwAJ7As3PvPDovzuyteCsZo7BDzhvn/YQkiQ=</latexit> <latexit sha1_base64="JpD9q9Jvp09IkN0chWQ+ZK5PxIY=">AB7XicdVBNSwMxEM3Wr1q/qh69BIvgadktLe2x4MVjBfsB7VJm02wbm2SXJCuUpf/BiwdFvPp/vPlvTLcVPTBwO9GWbmhQln2njeh1PY2Nza3inulvb2Dw6PyscnXR2nitAOiXms+iFoypmkHcMp/1EURAhp71wdrX0e/dUaRbLWzNPaCBgIlnECBgrdYcTEAJG5YrnVhtNr+HjnPh+fUW8eh37rpejgtZoj8rvw3FMUkGlIRy0HvheYoIMlGE0VpmGqaAJnBhA4slSCoDrL82gW+sMoYR7GyJQ3O1e8TGQit5yK0nQLMVP/2luJf3iA1UTPImExSQyVZLYpSjk2Ml6/jMVOUGD63BIhi9lZMpqCAGBtQyYbw9Sn+n3Srl9zmze1Squ6jqOIztA5ukQ+aqAWukZt1E3aEH9ISendh5dF6c1VrwVnPnKIfcN4+AfOgj1w=</latexit> <latexit sha1_base64="9mN0evEAmW6HidnbyBTqbhwcRE=">AB/nicdVDLSsNAFJ34rPUVFVduBotQNyGJLW13FTcuK9gHNKVMpN26GQSZiZCQV/xY0LRdz6He78GydtBU9MHA4517umePHjEpl2x/Gyura+sZmYau4vbO7t28eHZklAhM2jhikej5SBJGOWkrqhjpxYKg0Gek60+vMr97R4SkEb9Vs5gMQjTmNKAYKS0NzeMyGiroUQ69EKkJRiy9nJ8PzZJtuU7VrTbgkjQuclKpQceyFyiBHK2h+e6NIpyEhCvMkJR9x47VIEVCUczIvOglksQIT9GY9DXlKCRykC7iz+GZVkYwiIR+XMGF+n0jRaGUs9DXk1lG+dvLxL+8fqKC+iClPE4U4Xh5KEgYVBHMuoAjKghWbKYJwoLqrBPkEBY6caKuoSvn8L/Sce1nIpVv6mUm5eRwGcgFNQBg6ogSa4Bi3QBhik4AE8gWfj3ng0XozX5eiKke8cgR8w3j4BbAqVIg=</latexit> <latexit sha1_base64="i+E65M1i3gjNnzJeIYbguafpo1s=">AB8nicdVDLSsNAFJ34rPFVdelmsAiuQlIs7UYsuHFZxT6gDWUynbRDJzNhZiKU0M9w40KRbt37H27Ev3GSKjogQuHc+7lnuDmFGlXfdWlpeWV1bL23Ym1vbO7vlvf2OEonEpI0FE7IXIEUY5aStqWakF0uCoCRbjC9yPzuLZGKCn6jZzHxIzTmNKQYaSP1BxHSE4xYej0fliuU603LoHc+J5tYK4tRr0HDdH5fzFPosXb3ZrWH4djAROIsI1ZkipvufG2k+R1BQzMrcHiSIxwlM0Jn1DOYqI8tM8hweG2UEQyFNcQ1z9ftEiKlZlFgOrOI6reXiX95/USHDT+lPE404bhYFCYMagGz+GISoI1mxmCsKQmK8QTJBHW5ku2ecLXpfB/0qk63qnTuHIrzSoUAKH4AicA/UQRNcghZoAwEuAMP4NHS1r31ZC2K1iXrc+YA/ID1/AFAvpTc</latexit> <latexit sha1_base64="47fJjpAaSDkG5CZij9cVsTrtsLM=">ACFnicdVDLSgNBEJz1bXxFPXoZDKJgDLsxoh4EwYvHCEaF7Lr0TiZmcPbBTK8Q1v0KL/6KFw+KeBVv/o2Th6CiBQNFVTc9VUEihUb/rBGRsfGJyanpgszs3PzC8XFpTMdp4rxBotlrC4C0FyKiDdQoOQXieIQBpKfB9dHPf/8hist4ugUuwn3QriKRFswQCP5xS03BOwkFk9zOt1/NLOHDrqn9Ded3C3fumXtY5mCj5fLNmVqrNT3dmnA7K/PS1XepU7D5KZIi6X3x3WzFLQx4hk6B107ET9DJQKJjkecFNU+AXcMVbxoaQci1l/Vj5XTNKC3ajpV5EdK+n0jg1DrbhiYyV4I/dvriX95zRTbe14moiRFHrHBoXYqKca01xFtCcUZyq4hwJQwf6WsAwoYmiYLpoSvpPR/clatOLXK3kmtdFgd1jFVsgq2SAO2SWH5JjUSYMwckceyBN5tu6tR+vFeh2MjljDnWXyA9bJxupn0s=</latexit> <latexit sha1_base64="m2zP/jMubcdRPaHpYCksETvGRPs=">AB8nicdVDLSsNAFJ3UV42vqks3g0VwFSbF0m7EghuXFewD0lAm0k7dJIJMxOhH6GxeKdOve/3Aj/o3TREFD1w4nHMv9wbJwpjdC7VpZXVvfKG/aW9s7u3uV/YOuEqktEMEF7IfYEU5i2lHM81pP5EURwGnvWB6ufR7t1QqJuIbPUuoH+FxzEJGsDaSN4iwnhDMs/Z8WKkip9ZoYLc+K69YKgeh26DspRvXixz5PFm90eVl4HI0HSiMacKyU56JE+xmWmhFO5/YgVTBZIrH1DM0xhFVfpZHnsMTo4xgKSpWMNc/T6R4UipWRSYzmVE9dtbin95XqrDp+xOEk1jUmxKEw51AIu74cjJinRfGYIJpKZrJBMsMREmy/Z5glfl8L/SbfmuGdO8xpVWzVQoAyOwDE4BS5ogBa4Am3QAQIcAcewKOlrXvryVoUrSXrc+YQ/ID1/AE9tJTa</latexit> <latexit sha1_base64="pvU12GU1dDG1BShnHUIzAeZ3O6M=">AB8nicdVDLSsNAFJ34rPFVdelmsAiuwqRY2o1YceOygn1AG8pkOmHTiZhZiKU0M9w40KRbt37H27Ev3GSKjogQuHc+7lnv9mDOlEXq3lpZXVtfWSxv25tb2zm5b7+jokQS2iYRj2TPx4pyJmhbM81pL5YUhz6nX96mfndWyoVi8SNnsXUC/FYsIARrI3UH4RYTwjm6cV8WK4gp1pvoLoLc+K6tYKgWg26DspROX+xz+LFm90al8Ho4gkIRWacKxU30Wx9lIsNSOczu1BomiMyRSPad9QgUOqvDSPIfHRhnBIJKmhIa5+n0ixaFSs9A3nVlE9dvLxL+8fqKDhpcyESeaClIsChIOdQSz+GISUo0nxmCiWQmKyQTLDHR5ku2ecLXpfB/0qk67qnTuEaVZhUKIFDcAROgAvqoAmuQAu0AQERuAMP4NHS1r31ZC2K1iXrc+YA/ID1/AEm6ZTL</latexit> <latexit sha1_base64="UGPCm8FQBtob6fJA7liEs4eoWaY=">ACP3icdVBNSxtBGJ7VamP8aIxHL0ODkIOE3WBILkLEixchtk0UsiG8O5kQ2Zml5lZISz5Df1DvehP8ObViwel9NJDb53dtBpFXxh4nuf9mPd9gogzbVz31la/rCy+jG3l/f2Nz6VNgudnQYK0LbJOShughAU84kbRtmOL2IFAURcHoeTI7T/PklVZqF8puZRrQnYCTZkBEwVuoXOr4AMybAk9PZoc9BjFT9rX2f4zOVokrUXyJSUjEAKwr7IR/ULJrVTrDbfu4Qx4Xm0O3FoNexU3i1Kz6Jd/X3/3W/3CjT8ISyoNISD1l3PjUwvAWUY4XSW92NIyATGNGuhRIE1b0ku3+G96wywMNQ2ScNztTFjgSE1lMR2Mp0Zf06l4pv5bqxGTZ6CZNRbKgk84+GMcmxKmZeMAUJYZPLQCimN0VkzEoIMZanrcm/L8Uvw861Yp3UGmcWTeqaB45tIs+ozLyUB010QlqoTYi6Ae6Qw/o0bly7p2fzq956ZLzr2cHvQjnz18tjrPb</latexit> <latexit sha1_base64="FywXp2+qoAPEXh6BbgVeq4gZKgw=">AB8nicdVDLSsNAFJ34rPFVdelmsAiuQlIs7UYsuHFZ0T6gDWUynbRDJzNhZiKU0M9w40KRbt37H27Ev3GSKjogQuHc+7lnuDmFGlXfdWlpeWV1bL23Ym1vbO7vlvf2OEonEpI0FE7IXIEUY5aStqWakF0uCoCRbjC9yPzuLZGKCn6jZzHxIzTmNKQYaSP1BxHSE4xYej0fliuU603LoHc+J5tYK4tRr0HDdH5fzFPosXb3ZrWH4djAROIsI1ZkipvufG2k+R1BQzMrcHiSIxwlM0Jn1DOYqI8tM8hweG2UEQyFNcQ1z9ftEiKlZlFgOrOI6reXiX95/USHDT+lPE404bhYFCYMagGz+GISoI1mxmCsKQmK8QTJBHW5ku2ecLXpfB/0qk63qnTuHIrzSoUAKH4AicA/UQRNcghZoAwEuAMP4NHS1r31ZC2K1iXrc+YA/ID1/AFCQ5Td</latexit> Mathematical Framework: Marko kov Decisi sion Processe sses A Markov Decision Process is defined by a tuple M = h S , A , P , R , γ i ( s t ∈ S ) S : state space A ( a t ∈ A ) : action space P a ss 0 = Pr[ s t +1 | s t , a t ] P : transition probability r ( s, a ) = E [ r t +1 | s = s t , a = a t ] : reward function R : a discount factor γ ∈ [0 , 1] γ CS391R: Robot Learning (Fall 2020) 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend