learning latent dynamics for planning from pixels
play

Learning Latent Dynamics for Planning from Pixels Danijar Hafner, - PowerPoint PPT Presentation

Learning Latent Dynamics for Planning from Pixels Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson @danijar h danijar.com/planet Planning with Learned Models Watter et al., 2015, Banijamali


  1. Learning Latent Dynamics for Planning from Pixels Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson @danijar h danijar.com/planet

  2. Planning with Learned Models Watter et al., 2015, Banijamali et al. 2017, Zhang et al. 2017 Agrawal et al., 2016; Finn & Levine, 2016; Ebert et al., 2018

  3. Visual Control Tasks partially many sparse contacts balance observable joints reward Some model-free methods can solve these tasks but need up to 100,000 episodes

  4. Visual Control Tasks partially many sparse contacts balance observable joints reward Some model-free methods can solve these tasks but need up to 100,000 episodes

  5. We introduce PlaNet Recipe for scalable model-based reinforcement learning 1 Efficient planning in latent space with large batch size 2 Reaches top performance using 200X fewer episodes 3

  6. Latent Dynamics Model encode images

  7. Latent Dynamics Model encode images predict states

  8. Latent Dynamics Model encode images predict states decode images

  9. Latent Dynamics Model encode images predict states decode images decode rewards

  10. Recurrent State Space Model deterministic stochastic h 1 h 2 h 3 h 1 h 2 h 3 z 1 s 1 z 2 s 2 z 3 s 3 z 1 z 2 z 3 Recurrent Neural Network State Space Model Recurrent State Space Model

  11. Unguided Video Predictions by Single Agent 5 frames context and 45 frames predicted

  12. Unguided Video Predictions by Single Agent 5 frames context and 45 frames predicted

  13. Planning in Latent Space

  14. Planning in Latent Space

  15. Planning in Latent Space

  16. Planning in Latent Space

  17. Planning in Latent Space

  18. Planning in Latent Space

  19. Comparison to Model-Free Agents Training time 1 day on a single GPU

  20. Enabling More Model-Based RL Research Explore dynamics Distill the planner to save Value function to extend without supervision computation planning horizon

  21. Learning Latent Dynamics for Planning from Pixels Website with code, videos, blog post, animated paper: danijar.com/planet 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend