dream to control
play

Dream to Control Learning Behaviors by Latent Imagination Danijar - PowerPoint PPT Presentation

Dream to Control Learning Behaviors by Latent Imagination Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi Google Brain DeepMind @danijarh danijar.com/dreamer We introduce Dreamer Scalable reinforcement learning from pixels


  1. Dream to Control Learning Behaviors by Latent Imagination Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi Google Brain DeepMind @danijarh danijar.com/dreamer

  2. We introduce Dreamer Scalable reinforcement learning from pixels using a world model 1 Learn actor and value in imagination for long-sighted behaviors 2 Efficiently update actor by backprop through imagined sequences 3

  3. We introduce Dreamer Scalable reinforcement learning from pixels using a world model 1 Learn actor and value in imagination for long-sighted behaviors 2 Efficiently update actor by backprop through imagined sequences 3

  4. We introduce Dreamer Scalable reinforcement learning from pixels using a world model 1 Learn actor and value in imagination for long-sighted behaviors 2 Efficiently update actor by backprop through imagined sequences 3

  5. Dreamer Agent Overview

  6. Dreamer Agent Overview

  7. Dreamer Agent Overview

  8. World Model with Latent States a 1 a 2 o 1 o 2 o 3

  9. World Model with Latent States a 1 a 2 encode images o 1 o 2 o 3

  10. World Model with Latent States a 1 a 2 encode images compute states o 1 o 2 o 3

  11. World Model with Latent States ̂ ̂ ̂ r 1 a 1 r 2 a 2 r 3 encode images compute states predict rewards o 1 o 2 o 3

  12. World Model with Latent States ̂ ̂ ̂ r 1 a 1 r 2 a 2 r 3 encode images compute states predict rewards predict images ̂ ̂ ̂ o 1 o 1 o 2 o 2 o 3 o 3

  13. Long-Term Video Prediction

  14. Long-Term Video Prediction

  15. Learning Behaviors by Latent Imagination

  16. Learning Behaviors by Latent Imagination

  17. Learning Behaviors by Latent Imagination

  18. Learning Behaviors by Latent Imagination encode images o 1

  19. Learning Behaviors by Latent Imagination a 1 a 2 encode images imagine ahead o 1

  20. Learning Behaviors by Latent Imagination ̂ ̂ a 1 r 2 a 2 r 3 encode images imagine ahead predict rewards o 1

  21. Learning Behaviors by Latent Imagination ̂ ̂ ̂ ̂ a 1 v 2 r 2 a 2 v 3 r 3 encode images imagine ahead predict rewards predict values o 1

  22. Learning Behaviors by Latent Imagination ̂ ̂ ̂ ̂ a 1 v 2 r 2 a 2 v 3 r 3 encode images imagine ahead predict rewards predict values o 1

  23. Behaviors Learned by Dreamer

  24. Large-Scale Evaluation for Control from Pixels Model-based: Model-free: 28 hours of interaction 23 days of interaction

  25. Large-Scale Evaluation for Control from Pixels Model-based: Model-free: 28 hours of interaction 23 days of interaction A3C (243)

  26. Large-Scale Evaluation for Control from Pixels Model-based: Model-free: 28 hours of interaction 23 days of interaction PlaNet (332) A3C (243)

  27. Large-Scale Evaluation for Control from Pixels Dreamer (823) Model-based: Model-free: 28 hours of interaction 23 days of interaction PlaNet (332) A3C (243)

  28. Large-Scale Evaluation for Control from Pixels Dreamer (823) D4PG (786) Model-based: Model-free: 28 hours of interaction 23 days of interaction PlaNet (332) A3C (243)

  29. Introducing Dreamer: Scalable Reinforcement Learning Using World Models

  30. Dream to Control Learning Behaviors by Latent Imagination Blog post, code, videos, paper: danijar.com/dreamer

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend