making deep q learning approaches robust to time
play

Making Deep Q-learning Approaches Robust to Time Discretization - PowerPoint PPT Presentation

Making Deep Q-learning Approaches Robust to Time Discretization Corentin Tallec L eonard Blier Yann Ollivier Universit e Paris-Sud, Facebook AI Research June 4, 2019 C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4,


  1. Making Deep Q-learning Approaches Robust to Time Discretization Corentin Tallec L´ eonard Blier Yann Ollivier Universit´ e Paris-Sud, Facebook AI Research June 4, 2019 C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 1 / 4

  2. Reinforcement Learning in Near Continuous Time What happens when using standard RL methods with small time discretization or high framerate ? Usual RL algorithm + high framerate → failure Scalability limited by algorithms ! Better hardware, sensors, actuators → Worse performance Contributes to lack of robustness of Deep RL: New environment → different framerate → new hyperparameters. Low FPS High FPS C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 2 / 4

  3. Why is near continuous Q-learning failing? There is no continuous time Q-learning As δ t → 0, Q π ( s , a ) → V π ( s ) Q π does not depend on actions when δ t → 0 ⇒ Cannot use Q π to select actions! = There is no continuous time ε -greedy exploration ε -greedy, ε = 1 pendulum: δ t = . 05 δ t = . 0001 C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 3 / 4

  4. Can we solve this? YES To know how: Poster #32 this evening Low FPS High FPS C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 4 / 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend