sample mple opt optimal imal pa para rametric metric q le
play

Sample mple-Opt Optimal imal Pa Para rametric metric Q-Le - PowerPoint PPT Presentation

Sample mple-Opt Optimal imal Pa Para rametric metric Q-Le Learning arning Usi Using ng Li Line nearly arly Ad Additive ditive Fea eatur tures es Lin in F. Yan ang, , Meng ngdi di Wan ang A Basic RL Model: Markov Decision


  1. Sample mple-Opt Optimal imal Pa Para rametric metric Q-Le Learning arning Usi Using ng Li Line nearly arly Ad Additive ditive Fea eatur tures es Lin in F. Yan ang, , Meng ngdi di Wan ang

  2. A Basic RL Model: Markov Decision Process • States: ; Actions: • Reward: • State transition: • Policy: random Effective Horizon: • Optimal policy & value: • -optimal policy :

  3. Curse of Dimensionality • Optimal sample complexity: |S| = 3 361 |S| ≥ 256 256×240 Too many states for How to optimally reduce dimensions? most cases … Exploiting structures!

  4. Parametric Q-Learning On Feature-Based MDP • Transition is decomposable 𝑄 ∈ ℝ 𝑇×𝐵 ×𝑇 Φ Ψ Known Unknown

  5. Parametric Q-Learning On Feature-Based MDP • Transition is decomposable

  6. Parametric Q-Learning On Feature-Based MDP 0.2 0.11 0.3 0.5 0.01

  7. A Simple Regression Based Algorithm • Generative Model: we are able to samples from any ( s,a ) Represent Q-function with parameter 𝑥 ∈ ℝ 𝐿 : 𝑅 𝑥 ≔ 𝑠 𝑡, 𝑏 + 𝛿𝜚 𝑡, 𝑏 ⊤ 𝑥 𝑊 𝑥 𝑡 ≔ max 𝑏∈𝐵 𝑅 𝑥 (𝑡, 𝑏) 𝜌 𝑥 𝑡 ≔ argmax 𝑏∈𝐵 𝑅 𝑥 (𝑡, 𝑏) • Learn 𝑥 with modified Q-learning Sample complexity ( 𝐿 : feature dimension): 𝐿 ෨ 𝑃 𝜗 2 1 − 𝛿 7

  8. Sample Optimality? 𝑄 ⋅ |𝑡 1 , 𝑏 1 • Anchor condition: 𝑄 ⋅ |𝑡 2 , 𝑏 2 𝑄 ⋅ |𝑡, 𝑏 𝑄 ⋅ |𝑡 6 , 𝑏 6 𝑄 ⋅ |𝑡 3 , 𝑏 3 Sample complexity: 𝑄 ⋅ |𝑡 4 , 𝑏 4 𝑄 ⋅ |𝑡 5 , 𝑏 5 𝐿 ෩ Θ 𝜗 2 1 − 𝛿 3 ArXiv: 1902.04779. Poster: 117

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend