Statistics and Samples in Distributional Reinforcement Learning - - PowerPoint PPT Presentation

statistics and samples in distributional reinforcement
SMART_READER_LITE
LIVE PREVIEW

Statistics and Samples in Distributional Reinforcement Learning - - PowerPoint PPT Presentation

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi, Saurabh Kumar, Rmi Munos, Marc G. Bellemare, Will Dabney ICML 2019 Google Research Brain team Distributional Reinforcement Learning Distributional


slide-1
SLIDE 1

Google Research Brain team

Statistics and Samples in Distributional Reinforcement Learning

Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney

ICML 2019

slide-2
SLIDE 2

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

Distributional Reinforcement Learning

Distributional RL aims to learn full return distributions. Return distribution: Distributional Bellman operator: Distributional Bellman equation:

[Bellemare et al., 2017]

slide-3
SLIDE 3

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

In practice, we often work with parametric approximate distributions.

Distributional Reinforcement Learning

Non-parametric

slide-4
SLIDE 4

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

In practice, we often work with parametric approximate distributions.

Categorical [Bellemare et al., 2017]

Distributional Reinforcement Learning

Non-parametric

slide-5
SLIDE 5

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

In practice, we often work with parametric approximate distributions.

Categorical [Bellemare et al., 2017]

Distributional Reinforcement Learning

Non-parametric Dirac deltas [Dabney et al., 2018]

slide-6
SLIDE 6

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

Main Contribution: An Alternative Perspective

  • Moments, tail probabilities,

expectations, etc. Distributional RL algorithms learn statistical functionals of the return distribution.

slide-7
SLIDE 7

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

Main Contribution: An Alternative Perspective

Distributional RL algorithms learn statistical functionals of the return distribution. Theory: What properties of return distributions can be learnt through dynamic programming? Algorithmic: A general framework for approximate learning of statistics.

  • Moments, tail probabilities,

expectations, etc.

slide-8
SLIDE 8

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

Main Contribution: An Alternative Perspective

Theory: What properties of return distributions can be learnt through dynamic programming? Algorithmic: A general framework for approximate learning of statistics.

  • Moments, tail probabilities,

expectations, etc. Distributional RL algorithms learn statistical functionals of the return distribution.

slide-9
SLIDE 9

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

A General Framework for Distributional RL Algorithms

Current statistics Imputed samples Bellman-updated distribution Bellman-updated statistics Imputation strategy

slide-10
SLIDE 10

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

A General Framework for Distributional RL Algorithms

Current statistics Imputed samples Bellman-updated distribution Bellman-updated statistics Imputation strategy

slide-11
SLIDE 11

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

A General Framework for Distributional RL Algorithms

Current statistics Imputed samples Bellman-updated distribution Bellman-updated statistics Imputation strategy

slide-12
SLIDE 12

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

A General Framework for Distributional RL Algorithms

Current statistics Imputed samples Bellman-updated distribution Bellman-updated statistics Imputation strategy

slide-13
SLIDE 13

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

A General Framework for Distributional RL Algorithms

Current statistics Imputed samples Bellman-updated distribution Bellman-updated statistics Imputation strategy

slide-14
SLIDE 14

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

A General Framework for Distributional RL Algorithms

Current statistics Imputed samples Bellman-updated distribution Bellman-updated statistics Imputation strategy

slide-15
SLIDE 15

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

Application: Expectiles

We apply this framework to learn expectiles of return distributions. New deep RL agent: Expectile Regression DQN (ER-DQN), with improved mean performance on Atari-57 relative to QR-DQN.

slide-16
SLIDE 16

Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

Summary

A new perspective on distributional RL Theoretical progress on what it is possible to learn A general framework for distributional RL algorithms

slide-17
SLIDE 17

THANK YOU

Poster #113