De Deep R Reinforcement Learning i in a a Ha Handf dful of of - - PowerPoint PPT Presentation

de deep r reinforcement learning i in a a ha handf dful
SMART_READER_LITE
LIVE PREVIEW

De Deep R Reinforcement Learning i in a a Ha Handf dful of of - - PowerPoint PPT Presentation

De Deep R Reinforcement Learning i in a a Ha Handf dful of of Trials ls u using Probabilistic D Dynamics M Models Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine University of California, Berkeley How L Lon ong D


slide-1
SLIDE 1

De Deep R Reinforcement Learning i in a a Ha Handf dful

  • f
  • f Trials

ls u using Probabilistic D Dynamics M Models

Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine University of California, Berkeley

slide-2
SLIDE 2

How L Lon

  • ng D

Doe

  • es

s Lea earnin ing Take? e?

~800,000 grasp attempts ~21 million games ~50 million frames

[Mnih et al. 2015] [Silver et al. 2017] [Levine et al. 2017]

slide-3
SLIDE 3

Can Can w we speed t this u up?

slide-4
SLIDE 4

Mo Model-Ba Based ed Reinforcem emen ent Learning

Optimize Policy Execute Policy Train Dynamics Model

slide-5
SLIDE 5

Comparative P Perf rform rmance

  • n Ha

HalfCh Chee eetah

slide-6
SLIDE 6

Comparative P Perf rform rmance

  • n Ha

HalfCh Chee eetah

slide-7
SLIDE 7

Determ rministic N Neural Nets as Models

slide-8
SLIDE 8

Determ rministic N Neural Nets as Models

slide-9
SLIDE 9

Determ rministic N Neural Nets as Models

slide-10
SLIDE 10

Determ rministic N Neural Nets as Models

slide-11
SLIDE 11

Determ rministic N Neural Nets as Models

slide-12
SLIDE 12

Probabilisti tic Neural N Nets ts a as Models

slide-13
SLIDE 13

Probabilisti tic Ensembles as Models

slide-14
SLIDE 14

Probabilisti tic Ensembles as Models

slide-15
SLIDE 15

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-16
SLIDE 16

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-17
SLIDE 17

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-18
SLIDE 18

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-19
SLIDE 19

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-20
SLIDE 20

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-21
SLIDE 21

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-22
SLIDE 22

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-23
SLIDE 23

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-24
SLIDE 24

Trajec ector

  • ry S

Sampling f g for State Prop

  • pagation
  • n
slide-25
SLIDE 25

Ex Experi rimental Results

slide-26
SLIDE 26

https://github.com/kchua/handful-of-trials https://sites.google.com/view/drl-in-a-handful-of-trials Code: Website:

De Deep R Reinforcement Learning i in a a Ha Handf dful

  • f
  • f Trials

ls u using Probabilistic D Dynamics M Models

Kurtland Chua Roberto Calandra Rowan McAllister Sergey Levine

Data efficient Competitive asymptotic performance Easy to implement

Poster #165