Discovering Options for Exploration by Minimizing Cover Time Yuu - - PowerPoint PPT Presentation

discovering options for exploration by minimizing cover
SMART_READER_LITE
LIVE PREVIEW

Discovering Options for Exploration by Minimizing Cover Time Yuu - - PowerPoint PPT Presentation

Discovering Options for Exploration by Minimizing Cover Time Yuu Jinnai, Jee Won Park, David Abel, George Konidaris Brown University Poster at Ballroom #117 Options (Sutton et al. 1999) Primitive Actions Using Options Goal State Goal State


slide-1
SLIDE 1

Discovering Options for Exploration by Minimizing Cover Time

Yuu Jinnai, Jee Won Park, David Abel, George Konidaris Brown University Poster at Ballroom #117

slide-2
SLIDE 2

Goal State Goal State

Primitive Actions Using Options

Options (Sutton et al. 1999)

slide-3
SLIDE 3

How can options help agents explore?

Explored states

slide-4
SLIDE 4

Contributions

1. Introduced an objective function for exploration: cover time

slide-5
SLIDE 5

Contributions

Cover Time: #steps to visit every state by a random walk

1. Introduced an objective function for exploration: cover time

slide-6
SLIDE 6

Contributions

Algorithm:

  • 1. Embed the state-space graph to a real value (i.e. Fiedler vector)

1. Introduced an objective function for exploration: cover time 2. Proposed an option discovery algorithm which minimizes the upper bound of the cover time

slide-7
SLIDE 7

Contributions

Algorithm:

  • 1. Embed the state-space graph to a real value (i.e. Fiedler vector)

1. Introduced an objective function for exploration: cover time 2. Proposed an option discovery algorithm which minimizes the upper bound on the cover time Fiedler vector Euclidean

slide-8
SLIDE 8

Contributions

Algorithm:

  • 1. Embed the state-space graph to a real value (i.e. Fiedler vector)
  • 2. Generate options to connect the two most distant states

Fiedler vector Euclidean 1. Introduced an objective function for exploration: cover time 2. Proposed an option discovery algorithm which minimizes the upper bound on the cover time

slide-9
SLIDE 9

Contributions

Algorithm:

  • 1. Embed the state-space graph to a real value (i.e. Fiedler vector)
  • 2. Generate options to connect the two most distant states

Theorem: The upper bound on the cover time is improved: 1. Introduced an objective function for exploration: cover time 2. Proposed an option discovery algorithm which minimizes the upper bound on the cover time

slide-10
SLIDE 10

Contributions

1. Introduced an objective function for exploration: cover time 2. Proposed an option discovery algorithm which minimizes the upper bound on the cover time 3. Empirical comparison shows that it outperforms existing algorithms

slide-11
SLIDE 11

Contributions

1. Introduced an objective function for exploration: cover time 2. Proposed an option discovery algorithm which minimizes the upper bound on the cover time 3. Empirical comparison shows that it outperforms existing algorithms

slide-12
SLIDE 12

Contributions

1. Introduced an objective function for exploration: cover time 2. Proposed an option discovery algorithm which minimizes the upper bound on the cover time 3. Empirical comparison shows that it outperforms existing algorithms

Poster at Ballroom #117