Gated Path Planning Networks Lisa Lee Machine Learning Department - - PowerPoint PPT Presentation

gated path planning networks
SMART_READER_LITE
LIVE PREVIEW

Gated Path Planning Networks Lisa Lee Machine Learning Department - - PowerPoint PPT Presentation

Gated Path Planning Networks Lisa Lee Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov ICML 2018 Path Planning Gated Path Planning Networks


slide-1
SLIDE 1

ICML 2018

Gated Path Planning Networks

Lisa Lee

Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov

slide-2
SLIDE 2

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Path Planning

2

slide-3
SLIDE 3

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

  • Autonomous vehicles
  • Drones
  • Factory robots
  • Household robots

Path Planning is a fundamental part of any application that requires navigation.

3

https://giphy.com/gifs/battlefield-navigate-selfdriving-AmqDSvVwywm7m

slide-4
SLIDE 4

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Path Planning

4

A* search (popular heuristic algorithm) ⇒ Not differentiable

slide-5
SLIDE 5

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Path Planning

5

Value Iteration Networks (Tamar et al., 2016) ⇒ Fully differentiable!

  • Can be used as a path planner module in neural architectures while

maintaining end-to-end differentiability.

  • VINs have become an important path planner component used in

many recent works:

  • QMDP-Net: Deep learning for planning under partial observability (Karkus et al., 2017)
  • Cognitive mapping and planning for visual navigation (Gupta et al., 2017)
  • Unifying map and landmark based representations for visual navigation (Gupta et al., 2017)
  • Memory Augmented Control Networks (Khan et al., 2018)
  • Deep Transfer in RL by Language Grounding (Narasimhan 2017)
slide-6
SLIDE 6

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Outline of this talk

Problem: VINs are difficult to optimize. 1. Overview of VIN 2. We reframe VIN as a recurrent-convolutional network. 3. From this perspective, we propose architectural improvements to VIN. ⇒ Gated Path Planning Networks (GPPN) 4. We show that GPPN performs better & alleviates many

  • ptimization issues of VIN.

6

slide-7
SLIDE 7

Methods

slide-8
SLIDE 8

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University Map Design & Goal Location ! " ($) & "

+

' " ($) recurrence max pooling conv ' " ($()) conv Reward State Value State Value Action-State Value

M × M M × M M × M

2 × M × M

N × M × M

(1) (2)

Output State Value

M × M

Overview of VIN

8 ! " ($, &) = )

* +,, +

  • $′ $, &

/ $, &, $ 0 + 23 " 4 5 ($ 0) 3 " ($) = max

9

! " ($, &) Value Iteration (Bellman, 1957)

! " ($) = max

+ "

, "+

" ($)

, "+

" ($) = - + " ./

"[1] + -

+ " 4!

"[1]

($56)

(1) (2)

Convolution with kernel size 3

slide-9
SLIDE 9

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University Map Design & Goal Location ! " ($) & "

+

' " ($) recurrence max pooling conv ' " ($()) conv Reward State Value State Value Action-State Value

M × M M × M M × M

2 × M × M

N × M × M

(1) (2)

Output State Value

M × M

Overview of VIN

9

Recurrent-Convolutional Network with:

  • An unconventional nonlinearity (max-pooling)
  • Restriction of kernel sizes to 3
  • A hidden dimension of 1

Non-gated RNNs are known to be difficult to optimize.

nonlinearity kernel size 3

! " ($) = max

* "

+

* " ,-

"[/] + +

* " 3!

"[/]

($45)

convolution recurrence

slide-10
SLIDE 10

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University Map Design & Goal Location ! " ($) & "

+

' " ($) recurrence LSTM conv ' " ($()) conv Reward State Value State Value Action-State Value

M × M M × M M × M

2 × M × M

N × M × M

Output State Value

M × M

Gated Path Planning Networks (GPPN)

10

GPPN:

  • Replace max-pooling activation with a

well-established gated recurrent operator (e.g., LSTM).

  • Allow kernel size F > 3.

nonlinearity convolution recurrence

! " ($) = LSTM + ,

  • "

./

"[1] + ,

  • "

5!

"[1]

($67)

  • "

kernel size

slide-11
SLIDE 11

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Gated Path Planning Networks (GPPN)

11

!(#) = max

)

*)

+,[.] + *) 1![.] (#23)

!(#) = LSTM 8

)

*)

+,[9] + *) 1![9] (#23)

The gated LSTM update is well-known to alleviate many of the optimization problems with standard recurrent networks.

VIN update: GPPN update:

slide-12
SLIDE 12

Experimental Setup

slide-13
SLIDE 13

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Maze environments

13

Test VIN & GPPN on a variety of settings such as:

  • Training dataset size
  • Maze size
  • Maze Transition Models

Goal Agent

slide-14
SLIDE 14

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Maze environments

14

Test VIN & GPPN on a variety of settings such as:

  • Training dataset size
  • Maze size
  • Maze Transition Models
  • NEWS
slide-15
SLIDE 15

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Maze environments

15

Test VIN & GPPN on a variety of settings such as:

  • Training dataset size
  • Maze size
  • Maze Transition Models
  • NEWS
  • Moore
slide-16
SLIDE 16

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Maze environments

16

Test VIN & GPPN on a variety of settings such as:

  • Training dataset size
  • Maze size
  • Maze Transition Models
  • NEWS
  • Moore
  • Differential Drive
slide-17
SLIDE 17

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Maze environments

17 First-person RGB images

3D ViZDoom Environment

slide-18
SLIDE 18

Experimental Results

slide-19
SLIDE 19

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Our GPPN outperforms VIN in a variety of metrics:

  • Learning speed

19

GPPN learns faster.

unstable

60 70 80 90 100 1 4 7 10 13 16 19 22 25 28

% Optimal # Epochs GPPN VIN

% Optimal: Percentage of states whose predicted paths have optimal length.

Test perform ance on 15 × 15 m azes with NEW S m echanism , dataset size 25k, and best (K, F) settings for each m odel.

slide-20
SLIDE 20

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Our GPPN outperforms VIN in a variety of metrics:

  • Learning speed
  • Performance

20

GPPN performs better.

Test perform ance on 15 × 15 m azes with dataset size 10k and best (K, F) settings for each m odel.

80 85 90 95 100 NEWS Moore

  • Diff. Drive

% Optimal Maze Transition Models GPPN VIN

Performance difference

slide-21
SLIDE 21

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

  • Learning speed
  • Performance
  • Generalization

Our GPPN outperforms VIN in a variety of metrics:

Test perform ance on 15 × 15 m azes with NEW S m echanism and best (K, F) settings for each m odel.

21

GPPN generalizes better with less data.

85 90 95 100 10k 25k 100k

% Optimal Training Dataset Size GPPN VIN

Performance difference

slide-22
SLIDE 22

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Our GPPN outperforms VIN in a variety of metrics:

GPPN is more stable to hyperparameter changes.

  • Learning speed
  • Performance
  • Generalization
  • Hyperparameter sensitivity

22

Test perform ance on 15 × 15 m azes with Differential Drive m echanism , dataset size 100k, and best (K, F) settings for each m odel.

20 40 60 80 100 1 4 7 10 13 16 19 22 25

% Optimal Hyperparameter Setting Index (Ordered by % Optimal) GPPN VIN

flatter

slide-23
SLIDE 23

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Our GPPN outperforms VIN in a variety of metrics:

  • Learning speed
  • Performance
  • Generalization
  • Hyperparameter sensitivity
  • Random seed sensitivity

23

GPPN exhibits less variance.

Test perform ance on 15 × 15 m azes with dataset size 100k and best (K, F) settings for each m odel.

NEWS

  • Diff. Drive

% Optimal

93 94 95 96 97 98 99 100

GPPN VIN

Amount of variance

slide-24
SLIDE 24

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Conclusion

  • GPPN is a more general architecture that relaxes the architectural

inductive bias of VIN.

  • Performs better & alleviates many optimization issues of VIN.
  • Our results suggest that path planning architectures need not strictly

resemble path-finding algorithms like value iteration.

  • By looking at VIN as a recurrent-convolutional network, we can

explore other RNN architectural improvements:

  • Gated recurrent operators (Our work)
  • Multiplicative Integration (Wu et al., 2016)
  • Orthogonality constraints (Vorontsov et al., 2017)

24

!(#) = LSTM *

+

,+

  • .[0] + ,+

3![0] (#45)

!(#) = max

+

,+

  • .[9] + ,+

3![9] (#45)

VIN: GPPN:

slide-25
SLIDE 25

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University

Conclusion

  • GPPN is a more general architecture that relaxes the architectural

inductive bias of VIN.

  • Performs better & alleviates many optimization issues.
  • Our results suggest that path planning architectures need not strictly

resemble path-finding algorithms like value iteration.

  • By looking at VIN as a recurrent-convolutional network, we can

explore other RNN architectural improvements:

  • Gated recurrent operators (Our work)
  • Multiplicative Integration (Wu et al., 2016)
  • Orthogonality constraints (Vorontsov et al., 2017)

25

Future directions

slide-26
SLIDE 26

Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 26 Lisa Lee Emilio Parisotto Devendra Chaplot Russ Salakhutdinov Eric Xing

Code available on GitHub:

https://github.com/lileee/gated-path-planning-networks

Check out our poster!

Today 18:15 - 21:00 @ Hall B (#134)