ICML 2018
Gated Path Planning Networks
Lisa Lee
Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov
Gated Path Planning Networks Lisa Lee Machine Learning Department - - PowerPoint PPT Presentation
Gated Path Planning Networks Lisa Lee Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov ICML 2018 Path Planning Gated Path Planning Networks
ICML 2018
Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
2
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
3
https://giphy.com/gifs/battlefield-navigate-selfdriving-AmqDSvVwywm7m
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
4
A* search (popular heuristic algorithm) ⇒ Not differentiable
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
5
Value Iteration Networks (Tamar et al., 2016) ⇒ Fully differentiable!
maintaining end-to-end differentiability.
many recent works:
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
Problem: VINs are difficult to optimize. 1. Overview of VIN 2. We reframe VIN as a recurrent-convolutional network. 3. From this perspective, we propose architectural improvements to VIN. ⇒ Gated Path Planning Networks (GPPN) 4. We show that GPPN performs better & alleviates many
6
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University Map Design & Goal Location ! " ($) & "
+
' " ($) recurrence max pooling conv ' " ($()) conv Reward State Value State Value Action-State Value
M × M M × M M × M
2 × M × M
N × M × M
(1) (2)
Output State Value
M × M
8 ! " ($, &) = )
* +,, +
/ $, &, $ 0 + 23 " 4 5 ($ 0) 3 " ($) = max
9
! " ($, &) Value Iteration (Bellman, 1957)
! " ($) = max
+ "
, "+
" ($)
, "+
" ($) = - + " ./
"[1] + -
+ " 4!
"[1]
($56)
(1) (2)
Convolution with kernel size 3
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University Map Design & Goal Location ! " ($) & "
+
' " ($) recurrence max pooling conv ' " ($()) conv Reward State Value State Value Action-State Value
M × M M × M M × M
2 × M × M
N × M × M
(1) (2)
Output State Value
M × M
9
Recurrent-Convolutional Network with:
Non-gated RNNs are known to be difficult to optimize.
nonlinearity kernel size 3
! " ($) = max
* "
+
* " ,-
"[/] + +
* " 3!
"[/]
($45)
convolution recurrence
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University Map Design & Goal Location ! " ($) & "
+
' " ($) recurrence LSTM conv ' " ($()) conv Reward State Value State Value Action-State Value
M × M M × M M × M
2 × M × M
N × M × M
Output State Value
M × M
10
GPPN:
well-established gated recurrent operator (e.g., LSTM).
nonlinearity convolution recurrence
! " ($) = LSTM + ,
./
"[1] + ,
5!
"[1]
($67)
kernel size
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
11
!(#) = max
)
*)
+,[.] + *) 1![.] (#23)
!(#) = LSTM 8
)
*)
+,[9] + *) 1![9] (#23)
VIN update: GPPN update:
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
13
Test VIN & GPPN on a variety of settings such as:
Goal Agent
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
14
Test VIN & GPPN on a variety of settings such as:
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
15
Test VIN & GPPN on a variety of settings such as:
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
16
Test VIN & GPPN on a variety of settings such as:
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
17 First-person RGB images
3D ViZDoom Environment
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
19
GPPN learns faster.
unstable
60 70 80 90 100 1 4 7 10 13 16 19 22 25 28
% Optimal # Epochs GPPN VIN
% Optimal: Percentage of states whose predicted paths have optimal length.
Test perform ance on 15 × 15 m azes with NEW S m echanism , dataset size 25k, and best (K, F) settings for each m odel.
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
20
GPPN performs better.
Test perform ance on 15 × 15 m azes with dataset size 10k and best (K, F) settings for each m odel.
80 85 90 95 100 NEWS Moore
% Optimal Maze Transition Models GPPN VIN
Performance difference
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
Test perform ance on 15 × 15 m azes with NEW S m echanism and best (K, F) settings for each m odel.
21
GPPN generalizes better with less data.
85 90 95 100 10k 25k 100k
% Optimal Training Dataset Size GPPN VIN
Performance difference
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
GPPN is more stable to hyperparameter changes.
22
Test perform ance on 15 × 15 m azes with Differential Drive m echanism , dataset size 100k, and best (K, F) settings for each m odel.
20 40 60 80 100 1 4 7 10 13 16 19 22 25
% Optimal Hyperparameter Setting Index (Ordered by % Optimal) GPPN VIN
flatter
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
23
GPPN exhibits less variance.
Test perform ance on 15 × 15 m azes with dataset size 100k and best (K, F) settings for each m odel.
NEWS
% Optimal
93 94 95 96 97 98 99 100
GPPN VIN
Amount of variance
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
inductive bias of VIN.
resemble path-finding algorithms like value iteration.
explore other RNN architectural improvements:
24
!(#) = LSTM *
+
,+
3![0] (#45)
!(#) = max
+
,+
3![9] (#45)
VIN: GPPN:
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University
inductive bias of VIN.
resemble path-finding algorithms like value iteration.
explore other RNN architectural improvements:
25
Future directions
Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 26 Lisa Lee Emilio Parisotto Devendra Chaplot Russ Salakhutdinov Eric Xing