Can Autonomous Vehicles Identify, Recover From, Rowan - - PowerPoint PPT Presentation

can autonomous vehicles identify recover from
SMART_READER_LITE
LIVE PREVIEW

Can Autonomous Vehicles Identify, Recover From, Rowan - - PowerPoint PPT Presentation

Can Autonomous Vehicles Identify, Recover From, Rowan github.com/OATML/carsuite Yarin Sergey and Adapt to Distribution Shifts? Nick Angelos Panos ICML 2020 Tigkas Filos McAllister Rhinehart Levine Gal equal


slide-1
SLIDE 1

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

ICML 2020

Angelos Filos∗α Panos Tigkas∗α Rowan McAllisterβ Nick Rhinehartβ Sergey Levineβ Yarin Galα

github.com/OATML/carsuite

∗ equal contribution αUniversity of Oxford βUniversity of California, Berkeley

slide-2
SLIDE 2

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time:

demo

xi yi

N i 1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-3
SLIDE 3

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time:

demo

xi yi

N i 1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-4
SLIDE 4

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time: Ddemo = {(xi, yi)}N

i=1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-5
SLIDE 5

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time: Ddemo = {(xi, yi)}N

i=1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-6
SLIDE 6

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time: Ddemo = {(xi, yi)}N

i=1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-7
SLIDE 7

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time: Ddemo = {(xi, yi)}N

i=1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-8
SLIDE 8

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time: Ddemo = {(xi, yi)}N

i=1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-9
SLIDE 9

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time: Ddemo = {(xi, yi)}N

i=1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-10
SLIDE 10

Problem Setting

Learning from Demonstrations

  • no explicit reward function
  • plethora of expert demonstrations

Training Time: Ddemo = {(xi, yi)}N

i=1

Evaluation Time: navigate safely to goals

Sensitivity to Distribution Shifts (OOD)

  • failure to generalise in OOD
  • inexhaustible space of scenes
  • ut-of-distribution (OOD) scenes

in-distribution scenes

Decision-making under distribution shift

slide-11
SLIDE 11

Main Result

CARLA @ Roundabout nuScenes @ Boston Nicholas Rhinehart et al. (2020). “Deep Imitative Models for Flexible Inference, Planning, and Control”. International Conference on Learning Representations (ICLR). Felipe Codevilla et al. (2018). “End-to-end driving via conditional imitation learning”. International Conference on Robotics and Automation (ICRA). IEEE, pp. 1–9.

slide-12
SLIDE 12

Main Result

Uncertainty-Aware Online Planning in OOD

Nicholas Rhinehart et al. (2020). “Deep Imitative Models for Flexible Inference, Planning, and Control”. International Conference on Learning Representations (ICLR). Felipe Codevilla et al. (2018). “End-to-end driving via conditional imitation learning”. International Conference on Robotics and Automation (ICRA). IEEE, pp. 1–9.

slide-13
SLIDE 13

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-14
SLIDE 14

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-15
SLIDE 15

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-16
SLIDE 16

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-17
SLIDE 17

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-18
SLIDE 18

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-19
SLIDE 19

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-20
SLIDE 20

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-21
SLIDE 21

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-22
SLIDE 22

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-23
SLIDE 23

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-24
SLIDE 24

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-25
SLIDE 25

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-26
SLIDE 26

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-27
SLIDE 27

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty

slide-28
SLIDE 28

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty yRIP-⊕ ≜ arg max

y

θ∈supp(p(θ|D))

log q(y|x; θ)

slide-29
SLIDE 29

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty yRIP-WCM ≜ arg max

y

min

θ∈supp(p(θ|D))log q(y|x; θ)

slide-30
SLIDE 30

Recover From OOD: Robust Imitative Planning (RIP)

CARLA @ Roundabout OOD driving scene

models, qk trajectories, yi y1 y2 y3 q1 0.6 0.1 0.3 q2 0.3 0.4 0.3 q3 0.2 0.2 0.6 mink 0.2 0.1 0.3 y3

1 K ∑k 1.1 3 0.7 3 1.2 3

y3 arg maxi y3 y2 y1

Robust Imitative Planning “plan” ← “aggregate” ← “evaluate”

Online Planning Under Epistemic Uncertainty sRIP-MA ≜ arg max

y

p(θ|D)log q(y|x; θ)dθ

slide-31
SLIDE 31

Practical Implementation

Algorithm 1: Robust Imitaive Planning (RIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y

3

sample random plan

4

while note converged do

5

// ``evaluate'': {q(y|x, G; θk)}K

1 6

score plan under imitative model(s)

7

// ``aggregate'': ⊕, mink,

1 K ∑k 8

consolidate evaluations of ensemble

9

// ``plan'': y ← y + η ∂U

∂y 10

improve plan with online SGD // ``act'': y∗

11

submit plan to environment

x θ1 y θK y ⊕ e.g., mink, 1

K ∑k

U = ⊕ log q(y|x, G; θ),

∂U ∂y

plan y∗ = arg maxy U

slide-32
SLIDE 32

Practical Implementation

Algorithm 2: Robust Imitaive Planning (RIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y

3

sample random plan

4

while note converged do

5

// ``evaluate'': {q(y|x, G; θk)}K

1 6

score plan under imitative model(s)

7

// ``aggregate'': ⊕, mink,

1 K ∑k 8

consolidate evaluations of ensemble

9

// ``plan'': y ← y + η ∂U

∂y 10

improve plan with online SGD // ``act'': y∗

11

submit plan to environment

x θ1 y θK y ⊕ e.g., mink, 1

K ∑k

U = ⊕ log q(y|x, G; θ),

∂U ∂y

plan y∗ = arg maxy U

slide-33
SLIDE 33

Practical Implementation

Algorithm 3: Robust Imitaive Planning (RIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y

3

sample random plan

4

while note converged do

5

// ``evaluate'': {q(y|x, G; θk)}K

1 6

score plan under imitative model(s)

7

// ``aggregate'': ⊕, mink,

1 K ∑k 8

consolidate evaluations of ensemble

9

// ``plan'': y ← y + η ∂U

∂y 10

improve plan with online SGD // ``act'': y∗

11

submit plan to environment

x θ1 y θK y ⊕ e.g., mink, 1

K ∑k

U = ⊕ log q(y|x, G; θ),

∂U ∂y

plan y∗ = arg maxy U

slide-34
SLIDE 34

Practical Implementation

Algorithm 4: Robust Imitaive Planning (RIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y

3

sample random plan

4

while note converged do

5

// ``evaluate'': {q(y|x, G; θk)}K

1 6

score plan under imitative model(s)

7

// ``aggregate'': ⊕, mink,

1 K ∑k 8

consolidate evaluations of ensemble

9

// ``plan'': y ← y + η ∂U

∂y 10

improve plan with online SGD // ``act'': y∗

11

submit plan to environment

x θ1 y θK y ⊕ e.g., mink, 1

K ∑k

U = ⊕ log q(y|x, G; θ),

∂U ∂y

plan y∗ = arg maxy U

slide-35
SLIDE 35

Practical Implementation

Algorithm 5: Robust Imitaive Planning (RIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y

3

sample random plan

4

while note converged do

5

// ``evaluate'': {q(y|x, G; θk)}K

1 6

score plan under imitative model(s)

7

// ``aggregate'': ⊕, mink,

1 K ∑k 8

consolidate evaluations of ensemble

9

// ``plan'': y ← y + η ∂U

∂y 10

improve plan with online SGD // ``act'': y∗

11

submit plan to environment

x θ1 y θK y ⊕ e.g., mink, 1

K ∑k

U = ⊕ log q(y|x, G; θ),

∂U ∂y

plan y∗ = arg maxy U

slide-36
SLIDE 36

Practical Implementation

Algorithm 6: Robust Imitaive Planning (RIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y

3

sample random plan

4

while note converged do

5

// ``evaluate'': {q(y|x, G; θk)}K

1 6

score plan under imitative model(s)

7

// ``aggregate'': ⊕, mink,

1 K ∑k 8

consolidate evaluations of ensemble

9

// ``plan'': y ← y + η ∂U

∂y 10

improve plan with online SGD // ``act'': y∗

11

submit plan to environment

x θ1 y θK y ⊕ e.g., mink, 1

K ∑k

U = ⊕ log q(y|x, G; θ),

∂U ∂y

plan y∗ = arg maxy U

slide-37
SLIDE 37

Practical Implementation

Algorithm 7: Robust Imitaive Planning (RIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y

3

sample random plan

4

while note converged do

5

// ``evaluate'': {q(y|x, G; θk)}K

1 6

score plan under imitative model(s)

7

// ``aggregate'': ⊕, mink,

1 K ∑k 8

consolidate evaluations of ensemble

9

// ``plan'': y ← y + η ∂U

∂y 10

improve plan with online SGD // ``act'': y∗

11

submit plan to environment

x θ1 y θK y ⊕ e.g., mink, 1

K ∑k

U = ⊕ log q(y|x, G; θ),

∂U ∂y

plan y∗ = arg maxy U

slide-38
SLIDE 38

Quantitative Results

ICRA 2020 nuScenes prediction challenge Boston Singapore minADE1 ↓ minADE5 ↓ minFDE1 ↓ minADE1 ↓ minADE5 ↓ minFDE1 ↓ Methods (2073 scenes, 50 samples, open-loop planning) (1189 scenes, 50 samples, open-loop planning) MTP♦⋆ (Cui et al., 2019) 4.13 3.24 9.23 4.13 3.24 9.23 MultiPath♦⋆ (Chai et al., 2019) 3.89 3.34 9.19 3.89 3.34 9.19 CoverNet♦⋆ (Phan-Minh et al., 2019) 3.87 2.41 9.26 3.87 2.41 9.26 DIM♣ (Rhinehart et al., 2020) 3.64±0.05 2.48±0.02 8.22±0.13 3.82±0.04 2.95±0.01 8.91±0.08 RIP-BCM♣ (baseline) 3.53±0.04 2.37±0.01 7.92±0.09 3.57±0.02 2.70±0.01 8.39±0.03 RIP-MA♣ (ours) 3.39±0.03 2.33±0.01 7.62±0.07 3.48±0.01 2.69±0.02 8.19±0.02 RIP-WCM♣ (ours) 3.29±0.03 2.28±0.00 7.45±0.05 3.43±0.01 2.66±0.01 8.09±0.04

CARNOVEL benchmark of novel driving scenarios (based on CARLA) AbnormalTurns BusyTown Success ↑ Infra/km ↓ Success ↑ Infra/km ↓ Methods (7 × 10 scenes, %) (×1e − 3) (11 × 10 scenes, %) (×1e − 3) CIL♣⋆ (Codevilla et al., 2018) 65.71±7.37 7.04±5.07 5.45±6.35 11.49±3.66 LbC♣⋆ (Chen et al., 2019) 0.00±0.00 5.81±0.58 20.00±13.48 3.96±0.15 DIM♣ (Rhinehart et al., 2020) 74.28±11.26 5.56±4.06 47.13±14.54 8.47±5.22 RIP-BCM♣ (baseline) 68.57±9.03 7.93±3.73 50.90±20.64 3.74±5.52 RIP-MA♣ (ours) 84.28±14.20 7.86±5.70 64.54±23.25 5.86±3.99 RIP-WCM♣ (ours) 87.14±14.20 4.91±3.60 62.72±5.16 3.17±2.04

slide-39
SLIDE 39

Quantitative Results

ICRA 2020 nuScenes prediction challenge Boston Singapore minADE1 ↓ minADE5 ↓ minFDE1 ↓ minADE1 ↓ minADE5 ↓ minFDE1 ↓ Methods (2073 scenes, 50 samples, open-loop planning) (1189 scenes, 50 samples, open-loop planning) MTP♦⋆ (Cui et al., 2019) 4.13 3.24 9.23 4.13 3.24 9.23 MultiPath♦⋆ (Chai et al., 2019) 3.89 3.34 9.19 3.89 3.34 9.19 CoverNet♦⋆ (Phan-Minh et al., 2019) 3.87 2.41 9.26 3.87 2.41 9.26 DIM♣ (Rhinehart et al., 2020) 3.64±0.05 2.48±0.02 8.22±0.13 3.82±0.04 2.95±0.01 8.91±0.08 RIP-BCM♣ (baseline) 3.53±0.04 2.37±0.01 7.92±0.09 3.57±0.02 2.70±0.01 8.39±0.03 RIP-MA♣ (ours) 3.39±0.03 2.33±0.01 7.62±0.07 3.48±0.01 2.69±0.02 8.19±0.02 RIP-WCM♣ (ours) 3.29±0.03 2.28±0.00 7.45±0.05 3.43±0.01 2.66±0.01 8.09±0.04

CARNOVEL benchmark of novel driving scenarios (based on CARLA) AbnormalTurns BusyTown Success ↑ Infra/km ↓ Success ↑ Infra/km ↓ Methods (7 × 10 scenes, %) (×1e − 3) (11 × 10 scenes, %) (×1e − 3) CIL♣⋆ (Codevilla et al., 2018) 65.71±7.37 7.04±5.07 5.45±6.35 11.49±3.66 LbC♣⋆ (Chen et al., 2019) 0.00±0.00 5.81±0.58 20.00±13.48 3.96±0.15 DIM♣ (Rhinehart et al., 2020) 74.28±11.26 5.56±4.06 47.13±14.54 8.47±5.22 RIP-BCM♣ (baseline) 68.57±9.03 7.93±3.73 50.90±20.64 3.74±5.52 RIP-MA♣ (ours) 84.28±14.20 7.86±5.70 64.54±23.25 5.86±3.99 RIP-WCM♣ (ours) 87.14±14.20 4.91±3.60 62.72±5.16 3.17±2.04

slide-40
SLIDE 40

Quantitative Results

ICRA 2020 nuScenes prediction challenge Boston Singapore minADE1 ↓ minADE5 ↓ minFDE1 ↓ minADE1 ↓ minADE5 ↓ minFDE1 ↓ Methods (2073 scenes, 50 samples, open-loop planning) (1189 scenes, 50 samples, open-loop planning) MTP♦⋆ (Cui et al., 2019) 4.13 3.24 9.23 4.13 3.24 9.23 MultiPath♦⋆ (Chai et al., 2019) 3.89 3.34 9.19 3.89 3.34 9.19 CoverNet♦⋆ (Phan-Minh et al., 2019) 3.87 2.41 9.26 3.87 2.41 9.26 DIM♣ (Rhinehart et al., 2020) 3.64±0.05 2.48±0.02 8.22±0.13 3.82±0.04 2.95±0.01 8.91±0.08 RIP-BCM♣ (baseline) 3.53±0.04 2.37±0.01 7.92±0.09 3.57±0.02 2.70±0.01 8.39±0.03 RIP-MA♣ (ours) 3.39±0.03 2.33±0.01 7.62±0.07 3.48±0.01 2.69±0.02 8.19±0.02 RIP-WCM♣ (ours) 3.29±0.03 2.28±0.00 7.45±0.05 3.43±0.01 2.66±0.01 8.09±0.04

CARNOVEL benchmark of novel driving scenarios (based on CARLA) AbnormalTurns BusyTown Success ↑ Infra/km ↓ Success ↑ Infra/km ↓ Methods (7 × 10 scenes, %) (×1e − 3) (11 × 10 scenes, %) (×1e − 3) CIL♣⋆ (Codevilla et al., 2018) 65.71±7.37 7.04±5.07 5.45±6.35 11.49±3.66 LbC♣⋆ (Chen et al., 2019) 0.00±0.00 5.81±0.58 20.00±13.48 3.96±0.15 DIM♣ (Rhinehart et al., 2020) 74.28±11.26 5.56±4.06 47.13±14.54 8.47±5.22 RIP-BCM♣ (baseline) 68.57±9.03 7.93±3.73 50.90±20.64 3.74±5.52 RIP-MA♣ (ours) 84.28±14.20 7.86±5.70 64.54±23.25 5.86±3.99 RIP-WCM♣ (ours) 87.14±14.20 4.91±3.60 62.72±5.16 3.17±2.04

slide-41
SLIDE 41

can we do better?

slide-42
SLIDE 42

Online Adaptation: Adaptive RIP (AdaRIP)

Algorithm 7: Adaptive RIP (AdaRIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y sample random plan while note converged do // ``evaluate'': {q(y|x, G; θk)}K

1 3

score plan under imitative model(s) // ``aggregate'': ⊕, mink,

1 K ∑k 4

consolidate evaluations of ensemble // ``plan'': y ← y + η ∂U

∂y 5

improve plan with online SGD // ``adapt'': u(y∗) > τ

6

if epistemically uncertain then

7

query expert

8

update model to reduce uncertainty // ``act'': y∗

9

submit plan to environment

slide-43
SLIDE 43

Online Adaptation: Adaptive RIP (AdaRIP)

Algorithm 8: Adaptive RIP (AdaRIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y sample random plan while note converged do // ``evaluate'': {q(y|x, G; θk)}K

1 3

score plan under imitative model(s) // ``aggregate'': ⊕, mink,

1 K ∑k 4

consolidate evaluations of ensemble // ``plan'': y ← y + η ∂U

∂y 5

improve plan with online SGD // ``adapt'': u(y∗) > τ

6

if epistemically uncertain then

7

query expert

8

update model to reduce uncertainty // ``act'': y∗

9

submit plan to environment

slide-44
SLIDE 44

Online Adaptation: Adaptive RIP (AdaRIP)

Algorithm 9: Adaptive RIP (AdaRIP) // ``train'': {θk}K

k=1 1 learn from demonstrations with MLE 2 for step in environment do

// ``init'': y sample random plan while note converged do // ``evaluate'': {q(y|x, G; θk)}K

1 3

score plan under imitative model(s) // ``aggregate'': ⊕, mink,

1 K ∑k 4

consolidate evaluations of ensemble // ``plan'': y ← y + η ∂U

∂y 5

improve plan with online SGD // ``adapt'': u(y∗) > τ

6

if epistemically uncertain then

7

query expert

8

update model to reduce uncertainty // ``act'': y∗

9

submit plan to environment

(Normalized) Uncertainty RIP AdaRIP

slide-45
SLIDE 45

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-46
SLIDE 46

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-47
SLIDE 47

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-48
SLIDE 48

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-49
SLIDE 49

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-50
SLIDE 50

Contributions

Identify

novel (OOD) scenes in-distribution scenes

Epistemic (Model) Uncertainty

Recover From

Robust Imitative Planning (RIP)

Adapt To

Adaptive RIP (AdaRIP)

Autonomous car novel-scene benchmark CARNOVEL github.com/OATML/carsuite

AbnormnalTurns BusyTown Hills Roundabouts

slide-51
SLIDE 51

Open Questions

  • 1. Real-time epistemic uncertainty estimation;
  • 2. Real-time online planning;
  • 3. Resistance to catastrophic forgetting in online adaptation.
slide-52
SLIDE 52

Open Questions

  • 1. Real-time epistemic uncertainty estimation;
  • 2. Real-time online planning;
  • 3. Resistance to catastrophic forgetting in online adaptation.
slide-53
SLIDE 53

Open Questions

  • 1. Real-time epistemic uncertainty estimation;
  • 2. Real-time online planning;
  • 3. Resistance to catastrophic forgetting in online adaptation.
slide-54
SLIDE 54

Open Questions

  • 1. Real-time epistemic uncertainty estimation;
  • 2. Real-time online planning;
  • 3. Resistance to catastrophic forgetting in online adaptation.
slide-55
SLIDE 55

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

ICML 2020

Angelos Filos∗α Panos Tigkas∗α Rowan McAllisterβ Nick Rhinehartβ Sergey Levineβ Yarin Galα

github.com/OATML/carsuite

∗ equal contribution αUniversity of Oxford βUniversity of California, Berkeley

slide-56
SLIDE 56

Bibliography

Codevilla, Felipe et al. (2018). “End-to-end driving via conditional imitation learning”. International Conference on Robotics and Automation (ICRA). IEEE, pp. 1–9. Chai, Yuning et al. (2019). “Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction”. arXiv preprint arXiv:1910.05449. Chen, Dian et al. (2019). “Learning by cheating”. arXiv preprint arXiv:1912.12294. Cui, Henggang et al. (2019). “Multimodal trajectory predictions for autonomous driving using deep convolutional networks”. 2019 International Conference on Robotics and Automation (ICRA). IEEE, pp. 2090–2096. Phan-Minh, Tung et al. (2019). “CoverNet: Multimodal Behavior Prediction using Trajectory Sets”. arXiv preprint arXiv:1911.10298. Rhinehart, Nicholas, Rowan McAllister, and Sergey Levine (2020). “Deep Imitative Models for Flexible Inference, Planning, and Control”. International Conference on Learning Representations (ICLR).