Neural Topological SLAM for Visual Navigation CVPR-2020 Webpage: - - PowerPoint PPT Presentation

neural topological slam for visual navigation
SMART_READER_LITE
LIVE PREVIEW

Neural Topological SLAM for Visual Navigation CVPR-2020 Webpage: - - PowerPoint PPT Presentation

Neural Topological SLAM for Visual Navigation CVPR-2020 Webpage: https://devendrachaplot.github.io/projects/Neural-Topological-SLAM Abhinav Ruslan Devendra Singh Saurabh Gupta Salakhutdinov Gupta Chaplot Semantic Priors and


slide-1
SLIDE 1

Neural Topological SLAM for Visual Navigation

Devendra Singh Chaplot Saurabh
 Gupta Abhinav
 Gupta Ruslan
 Salakhutdinov

CVPR-2020

Webpage: https://devendrachaplot.github.io/projects/Neural-Topological-SLAM

slide-2
SLIDE 2

Semantic Priors and Common-Sense

  • Humans use semantic

priors and common-sense to explore and navigate everyday

  • Most navigation algorithms

struggle to do so

2

slide-3
SLIDE 3

Semantic Priors and Common-Sense

  • Humans use semantic

priors and common-sense to explore and navigate everyday

  • Most navigation algorithms

struggle to do so

Target Image

2 3 1

2

slide-4
SLIDE 4

Image Goal Task

(IS) Source Image (IG) Goal Image

3

slide-5
SLIDE 5

Image Goal Task

(IS) Source Image (IG) Goal Image

  • Agent observations are panoramic images

3

slide-6
SLIDE 6

Image Goal Task

(IS) Source Image (IG) Goal Image

navigate

  • Agent observations are panoramic images
  • Take actions to navigate to the goal location

3

slide-7
SLIDE 7

Image Goal Task

(IS) Source Image (IG) Goal Image

navigate

  • Agent observations are panoramic images
  • Take actions to navigate to the goal location
  • Take the `stop’ action at the goal location

3

slide-8
SLIDE 8

Image Goal Task

(IS) Source Image (IG) Goal Image

navigate

  • Agent observations are panoramic images
  • Take actions to navigate to the goal location
  • Take the `stop’ action at the goal location
  • Sequential goals

3

slide-9
SLIDE 9

Prior work

4

slide-10
SLIDE 10

Prior work

4

Observations Neural Network Actions

Reward

+

Reward

End-to-end Reinforcement or Imitation Learning

End-to-end Learning

  • High sample complexity
  • Ineffective in large environments
slide-11
SLIDE 11

Prior work

4

Observations Neural Network Actions

Reward

+

Reward

End-to-end Reinforcement or Imitation Learning Modular Metric Maps

End-to-end Learning

  • High sample complexity
  • Ineffective in large environments

Modular Metric Maps

  • Can not learn semantic priors
  • Pose error accumulation
slide-12
SLIDE 12

Topological Maps

5

slide-13
SLIDE 13

Topological Maps

5

Entrance Children’s Room Living Room Stairway Dining Room Office Kitchen Master Bedroom Hallway

slide-14
SLIDE 14

Topological Graph Representation

6

slide-15
SLIDE 15

Topological Graph Representation

6

Agent’s Current Node Regular Nodes Ghost Nodes

  • Nodes: areas
  • Regular nodes: Explored areas
  • Ghost nodes: Unexplored areas
slide-16
SLIDE 16

Topological Graph Representation

6

Selected Ghost Node Agent’s Current Node Regular Nodes Ghost Nodes

  • Nodes: areas
  • Regular nodes: Explored areas
  • Ghost nodes: Unexplored areas
slide-17
SLIDE 17

Topological Graph Representation

7

Agent’s Current Node Regular Nodes Ghost Nodes

  • Nodes: areas
  • Regular nodes: Explored areas
  • Ghost nodes: Unexplored areas
slide-18
SLIDE 18

Topological Graph Representation

7

Agent’s Current Node Regular Nodes Ghost Nodes

  • Nodes: areas
  • Regular nodes: Explored areas
  • Ghost nodes: Unexplored areas
slide-19
SLIDE 19

Topological Graph Representation

7

Selected Ghost Node Agent’s Current Node Regular Nodes Ghost Nodes

  • Nodes: areas
  • Regular nodes: Explored areas
  • Ghost nodes: Unexplored areas
slide-20
SLIDE 20

Topological Graph Representation

7

Selected Ghost Node Agent’s Current Node Regular Nodes Ghost Nodes

Relative Position

  • Nodes: areas
  • Regular nodes: Explored areas
  • Ghost nodes: Unexplored areas
  • Edges: Spatial relationship

between nodes

slide-21
SLIDE 21

Four learnable functions

8

slide-22
SLIDE 22

Four learnable functions

8

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

slide-23
SLIDE 23

Geometric Prediction

9

slide-24
SLIDE 24

Geometric Prediction

9

= Geometric Prediction: Free directions

ℱG(I1)

slide-25
SLIDE 25

Semantic Prediction

10

slide-26
SLIDE 26

Semantic Prediction

10

= Semantic Prediction: Closeness to target

ℱS(I1, I2)

slide-27
SLIDE 27

Localization

11

slide-28
SLIDE 28

Localization

11

= Localization

ℱL(I1, I2)

Localization(ℱL) Localization(ℱL) 1

slide-29
SLIDE 29

Relative Pose Prediction

12 12

slide-30
SLIDE 30

Relative Pose Prediction

12

= Relative Pose

ℱR(I1, I2)

Relative Pose Prediction(ℱR)

12

ℱR

0.0 0.0 0.0 0.87 0.0 0.0 0.0 0.0 0.0 0.0 1

Direction label

ℱR

Score predictions Angle Distance

slide-31
SLIDE 31

Neural Topological SLAM

13

slide-32
SLIDE 32

Neural Topological SLAM

13

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

slide-33
SLIDE 33

Neural Topological SLAM

14

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

slide-34
SLIDE 34

Neural Topological SLAM

15

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

ℱL(I1, I2) ℱS(I1, I2) ℱR(I1, I2) ℱL(I1, I2) ℱG(I1) ℱR(I1, I2)

slide-35
SLIDE 35

Neural Topological SLAM

16

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

ℱL(I1, I2) ℱS(I1, I2) ℱR(I1, I2) ℱL(I1, I2) ℱG(I1) ℱR(I1, I2)

slide-36
SLIDE 36

Neural Topological SLAM

17

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

ℱL(I1, I2) ℱS(I1, I2) ℱR(I1, I2)

Δp Δp

ℱL(I1, I2) ℱG(I1) ℱR(I1, I2)

slide-37
SLIDE 37

Neural Topological SLAM

18

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

ℱL(I1, I2) ℱS(I1, I2) ℱR(I1, I2) ℱL(I1, I2) ℱG(I1) ℱR(I1, I2)

slide-38
SLIDE 38

Neural Topological SLAM

19

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

ℱL(I1, I2) ℱS(I1, I2) ℱR(I1, I2) ℱL(I1, I2) ℱG(I1) ℱR(I1, I2)

slide-39
SLIDE 39

Neural Topological SLAM

20

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

ℱL(I1, I2) ℱS(I1, I2) ℱR(I1, I2) ℱL(I1, I2) ℱG(I1) ℱR(I1, I2)

slide-40
SLIDE 40

Neural Topological SLAM

21

= Geometric Prediction: Free directions = Semantic Prediction: Closeness to target = Localization = Relative Pose Prediction

ℱG(I1) ℱS(I1, I2) ℱL(I1, I2) ℱR(I1, I2)

ℱL(I1, I2) ℱS(I1, I2) ℱR(I1, I2) ℱL(I1, I2) ℱG(I1) ℱR(I1, I2)

slide-41
SLIDE 41

Single supervised learning model

22

slide-42
SLIDE 42

Single supervised learning model

  • No reinforcement learning, no interaction needed
  • Can be trained completely with static data

22

slide-43
SLIDE 43

Demo video

23 Goal Location Node Locations Ghost nodes Selected Ghost node

slide-44
SLIDE 44

Demo video

23

0.73 0.09 0.08 0.07 0.17 0.23

Goal Location Node Locations Ghost nodes Selected Ghost node

slide-45
SLIDE 45

Demo video

Goal Location Node Locations Ghost nodes Selected Ghost node 24

slide-46
SLIDE 46

Demo video

Goal Location Node Locations Ghost nodes Selected Ghost node 24

slide-47
SLIDE 47

Learning Semantic Priors

25

0.73 0.09 0.08 0.07 0.17 0.23

Goal Location Node Locations Ghost nodes Selected Ghost node

slide-48
SLIDE 48

Learning Semantic Priors

26

0.76 0.20 0.56 0.17 0.27 0.13

Goal Location Node Locations Ghost nodes Selected Ghost node

slide-49
SLIDE 49

Learning Semantic Priors

Goal Location Node Locations Ghost nodes Selected Ghost node 27

slide-50
SLIDE 50

Learning Semantic Priors

Goal Location Node Locations Ghost nodes Selected Ghost node 27

slide-51
SLIDE 51

Results

RGB RGBD RGBD (No Noise) RGBD (No Stop) LSTM + Imitation 0.10 0.14 0.15 0.18 LSTM + RL 0.10 0.13 0.14 0.17 Occupancy Maps + FBE + RL N/A 0.26 0.31 0.24 Active Neural SLAM 0.23 0.29 0.35 0.39 Neural Topological SLAM 0.38 0.43 0.45 0.60

End-to-end 
 Learning Modular
 Metric Maps Topological Maps

slide-52
SLIDE 52

Results

RGB RGBD RGBD (No Noise) RGBD (No Stop) LSTM + Imitation 0.10 0.14 0.15 0.18 LSTM + RL 0.10 0.13 0.14 0.17 Occupancy Maps + FBE + RL N/A 0.26 0.31 0.24 Active Neural SLAM 0.23 0.29 0.35 0.39 Neural Topological SLAM 0.38 0.43 0.45 0.60

Robustness to Pose Noise NTS is better than occupancy map models, captures and uses semantic priors.

End-to-end 
 Learning Modular
 Metric Maps Topological Maps

slide-53
SLIDE 53

Sequential Goals and Ablations

  • ablations as a function of number of sequential goals.
slide-54
SLIDE 54

Sequential Goals and Ablations

  • ablations as a function of number of sequential goals.

But, at the same time, importance of the topological representation increases Semantic score function improves efficiency when no prior experience with environment is available. As experience in environment increases, utility of semantic function decreases

slide-55
SLIDE 55

30

Neural Topological SLAM for Visual Navigation

Devendra Singh Chaplot, Ruslan Salakhutdinov, Abhinav Gupta, Saurabh Gupta
 CVPR 2020

Webpage: https://devendrachaplot.github.io/projects/Neural-Topological-SLAM

Devendra Singh Chaplot

Webpage: http://devendrachaplot.github.io/ Email: chaplot@cs.cmu.edu Twitter: @dchaplot

Thank you