for mapping environments Joo F. Henriques, Andrea Vedaldi Visual - PowerPoint PPT Presentation

MapNet: An allocentric spatial memory for mapping environments João F. Henriques, Andrea Vedaldi Visual Geometry Group

Motivation What we usually have: Object detections • Segmentations • 3D information • (relative to camera) ... • Image-centric ⇒ tasks Henriques and Vedaldi, MapNet , CVPR 2018 2

Motivation What we would like: Reason beyond image, into world • Object permanence • Eventually, long-term goals and planning • World-centric ⇒ tasks Henriques and Vedaldi, MapNet , CVPR 2018 3

Simultaneous Localization And Mapping (SLAM) Frame #3 Frame #1 Frame #2 Time Location Location ... Agent Agent Agent Map Map Hard to adapt to new environments (hand-tuning) • Classic SLAM No semantic information • (No learning) No use of priors to compensate for missing data • Henriques and Vedaldi, MapNet , CVPR 2018 4

Related work – deep learning for SLAM Frame #3 Frame #1 Frame #2 Time ... Agent Agent Agent Location Location No map • Egomotion predictors Cannot correct for inevitable drift • Costante ’ 15, Clark ’ 17, Zhu ’ 17, Wang ’ 17, ... Henriques and Vedaldi, MapNet , CVPR 2018 5

Related work – deep learning for SLAM Frame #3 Frame #1 Frame #2 Time ... Agent Agent Agent Location Location Map (offline) Map is stored in deep network ’ s parameters • Offline-learned localization New environments require re-training • Kendall ’ 15, Mirowski ’ 18, Brahmbhatt ’ 18, ... Henriques and Vedaldi, MapNet , CVPR 2018 6

Related work – deep learning for SLAM Frame #3 Frame #1 Frame #2 Time ... Agent Agent Agent Map Map Location (egomotion) Map is created on-the-fly as activations • Online mapping, Perfect egomotion input is used for localization, not map • no localization Tested on synthetic environments (so far) • Kanitscheider ’ 16, Gupta ’ 17, Zhang ’ 17, Parisotto ’ 17, ... Henriques and Vedaldi, MapNet , CVPR 2018 7

Proposed method Frame #3 Frame #1 Frame #2 Time Location Location ... Agent Agent Agent Map Map Performs both Mapping and Localization with a deep net • Our method No egomotion information • (MapNet) Fully online (mapping as we go) • Henriques and Vedaldi, MapNet , CVPR 2018 8

Allocentric map memory Image Map model: Represent ground plane as 2D grid. • Store one embedding per location. 𝑦 • 𝑧 Localization Allows associating semantics with • Embedding world coordinates . Mapping Position/orientation Map tensor heatmap Henriques and Vedaldi, MapNet , CVPR 2018 9

Localization and mapping as dual operators Embedding Image Location ⋆ ∗ Map memory Map memory at time 𝑢 at time 𝑢 + 1 Core insight: Localization ⇔ convolution Mapping ⇔ deconvolution Henriques and Vedaldi, MapNet , CVPR 2018 10

Ground projected CNN features Ground projection CNN Local view Image (CNN embeddings in the ground-plane) Given depth and camera intrinsics, • project CNN features to ground-plane. Since camera pose is unknown, the • output 2D grid is local (camera-space). Depth Henriques and Vedaldi, MapNet , CVPR 2018 11

Localization Localize by dense matching of the local view ’ s embeddings to the map. Position heatmap Local view Cross-correlation Softmax 𝜏 ⋆ Requires only one cross-correlation • (convolution). Can be interpreted as addressing a • Map spatial associative memory . Henriques and Vedaldi, MapNet , CVPR 2018 12

Localization Also consider camera orientation : Rotated local views Position and orientation heatmap Local view Cross-correlation Softmax Resampler 𝜏 ⋆ (rotation) Orientations Simply resample the local • view at several rotations. Map Use as filter bank for • cross-correlation. Henriques and Vedaldi, MapNet , CVPR 2018 13

Localization Camera reference-frame World reference-frame Henriques and Vedaldi, MapNet , CVPR 2018 14

Mapping The mapping step updates the map with the local view. Rotated local views The local view must be registered to world-space. • Requires one deconvolution of the position/orientation • heatmap, using the local views (filter bank). After registration, the local view can ∗ • Deconvolution be easily integrated into the map Registered local view (e.g. by linear interpolation, or a Position and orientation heatmap convolutional LSTM) Henriques and Vedaldi, MapNet , CVPR 2018 15

Full pipeline Image Local view Ground Resampler CNN projection (rotation) Registered local view 𝜏 ⋆ ∗ Position and orientation heatmap LSTM Map Updated map Henriques and Vedaldi, MapNet , CVPR 2018 16

Full pipeline Image Local view Ground Resampler CNN projection (rotation) Mapping ⇔ deconvolution Registered local view Localization ⇔ 𝜏 ⋆ ∗ convolution Position and orientation heatmap LSTM Map Updated map Henriques and Vedaldi, MapNet , CVPR 2018 17

Experiments – 2D data Toy problem setup 100,000 mazes • Agent moves at random • Local view Limited, local visibility • Training Input sequences of 5 frames • Position/orientation supervision • Min. logistic loss of predicted position (heatmap) • Henriques and Vedaldi, MapNet , CVPR 2018 18

Experiments – 2D data Global view Local view (always facing right) Predicted heatmap (blue – ground truth) Henriques and Vedaldi, MapNet , CVPR 2018 19

Experiments – 2D data Global view Local view (always facing right) Predicted heatmap (blue – ground truth) Henriques and Vedaldi, MapNet , CVPR 2018 20

Experiments – 2D data Map tensor (one channel per column) Sample #1 Sample #2 Sample #3 Sample #4 ⇒ Several local views are integrated into a larger map. Henriques and Vedaldi, MapNet , CVPR 2018 21

Experiments – 2D data Yes! Is this map semantic? → Assigned class labels to maze cells • Map embedding Class labels (color-coded) (corridors, turns, dead-ends...). Class label is correctly predicted from • a cell ’ s embedding most of the time. Balanced dataset prediction accuracy (chance: 50%) Henriques and Vedaldi, MapNet , CVPR 2018 22

Experiments – 3D game data ResearchDoom Dataset • 4 recorded speed-runs through the whole game https://www.youtube.com/watch?v=mInSO7YW1EU • 6 hours of gameplay • Challenging, large hand-crafted levels Henriques and Vedaldi, MapNet , CVPR 2018 23

Experiments – 3D real data Active Vision Dataset • Robot platform in 19 indoor scenes • Images collected at all https://www.youtube.com/watch?v=-MUXfcrxGEM positions/orientations • Can be composed into unlimited sequences Henriques and Vedaldi, MapNet , CVPR 2018 24

Experiments – 3D data quantitative results ResearchDoom Dataset Active Vision Dataset Henriques and Vedaldi, MapNet , CVPR 2018 25

Conclusions We perform SLAM entirely online • using an end-to-end learned architecture. Localization and Mapping are a dual pair of • convolution/deconvolution . Semantic embeddings of the World arise • from the self-localization objective. Next step: navigation and long-term goals. • Project page with code: www.robots.ox.ac.uk/~joao/mapnet Henriques and Vedaldi, MapNet , CVPR 2018 26

for mapping environments Joo F. Henriques, Andrea Vedaldi Visual - PowerPoint PPT Presentation

MapNet: An allocentric spatial memory for mapping environments Joo F. Henriques, Andrea Vedaldi Visual Geometry Group Motivation What we usually have: Object detections Segmentations 3D information (relative to camera)

Texture and other Mappings Texture Mapping Texture Mapping Bump Mapping Bump Mapping

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

TEXTURE MAPPING 1 OUTLINE Introduce Mapping Methods Texture Mapping Environment

Advanced Texturing Environment Mapping Environment Mapping reflections Environment Mapping

Texture Mapping Texture Mapping 1 Texture Mapping Texture Mapping Motivation Motivation:

Texture Mapping Surface mapping OpenGl and Implementation Details Texture mapping Bump

misc: environments, usethis, package structure Environments Environments and bindings via

Environments Announcements Environments for Higher-Order Functions Environments Enable

RGB-D Mapping Overview CSE 571 Robotics Map RGB-D Mapping `` University of Washington Dieter

RTK Mapping Process RTK Mapping Presentation RTK Mapping Presentation June 4, 2002 RTK

Mapping data Representing data with maps Geographic analysis tasks Mapping where things are

lecture 17 - more on texture mapping - graphics pipeline - MIP mapping - procedural

Texture and other Mappings Texture Mapping Bump Mapping Displacement Mapping Environment

Computer Graphics (CS 543) Lecture 10: Bump Mapping, Parallax, Relief, Alpha, Specular Mapping

Computer Graphics Remeshing Parameterization Convex mapping Texture mapping and

lecture 16 Texture mapping Aliasing (and anti-aliasing) Texture (images) Texture Mapping Q:

Probabilistic Graphical Models David Sontag New York University Lecture 2, February 2, 2012

Human-in-the-Loop Interpretability Prior Isaac Lage 1 , Andrew Slavin Ross 1 , Been Kim 2 , Samuel

Probabilistic Graphical Models Lecture 3 Bayesian Networks Semantics CS/CNS/EE 155 Andreas

Planning.Maryland.gov Planning.Maryland.gov E XAMPLE U SES THE L AND U SE M AP Planning Activity

Supercompilation for Haskell Neil Mitchell, Colin Runciman www.cs.york.ac.uk/~ndm/supero The

Punctured logarithmic maps and punctured invariants Dan Abramovich, Brown University Work with

Notes: Introduction of the tools used in GIS for Emergency Management and Fire Incident Mapping.

Random maps with unconstrained genus Thomas Budzinski Joint work with Nicolas Curien and Bram