Lifelong Visual Mapping Linguang Zhang Princeton Vision Group

Papers • Toward lifelong object segmentation from change detection in dense rgb-d maps. • Towards Semantic KinectFusion. • Slam++: Simultaneous localisation and mapping at the level of objects. • 3D mapping, localisation and object retrieval using low cost robotic platforms: A robotic search engine for the real-world.

Current SLAM Systems • NO truly ‘pick up and play’ SLAM systems. � • Low-cost devices. • Used by non-expert users. • Sparse feature-based SLAM: � • world is modeled as an unconnected point cloud. • can be improved by key frame SLAM systems (bundle adjustment). • Dense SLAM: � • GPGPU processing hardware. • Reconstruct and track full surface models. • KinectFusion (extension using a sliding volume: Kintinuous). • Truly scalable, multi-resolution, loop closure capable dense non-parametric surface representation has not been developed. • Wasteful in environments with symmetry.

Tools • KinectFusion • Kintinuous: https://www.youtube.com/watch?v=D3yYjaLmiqU • Iterative Closest Point: https://www.youtube.com/watch? v=PCPfuZ7njmQ

Lifelong Object Segmentation • Input: two RGB-D maps with regions of overlap where the map have changed (obtained from Kintinuous system) • Goals: • Discover objects by differencing. • Training segmentation algorithms. • If the same object is discovered, refine the features and the segmentation method.

Object Discovery • RGB-D SLAM - Kintinuous. • Map Alignment - manually initialized and refined by ICP. • Differencing (Symmetric). • . • Filtering. • Small scattered points - remove clusters smaller than a 3 x 3 x 3 cm cube (estimated from Kintinuous volumetric resolution). • Free-space filtering.

Free-space Filtering before filtering after filtering

Free-space Filtering occupied unseen free space Only assume the clusters protrude into the free-space as objects.

Segmentation Methods • Graph structure. • Treat every point in the map as a node. • Neighboring nodes are defined by the points within a radius r’ (twice the volumetric resolution). • Connect neighboring nodes by undirected edges with weights (initialized by T ). • Kill edges with a weight below a dynamic threshold. • Threshold (controlled by k ) grows after each joining. • k is correlated to the segment size.

Edge Weights • Color edge weights: � • Normal edge weights: Convex Convex parts more likely correspond to objects. Concave parts correspond to object boundaries.

Segmentation Fitting • Scoring. � • Optimization.

Different Segmentation Methods Color Convexity surface normal

Segmentation Fitting • Object representation. • based on Principle Component Analysis (PCA). • each feature is used as a normal distribution with mean being the measured value and variance being • Object matching.

Lifelong Learning • Manually group discovered objects together to guarantee the correct convergence for the variance of the features. • Update scores and feature distributions. • new score is the weighted average using the number of times each object was observed as the weights.

Results trash bin stuffed bunny Precision Recall Curve

Results

Semantic KinectFusion • Contributions. • using a TSDF volume to build a keyframe representation of the environment. • Semantic Bundle Adjustment.

Keyframe Representation • Surface voxels: voxels having a non-truncated function value in TSDF. • Adding a new field to each TSDF voxel to keep track of the list of keyframes. • Appending the keyframe index to the keyframe list when merging a keyframe into the volume.

Optimization • Optimize the pose graph. (G2O optimizer) • cost function: • Novel matching scheme - for each keyframe: • find its corresponding surface voxels by back-projection. • get keyframe list and project the voxel back to all corresponding keyframes. • 2D local search and find the nearest 3D point. • After all these, recreate the TSDF.

Semantic KinectFusion • Extracting 3D features. (In experiments: SIFT3D, Color-SHOT.) “A Representation for 3-D Surface Matching” • “A combined texture-shape descriptor for enhanced 3D feature matching” • • Generate set of candidate hypotheses on objects’ presence. • RANSAC-based 6DOF pose estimation. • Validation graph • edges: • virtual edges: • Clear wrong hypotheses and include all the constraints into the global graph.

System Overview

Results

Object-oriented SLAM - SLAM++ • Stronger assumptions: • World has intrinsic symmetry - repetitive objects. • Pre-define the objects. • Video: https://www.youtube.com/watch?v=tmrAh1CqCRo

Characteristics of Object SLAM • The map only contains a few discrete entities. • Enables the possibility to jointly optimize over all object positions to make globally consistent map. • Tracking one object in 6DOF is enough to localize a camera. • Lost camera or loop closure detection can be performance based on a small number of object measurements.

SLAM Map Representation • Graph representation: • object - world pose • object - camera pose • camera - world pose • additional factors: • camera - camera pose (ICP). • structural priors: objects must be grounded on the same plane.

Real-Time Object Recognition • Point-Pair Features (PPFs). • 4D descriptors of relative position and normals of pairs of oriented points. • Randomly sample points and pair up in all possible combinations to generate PPFs. • Matching against models and producing a vote for each match. • Active Object Search: • Generate a mask image space from view prediction. (avoid occlusion)

Active Object Search

Camera Tracking and Object Pose Estimation • Camera model tracking. • In KinectFusion, track against incomplete models at early stage. • For SLAM++, track against high quality models. • Tracking for model initialization. (reject incorrect objects) • Given a candidate object and detected pose, run camera-model ICP estimation on the detected object pose. • Camera-Object pose constraints. • Run dense ICP estimate between the live frame and each visible model object.

Relocalization

Loop Closure • Small loops: • standard ICP tracking mechanism • Large loops: • matching fragments within the main long-term graph (same as relocalization)

Loop Closure

Real-world Example Whelan, Thomas, et al. "3D mapping, localisation and object retrieval using low cost robotic platforms: A robotic search engine for the real-world." https://www.youtube.com/watch?v=XqDUniEY954

System Overview

Steps • Avoidance-based exploration. • Dense SLAM. • Planar simplification. • Object detection. • Path planning. • Onboard localization and control.

Problem: compounding of failure rates • Frame-drops in the wireless streaming of raw- RGB-D image sequence. • Failure of the planar segmentation algorithm. • Failure of object segmentation -> recognition due to noise. Conclusion: techniques involved are quite robust alone. But the reliability is not enough when combining them together into a complete framework.

Lifelong Visual Mapping Linguang Zhang Princeton Vision Group - PowerPoint PPT Presentation

Lifelong Visual Mapping Linguang Zhang Princeton Vision Group Papers Toward lifelong object segmentation from change detection in dense rgb-d maps. Towards Semantic KinectFusion. Slam++: Simultaneous localisation and mapping at the

Texture and other Mappings Texture Mapping Texture Mapping Bump Mapping Bump Mapping

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

TEXTURE MAPPING 1 OUTLINE Introduce Mapping Methods Texture Mapping Environment

30/06/2017 Universit de La Rochelle 1. Lifelong Learning in France 2. The Blue Biotechnology

Lifelong Learning CS 330 Plan for Today The lifelong learning problem statement Basic approaches

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Advanced Texturing Environment Mapping Environment Mapping reflections Environment Mapping

Texture Mapping Texture Mapping 1 Texture Mapping Texture Mapping Motivation Motivation:

Texture Mapping Surface mapping OpenGl and Implementation Details Texture mapping Bump

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

RTK Mapping Process RTK Mapping Presentation RTK Mapping Presentation June 4, 2002 RTK

Mapping data Representing data with maps Geographic analysis tasks Mapping where things are

lecture 17 - more on texture mapping - graphics pipeline - MIP mapping - procedural

Texture and other Mappings Texture Mapping Bump Mapping Displacement Mapping Environment

The Essentials of CAGD Chapter 2: Lines and Planes Gerald Farin & Dianne Hansford CRC Press,

Breaking Databases via SQLi attacks Azqa Nadeem PhD Student @ Cyber Security Group The Cyber

Childrens privacy relations around their media uses in Turkey Hamide Elif zmc Visiting

!""#$%&'()++,'-.//"01%&' !20%$%&'.+$%&'3220'30"&0*4+'

Interactive Exploratory Data Visualization in R A workshop introduction to loon and related

Childrens National Health System Engaging Patients and Families at the Point of Care Kathleen

Dynamo: Theme and Varia0ons @shanley Riak 150

TLS Record Protocol: Security Analysis and Defense-in-depth Countermeasures for HTTPS Olivier