SLAM CAD&CG SLAM: - - PowerPoint PPT Presentation

slam
SMART_READER_LITE
LIVE PREVIEW

SLAM CAD&CG SLAM: - - PowerPoint PPT Presentation

SLAM CAD&CG SLAM:


slide-1
SLIDE 1

视觉SLAM

章国锋 浙江大学CAD&CG国家重点实验室

slide-2
SLIDE 2

SLAM: 同时定位与地图构建

 机器人和计算机视觉领域的基本问题

 在未知环境中定位自身方位并同时构建环境三维地图

 广泛的应用

 增强现实、虚拟现实  机器人、无人驾驶、航空航天

slide-3
SLIDE 3

SLAM常用的传感器

红外传感器:较近距离感应,常用于扫地机器人。

激光雷达、深度传感器。

摄像头:单目、双目、多目。

惯性传感器(英文叫IMU,包括陀螺仪、加速度计):智能手机标配。

常见的单目摄像头 激光雷达 普通手机摄像头也可作为传感器 双目摄像头 微软Kinect彩色-深度(RGBD)传感器 手机上的惯性传感器(IMU)

slide-4
SLIDE 4

SLAM的运行结果

 设备根据传感器的信息

 计算自身位置(在空间中的位置和朝向)  构建环境地图(稀疏或者稠密的三维点云)

稀疏SLAM 稠密SLAM

slide-5
SLIDE 5

SLAM系统常用的框架

输入

  • 传感器数据

前台线程

  • 根据传感器数据进行跟

踪求解,实时恢复每个 时刻的位姿

后台线程

  • 进行局部或全局优化,减少

误差累积

  • 场景回路检测

输出

  • 设备实时位姿
  • 三维点云

RGB图 深度图 IMU测量值

优化以减少误差累积 回路检测

slide-6
SLIDE 6

Related Work

 Filter-based SLAM

 Davison et al.2007 (MonoSLAM), Eade and Drummond 2006,

Mourikis et al. 2007 (MSCKF), …

 Keyframe-based SLAM

 Klein and Murray 2007,2008 (PTAM), Castle et al.2008, Tan et

  • al. 2013 (RDSLAM), Mur-Artal et al. 2015 (ORB-SLAM), Liu et
  • al. 2016 (RKSLAM), …

 Direct Tracking based SLAM

 Engel et al. 2014 (LSD-SLAM), Forster et al. 2014 (SVO), Engel

et al. 2018 (DSO)

slide-7
SLIDE 7

 State at time k, model as multivariate Gaussian  State transition model  State observation model

Extended Kalman Filter

k k k

w x f x  

 )

(

1

) , ˆ ( ~

k k k

P x N x ) , ( ~

k k

Q N w

mean covariance

Process noise

k k k

v x h z   ) ( ) , ( ~

k k

R N v

Observation noise

slide-8
SLIDE 8

 Predict  Update

Extended Kalman Filter

1 | 1

ˆ 1 | 1 1 | 1 | 1 1 |

) ˆ ( ˆ

 

     

     

k k

x k k T k k k k k k k k k k

x f F Q F P F P x f x

| 1

| 1 1 | 1 | | 1 | 1 | | 1 ˆ

ˆ ˆ ˆ ( ( )) ( )

k k

T k k k k k k T k k k k k k k k k k k k k k k k k k k k x

S H P H R K P H S x x K z h x P I K H P H h x

     

          

Innovation covariance

slide-9
SLIDE 9

MonoSLAM

  • A. J. Davison, N. D. Molton, I. Reid, and O. Stasse. MonoSLAM: Real-

time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 29(6):1052-1067, 2007.

 Map representation                         

2 1

X X C X C x

camera state point state

                    

2 2 1 2 2 2 1 1 1 1 2 1

X X X X C X X X X X C X CX CX CC

P P P P P P P P P P

slide-10
SLIDE 10

MonoSLAM

camera position

  • rientation quaternion

linear velocity angular velocity

 Camera state

              

k k k k k

v q p C 

slide-11
SLIDE 11

MonoSLAM

 Predict

linear acceleration angular acceleration

 

1 1 | 1 1 1 | 1 1

) ) (( ) ( ) , ( diag , ~

      

                                               

k k k k k k k k k k k k k k k k k a k k k k

X X a v q t q t a v p v q p C Q Q N w a w      

slide-12
SLIDE 12

MonoSLAM

 Predicted features position  Innovation covariance

 Elliptical feature search region

) , ( ~ ) , ( R N v v C X z

i i i i

  

i i X i C T X X X X T C C X X T X CX C T C CC C i

X z J C z J R J P J J P J J P J J P J S

i i i i i i i i i

          

slide-13
SLIDE 13

MonoSLAM

 Active search

Shi and Tomasi Feature Elliptical search region

slide-14
SLIDE 14

MonoSLAM

 Complexity

per frame

 Scalability

 Hundreds of points

) (

3

N O

slide-15
SLIDE 15

PTAM: Parallel Tracking and Mapping

 Map representation

  • G. Klein and D. W. Murray. Parallel Tracking and Mapping for Small AR Workspaces. In

Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), 2007.

slide-16
SLIDE 16

PTAM: Parallel Tracking and Mapping

 Overview

Feature Extraction Feature Tracking 3D Points Camera Pose Estimation Keyframes Bundle Adjustment Add New 3D Points New Keyframe ? yes Foreground Thread Map Background Thread

slide-17
SLIDE 17

Keyframe-based SLAM vs Filtering-based SLAM

 Advantages

 Accuracy  Efficiency  Scalability

 Disadvantages

 Sensitive to strong rotation

 Challenges for both

 Fast motion  Motion blur  Insufficient texture

  • H. Strasdat, J. Montiel, and A. J. Davison. Visual SLAM: Why filter?

Image and Vision Computing, 30:65-77, 2012.

slide-18
SLIDE 18

ORB-SLAM

Raul Mur-Artal, J. M. M. Montiel, Juan D. Tardós: ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans. Robotics 31(5): 1147-1163 (2015).

slide-19
SLIDE 19

ORB-SLAM: A Versatile and Accurate Monocular SLAM System

 基本延续了 PTAM 的算法框架,但对框架中

的大部分组件都做了改进

选用ORB特征, 匹配和重定位性能更好. 加入了循环回路的检测和闭合机制, 以消除误

差累积.

通过检测视差来自动选择初始化的两帧. 采用一种更鲁棒的关键帧和三维点的选择机制.

slide-20
SLIDE 20

Direct Tracking

Thomas Schops, Jakob Engel, Daniel Cremers: Semi-dense visual odometry for AR on a smartphone. ISMAR 2014: 145-150.

slide-21
SLIDE 21

 Goal

 Estimate the camera motion by aligning intensity

images and with depth map of

 Assumption

Direct Tracking

))) ( , , ( ( ) (

1 2 1

x Z x I x I   

1

I

2

I

1

Z

warping function: maps a pixel from I1 to I2

1

I

slide-22
SLIDE 22

 Warping function

Direct Tracking

Christian Kerl, Jürgen Sturm, Daniel Cremers: Robust odometry estimation for RGB-D cameras. ICRA 2013: 3748-3754

)) ( , , (

1 x

Z x  

T y y x x T

f c v f c u x Z x Z v u x Z x p             

 

, ) ( )) ( , ) , (( )) ( , (

1 1 1 1 1

 

slide-23
SLIDE 23

 Warping function

Direct Tracking

Christian Kerl, Jürgen Sturm, Daniel Cremers: Robust odometry estimation for RGB-D cameras. ICRA 2013: 3748-3754

)) ( , , (

1 x

Z x  

T y y x x T

c Z Y f c Z X f Z Y X p T             , ) ) , , (( )) , ( (   

t Rp p T   ) , (

slide-24
SLIDE 24

 Warping function

Direct Tracking

Christian Kerl, Jürgen Sturm, Daniel Cremers: Robust odometry estimation for RGB-D cameras. ICRA 2013: 3748-3754

)))) ( , ( , ( ( )) , ( ( )) ( , , (

1 1 1

x Z x T p T x Z x

        

)) ( , , (

1 x

Z x  

slide-25
SLIDE 25

 Residual of the k-th pixel  Posteriori likelihood

Direct Tracking

) ( ))) ( , , ( ( ) (

1 1 2 k k k k

x I x Z x w I r    

) ( ) ( ) | ( ) ( ) ( ) | ( ) | ( r p p r p r p p r p r p

k k

              

slide-26
SLIDE 26

Semi-Dense Visual Odometry

Jakob Engel, Jürgen Sturm, Daniel Cremers: Semi-dense Visual Odometry for a Monocular Camera. ICCV 2013: 1449-1456

slide-27
SLIDE 27

Semi-Dense Visual Odometry

 Keyframe representation

) ( ) ( ) ( ) , , (

2

x V x D d x I i V D I K

i d i i i i i i i i

i 

   

image intensity inverse depth inverse depth variance

slide-28
SLIDE 28

Semi-Dense Visual Odometry

 Overview

slide-29
SLIDE 29

LSD-SLAM

Jakob Engel, Thomas Schops, Daniel Cremers: LSD-SLAM: Large-Scale Direct Monocular SLAM. ECCV (2) 2014: 834-849. After loop closure Before loop closure

slide-30
SLIDE 30

LSD-SLAM

 Map representation

 Pose graph of keyframes  Node: keyframe  Edge: similarity transformation

) , , (

i i i i

V D I K  ) 3 ( sim 

ji

slide-31
SLIDE 31

LSD-SLAM

 Overview

slide-32
SLIDE 32

LSD-SLAM

 Direct sim(3) image alignment

) / 1 , , ( ) ( ) ( ) ( ) ( ) ( )) / 1 , ( , ( / 1 ) , ( 2 ) ( )) / 1 , , ( ( ) , ( ) , ( ) , ( min arg

2 2 2 ) , ( 1 2 2 2 2 ) , ( 2 ) , ( 2 2 ) , ( 2 *

2 2 2 2

i ji i d i j d j p r j i ji Z ji d d i p I p r i i ji j ji p p p r ji d p r ji p ji

d p p p D r p V p D r p V p D d p T p r d r p I d p I p r p r p r

ji d i ji p ji d ji p ji

                

         

                                      

slide-33
SLIDE 33

LSD-SLAM

 Pose graph optimization

 Energy function:

Kummerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g2o: A general framework for graph optimization. In: Intl.

  • Conf. on Robotics and Automation(ICRA) (2011)
slide-34
SLIDE 34

 Gradually changing

Key Issues for SLAM in Dynamic Environments

slide-35
SLIDE 35

Key Issues for SLAM in Dynamic Environments

 Gradually changing  Object Occlusion

Viewpoint Change Dynamic Objects

slide-36
SLIDE 36

 Gradually changing  Object Occlusion

Viewpoint Change Dynamic Objects

 Very low inlier ratio

Key Issues for SLAM in Dynamic Environments

slide-37
SLIDE 37

RDSLAM Framework

slide-38
SLIDE 38

 Keyframe representation  3D Change detection

 Select 5 closest keyframes for online image.  For each valid feature point x in each

selected keyframe,

 Compute its projection x’ in current frame  If , compute the appearance

difference

Online 3D Points and Keyframes Updating

slide-39
SLIDE 39

 Keyframe representation  3D Change detection

 Select 5 closest keyframes for online image.  For each valid feature point x in each

selected keyframe,

 Compute its projection x’ in current frame  If , compute the appearance

difference

 If , then find a set of

feature points y close to x’.

Online 3D Points and Keyframes Updating

Since dynamic points cannot be triangulated, the occlusion caused by dynamic objects can be excluded here.

slide-40
SLIDE 40

 Keyframe representation  3D Change detection

 Select 5 closest keyframes for online image.  For each valid feature point x in each

selected keyframe,

 Compute its projection x’ in current frame  If , compute the appearance

difference

 If , then find a set of

feature points y close to x’.

  • If or their depths are

very close, set V(X)=0.

Online 3D Points and Keyframes Updating

The occlusions caused by static objects are also excluded.

Since dynamic points cannot be triangulated, the occlusion caused by dynamic objects can be excluded here.

slide-41
SLIDE 41

Occlusion Handling

slide-42
SLIDE 42

Occlusion Handling

(a) The SLAM result without occlusion handling. (b) The SLAM result with occlusion handling.

slide-43
SLIDE 43

Random Sample Consensus (RANSAC)

Step 1. Compute a set of potential matches Step 2. While T(#inliers, #samples) < 95% do step 2.1 select minimal sample (6 matches) step 2.2 compute solutions for P step 2.3 determine inliers Step 3. Refine P based on all inliers

[Fischler and Bolles, 1981] Objective: Robust fit of a model to a data set S which contains outliers.

slide-44
SLIDE 44

Prior-based Adaptive RANSAC

 Sample generation

10x10 bins Prior probability

 Hypothesis evaluation

Inliers number Inliers distribution, i.e.,

distribution ellipse

j j i i

p

* * /

 

i i

A C s ) det( ) (  

i i

N  C

slide-45
SLIDE 45

Prior-based Adaptive RANSAC

 Hypothesis evaluation

200 green points on the static background, 300 cyan points on the rigidly moving object, 500 red points are randomly moving.

i i

A C s ) det( ) (  

=24.94

=21.77

slide-46
SLIDE 46

Prior-based Adaptive RANSAC

 Hypothesis evaluation

200 green points on the static background, 300 cyan points on the rigidly moving object, 500 red points are randomly moving.

i i

A C s ) det( ) (  

=24.94

S1 = 8.31 > S2 = 1.98 =21.77

slide-47
SLIDE 47

Result Comparison

(a) The SLAM result with standard RANSAC (b) The SLAM result with our PARSAC

slide-48
SLIDE 48

Results and Comparison

slide-49
SLIDE 49

Results and Comparison

slide-50
SLIDE 50

http://www.zjucvg.net/rdslam/rdslam.html

slide-51
SLIDE 51

Visual-Inertial SLAM

 Use IMU data to improve robustness

Filtering-based methods

 MSCKF, SLAM in Project Tango, …

Non-linear optimization based methods

 OKVIS, …

 Can work without real IMU data?

slide-52
SLIDE 52

RKSLAM Framework

 Multi-Homography based

Tracking

 Global homography  Specific Homography  Local Homographies

 Sliding-window based pose

  • ptimization

 Use global image alignment

to estimate rotational velocity

 Pose optimization with

simulated IMU data

slide-53
SLIDE 53

Multi-Homography based Tracking

 Global Homography Estimation

Combine the alignment between the keyframe

and previous frame, and the transformation between current frame and previous frame

slide-54
SLIDE 54

Multi-Homography based Tracking

 Specific Homography Estimation

For a 3D plane Pj visible in keyframe Fk, its

homography from Fk to Ii can be derived as

slide-55
SLIDE 55

Multi-Homography based Tracking

 Local Homography Estimation

Same with ENFT algorithm Use the inlier matches to estimate a set of local

homographies

 Matching with Multi-Homography

Provide better initial positions Alleviate patch distortion Robust to fast motion

slide-56
SLIDE 56

Sliding-Window based Pose Optimization

 Assume having IMU data  Set and estimate by

slide-57
SLIDE 57

Sliding-Window based Optimization Comparison

slide-58
SLIDE 58

Results and Comparions

slide-59
SLIDE 59

Quantitative Evaluation with TUM RGB-D Dataset

Group A: simple translation Group B: there are loops Group C: slow and nearly pure rotation Group D: fast motion with strong rotation

From left to right: RMSE (cm) of keyframes, the starting ratio (i.e. dividing the initialization frame index by the total frame number), and the tracking success ratio after initialization.

slide-60
SLIDE 60

Timing

 Computation Time on a desktop PC  For a mobile device

20~50 fps on an iPhone 6.

slide-61
SLIDE 61

各类单目 V-SLAM 系统比较

slide-62
SLIDE 62

Visual SLAM技术发展趋势(1)

 缓解特征依赖

 基于边的跟踪  直接图像跟踪或半稠密跟踪  结合机器学习和先验/语义

信息

 稠密三维重建

 单/多目实时三维重建  基于深度相机的实时三维重

 平面表达和模型自适应简化

slide-63
SLIDE 63

Visual SLAM技术发展趋势(2)

 多传感器融合

结合IMU、GPS、深度相机、光流计、里程计

slide-64
SLIDE 64

我们的SLAM系统

 RDSLAM

 http://www.zjucvg.net/rdslam/rdslam.html

 RKSLAM

 http://www.zjucvg.net/rkslam/rkslam.html

 更多系统未来会放出来

 http://www.zjucvg.net

slide-65
SLIDE 65

推荐开源系统

 PTAM

 https://github.com/Oxford-PTAM/PTAM-GPL

 ORB-SLAM

 https://github.com/raulmur/ORB_SLAM

 LSD-SLAM

 https://github.com/tum-vision/lsd_slam

 DSO

 https://github.com/JakobEngel/dso

 SVO

 https://github.com/uzh-rpg/rpg_svo

slide-66
SLIDE 66

Thank you!