Regularity of Man-Made Environments Danping Zou VALSE SE online - - PowerPoint PPT Presentation
Regularity of Man-Made Environments Danping Zou VALSE SE online - - PowerPoint PPT Presentation
StructVIO: Visual-Inertial Odometry with Structural Regularity of Man-Made Environments Danping Zou VALSE SE online ne semina nar 2019 2019 7 10 10 Visual SLAM Visual ual SLAM M techniques iques have e been widel ely
▪ Visual ual SLAM M techniques iques have e been widel ely appl plie ied d to unmanned ed vehicle cles. s.
Visual SLAM
Fishey eye e camer era Stereo eo camera mera Stereo eo camera mera
▪ Augment ented ed reali lity ty (AR) (Holol
- lolen
ens Glass ss,Project Project Tango go Tabl blet et)
Visual SLAM
Holole lens uses s four camer eras as for visual al SLAM Tango use one fisheye eye camera mera for visua ual SLAM
▪ Operat ration ion syst stem m on cellph phones
- nes
▪ Google and Apple integrate visual SLAM into their OS (iOS, Android).
Visual SLAM
▪ A lot of algorithms have been proposed for visual SLAM in the past 15 years.
▪ MonoSLAM AM (2003), 3), Struc uctSL tSLAM(20 2014 14) ▪ PTAM( M(2007) 2007), , ORB-SL SLAM(20 M(2015) 5) ▪ SVO(201 2014) 4), , LSD-SL SLAM( M(20 2014 14), , DSO(2016) 2016)
▪ Pure visual SLAM system is not robust in practical applications. ▪ Visual-inertial systems become predominant for real applications.
▪ MSCKF F (2007) 7), , ROVIO (2009) 9) ▪ OKVIS (2015) 5), , VINS(2017) 2017), ICE-BA( BA(20 2018 18)
Visual SLAM
▪ Most st visual ual-SLA LAM or visual ual-inertia inertial l syst stems ms choose
- se points
ts as the landma dmarks. rks.
Features in v/vi-SLAM systems
▪ Man made environments exhibit stron rong g regul ulari rity ty on geome metr try.
Features in v/vi-SLAM systems
Natural ural scenes nes Street eet Indoor r Under ergroun round d parki king ng
Structural regularity - Manhattan word
1.
- 1. Rich of line featu
ture res 2.
- 2. Three known
wn directions ctions (x, y, z)
▪ StructSLAM ctSLAM (Pr Present esented ed VALSE SE online e seminar, nar, 2016, 30th
th,Mar)
,Mar)
▪ Point + structural lines (lines aligned with x, y, z directions) ▪ The direction of lines improves the observability of camera orientation
Visual SLAM with Manhattan world model
Zhou, , Huizho izhong, , Danpin ing, , Zou, , et al. . "StructSL ctSLAM: : Visual al SLAM with th build ildin ing struct cture lines es." ." Vehicu icula lar Tech chnolo logy, y, IEEE Tran ansact actio ions on 64.4 (2015): ): 1364-13 1375. . - Specia ial l session for indoor loca calizat lizatio ion
▪ A lot of man made de environm
- nment
ents can not be well describe cribed d by Manh nhatt ttan n worl rld d model. l. ▪ Obliqu ique/ e/cur curvy y structur ctures. es.
Real word is full of diversity
▪ A novel l visual ual-inertial inertial odom
- metry
etry method
- d
is presen sented ted
▪ Use Atlanta nta world model to better describe irregular scenes. ▪ Made sever eral al improvements
- vements to existing
VIO approach. ▪ A VIO dataset that can be used evaluate different methods.
StructVIO
Zou, Danping, et al. “StructVIO: Visual-inertial Odometry with Structural Regularity of Man- made Environments.” IEEE Trans. on Robotics, 2019 Executable, tools & dataset : http://d /dro rone.sjt sjtu.edu edu.cn cn/d /dpzo zou/p /pro roje ject ct/s /stru ruct ctvi vio.
- .html
ml
▪ We can approximate an irregular world by a group up of local cal Manh nhatt ttan an worl rlds ds. ▪ Each one of them can be represented by a heading direction ∶ 𝜚.
Key idea – Atlanta world model
One Manhattan attan world Two Manhat attan tan worlds ds Three ee Manhatt hattan an worlds ds
▪ Locally, the world is a Manhattan world. We can still use
▪ Three ee direc ectio ions ns ▪ Struc uctu tural al line features s
▪ to improve the performance of the VIO system.
Key idea – Atlanta world model
Three directio ions ns X,Y direc ection ions s – Render er the Yaw angle le observabl vable e (locally) ally) Z d direc ectio ion n – Render er the gravit vity y directi ction
- n observable
vable Line e features es A g good complementary plementary to point t features ures in textur ture- less scenes. nes.
▪ We adop
- pt
t the multi-state tate EKF KF filter er based sed fram amew ework. k. ▪ Compar mparin ing g with class ssic c EKF KF filter er
▪ Much faster since the features are not included in the state vector.
▪ Compar mparin ing g with key-frame frame optim imiz ization ation
▪ Short feature trajectories are fully explored. ▪ State update using a single feature trajectory. ▪ Efficient but without losing much accuracy.
The framework of StructVIO
Clas assic ic EKF filter ter Key-fra frame me opti timi mizat ation ion Multi lti-sta tate te EKF KF filter lter
▪ The pipeline of StructVIO is as the following:
The framework of StructVIO
▪ The state vector consists of the current ent IMU U state te, historical
- rical IMU
U poses, ses, calibra ibration tion para rameters eters, and the headin ding g direc ections tions of local Manhattan worlds
State definition of StructVIO
Current ent IMU state ate Camera-IMU IMU cali libr brati ation
- n
Manhatt hattan an world lds Historic ical al IMU poses
▪ Inside ide of the filter er
▪ Paramet ameter eriz izat ation ion ▪ Measu surement nt equation ion
StructVIO – Technical details
▪ Outsid ide of the filter ter
▪ Struc ructur tural al line e relat ated ed tasks: sks: ▪ Line e detec tectio tion n & tracking acking ▪ Classif sific icati ation n of struc uctural tural lines ▪ initia tializat lization ion & triangulat iangulation ion ▪ Hand ndling ling long ng feature ture tracks acks ▪ Manha hatta ttan world rld : ▪ Detect tection ion ▪ Merg rging ing
▪ Other details ▪ Outlier rejection
▪ World ld frame e :
▪ Z axis aligned with gravity ▪ Starting point as the origin
▪ Local cal Manhat hattan tan frame me: ▪ Camera era frame me :
▪ Z axis aligned with the optical axis toward the viewing direction. ▪ X, Y axes aligned with x,y axes of the image
▪ Startin rting g frame: me:
- Movin
ing Manha hattan an frame me
▪ The origin is located at the camera center. ▪ Three axes aligned with those of local Manhattan frame
Coordinate frames
▪ We use a camera mera-cent centric ric representation. ▪ Para ramete eter r spa pace ce : - use for line represent esentation ation
Representation of a structural line
Camera a frame Starting ing frame Paramete ameter space ce World d frame
▪ In parameter space {𝑀},a structural line can be represented by a point and a vertical direction. ▪ To achieve better linearization, the intersection point can be represented using inverse-depth approach. We have
Structural line parameter space
▪ The structural line can be transformed into three axes of the starting frame by the rotation 𝑀
𝑇𝑆.
Line space -> Starting frame
Line e space ce Starting ing frame World d frame Camera a frame
▪ The structural line can be further transformed into the world frame by using the heading direction (𝜚𝑗) of the local Manhattan world. ▪ The structural line is then transformed to the current camera frame by.
Starting frame -> World frame
Line e space ce Starting ing frame World d frame Camera a frame
▪ Apply the transformations to both the point 𝑚𝑞 and the vertical direction 𝑎
Line projection on the image
Paramete ameter space ce Starting ing frame World d frame Camera a frame Line e equati ation
- n
▪ Line projection can be written as the following functions , where 𝑀
𝑇𝑆 are known constants after line direction classification.
▪ Hence we further write ▪ We can use the above functions to derive the measurement equations.
Line projection on the image
(unknown camera-IMU calibration 𝐷
𝐽𝜐) 𝑗𝑛𝑚 = Π(𝑚, 𝜚𝑗, 𝐷 𝑋𝜐) 𝑗𝑛𝑚 = Π(𝑚, 𝜚𝑗, 𝐷 𝐽 𝜐 , 𝐷 𝑋𝜐)
▪ Measur surem ement nt equation tion by re-pro project jection ion errors rs
▪ The line projection at time 𝑙 is given by: ▪ The line segment detected on the image is denoted by : 𝑡𝑏 ↔ 𝑡𝑐 ▪ Hence the re-projection error can be computed as the signed distances between the line projection and the two end points:
Measurement equations
𝑗𝑛𝑚𝑙 = Π(𝑚, 𝜚𝑗, 𝐷 𝐽 𝜐, 𝐷 𝑋𝜐)
▪ After local linearization, we have ▪ By stacking all observations from time 1 to time 𝑁
Measurement equations
Headi ding ng of Manhatt hattan an world Camera- IMU calibr ibratio ation Historic ical al IMU poses Line paramete ameters
▪ Project the residual to the left null space of 𝐼𝑚, we can get rid of the line parameters: ▪ The measurement equation involves
▪ 1. Heading ing directio ion n of the local al Manhatta attan n world ▪ 2. IMU-camer camera a relati ative ve pose ▪ 3. Histor
- ric
ical al IMU poses
Measurement equations
▪ Structu ctura ral line related ted tasks: sks:
▪ Line e detec ection ion & classi ssific icatio ation n of structural uctural lines ▪ initiali alizat ation ion & & triang angulati ulation ▪ Line track acking ing ▪ Handling ling long featur ure e trac acks
Outside of the filter
▪ Structu ctura ral line related ted tasks: sks:
▪ Line e detec ection ion & classi ssific icatio ation n of structural uctural lines
Outside of the filter
Detec ection
- n of l
line e segment ents Classif ifica icatio ion n of line direc ectio ions ns (X,Y,or
- r Z) a
and identify ify the Manhatt hattan an world ld 𝝔𝒋 For a line segment 𝑡𝑏 ↔ 𝑡𝑐 find its Manhattan attan world d 𝜚𝑗 and its direc ectio ion (X,Y,orZ)
▪ Structur uctural al line e relate ated d tasks: ks:
▪ Initial ializ izati ation
Outside of the filter
- 1. Longer line segment first
- 2. Establish the starting frame (in which Manhattan world it lies)
- 3. Use the middle point 𝑛 of the line segment for initialization
Camera mera fram ame World ld frame ame Star artin ing frame ame (Lo (Local M al Manh nhat attan an) Line e parameter meter space ce
For horizo izontal al lines (ali ligned with X, Y axes s of a certain ain Manhattan an frame) ame)
▪ Structur uctural al line e relate ated d tasks: ks:
▪ Initial ializ izati ation
Outside of the filter
- 1. Longer line segment first
- 2. Establish the starting frame (in which Manhattan world it lies)
- 3. Use the middle point 𝑛 of the line segment for initialization
Camera mera fram ame World ld frame ame A dummy mmy Manhatt ttan world ld 𝜚0 = 0 Line e parameter meter space ce
For vertical ical lines (ali ligned with Z axis s of any Manhattan an frame ames) s)
▪ We can write the initialization process as a function:
Outside of the filter
𝑡 : line segment 𝜚𝑗: Local Manhattan frame
𝐷 𝑋𝑆: Current camera orientation
𝜍0: Initial inverse depth Initial covariance Initial parameters
𝜏𝜄0
2 : small value to account line detection error (2-4 pixels)
𝜏𝜍0
2 : uncertainty of inverse depth (5 by default)
▪ Line triangu ngulat lation ion with prior
- r Knowledg
- wledge
Outside of the filter
𝑚0 : Prior line parameters 𝑠𝑙(𝑚) : line projection error 𝒲 : set of visible views 𝑠𝑙(𝑚) Line projection error Prior knowledge
▪ Line tracki cking ng
▪ 1. Sample sever eral al points ts on the line ▪ 2. Project those points onto the image, searching corresponding points perpendicular to the line projection. ▪ 3. Use the small patches around those points as the descriptor
Outside of the filter
▪ Handli ling ng long feature ure track acks
Outside of the filter
Dropped ped views ws {}
Nor
- rmal
mal equa uation ion in the e last t Gauss-Newt wton itera eratio ion
Step1 p1 – Absorb dropped ped measu surements ements into
- priori
i informatio mation n : Step2 p2 – Change ge the starting ng frame me 𝑇 → 𝑇′ Current ent estimate imate Prio ior information mation
▪ Manh nhatt ttan n worl rld d detection ction :
▪ 1. starts once vertical lines are identified ▪ 2. compute the horizontal line 𝑚∞ = 𝐿−𝑈
𝑋 𝐷 𝑆 0,0,1 𝑈
▪ 3. run 1-line RANSAC to detect one of the two horizontal directions (X or Y)
▪ Randomly select one line, extended it to intersect 𝑚∞ to get a vanishing point 𝑤𝑦 ▪ Compute the other vanishing point 𝑤𝑧 ▪ Check the consistent line segments aligned with 𝑤𝑦 or 𝑤𝑧 ▪ Repeat the aforementioned steps
▪ It is a p possib ible le Manhatt hattan an world d if the maximum consensus set contains sufficient inliers.
Outside of the filter
▪ Manh nhatt ttan n worl rld d merging ng :
▪ The heading direction of two Manhattan worlds could be very close.
|𝜚𝑗 − 𝜚𝑘| < Δ𝜚
▪ We merge them by removing the newly detected one and update the information of related structural lines
Outside of the filter
▪ Benchmark tests on Euroc dataset
Results
V2_03 03_dif diffic icult ult MH_05_ 5_dif ifficult ficult
▪ Euroc dataset
Results
RMSE-Rooted
- oted Mean Squared
ed Error
- r