[PPT] - Object Detection and Tracking in 3D World Xinshuo Weng 3D Object PowerPoint Presentation

SLIDE 1

Object Detection and Tracking in 3D World

Xinshuo Weng

SLIDE 2

3D Object Detection

SLIDE 3

Goal

SLIDE 4

Goal

Inputs:

○ LiDAR point cloud

SLIDE 5

Goal

Inputs:

○ LiDAR point cloud ○ Monocular Images

SLIDE 6

Goal

Inputs:

○ LiDAR point cloud ○ Monocular Images ○ Stereo images

Left Right

SLIDE 7

Goal

Inputs:

○ LiDAR point cloud ○ Monocular Images ○ Stereo images ○ Or fusion

SLIDE 8

Goal

Inputs:

○ LiDAR point cloud ○ Monocular Images ○ Stereo images ○ Or fusion

Outputs:

○ Eight corners ○ Four corners + height ○ Size (l,w,h) + center (x,y,z) + heading (𝜾)

SLIDE 9

3D Object Detection from LiDAR Point Cloud

Shi et al, “PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud”, CVPR, 2019.

SLIDE 10

3D Object Detection from Monocular Images

Mousavian et al, “3D Bounding Box Estimation Using Deep Learning and Geometry”, CVPR, 2017.

Goal: estimate 7 DoF parameters
Leverage the 2D-3D bounding box consistency constraint

○ Provide 4 constraints

SLIDE 11

3D Object Detection from Monocular Images

Mousavian et al, “3D Bounding Box Estimation Using Deep Learning and Geometry”, CVPR, 2017.

Goal: estimate 7 DoF parameters
Leverage the 2D-3D bounding box consistency constraint

○ Provide 4 constraints ○ Need at least another three

SLIDE 12

3D Object Detection from Stereo Images

Li et al, “Stereo R-CNN based 3D Object Detection for Autonomous Driving”, CVPR, 2019.

SLIDE 13

3D Object Detection from Stereo Images

Li et al, “Stereo R-CNN based 3D Object Detection for Autonomous Driving”, CVPR, 2019.

Size (l, w, h) 2D bounding box (x, y, z, 𝜾)

SLIDE 14

3D Object Detection from Stereo Images

Li et al, “Stereo R-CNN based 3D Object Detection for Autonomous Driving”, CVPR, 2019.

Matching loss

SLIDE 15

3D Object Detection from Images and LiDAR

Qi et al, “Frustum PointNets for 3D Object Detection from RGB-D Data”, CVPR, 2018.

SLIDE 16

Accepted to autonomous driving workshop in ICCV 2019
Motivation: to bridge the performance gap between LiDAR and camera for 3D object detection
KITTI dataset leaderboard:

Our Recent Work on Monocular 3D Object Detection

LiDAR-based 3D detection Monocular 3D detection

X. Weng and K. Kitani, “Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud”, ICCVW, 2019.

SLIDE 17

Our Recent Work on Monocular 3D Object Detection

X. Weng and K. Kitani, “Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud”, ICCVW, 2019.

SLIDE 18

Our Recent Work on Monocular 3D Object Detection

Contributions:

○ Pseudo-LiDAR framework ○ Two observations: ■ Long tail ■ Local misalignment

X. Weng and K. Kitani, “Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud”, ICCVW, 2019.

SLIDE 19

Our Recent Work on Monocular 3D Object Detection

Contributions:

○ Pseudo-LiDAR framework ○ Two observations: ■ Long tail – instance mask proposal ■ Local misalignment

X. Weng and K. Kitani, “Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud”, ICCVW, 2019.

SLIDE 20

Our Recent Work on Monocular 3D Object Detection

X. Weng and K. Kitani, “Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud”, ICCVW, 2019.
Contributions:

○ Pseudo-LiDAR framework ○ Two observations: ■ Long tail – instance mask proposal ■ Local misalignment – bounding box consistency loss (BBCL) and optimization (BBCO)

SLIDE 21

Our Recent Work on Monocular 3D Object Detection

X. Weng and K. Kitani, “Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud”, ICCVW, 2019.
Inputs are monocular images only
Current 1st position on both KITTI 3D detection / bird’s eye view detection leaderboard among

monocular methods

SLIDE 22

Our Recent Work on Monocular 3D Object Detection

[6] R. Urtasun et al (University of Toronto). Monocular 3D Object Detection for Autonomous Driving. CVPR 2016. [30] J. Kosecka (George Mason Unibrtsity). 3D Bounding Box Estimation Using Deep Learning and Geometry. CVPR 2017. [58] Z. Chen (Wuhan University) et al. Multi-Level Fusion based 3D Object Detection from Monocular Images. CVPR 2018.

X. Weng and K. Kitani, “Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud”, ICCVW, 2019.

SLIDE 23

3D Multi-Object Tracking

SLIDE 24

Goal

SLIDE 25

Goal

Inputs:

○ LiDAR point cloud ○ Monocular Image ○ Stereo image, add video ○ Or fusion

Outputs:

○ Eight corners ○ Four corners + height ○ Size + center + orientation ○ identity

SLIDE 26

Goal

Inputs:

○ LiDAR point cloud ○ Monocular Image ○ Stereo image, add video ○ Or fusion

Outputs:

○ Eight corners ○ Four corners + height ○ Size + center + orientation ○ Identity – association problem

SLIDE 27

Typical Multi-Object Tracking Solver

Tracking-by-detection pipeline

SLIDE 28

Typical Multi-Object Tracking Solver

Tracking-by-detection pipeline
detector

SLIDE 29

Typical Multi-Object Tracking Solver

Tracking-by-detection pipeline
detector + appearance model + motion model

SLIDE 30

Typical Multi-Object Tracking Solver

Tracking-by-detection pipeline
detector + appearance model + motion model + data association (e.g., Hungarian algorithm)

SLIDE 31

Typical Multi-Object Tracking (MOT) Solver

Tracking-by-detection pipeline
detector + appearance model + motion model + data association

Deep motion network Deep association network Deep appearance network

SLIDE 32

3D MOT from LiDAR Point Cloud

Luo et al, “Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net”, CVPR, 2018.

SLIDE 33

3D MOT from LiDAR Point Cloud

Baser et al, “FANTrack: 3D Multi-Object Tracking with Feature Association Network”, arXiv, 2019.

SimNet AssocNet

SLIDE 34

3D MOT from LiDAR Point Cloud

Frossard et al, “End-to-end Learning of Multi-sensor 3D Tracking by Detection”, ICRA, 2018.

SLIDE 35

Our Recent Work on 3D Multi-Object Tracking

Tracking by detection

○ Detection: state-of-the-art 3D object detector ---- PointRCNN ○ Tracking: Kalman filter with 3D constant velocity model + Hungarian algorithm, no appearance model

X. Weng and K. Kitani, “Simple Baseline and New Evaluation Tool for 3D Multi-Object Tracking”, arXiv, 2019.

SLIDE 36

Our Recent Work on 3D Multi-Object Tracking

X. Weng and K. Kitani, “Simple Baseline and New Evaluation Tool for 3D Multi-Object Tracking”, arXiv, 2019.
Inputs are only LiDAR point cloud only
Current 1st position on KITTI 3D tracking leaderboard, 2nd position on KITTI 2D tracking leaderboard among

published works

SLIDE 37

Our Recent Work on 3D Multi-Object Tracking

X. Weng and K. Kitani, “Simple Baseline and New Evaluation Tool for 3D Multi-Object Tracking”, arXiv 2019.

2D tracking results on KITTI test set 3D tracking results on KITTI validation set

[1] Raquel Urtasun. End-to-End Learning of Multi-Sensor 3D Tracking by Detection. ICRA 2018. [2] Krzysztof Czarnecki. University of Waterloo. FANTrack: 3D Multi-Object Tracking with Feature Association Network. arXiv 2019. [3] Karl Granstrom, Chalmer University of Technology. Mono-Camera 3D Multi-Object Tracking Using Deep Learning Detections and PMBM Filtering. ITSC 2018. [5] K. Madhava Krishna. IIIT Hyderabad, India. Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking. ICRA 2018.

SLIDE 38

Takeaway Message

With proper use, conceptually simple idea can achieve an unprecedented improvement of

performance in practice