1 Graph Neural Network for 3D Multi-Object Tracking Xinshuo Weng, - - PowerPoint PPT Presentation

1 graph neural network for 3d multi object tracking
SMART_READER_LITE
LIVE PREVIEW

1 Graph Neural Network for 3D Multi-Object Tracking Xinshuo Weng, - - PowerPoint PPT Presentation

1 Graph Neural Network for 3D Multi-Object Tracking Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani Robotics Institute, Carnegie Mellon University European Conference on Computer Vision (ECCV) Workshops 2 Standard 3D MOT Pipeline Sensor


slide-1
SLIDE 1

1

slide-2
SLIDE 2

Graph Neural Network for 3D Multi-Object Tracking

Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

Robotics Institute, Carnegie Mellon University European Conference on Computer Vision (ECCV) Workshops

2

slide-3
SLIDE 3

Standard 3D MOT Pipeline

3

3D Object Detection Data Association

Sensor Data

Object trajectories

slide-4
SLIDE 4

Standard 3D MOT Pipeline

4

3D Object Detection Data Association

Sensor Data

LiDAR point clouds RGB frames

Object trajectories

slide-5
SLIDE 5

Standard 3D MOT Pipeline

5

3D Object Detection Data Association

Sensor Data

Detection results

Object trajectories

slide-6
SLIDE 6

Standard 3D MOT Pipeline

6

3D Object Detection Data Association

Sensor Data

Object trajectories Feature Extraction Bipartite Matching

Affinity matrix

Past Tracklets New Detections

3D MOT results

slide-7
SLIDE 7

Limitation of the Prior Work

7

3D Object Detection

Sensor Data

Limitation

  • 1. Feature representation does not take into

account contexts of other objects

  • 2. Feature representation does not fully utilize

information from multiple modalities that is complementary Object trajectories Data Association

Feature Extraction Matching

slide-8
SLIDE 8

8

Our Contributions

  • 1. A novel feature interaction mechanism to encode

contexts via object interaction

  • 2. A 2D-3D joint feature extractor to learn multi-

modal features that are complementary

slide-9
SLIDE 9

Our Contributions

Prior work

  • Feature extraction is independent of each
  • bject
  • Employs features from one modality (2D or 3D)

Our Approach

  • A joint feature extractor to learn multi-modal

features

  • A novel feature interaction mechanism to

iteratively encode context and improve discriminative feature learning

9

slide-10
SLIDE 10

Our Approach

  • (a) Obtain the appearance / motion features from the 3D space
  • (b) Obtain the appearance / motion features from the 2D space
  • (c) Learn discriminative object features by encoding context through object feature interaction

10

slide-11
SLIDE 11

14

Ablation Study

slide-12
SLIDE 12

Improve Feature Learning for 3D MOT

15

  • Is encoding the multi-modal features really useful?

Use feature from single modality

A: appearance feature, M: motion feature

Use feature from multiple modalities: Performance increased!

slide-13
SLIDE 13

Improve Feature Learning for 3D MOT

16

  • Is feature interaction using GNNs useful to 3D MOT?

Performance largely increased with GNN layers = 3 v.s. 0 !

slide-14
SLIDE 14

17

Qualitative Results

slide-15
SLIDE 15

Qualitative Results

18

slide-16
SLIDE 16

Graph Neural Network for 3D Multi-Object Tracking

Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

Robotics Institute, Carnegie Mellon University European Conference on Computer Vision (ECCV) Workshops

19