New Perspective on Perception and Prediction Pipeline for Autonomous Driving
Xinshuo Weng, Kris Kitani Robotics Institute, Carnegie Mellon University
August 28, 2020
1
New Perspective on Perception and Prediction Pipeline for Autonomous - - PowerPoint PPT Presentation
New Perspective on Perception and Prediction Pipeline for Autonomous Driving Xinshuo Weng, Kris Kitani Robotics Institute, Carnegie Mellon University August 28, 2020 1 Perception and prediction are important components in the autonomous
Xinshuo Weng, Kris Kitani Robotics Institute, Carnegie Mellon University
August 28, 2020
1
3
4
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
Perception Prediction
5
LiDAR RGB
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
6
3D Object Detection
Sensor Data
Detection results
3D Multi-Object Tracking
Trajectory Forecasting
7
3D Object Detection
Sensor Data
Tracking results
3D Multi-Object Tracking
Trajectory Forecasting
8
3D Object Detection
Sensor Data
3D Multi-Object Tracking
Trajectory Forecasting
Forecasting results
9
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
Is this really the best place to perform prediction?
10
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
Can we do prediction here?
11
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
Can we do prediction here?
12
13
Better models from bigger datasets!
* Mined trajectory data not counted for the Argo dataset
*
150x increase! 3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
(Waymo)
Sun et al. Scalability in Perception for Autonomous Driving: Waymo Open Dataset. CVPR 2020
14
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction. CVPR 2020
Dataset with multi-modal ground truth
Green: multi-modal ground truth future Yellow: past observations Each modality of the future is generated by setting a different goal in the simulator
In contrast to prior dataset with single future ground truth and make multi-future evaluation possible What are the right metrics for evaluation?
15
16
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
Multi-agent interaction modeling with Graph Neural Networks (GNNs)
Contextual features are encoded to take into account
17
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
CoverNet: Multimodal Behavior Prediction using Trajectory Sets. CVPR 2020
Road context / physical constraint helps
Using road structure semantics as inputs eliminates physically impossible trajectories
18
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
PRECOG: PREdiction Conditioned On Goals in Visual Multi-Agent Settings. ICCV 2019
Goal-conditioned forecasting
Different goals could lead to different forecasts
19
Jointly optimized 3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
End-to-end perception and prediction pipeline
PnPNet: End-to-End Perception and Prediction with Tracking in the Loop. CVPR 2020
Gradients Gradients
Separately optimized
All modules are optimized for the end goal: trajectory prediction
20
Lots of progress on (1) building better/larger datasets and (2) improving forecasting models The pipeline stays the same! Any possible improvement at the pipeline level?
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
21
upstream module as inputs
corrected and will degrade performance of the downstream module
22
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
GT past trajectories Predicted trajectories Tracking results Predicted trajectories
Data association error in tracking
23
3D Object Detection
3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data Sequential Pipeline Parallelized Tracking and Forecasting Pipeline
Feature Extraction Matching Feature Extraction Trajectory Decoder
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
Shared Feature Learning
Matching
Trajectory Decoder
Similar components, which aims to encode object features from past information Module-specific components
the association information in the current frame
24
Forecasting
Predicted trajectories in future T frames Edge features
Diversity sampling
Node features
GNN for feature interaction
Detected objects in current frame Objects trajectories in past H frames
Last frame
Current frame Feature extraction Feature extraction
3D MOT head Trajectory forecasting head
Joint 3D Tracking and Forecasting
Shared Feature Learning 3D MOT
3D Object Detection 3D Multi-Object Tracking Trajectory Forecasting Sensor Data Shared Feature Learning
interaction with GNNs
25
3D Object Detection 3D Multi-Object Tracking Trajectory Forecasting Sensor Data Shared Feature Learning
between every pair of objects
26
3D Object Detection 3D Multi-Object Tracking Trajectory Forecasting Sensor Data Shared Feature Learning
latent code covering various modes
latent codes
27
3D Object Detection 3D Multi-Object Tracking Trajectory Forecasting Sensor Data Shared Feature Learning
28
29
Forecasting evaluation without 3D MOT Performance improved after adding MOT!
3D Object Detection 3D Multi-Object Tracking Trajectory Forecasting Sensor Data Shared Feature Learning
30
3D Object Detection 3D Multi-Object Tracking Trajectory Forecasting Sensor Data Shared Feature Learning
3D MOT evaluation without forecasting module
31
3D Object Detection 3D Multi-Object Tracking Trajectory Forecasting Sensor Data Shared Feature Learning
32
33
3D Object Detection 3D Multi-Object Tracking
Trajectory Forecasting
Sensor Data
34
SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting
35
Switch the order
Weng et al. Unsupervised Sequence Forecasting of 100,000 Points for Unsupervised Trajectory Forecasting. arXiv 2020
36
SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting
Weng et al. Unsupervised Sequence Forecasting of 100,000 Points for Unsupervised Trajectory Forecasting. arXiv 2020
37
Weng et al. Unsupervised Sequence Forecasting of 100,000 Points for Unsupervised Trajectory Forecasting. arXiv 2020
(b) LSTM for temporal modeling
(d) Losses
38
Weng et al. Unsupervised Sequence Forecasting of 100,000 Points for Unsupervised Trajectory Forecasting. arXiv 2020
39
40
Weng et al. Unsupervised Sequence Forecasting of 100,000 Points for Unsupervised Trajectory Forecasting. arXiv 2020
41
Weng et al. Unsupervised Sequence Forecasting of 100,000 Points for Unsupervised Trajectory Forecasting. arXiv 2020
at the pipeline level
42
3D Object Detection Trajectory Forecasting Sensor Data 3D Multi-Object Tracking
Many large-scale datasets but sensor suite and annotations are not unified Trajectory forecasting is improving but should be coupled with perception modules more tightly Doesn't take into account multi-level optimization problem (planning, control) Should also take into account sensor optimization and redundancy
Xinshuo Weng, Kris Kitani Robotics Institute, Carnegie Mellon University
August 28, 2020
43