powered by
Hands-on Intro To Developing Vision and LiDAR Classifier
Formula Student Driverless Workshop
Hands-on Intro To Developing Vision and LiDAR Classifier Formula - - PowerPoint PPT Presentation
Hands-on Intro To Developing Vision and LiDAR Classifier Formula Student Driverless Workshop powered by Introduction Sibo Zhu Zhijian Liu Haotian Tang Perception Lead at Perception Lead at Perception Lead at MIT Driverless
powered by
Formula Student Driverless Workshop
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
2
Sibo Zhu ▪ Perception Lead at MIT Driverless ▪ Research Assistant at MIT HAN Lab Zhijian Liu ▪ Perception Lead at MIT Driverless ▪ PhD student at MIT HAN Lab Haotian Tang ▪ Perception Lead at MIT Driverless ▪ PhD student at MIT HAN Lab
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang 3
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
4
Wide Angle Camera VIO Camera Stereovision Pair Wide Angle Camera VIO Camera Stereovision Pair
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
5
▪ Latency
▪ Maximum view-to-actuation time for emergency stop from top speed during an acceleration run
▪ Mapping Accuracy
▪ Driven by downstream mapper
▪ Horizontal Field-of-View (FOV)
▪ perceive landmarks on the inside apex of a hairpin turn
▪ Look-ahead Distance
▪ depends on the full-stack-latency and vehicle deceleration rate
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
6
Object Detection
Step 1
Perspective-n-Point (PnP)
Keypoints Detection
Step2 Step3
Stereo Matching Algorithm Al
Step2
Mono Stereo
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
30.08.2020 7
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang 8
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
30.08.2020 9
▪ Detects seven keypoints on each YOLOv3 detection ▪ A residual NN that leverages the geometric relationship between keypoints ▪ Seven detected key points will be then used in a Perspective-n-Point (PnP) to get depth
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
30.08.2020 10
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang 11
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
30.08.2020 12
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
30.08.2020 13
Open Sourced here: github.com/cv-core
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
▪ Change the object detection backbone from YOLOv3 to YOLOv4/ EfficientNet/etc ▪ Adding temporal information for more stable and accurate detection
▪ Temporal Shift Module: hanlab.mit.edu/projects/tsm/
▪ Inference with TensorRT in C++
▪ Open sourced here: github.com/cv-core
▪ Prune the full YOLO architecture for cone detection task ▪ Quantization (int8) for even faster inference
14 30.08.2020
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
15 30.08.2020
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
16
powered by
Formula Student Driverless Workshop
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Sibo Zhu ▪ RA at MIT HAN Lab ▪ Perception Lead at MIT Driverless Zhijian Liu ▪ PhD student at MIT ▪ Perception Lead at MIT Driverless Haotian Tang ▪ PhD student at MIT ▪ Perception Lead at MIT Driverless
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Song Han Feb 22, 2018
3D LiDAR Sensor 3D Point Cloud 500k+ points: (x, y, z, intensity)
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Velodyne 32C LiDAR
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
High Accuracy (Prevent Collisions) Low Latency (Drive Faster)
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Self-Driving Cars A whole trunk of computers! We need more efficient algorithms that do not consume intensive computations.
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Longer Distance Fewer Laser Rings on Objects Fewer Laser Points Shorter Distance Too many Laser Rings on Objects Too many Laser Points
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Off-chip DRAM access is much more expensive than arithmetic operation! Random memory access is inefficient due to the potential bank conflicts!
668 167 30 Mult and Add SRAM MemoryDRAM Memory Bandwidth (GB/s) 20x slower
Random Memory Access
8
Sequential Memory Access
8
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
VoxNet [IROS’15] 3D ShapeNets [CVPR’15] 3D U-Net [MICCAI’16]
20 40 60 100 120 Voxel Resolution GPU Memory (GB) 128 x 128 x 128 resolution 83 GB (Titan XP x 7) 7% information loss 80 2 4 6 8 64 x 64 x 64 resolution 11 GB (Titan XP x 1) 42% information loss
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
51,8 2,9 45,3 36,3 51,5 12,2 57,4 27,0 15,6 4,9 0,0 95,1
Irregular Access Dynamic Kernel Actual Computation
DGCNN PointCNN SpiderCNN Ours
Runtime (%)
PointCNN [NeurIPS’18] PointNet [CVPR’17] DGCNN [SIGGRAPH’19]
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Devoxelize Normalize Voxelize Convolve Fuse Multi-Layer Perceptron
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Point-Based Feature Transformation (Fine-Grained)
Multi-Layer Perceptron
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Devoxelize Voxelize Convolve
Voxel-Based Feature Aggregation (Coarse-Grained)
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Devoxelize Normalize Voxelize Convolve Fuse
Point-Based Feature Transformation (Fine-Grained) Voxel-Based Feature Aggregation (Coarse-Grained)
Multi-Layer Perceptron
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Downsampled Scene Original Scene
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Multi-Layer Perceptron Fuse Devoxelize Voxelize Sparse Convolution ×N Sparse Convolution Branch Point Branch
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Multi-Layer Perceptron Fuse Devoxelize Voxelize Sparse Convolution ×N Sparse Convolution Branch Point Branch
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
person person cyclist trunk trunk traffic sign
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang Elastic Trans. Channel Elastic Res. Channel
Elastic Mid. Channel
Elastic Res. Channel Dynamic ResBlock Elastic Res. Channel
Elastic Mid. Channel
Elastic Res. Channel Dynamic ResBlock
…
Multi-Layer Perceptron Fuse Devoxelize Voxelize Sparse Convolution
×N
Sparse Convolution Branch
Point Branch
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang Elastic Trans. Channel Elastic Res. Channel
Elastic Mid. Channel Elastic Res. Channel
Dynamic ResBlock Elastic Res. Channel
Elastic Mid. Channel
Elastic Res. Channel Dynamic ResBlock
…
Multi-Layer Perceptron Fuse Devoxelize Voxelize Sparse Convolution
×N
Sparse Convolution Branch
Point Branch
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Elastic Trans. Channel Elastic Res. Channel
Elastic Mid. Channel
Elastic Res. Channel
Dynamic ResBlock
Elastic Res. Channel
Elastic Mid. Channel
Elastic Res. Channel
Dynamic ResBlock
…
Multi-Layer Perceptron Fuse Devoxelize Voxelize Sparse Convolution
×N
Sparse Convolution Branch
Point Branch
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang Elastic Trans. Channel Elastic Res. Channel Elastic Mid. Channel Elastic Res. Channel Dynamic ResBlock Elastic Res. Channel Elastic Mid. Channel Elastic Res. Channel Dynamic ResBlock
…
Multi-Layer Perceptron Fuse Devoxelize Voxelize Sparse Convolution
×N
Sparse Convolution Branch
Point Branch
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Evolutionary Architecture Search
mutate
+
crossov er
Train Super Network Fine-Grained Channel + Elastic Depth Weight Sharing Uniform Sampling
GPU#1 GPU#N …
#Cin #Cout max #Cin max #Cout
Stage I (Depth: 3) Stage II (Depth: 2,3) Stage III (Depth: 1,2,3)
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Large kernels are efficient in 2D-NAS
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
1x 2x 0,5 1,0 1,5 2,0 2,5 #MACs Two 3x3x3 One 5x5x5 1x 5x 6x 0,5 1,5 2,5 3,5 4,5 5,5 Kernel Map Cost Two 3x3x3 One 5x5x5 Hybrid Small Kernels Large Kernels 3D Deep Learning Cost of large kernels in 3D deep learning is more prohibitive than 2D.
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Elastic Trans. Channel Elastic Res. Channel Elastic Mid. Channel Elastic Res. Channel Dynamic ResBlock Elastic Res. Channel Elastic Mid. Channel Elastic Res. Channel Dynamic ResBlock
…
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Scaling network channels only cannot result in efficient models.
Elastic Trans. Channel Elastic Res. Channel Elastic Mid. Channel Elastic Res. Channel Dynamic ResBlock Elastic Res. Channel Elastic Mid. Channel Elastic Res. Channel Dynamic ResBlock
… FLOPs: 7.5G 1.9G Latency: 105 ms 96 ms (4x) (1.1x)
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
1.280.000
19.112
10000,0 100000,0 1000000,0 Number of Training Samples ImageNet SemanticKITTI
67x less
85.000
17.940
18000 36000 54000 72000 90000 Number of Archs Sampled ImageNet SemanticKITTI
4.7x less
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
61,5 62,9 63,5 64,7 60,4 62,0 62,8 63,5 60 61 62 63 64 65 15 26 37 48 59 70 mIoU # MACs (G) Distributed Sampling Synchronized Sampling
GPU #1 GPU #2
… …
GPU #N
Distributed Sampling
Different sub-networks on different GPUs GPU #1 GPU #2
… …
GPU #N
Synchronized Sampling
The same sub-networks on different GPUs
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
29x larger than ScanNet, 160x larger than S3DIS.
annotation for video sequences.
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
We achieve 8x MACs reduction and 3x speedup over MinkowskiNet with SPVNAS
63 64
60,0 60,5 61,0 61,5 62,0 62,5 63,0 63,5 64,0 mIoU
MinkowskiNet SPVNAS
114G 15G
0,0 14,4 28,8 43,1 57,5 71,9 86,3 100,6 115,0 #MACs
7.6x smaller
294 ms 110 ms
0,0 37,5 75,0 112,5 150,0 187,5 225,0 262,5 300,0 GTX 1080Ti Latency
2.7x faster
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
58 60 63 63 57 59 61 63 65 67 24 48 72 96 120
mIoU #MACs (G)
MinkowskiNet
57,5 60,0 62,8 63,1 57 59 61 63 65 67 110 154 198 242 286 330
mIoU GPU Latency (ms) Both a better module (SPVConv) and 3D-NAS improve the performance of MinkowskiNet.
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
58,5 61,6 64,4 65,3 58 60 63 63 57 59 61 63 65 67 24 48 72 96 120
mIoU #MACs (G)
SPVCNN (Ours) MinkowskiNet
57,5 60,0 62,8 63,1 59 62 64 65 57 59 61 63 65 67 110 154 198 242 286 330
mIoU GPU Latency (ms) Both a better module (SPVConv) and 3D-NAS improve the performance of MinkowskiNet.
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
63,6 64,5 65,2 66,0 66,4 58,5 61,6 64,4 65,3 58 60 63 63 57 59 61 63 65 67 24 48 72 96 120
mIoU #MACs (G)
SPVNAS (Ours) SPVCNN (Ours) MinkowskiNet
63,6 64,5 65,2 66,0 66,4 57,5 60,0 62,8 63,1 59 62 64 65 56 58 60 62 64 66 68 110 154 198 242 286 330
mIoU GPU Latency (ms) Both a better module (SPVConv) and 3D-NAS improve the performance of MinkowskiNet.
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
We achieve up to 25 mIoU improvement on safety-critical small objects. 61 66 60,0 61,0 62,0 63,0 64,0 65,0 66,0 Person IoU
MinkowskiNet SPVNAS
40 52 36,0 38,0 40,0 42,0 44,0 46,0 48,0 50,0 52,0 Bicycle IoU
+11.2 IoU +4.8 IoU
19 44 15,0 20,0 25,0 30,0 35,0 40,0 45,0 Motorcyclist IoU
+25. IoU
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
We achieve up to 58x MACs reduction and 46x params reduction over projection-based methods.
49,9 56 57 60
47,5 49,1 50,8 52,4 54,0 55,6 57,3 58,9 60,5 mIoU
DarkNet SqueezeSegV3 PolarNet SPVNAS
376G 515G 135G
9G
0,0 65,0 130,0 195,0 260,0 325,0 390,0 455,0 520,0 #MACs
50M 26M 14M 1M
0,0 6,3 12,5 18,8 25,0 31,3 37,5 43,8 50,0 #Params
46x smaller 58x smaller +10.4 mIoU
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
MinkowskiNet SPVNAS (Ours) Ground Truth
traffic sign boundary boundary traffic sign bicycle person pole person pole bicycle
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
Mean IoU: 63.1 Throughput: 3.4 FPS (21.7M Params 114.0G FLOPs) Mean IoU: 63.6 Throughput: 9.1 FPS (2.6M Params 15.0G FLOPs)
MinkowskiNet SPVNAS (Ours)
SPVNAS outperforms the state-of-the-art MinkowskiNet (with 3x measured speedup and 8x model size reduction).
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
DarkNet53Seg SPVNAS (Ours)
Mean IoU: 49.9 Throughput: 9.7 FPS 50.4M Params 376.3G FLOPs Mean IoU: 60.3 (>
KPConv)
Throughput: 11.2 FPS 1.1M Params 8.9G FLOPs
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
36 40 44 48 52 56
5 10 15 20 25 Search Iteration
Evolutionary Search Random Search
60,0 60,2 60,4 60,6 60,8 61,0 61,2 61,4 61,6
5 10 15 20 25 Network Index
61.5 60.7 60.0 61.1
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
23 % 77 %
MinkowskiNet
47% 53%
Encoder Decoder
SPVNAS - 20G SPVNAS balances the encoder / decoder computation ratio.
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang
missing prediction correct predictio n duplicate predictions correct prediction
SECOND SPVCNN Ground Truth
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang 59
Original Solution Accuracy: 95% Range: 8 meters Latency: 2 ms/object PVCNN Accuracy: 99.93% Range: 12 meters Latency: 1.25 ms/object
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang 30.08.2020 Author: Zhijian Liu, Haotian Tang
https://fsg.one/academy 30.08.2020 Author: Sibo Zhu, Zhijian Liu, Haotian Tang