1 / 57
Innfarn Yoo, 3/29/2018
POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/2018 1 / 57 - - PowerPoint PPT Presentation
POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/2018 1 / 57 Introduction Previous Work AGENDA Method Result Conclusion 2 / 57 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object
1 / 57
Innfarn Yoo, 3/29/2018
2 / 57 2 / 57
3 / 57
4 / 57
works really well
precise boundaries of objects (FCN Mask R-CNN)
[1] He et al., Mask R-CNN (2017)
5 / 57
more attentions
than before (LiDAR & other sensors)
detection accuracy
VoxelNet, & VRN Ensemble
[2] Zhou and Tuzel, VoxelNet (2017)
6 / 57
classification
7 / 57
8 / 57
Perceptron (MLP) Max Pool (MP) Classification
[5] Qi et al., PointNet (2017)
9 / 57
Network (MVCNN)
[3] Su et al., MVCNN (2015) [4] Krizhevsky et al., AlexNet (2012)
10 / 57
[6] Maturana and Scherer, VoxNet (2015) [7] Broke et al., VRN Ensemble (2016) [2] Zhou and Tuzel, VoxelNet (2017)
11 / 57
12 / 57 12 / 57
13 / 57 13 / 57
Trainer: C++ program Load 3D Objects Sample Points Loading 3D Object
Sampler Threads
Thread N
3D Point Sample
Thread 2
3D Point Sample
… Thread 3
3D Point Sample
Call Python NN Model Functions:
Train, Test, Eval, Report, & Save Epoch #i Increase epoch
Thread 1
3D Point Sample
Converter
Pixel, Point, Voxel
Converter
Pixel, Point, Voxel
Converter
Pixel, Point, Voxel
Converter
Pixel, Point, Voxel
…
14 / 57
SHAPENET CORE V2 MODELNET40 MODELNET10
Princeton ModelNet Data
http://modelnet.cs.princeton.edu/
10 Categories 4,930 Objects (2 GB) OFF (CAD) File Format ShapeNet
https://www.shapenet.org/
55 Categories 51,191 Objects (90 GB) OBJ File Format Princeton ModelNet Data
http://modelnet.cs.princeton.edu/
40 Categories 12,431 Objects (10 GB) OFF (CAD) File Format
15 / 57 15 / 57
Point-Based Models Pixel-Based Models Voxel-Based Models
16 / 57 16 / 57
surfaces
Pool Layers
17 / 57 17 / 57
18 / 57 18 / 57
Flatten Vector Fully Connected Layer
…
Class Onehot Vector
3D points
…
Softmax Cross Entropy ReLU + Dropout
19 / 57 19 / 57
3D points
…
Softmax Cross Entropy Random 3x3 Rotation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector
ReLU + Dropout
20 / 57 20 / 57
Random 3x3 Rotation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector
3D points
…
Softmax Cross Entropy ReLU + Dropout ReLU + Dropout
21 / 57 21 / 57
3D points
…
Softmax Cross Entropy Random 3x3 Rotation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector
ReLU + Dropout ReLU + Dropout
22 / 57 22 / 57
3D points
…
Softmax Cross Entropy Random 3x3 Rotation Resample Layer Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector
ReLU + Dropout ReLU + Dropout
23 / 57 23 / 57
…
Softmax Cross Entropy Random 3x3 Rotation Resample Layer Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector
3D points
ReLU + Dropout ReLU + Dropout ReLU + Dropout
24 / 57 24 / 57
25 / 57 25 / 57
26 / 57 26 / 57
Softmax Cross Entropy
Flatten Vector Fully Connected Layer
…
Class Onehot Vector
Images (32x32x5)
…
27 / 57 27 / 57
Images (32x32x5)
…
Softmax Cross Entropy
Image Separation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector Concat
28 / 57 28 / 57
29 / 57 29 / 57
30 / 57 30 / 57
Flatten Vector Fully Connected Layer
…
Class Onehot Vector Softmax Cross Entropy
Images (32x32x5)
…
31 / 57 31 / 57
Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector 3D Conv Layer
…
Softmax Cross Entropy
Voxels 32x32x32
32 / 57 32 / 57
…
Softmax Cross Entropy
Voxels 32x32x32
Avg Pooling Layer Resample Layer Max Pooling Layer Flatten Vector Fully Connected Layer
…
Class Onehot Vector 3D Conv Layer Concat
33 / 57 33 / 57
34 / 57 34 / 57
35 / 57
36 / 57
100 200 300 400 500 600 700 800 900 1000 table toilet monitor bathtub sofa chair desk dresser night_stand bed # of Test Models # of Train Models
37 / 57 10 20 30 40 50 60 70 80 90 100 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet Train Accu Test Accu mAP
Iter: 1000
%
38 / 57
100 200 300 400 500 600 700 800 900 # of Test Models # of Train Models
39 / 57 10 20 30 40 50 60 70 80 90 100 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet Train Accu Test Accu mAP
Iter: 1000
%
40 / 57
100 200 300 400 500 600 700 800 900 # of Test Models # of Train Models
41 / 57 10 20 30 40 50 60 70 80 90 100 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet Train Accu Test Accu mAP
Iter: 1000
%
42 / 57
100 200 300 400 500 600 700 800 900 # of Test Models # of Train Models
43 / 57
10 20 30 40 50 60 70 80 90 100 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet Train Accu Test Accu mAP
44 / 57
100 200 300 400 500 600 700 800 900 # of Test Models # of Train Models
45 / 57
10 20 30 40 50 60 70 80 90 100 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet Train Accu Test Accu mAP
Iter: 1000
%
46 / 57
0.00 0.50 1.00 1.50 2.00 2.50 3.00
PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
hours
Total Training Time
0.01 0.02 0.03 0.04 0.05 0.06 0.07
PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
Inference Time Per Batch
seconds
47 / 57
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00
PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
Total Training Time
hours
0.01 0.02 0.03 0.04 0.05 0.06 0.07
PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
Inference Time Per Batch
seconds
48 / 57 48 / 57
Iter: 600
10 20 30 40 50 60 70 80 90 100 PC MP PX MVCNN VX CNN Train Accu Test Accu mAP
%
49 / 57 49 / 57 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410
0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410 0.2 0.4 0.6 0.8 1 50 4050 7850 11600 15410
PC MLP1 PC MLPs PC CNN1 PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
50 / 57 50 / 57 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000 0.2 0.4 0.6 0.8 1 1000 2000 3000 4000
PC MLP1 PC MLPs PC CNN1 PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
51 / 57 51 / 57 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410 0.5 1 1.5 2 2.5 50 4050 7850 11600 15410
PC MLP1 PC MLPs PC CNN1 PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
52 / 57
53 / 57
accuracy and higher computation cost than ordered data
computation cost (converge faster, & more precise)
54 / 57
55 / 57
this problem or not
56 / 57
Point Cloud Based 3D Object Detection
3d shape recognition
Convolutional Neural Networks
classification and segmentation
Neural Network for Real-Time Object Recognition
Modeling with Convolutional Neural Networks
Detection with Region Proposal Networks
Repository
Volumetric Shapes
and Next-Best-View Prediction
57 / 57