3D ShapeNets: A Deep Representation for Volumetric Shape Modeling
by Wu, Song, Khosla, Yu, Zhang, Tang, Xiao
presented by Abhishek Sinha
1
3D ShapeNets: A Deep Representation for Volumetric Shape Modeling - - PowerPoint PPT Presentation
3D ShapeNets: A Deep Representation for Volumetric Shape Modeling by Wu, Song, Khosla, Yu, Zhang, Tang, Xiao presented by Abhishek Sinha 1 3D Shape Prior 2 Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric
by Wu, Song, Khosla, Yu, Zhang, Tang, Xiao
presented by Abhishek Sinha
1
2
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
2
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
2
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
3
3
3
3
3
3
3
3
3
4
6
7
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
7
Shape Synthesis
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
7
Shape Synthesis Shape Completion
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
7
Shape Synthesis Shape Completion 2.5D Object Recognition
person tricycle
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
7
Shape Synthesis Feature Extractor Shape Completion 2.5D Object Recognition
person tricycle
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
8
9
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
9
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
10
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
11
Simple Shapes Complex Shapes
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
12
building blocks full object
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
13
mesh classification shape completion shape generation 2.5D object recognition person tricycle
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
14
15
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
15
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
15
mesh
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
15
mesh
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
15
mesh binary voxel
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
16
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
16
Convolutional Deep Belief Network
A Deep Belief Network is a generative graphical model that describes the distribution of input x
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
16
error
Convolutional Deep Belief Network
A Deep Belief Network is a generative graphical model that describes the distribution of input x
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
16
error
Layer 1-3 convolutional RBM Layer 4 fully connected RBM Layer 5 multinomial label + Bernoulli feature form an associate memory
configurations
Convolutional Deep Belief Network
A Deep Belief Network is a generative graphical model that describes the distribution of input x
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
17
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
17
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
17
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
17
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
17
Convolutional Deep Belief Network
generative process discriminative process
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
17
Convolutional Deep Belief Network
* 3D ShapeNets can be converted into a CNN, and discriminatively trained with back-propagation. generative process discriminative process
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
18
Maximum Likelihood Learning
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
18
Layer-wise pre-training: Lower four layers are trained by CD Last layer is trained by FPCD[1] Fine-tuning: Wake sleep[2] but keep weights tied
[2] Hinton, et al "A fast learning algorithm for deep belief nets." Neural computation [1] Tijmen, et al. "Using fast weights to improve persistent contrastive divergence.”
Maximum Likelihood Learning
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
18
Layer-wise pre-training: Lower four layers are trained by CD Last layer is trained by FPCD[1] Fine-tuning: Wake sleep[2] but keep weights tied
[2] Hinton, et al "A fast learning algorithm for deep belief nets." Neural computation [1] Tijmen, et al. "Using fast weights to improve persistent contrastive divergence.”
Maximum Likelihood Learning
Convolutional Deep Belief Network
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
generation process:
Gibbs Sampling Convolutional Deep Belief Network
19
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
generation process:
Gibbs Sampling
20
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
generation process:
Gibbs Sampling
20
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
generation process:
Gibbs Sampling
20
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
generation process:
Gibbs Sampling
20
generation process:
Gibbs Sampling
20
21
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Query Keyword: common object categories from the SUN database that contain no less than 20 object instances per category
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
23
Query Keyword: common object categories from the SUN database that contain no less than 20 object instances per category
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
25
26
Slide Credit: Wu et al
26
Slide Credit: Wu et al
26
Slide Credit: Wu et al
26
Slide Credit: Wu et al
26
Slide Credit: Wu et al
26
Slide Credit: Wu et al
26
Slide Credit: Wu et al
26
Slide Credit: Wu et al
26
Gibbs sampling with clamping
Slide Credit: Wu et al
26
Gibbs sampling with clamping
Slide Credit: Wu et al
27
[29] R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng. Convolutional-recursive deep learning for 3d object classification. In NIPS 2012.
Training on CAD models and no discriminative tuning!
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Volumetric representation
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Volumetric representation sofa? bathtub? What is it?
dresser?
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Volumetric representation sofa? bathtub? What is it?
dresser? Not sure. Look from another view?
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Volumetric representation sofa? bathtub? What is it?
dresser? Not sure. Look from another view? Where to look next? Next-Best-View
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Volumetric representation New depth map sofa? bathtub? What is it?
dresser? Not sure. Look from another view? Where to look next? Next-Best-View
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Volumetric representation New depth map sofa? bathtub? What is it?
dresser? Aha! It is a sofa! Not sure. Look from another view? Where to look next? Next-Best-View
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
Volumetric representation New depth map sofa? bathtub? What is it?
dresser? 3D ShapeNets Aha! It is a sofa! Not sure. Look from another view? Where to look next? Next-Best-View
28
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
Mathematically, this is equivalent to evaluate the conditional mutual information: 0.8 0.5 0.2 0.3 0.4 0.8 0.3 0.8 0.8 0.3 0.7 0.3 0.4 0.8 0.8 0.3
29
Slide Credit: Wu et al
30
Recognition Accuracy from Two Views.
Based on the algorithms’ choice, we obtain the actual depth map for the next view and recognize the objects using two views by our 3D ShapeNets.
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
30
Recognition Accuracy from Two Views.
Based on the algorithms’ choice, we obtain the actual depth map for the next view and recognize the objects using two views by our 3D ShapeNets. Our algorithm stands out as the uncertainty of the first view increases
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
31
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2
48 filters of stride 2 160 filters of stride 2 512 filters of stride 1 30 13 5 1200 2 4000
3D ShapeNets 3D CNN
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
32
Slide Credit: Wu et al
32
Mesh Classification & Retrieval
Slide Credit: Wu et al
32
Mesh Classification & Retrieval
[29] R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng. Convolutional-recursive deep learning for 3d object classification. In NIPS 2012.
2.5D object recognition
Slide Credit: Wu et al
33
Slide Credit: Wu, Song et al. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling, CVPR 2015
34
35
36
37
performance on classification tasks?
http://3dshapenets.cs.princeton.edu/ 38
approximate training and inference techniques rather than standard back-prop?
learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554
"Restricted Boltzmann machines for collaborative filtering." Proceedings of the 24th international conference on Machine
39
In particular, doesn’t the voxel representation have the bottleneck of cubic dependency on grid size?
from multiple 2D views instead of voxel representation and get better results for classification .
ICCV2015. 40
problem
contain only 4 non-rigid categories — persons, plant, sofas, curtains.
41
42
> <
j ih
v
1
> <
j ih
v
i j i j t = 0 t = 1
1
Start with a training vector on the visible units. Update all the hidden units in parallel Update all the visible units in parallel to get a “reconstruction”. Update all the hidden units again. This is not following the gradient of the log likelihood. But it works well. It is approximately following the gradient of another objective function. reconstruction data 12
Slide Credit: Geoffery Hinton 43
weights to perform a bottom-up pass. – Train the generative weights to reconstruct activities in each layer from the layer above.
weights to generate samples from the model. – Train the recognition weights to reconstruct activities in each layer from the layer below.
h2 data h1 h3
2
1
1
2
3
3
R
Slide Credit: Geoffery Hinton 44