Learning Dense Correspondence via 3D-guided Cycle Consistency - - PowerPoint PPT Presentation
Learning Dense Correspondence via 3D-guided Cycle Consistency - - PowerPoint PPT Presentation
Learning Dense Correspondence via 3D-guided Cycle Consistency Tinghui Zhou 1 , Philipp Krhenbhl 1 , Mathieu Aubry 2 , Qixing Huang 3 , Alexei A. Efros 1 UC Berkeley 1 , ENPC ParisTech 2 , TTI-Chicago 3 The Unreasonable Effectiveness of Deep
The Unreasonable Effectiveness of Deep Learning?
Performance gain over traditional methods 60% 45% 30% 15%
Object detection Semantic seg. Human pose Intrinsic image Video Seg.
Lots of direct labels Very few direct labels
Dense matching
3
Dense Semantic Correspondence
4
Dense Semantic Correspondence
5
Traditional Pairwise Methods
- SIFT flow: Liu et al., ECCV 2008
- Generalized PatchMatch: Barnes et al., ECCV 2010
- Deformable Spatial Pyramid: Kim et al., CVPR 2013
Hand-crafted Features Hand-crafted Features Feature Matching
Collection Correspondence
- Congealing: Learned-Miller, PAMI 2006
- Collection Flow: Kramelmacher-Shlizerman et al., CVPR 2012
- Object discovery and segmentation: Rubinstein et al., CVPR 2013
- Compositional Image Model: Mobahi et al., CVPR 2014
- Object discovery and localization: Cho et al., CVPR 2015
- FlowWeb: T. Zhou et al., CVPR 2015
- Multi-image Matching: X. Zhou et al., ICCV 2015
Labels for CNN Training?
CNN
Infeasible to label in large-scale
Cycle-consistency as Supervision
- Composite flows along a cycle should be zero
Cycle-consistency as Supervision
- Composite flows along a cycle should be zero
- 2-cycle consistency: Fi,j Fj,i = 0
Cycle-consistency as Supervision
- Composite flows along a cycle should be zero
- 2-cycle consistency: Fi,j Fj,i = 0
- 3-cycle consistency: Fi,k Fk,j Fj,i = 0
Cycle-consistency as Supervision
- Composite flows along a cycle should be zero
- 2-cycle consistency: Fi,j Fj,i = 0
- 3-cycle consistency: Fi,k Fk,j Fj,i = 0
Cycle-consistency as Supervision
- Composite flows along a cycle should be zero
- 2-cycle consistency: Fi,j Fj,i = 0
- 3-cycle consistency: Fi,k Fk,j Fj,i = 0
CNN
Amount of inconsistency
Cycle Consistency in Vision
Shape Matching SfM Co-segmentation
Huang et al, SGP’13 Wang et al, ICCV’13 Zach et al, CVPR’10
Collection Correspondence
Zhou et al, CVPR’15 Zhou et al, ICCV’15
Could be consistent but wrong…
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Need an anchor edge!
Synthetic Correspondence as the Anchor
3D CAD Model Viewpoint Renderer
Correspondence from renderer
3D-guided Cycle Consistency
Fr2,s2 ˜ Fs1,s2 Fr1,r2
Fs1,r1
synthetic s1
synthetic s2
real r1 real r2 ˜ Fs1,s2 = Fs1,r1 Fr1,r2 Fr2,s2
Accumulate flow vector
Ground truth
TRAINING TIME
3D-guided Cycle Consistency
Fr2,s2 ˜ Fs1,s2 Fr1,r2
Fs1,r1
synthetic s1
synthetic s2
real r1 real r2
min X
< s1,s2,r1,r2>
L ⇣ ˜ Fs1,s2 Fs1,r1 Fr1,r2 Fr2,s2 ⌘
Ground truth
Network Architecture
128 8 3 128 64 64 32 32 16 16 16 32 32 64 64 128 128 256 128 8 3 128 64 64 32 32 16 16 16 32 32 64 64 128 128 256 8 16 16 32 32 64 64 128 128 512 256 256 128 128 64 64 32 2
Source Target Weight Sharing Flow field
Matchability Prediction
Source Target Flow field
CNN
Matchability Prediction
Source Target Flow field
CNN
Background: ✗!
Matchability Prediction
Source Target Flow field
CNN
Background: ✗! Occlusion: ✗!
Matchability Prediction
Source Target Flow field
CNN
Matchability
Training Set Construction
PASCAL 3D (Bbox + Viewpoint) ShapeNet (Synthetic Rendering)
Xiang et al, WACV’14 Chang et al, arXiv’15
Training Set Construction
PASCAL 3D (Bbox + Viewpoint) ShapeNet (Synthetic Rendering)
Xiang et al, WACV’14 Chang et al, arXiv’15
Training Set Construction
… … … …
Single view reconstruction via joint analysis of image and shape collections, Huang et al., SIGGRAPH 2015 Image-to-shape retrieval
Training Set Construction
One training example
- ~80,000 examples per category
- A single network for all 12 PASCAL3D categories (aero,
boat, bus, car, chair, etc.)
RESULTS
Image Warping Visualization
Target Source SIFT flow Ours
Image Warping Visualization
Target Source SIFT flow Ours
Keypoint Transfer
Source Target Accuracy (PCK)
SIFT flow Ours Mean 19.6 24.0 … Car 22.4 33.3 Bus 28.6 40.3 Bottle 28.3 40.3 TV 42.9 51.1 …
SIFT flow Ours
Matchability Prediction
Source Target Ours Ground truth Accuracy
SIFT flow Ours 64.5 72.0
t-SNE Feature Visualization
128 8 3
Source Target Weight sharing
128 64 64 32 32 16 16 16 32 32 64 64 128 128 256 128 8 3 128 64 64 32 32 16 16 16 32 32 64 64 128 128 256 8 16 16 32 32 64 64 128 128 8 16 16 32 32 64 64 128 128 512 256 256 128 128 64 64 32 2 256 128 128 64 64 32 32 16 2
Flow field Matchability
Global image features
t-SNE Feature Visualization
Side views 45。 views Frontal views
Application: Cross-domain Dense Label Transfer
Source Target Dense CRF SIFT flow Ours
Conclusion
TRAINING TIME
Fr2,s2 ˜ Fs1,s2 Fr1,r2 Fs1,r1
synthetic s1
synthetic s2
real r1 real r2
Ground truth
- Cycle consistency effective when direct labels not available
- ‘Meta’-supervision: supervising the behavior of the data