Sem Semantic 3D Modelling antic 3D Modelling
Ľubor Ladický
work with Christian Häne, Nikolay Savinov, Jianbo Shi,Bernhard Zeisl, Marc Pollefeys
Sem Semantic 3D Modelling antic 3D Modelling ubor Ladick work with - - PowerPoint PPT Presentation
Sem Semantic 3D Modelling antic 3D Modelling ubor Ladick work with Christian Hne, Nikolay Savinov, Jianbo Shi, Bernhard Zeisl, Marc Pollefeys Schedule Introduction Discrete MRF Optimization using Graph Cuts Classifiers for
Sem Semantic 3D Modelling antic 3D Modelling
Ľubor Ladický
work with Christian Häne, Nikolay Savinov, Jianbo Shi,Bernhard Zeisl, Marc Pollefeys
Schedule
Schedule
Graph-Cut (st-mincut)
source sink 9 5 6 8 4 2 2 2 5 3 5 5 3 1 1 6 2 3
source set sink set edge costsSet formulation
Graph-Cut (st-mincut)
source set sink set edge costssource sink 9 5 6 8 4 2 2 2 5 3 5 5 3 1 1 6 2 3
S T
cost = 18Set formulation
Graph-Cut (st-mincut)
source sink 9 5 6 8 4 2 2 2 5 3 5 5 3 1 1 6 2 3
S T
cost = 18Algebraic formulation Set formulation
Graph-Cut (st-mincut)
source sink 9 5 6 8 4 2 2 2 5 3 5 5 3 1 1 6 2 3
S T
cost = 18Algebraic formulation Set formulation After substitution
Graph-Cut (st-mincut)
source sink 9 5 6 8 4 2 2 2 5 3 5 5 3 1 1 6 2 3
S T
cost = 18Algorithms Augmented path method Push-relabel method
Foreground / Background Estimation
Rother et al. SIGGRAPH04Data term Smoothness term Data term Estimated using FG / BG colour models Smoothness term
where
Intensity dependent smoothness
Foreground / Background Estimation
Foreground / Background Estimation
Data term Smoothness term
Foreground / Background Estimation
Data term Smoothness term Min-Cut problem
Foreground / Background Estimation
source sink
Solvability using GraphCut
Submodularity
Solvability using GraphCut
all terms submodular Submodularity submodularity = necessary condition
Solvability using GraphCut
Submodularity General pairwise potential
Solvability using GraphCut
Submodularity General pairwise potential
whereSolvability using GraphCut
Submodularity General pairwise potential
where could be arbitrary Submodularity = sufficient conditionGeneral GraphCut pipeline
Energy minimization transformed into GraphCut :
Mulit-label energy with linear pairwise potentials
Encoding
xi xi D xk xk D . . . ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ K K K Ksource sink
.Smoothness term Data term
Ishikawa PAMI03Mulit-label energy with linear pairwise potentials
Encoding
xi xi D xk xk D . . . ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ K K K Ksource sink
.Smoothness term Data term
Ishikawa PAMI03Mulit-label energy with convex pairwise potentials
xi xi D xk xk D . ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞source sink
.Smoothness term Data term
Any convex function ∞ ∞ ∞ ∞ ∞ ∞ where Ishikawa PAMI03Higher order minimization with GraphCut
Higher order term Pairwise term
Obtain solution Graph Cut Transform into submodular E Encoding energy solution Invert EncodingHigher order minimization with GraphCut
Higher order term Pairwise term
Example :
Higher order minimization with GraphCut
Higher order term Pairwise term
Example :
Kolmogorov ECCV06, Ramalingam et al. DAM12General GraphCut pipeline
What if no encoding leads to pairwise submodular problem ?
Obtain solution Graph Cut Transform into submodular E Encoding energy solution Invert EncodingMove making algorithms
solvable with graph cut
current solution in a restricted search space
Update solution Graph Cut Transform into submodular E Encoding Initial solution solution Invert Encoding Propose move Boykov et al., PAMI01Move making algorithms
Update solution Graph Cut Transform into submodular E Encoding Initial solution solution Invert Encoding Propose moveα-swap
– Each variable taking label α or can change its label to α or – Move space defined by the transformation function Transformation function
Move making algorithms
Update solution Graph Cut Transform into submodular E Encoding Initial solution solution Invert Encoding Propose moveTransformation function
α-expansion
– Each variable may keep the old label or change to α – Move space defined by the transformation function
Move making algorithms
Sufficient condition for submodularity of each move :
metricity semi-metricity
α-swap α-expansion
Semantic Segmentation
Data term Smoothness term Data term Smoothness term
Discriminatively trained classifier
Semantic Segmentation
Original Image Initial solution grassSemantic Segmentation
Original Image Initial solution Building expansion grass grass buildingSemantic Segmentation
Original Image Initial solution Building expansion Sky expansion grass grass grass building building skySemantic Segmentation
Original Image Initial solution Building expansion Sky expansion Tree expansion grass grass grass grass building building building sky sky treeSemantic Segmentation
Original Image Initial solution Building expansion Sky expansion Tree expansion Aeroplane expansion grass grass grass grass grass building building building building sky sky sky tree tree aeroplaneNon-submodular energy minimization
What can we do?
Non-submodular energy minimization
What can we do?
Non-submodular energy minimization
QPBO
variables xi and xi s.t. xi = 1 - xi
_ _ _
Non-submodular energy minimization
QPBO
variables xi and xi s.t. xi = 1 - xi
_ _ _
Non-submodular energy minimization
QPBO
variables xi and xi s.t. xi = 1 - xi
_ _ _ _
Non-submodular energy minimization
QPBO
variables xi and xi s.t. xi = 1 - xi
part of globally optimal solution _ _ _ _
Non-submodular energy minimization
QPBO
variables xi and xi s.t. xi = 1 - xi
part of globally optimal solution
node (ICM), or by keeping old labels for move algorithms _ _ _ _
Other Structural Properties Solvable with Graph-Cut
Schedule
Semantic classifier
Shotton et al. ECCV06, Ladický et al. ICCV09Data-driven Depth Estimation
Data-driven Depth Estimation
Desired properties :
Data-driven Depth Estimation
Desired properties :
Super-pixels not necessarily planar
Data-driven Depth Estimation
Desired properties :
Data-driven Depth Estimation
Desired properties :
Data-driven Depth Estimation
Desired properties :
Sufficient to train a binary classifier predicting a single dC
Data-driven Depth Estimation
Desired properties :
Sufficient to train a binary classifier predicting a single dC For other depths d :
Data-driven Depth Estimation
Desired properties :
Data-driven Depth Estimation
Desired properties :
Generalized to multiple semantic classes
semantic labelTraining the classifier
1. Image pyramid is builtTraining the classifier
1. Image pyramid is built 2. Training data randomly sampledTraining the classifier
1. Image pyramid is built 2. Training data randomly sampled 3. Samples of each class at dC used as positivesTraining the classifier
1. Image pyramid is built 2. Training data randomly sampled 3. Samples of each class at dC used as positives 4. Samples of other classes or at d ≠ dC used as negativesTraining the classifier
1. Image pyramid is built 2. Training data randomly sampled 3. Samples of each class at dC used as positives 4. Samples of other classes or at d ≠ dC used as negatives 5. Multi-class classifier trainedDense Features SIFT, LBP, Self Similarity, Texton
Classifying the patch
Dense Features SIFT, LBP, Self Similarity, Texton Representation Soft BOW representations in the set of random rectangles
Classifying the patch
Dense Features SIFT, LBP, Self Similarity, Texton Representation Soft BOW representations in the set of random rectangles Classifier AdaBoost
Classifying the patch
Experiments
KITTI datasetKITTI results
NYU2 results
NYU2 results
Surface Normal Estimation
Not explored much in the literature… so how to approach it?
Surface Normal Estimation
Not explored much in the literature… so how to approach it?
Pixels or Super-pixels?
Pixel-based Classifiers
Input image Feature representation
[Shotton06, Shotton08]
Pixel-based Classifiers
Input image Feature representation
[Shotton06, Shotton08]
Segment-based Classifiers
Input image Feature representation
Segment-based Classifiers
Input image Feature representation
Segment-based Classifiers
Input image Feature representation
Input image Independent classifiers
Joint Regularization
Joint Regularization
Input image Independent classifiers
Joint Regularization
Input image Independent classifiers
How to convert segment representation into pixel representation?
Joint Learning
Input image Segment representation
How to convert segment representation into pixel representation?
Joint Learning
Input image Segment representation
How to convert segment representation into pixel representation?
Joint Learning
Input image Segment representation
How to convert segment representation into pixel representation?
Joint Learning
Joint Learning
To simplify regression problem
Joint Learning
To simplify regression problem
Pipeline of our Method
RMRC Challenge Results
Input image err = 40.366 Input image err = 32.446 Input image err = 33.636 Input image err = 35.109 Input image err = 35.849 Input image err = 28.379 Input image err = 35.429 Input image err = 37.066 Input image err = 38.043RMRC Challenge Results
Schedule
Semantic 3D Reconstruction
Input images Semantic 3D model Depth estimates Semantic estimates
Semantic 3D Reconstruction
Pixel predictions - prediction of the first occupied voxel along the ray Predictions of the semantic label of the first occupied voxel Predictions of the depth of the first
Semantic 3D Reconstruction
Volumetric formulation
Semantic 3D Reconstruction
Volumetric formulation
Ray potentials Pairwise regularizer
Semantic 3D Reconstruction
Volumetric formulation
Ray potentials Pairwise regularizer
Ray potentials typically approximated by unary potentials
( Zach 3DPVT08, Häne CVPR13, Kundu ECCV14, ..)
Semantic 3D Reconstruction
Volumetric formulation
Ray potentials Pairwise regularizer
We try to solve the right problem!
Semantic 3D Reconstruction
Volumetric formulation
Ray potentials Pairwise regularizer
Cost based on the first occupied voxel along the ray
depth label
freespaceTwo-label problem
Discrete formulation using QPBO relaxation
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 x3 x3Two-label problem
Discrete formulation using QPBO relaxation
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 x3 x3Our goal is to find :
Two-label problem
Discrete formulation using QPBO relaxation
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 x3 x3Our goal is to find :
such that is : 1) A pairwise function 2) Number of edges grows linearly with the length for a ray 3) Symmetric to inherit QPBO properties
Two-label problem
To find we do these steps:
1) Polynomial representation of the ray potential 2) Transformation into submodular function over x and x 3) Pairwise construction using auxiliary variables z 4) Merging variables (Ramalingam12) for linear complexity 5) Symmetrization of the graph
Polynomial representation of the ray potential
Two-label ray potential takes the form: where xi = 0 for occupied voxel xi = 1 for free-space
Polynomial representation of the ray potential
Two-label ray potential takes the form: where xi = 0 for occupied voxel xi = 1 for free-space We want to transform the potential into:
Polynomial representation of the ray potential
Two-label ray potential takes the form: where xi = 0 for occupied voxel xi = 1 for free-space We want to transform the potential into: Plugging it in: thus
Transformation into a submodular function
Transformation into a submodular function
For :
Transformation into a submodular function
For : Starting from the last term, we can iteratively transform:
Pairwise graph construction
Standard graph constructions (Freedman CVPR05) for negative products:
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 x3 x3 z6 a6 a6 a6 a6 a6 a6 a6 a6Pairwise graph construction
Standard graph constructions (Freedman CVPR05) for negative products:
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 x3 x3 z6 b6 b6 b6 b6 b6 b6 b6 b6Pairwise graph construction
Standard graph constructions (Freedman CVPR05) for negative products: Leads to a quadratic growth of the number of edges!
Pairwise graph construction
Non-standard graph constructions for negative products:
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 z0 z1 z2 z6 z5 z4 x3 x3 z3 a6 a6 a6 a6 a6 a6 a6 a6 a6 a6 a6 a6 a6 a6Optimal z :
Pairwise graph construction
Non-standard graph constructions for negative products:
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 z’0 z’1 z’2 z'6 z'5 z’4 x3 x3 z’3 b6 b6 b6 b6 b6 b6 b6 b6 b6 b6 b6 b6 b6Optimal z :
b6Merging theorem
If (optimal)
(Ramalingam12)
Merging theorem
If (optimal) then
(Ramalingam12)
Pairwise graph construction
Non-standard graph constructions for negative products: Optimal z : Optimal z :
Pairwise graph construction
Pairwise graph construction
x0 x0 x1 x1 x2 x2 x6 x5 x4 x4 x5 x6 z0 z1 z2 z6 z5 z4 z0 z1 z2 z6 z5 z4 ’ ’ ’ ’ ’ ’ a0 a1 x3 x3 z3 z3 ’ a2 a3 a4 a5 a6 b0 b1 b2 b3 b4 b5 b6 f0 f1 f2 f3 f4 f5 f6 b0 b1 b2 b3 b4 b5 b6 f6 f5 f4 f3 f2 f1Symmetrization of the graph
Multi-label problem
Implementation details
Semantic cost Depth cost For the top n matches :
Results
Input Depth Semantics 3D model
Results
Input Depth Semantics 3D model
Results
Results
Conclusions
Continuous Formulation
Continuous approach possible ?
Continuous Formulation
Continuous Formulation
The last term can be dropped by:
Continuous Formulation
Introducing visibility variables :
Continuous Formulation
Introducing visibility variables : where
Continuous Formulation
Introducing visibility variables: where convex for ci
l ≤ 0Continuous Formulation
Introducing visibility variables: convex for ci
l ≤ 0Continuous Formulation
Can we make ci
l ≤ 0 ?
Continuous Formulation
Can we make ci
l ≤ 0 ?
Yes!
Continuous Formulation
Can we make ci
l ≤ 0 ?
First, we notice :
Continuous Formulation
Can we make ci
l ≤ 0 ?
First, we notice : The cost function does not change by adding:
Continuous Formulation
Can we make ci
l ≤ 0 ?
First, we notice : The cost function does not change by adding:
Continuous Formulation
Can we make ci
l ≤ 0 ?
First, we notice : The cost function does not change by adding: ci
l ≥ 0Integer Formulation
Convex relaxation
Convex relaxation
Will it work ?
Convex relaxation
Will it work ? Unfortunately not
Convex relaxation
Convex relaxation
Desired solution
Convex relaxation
Desired solution Global optimum
Convex relaxation
Problem solved, if cost is taken, when there is a visibility drop:
Non-convex Formulation
Non-convex Formulation
Solved using majorize-minimize strategy:
Non-convex Formulation
Solved using majorize-minimize strategy: We replace constraint by
Non-convex Formulation
Solved using majorize-minimize strategy: We replace constraint by where
Results on Middlebury dataset
Results
Results on Thin Structures
TV Flux TV Flux TV Flux (high reg) (medium reg) (low reg) Data Our method
Multi-class results
Input Data Häne et al. CVPR13 Discrete result Continuous result
Questions ?