Analysis-by-Synthesis a.k.a Generative Modeling
Tejas D Kulkarni (tejask@mit.edu)
Monday, October 19, 15
Analysis-by-Synthesis a.k.a Generative Modeling Tejas D Kulkarni - - PowerPoint PPT Presentation
Analysis-by-Synthesis a.k.a Generative Modeling Tejas D Kulkarni (tejask@mit.edu) Monday, October 19, 15 Traditional paradigm for AI research Traditional machine learning and pattern recognition has been remarkable successful at questions
Monday, October 19, 15
Krizhevsky et. al
Monday, October 19, 15
source: https://www.youtube.com/watch?v=3f3rOz0NzPc
Monday, October 19, 15
Monday, October 19, 15
Hermann Von Helmholtz The general rule determining the ideas
impression is made on the eye, is that such objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism (1865) The free-energy principle says that any self-organizing system that is at equilibrium with its environment must minimize its free energy (2010) Karl Friston Geoff Hinton et al Boltzmann Machines Helmholtz Machine (1885, 1995)
Monday, October 19, 15
Kersten, NIPS 1998 Tutorial on Computational Vision
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Assumptions violated: broad posterior Assumptions satisfied: narrower posterior
Monday, October 19, 15
Light Nose Eyes
Outline Mouth
Nose Eyes
Outline Mouth
Shading Simulator
Image
Affine P(S, T, L, A|I) ∝ P(I|S, T, L, A)P(L)P(S)P(T)P(A) ∝ N(I − O; 0, 0.1)P(L)P(A) Y
i
P(Si)P(Ti)
Monday, October 19, 15
Light Nose Eyes
Outline Mouth
Nose Eyes
Outline Mouth
Shading Simulator Image
Affine
Monday, October 19, 15
Light Nose Eyes
Outline Mouth
Nose Eyes
Outline Mouth
Shading Simulator Image
Affine
Monday, October 19, 15
Light Nose Eyes
Outline Mouth
Nose Eyes
Outline Mouth
Shading Simulator Image
Affine
Monday, October 19, 15
Light Nose Eyes
Outline Mouth
Nose Eyes
Outline Mouth
Shading Simulator Image
Affine
Monday, October 19, 15
Monday, October 19, 15
Observed Image Inferred (reconstruction) Inferred model re-rendered with novel poses Inferred model re-rendered with novel lighting
Monday, October 19, 15
Test Image Inference Trajectory
Monday, October 19, 15
Test Image Inference Trajectory
Monday, October 19, 15
Monday, October 19, 15
Light Nose Eyes Outline Mouth Nose Eyes Outline Mouth
Shape Texture
Shading Simulator Image
Affine
SNose ∼ randn(50) TNose ∼ randn(50)
SMouth ∼ randn(50) TMouth ∼ randn(50)
P(I|S, T) ∝ Normal(O − R; 0, σ0)
Monday, October 19, 15
SNose ∼ randn(50) TNose ∼ randn(50)
SMouth ∼ randn(50) TMouth ∼ randn(50)
P(I|S, T) ∝ Normal(O − R; 0, σ0)
Repeat until convergence:
(1) Let x be either Si or Ti
We sample new x0 ∼ randn(50)
(2) r = p(x0)q(x|x0) p(x)q(x0|x)
(3) Accept x’ with probability: α = min{1, r} Otherwise, x’=x
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Unconditional Runs
Monday, October 19, 15
Unconditional Runs
Hallucinated Data (Sleep)
R}
Monday, October 19, 15
Unconditional Runs
Hallucinated Data (Sleep)
R}
(Krizhevsky et al.)
Monday, October 19, 15
Unconditional Runs Learning
Hallucinated Data (Sleep)
R}
(Krizhevsky et al.)
Monday, October 19, 15
Unconditional Runs Learning
Hallucinated Data (Sleep)
R}
(Krizhevsky et al.)
Monday, October 19, 15
Long-term Memory
Unconditional Runs Learning
Hallucinated Data (Sleep)
R}
(Krizhevsky et al.)
Monday, October 19, 15
Long-term Memory (Sleep)
Conditional Density Estimator
q(Sρ ← S0ρ|ID)
Now run inference 90%: Data-driven (Pattern Matching) 10%: Sampling/Search (Reasoning)
Monday, October 19, 15
With Data-driven Proposals Without Data-driven Proposals
Monday, October 19, 15
Tijmen Tielemen (Thesis, 2014)
Monday, October 19, 15
Tijmen Tielemen (Thesis, 2014)
Monday, October 19, 15
image Filters = 96 kernel size (KS) = 5 150x150
Convolution + Pooling graphics code
Q(zi|x)
Filters = 64 KS = 5 Filters = 32 KS = 5 7200
pose light shape
. . . .
Filters = 32 KS = 7 Filters = 64 KS = 7 Filters = 96 KS = 7
P(x|z)
Encoder (De-rendering) Decoder (Renderer) Unpooling (Nearest Neighbor) + Convolution
{µ200, Σ200}
Monday, October 19, 15
image Filters = 96 kernel size (KS) = 5 150x150
Convolution + Pooling graphics code
x
Q(zi|x)
Filters = 64 KS = 5 Filters = 32 KS = 5 7200
pose light shape
. . . .
Filters = 32 KS = 7 Filters = 64 KS = 7 Filters = 96 KS = 7
P(x|z)
Encoder (De-rendering) Decoder (Renderer) Unpooling (Nearest Neighbor) + Convolution
{µ200, Σ200}
Monday, October 19, 15
image Filters = 96 kernel size (KS) = 5 150x150
Convolution + Pooling graphics code
x
Q(zi|x)
Filters = 64 KS = 5 Filters = 32 KS = 5 7200
pose light shape
. . . .
Filters = 32 KS = 7 Filters = 64 KS = 7 Filters = 96 KS = 7
P(x|z)
Encoder (De-rendering) Decoder (Renderer) Unpooling (Nearest Neighbor) + Convolution
{µ200, Σ200}
Objective Function:
−logP(x|Z) + KL(Q(Z|x)||P(Z))
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Monday, October 19, 15
Kemp et. al., The discovery of structural form, PNAS 2008
Monday, October 19, 15
Kemp et. al., The discovery of structural form, PNAS 2008
Monday, October 19, 15
Kemp et. al., The discovery of structural form, PNAS 2008
Monday, October 19, 15
Ref: http://mlg.eng.cam.ac.uk/zoubin/talks/uai05tutorial-b.pdf
Monday, October 19, 15
Ref: http://mlg.eng.cam.ac.uk/zoubin/talks/uai05tutorial-b.pdf
Monday, October 19, 15
Monday, October 19, 15