MODELING ANNOTATED DATA
Reviewer: Saurabh Singh (ss1@uiuc.edu)
MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu) - - PowerPoint PPT Presentation
MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu) Problem Modeling of associated document items Images & Annotations Papers & Bibliographies Genes & Functions Documents are considered as pairs of data
Reviewer: Saurabh Singh (ss1@uiuc.edu)
Models Images (r) and Annotations (w) Three primary tasks
Organization)
(Automatic annotation, text based retrieval)
K factors or topics
Latent variables
Features Document: (r, w), N regions, M words Distributions p(r, w), p(w | r), p(w | r, rn)
Vocabulary: 168 Terms (V) Captions: 2-4 Words per Image
Composed of 6-10 regions via N-cuts Each region summarized as a feature vector ~40
inertia etc.
Three hierarchical probabilistic models
1.
Gaussian Multinomial mixture
2.
Gaussian Multinomial LDA
3.
Correspondence LDA
θd Zd,n Wd,n N D K
βk
α
η
N n=1 p(rn | z, µ, σ)
M m=1 p(wm | z, β).
But no
=
z p(z | r)p(w | z).
θd Zd,n Wd,n N D K
βk
α
η
p(r, w, θ, z, v) = p(θ | α)
N n=1 p(zn | θ)p(rn | zn, µ, σ)
·
M m=1 p(vm | θ)p(wm | vm, β) .
All
θd Zd,n Wd,n N D K
βk
α
η
All
p(r, w, θ, z, y) = p(θ | α)
N n=1 p(zn | θ)p(rn | zn, µ, σ)
·
M m=1 p(ym | N)p(wm | ym, z, β)
50 100 150 200 350 400 450 500 550 600 650
Number of factors Average negative log probability
Corr−LDA GM−Mixture GM−LDA ML
50 100 150 200 30 40 50 60 70 80 90 100
Number of factors Caption perplexity
Corr−LDA GM−Mixture GM−LDA ML
Maximum likelihood
50 100 150 200 30 40 50 60 70 80 90 100
Number of factors Caption perplexity
Corr−LDA GM−Mixture GM−LDA ML
Empirical Bayes smoothed
perplexity = exp{−
D d=1 Md m=1 log p(wm | rd)/ D d=1 Md}.
True caption scotland water Corr−LDA scotland water flowers hills tree GM−LDA GM−Mixture tree water people mountain sky water sky clouds sunset scotland True caption clouds jet plane Corr−LDA sky plane jet mountain clouds GM−LDA GM−Mixture sky water people tree clouds sky plane jet clouds pattern True caption fish reefs water Corr−LDA fish water ocean tree coral GM−LDA GM−Mixture water sky vegetables tree people fungus mushrooms tree flowers leaves
1
Corr−LDA:
GM−LDA:
2 3 4 5 6
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision Corr−LDA GM−Mixture GM−LDA
candy
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision Corr−LDA GM−Mixture GM−LDA
sunset
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision Corr−LDA GM−Mixture GM−LDA
people & fish
Candy Sunset People & Fish
If conditionals are needed, then model them explicitly