MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu) - - PowerPoint PPT Presentation

modeling annotated data
SMART_READER_LITE
LIVE PREVIEW

MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu) - - PowerPoint PPT Presentation

MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu) Problem Modeling of associated document items Images & Annotations Papers & Bibliographies Genes & Functions Documents are considered as pairs of data


slide-1
SLIDE 1

MODELING ANNOTATED DATA

Reviewer: Saurabh Singh (ss1@uiuc.edu)

slide-2
SLIDE 2

Problem

  • Modeling of associated document items
  • Images & Annotations
  • Papers & Bibliographies
  • Genes & Functions
  • Documents are considered as pairs of data streams.
  • One type provides annotation for the other type.
slide-3
SLIDE 3

Uses

  • Retrieval, Clustering, Classification
  • Automatic annotation
  • Retrieval of un-annotated data.
slide-4
SLIDE 4

This paper

Models Images (r) and Annotations (w) Three primary tasks

  • Joint distribution of an image and its caption (Clustering,

Organization)

  • Conditional distribution of words given an image.

(Automatic annotation, text based retrieval)

  • Conditional distribution of words given a region of an
  • image. (Automatic labeling of regions)
slide-5
SLIDE 5

Modeling

K factors or topics

  • Each a distribution over words
  • Each a distribution over image regions

Latent variables

  • Topic assignments
  • Distribution parameters (for components)

Features Document: (r, w), N regions, M words Distributions p(r, w), p(w | r), p(w | r, rn)

slide-6
SLIDE 6

Text annotations

Vocabulary: 168 Terms (V) Captions: 2-4 Words per Image

Multinomials on V conditioned on topics

slide-7
SLIDE 7

Images

Composed of 6-10 regions via N-cuts Each region summarized as a feature vector ~40

  • Size: Percentage of image
  • Position: Center of mass [0, 1]
  • Color: µ, σ of R,G,B, L, a, b etc.
  • Texture: µ, σ of filter responses
  • Shape: area/perimeter2, moment of

inertia etc.

Multivariate Gaussian over features: µ, Σ

slide-8
SLIDE 8

Models

Three hierarchical probabilistic models

1.

Gaussian Multinomial mixture

2.

Gaussian Multinomial LDA

3.

Correspondence LDA

slide-9
SLIDE 9

Gaussian Multinomial Mixture

r w N M D z σ β µ λ

θd Zd,n Wd,n N D K

βk

α

η

slide-10
SLIDE 10

Distributions

p(z, r, w) = p(z | λ)

N n=1 p(rn | z, µ, σ)

·

M m=1 p(wm | z, β).

  • p(r, w)
  • p(w | r) =

But no

  • p(w | r, rn)

=

z p(z | r)p(w | z).

slide-11
SLIDE 11

Gaussian Multinomial LDA

θd Zd,n Wd,n N D K

βk

α

η

α w r N M D z θ v σ β µ

slide-12
SLIDE 12

Distributions

p(r, w, θ, z, v) = p(θ | α)

N n=1 p(zn | θ)p(rn | zn, µ, σ)

·

M m=1 p(vm | θ)p(wm | vm, β) .

All

  • p(r, w)
  • p(w | r)
  • p(w | r, rn)
slide-13
SLIDE 13

Correspondence LDA

θd Zd,n Wd,n N D K

βk

α

η

θ w y r N M D z α σ β µ

slide-14
SLIDE 14

Distributions

All

  • p(r, w)
  • p(w | r)
  • p(w | r, rn)

p(r, w, θ, z, y) = p(θ | α)

N n=1 p(zn | θ)p(rn | zn, µ, σ)

·

M m=1 p(ym | N)p(wm | ym, z, β)

slide-15
SLIDE 15

Inference & Estimation

  • Variational Inference
  • Exact intractable
  • Approximate assuming factorizable distribution
  • Minimize KL-Divergence via iterative updates to parameters
  • Parameter Estimation
  • EM algorithm
  • E: Compute variational posterior.
  • M: MLE estimate of the model parameters.
slide-16
SLIDE 16

Evaluation

  • 7000 Images and their captions
  • 75% Training & 25% Testing
  • Test set likelihood
  • Automatic annotation
  • Text based retrieval
slide-17
SLIDE 17

Eval: Test set likelihood

50 100 150 200 350 400 450 500 550 600 650

Number of factors Average negative log probability

Corr−LDA GM−Mixture GM−LDA ML

slide-18
SLIDE 18

Eval: Automatic Annotation

50 100 150 200 30 40 50 60 70 80 90 100

Number of factors Caption perplexity

Corr−LDA GM−Mixture GM−LDA ML

Maximum likelihood

50 100 150 200 30 40 50 60 70 80 90 100

Number of factors Caption perplexity

Corr−LDA GM−Mixture GM−LDA ML

Empirical Bayes smoothed

perplexity = exp{−

D d=1 Md m=1 log p(wm | rd)/ D d=1 Md}.

slide-19
SLIDE 19

Eval: Automatic Annotation (Qual.)

True caption scotland water Corr−LDA scotland water flowers hills tree GM−LDA GM−Mixture tree water people mountain sky water sky clouds sunset scotland True caption clouds jet plane Corr−LDA sky plane jet mountain clouds GM−LDA GM−Mixture sky water people tree clouds sky plane jet clouds pattern True caption fish reefs water Corr−LDA fish water ocean tree coral GM−LDA GM−Mixture water sky vegetables tree people fungus mushrooms tree flowers leaves

slide-20
SLIDE 20

Eval: Automatic Annotation (Qual.)

1

  • 6. PLANE, JET

Corr−LDA:

  • 1. HOTEL, WATER
  • 2. PLANE, JET
  • 3. TUNDRA, PENGUIN
  • 4. PLANE, JET
  • 5. WATER, SKY
  • 6. BOATS, WATER
  • 2. SKY, JET
  • 3. SKY, CLOUDS
  • 4. SKY, MOUNTAIN
  • 5. PLANE, JET
  • 1. PEOPLE, TREE

GM−LDA:

2 3 4 5 6

slide-21
SLIDE 21

Text Based Retrieval

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision Corr−LDA GM−Mixture GM−LDA

candy

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision Corr−LDA GM−Mixture GM−LDA

sunset

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision Corr−LDA GM−Mixture GM−LDA

people & fish

slide-22
SLIDE 22

Text Based Retrieval (Qual.)

Candy Sunset People & Fish

slide-23
SLIDE 23

Conclusion

If conditionals are needed, then model them explicitly