Learning Object Categories from Googles Image Search R. Fergus et - - PowerPoint PPT Presentation

learning object categories from google s image search r
SMART_READER_LITE
LIVE PREVIEW

Learning Object Categories from Googles Image Search R. Fergus et - - PowerPoint PPT Presentation

Learning Object Categories from Googles Image Search R. Fergus et al R. Fergus et al Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio Outline Motivation Bag of words Model Approaches (pLSA, ABS-pLSA,


slide-1
SLIDE 1

Learning Object Categories from Google’s Image Search

  • R. Fergus et al
  • R. Fergus et al

Present by Jie Xiao

  • Dept. of Computer Science
  • Univ. of Texas at San Antonio
slide-2
SLIDE 2

Outline Motivation “Bag of words” Model Approaches (pLSA, ABS-pLSA, TSI-pLSA) Dataset Experiment

jxiao@cs.utsa.edu 1

Experiment Conclusion

slide-3
SLIDE 3

Motivation Current approaches of object categorization require manual labeled dataset as training set. Collecting data is time-consuming, involved in numerous human work.

jxiao@cs.utsa.edu 2

numerous human work. Finding good examples is another concern.

slide-4
SLIDE 4

Bag of Words Model

Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted point by point to visual centers in the brain; the cerebral cortex was a movie screen, so to speak, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we now

sensory, brain, visual, perception, retinal, cerebral cortex, eye, cell, optical

China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures are likely to further annoy the US, which has long argued that China's exports are unfairly helped by a deliberately undervalued yuan. Beijing agrees the surplus is too high, but says the

China, trade, surplus, commerce, exports, imports, US, yuan, bank, domestic,

jxiao@cs.utsa.edu 3

discoveries of Hubel and Wiesel we now know that behind the origin of the visual perception in the brain there is a considerably more complicated course of events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step- wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image.

eye, cell, optical nerve, image Hubel, Wiesel

agrees the surplus is too high, but says the yuan is only one factor. Bank of China governor Zhou Xiaochuan said the country also needed to do more to boost domestic demand so more goods stayed within the

  • country. China increased the value of the

yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade

  • freely. However, Beijing has made it clear that

it will take its time and tread carefully before allowing the yuan to rise further in value.

yuan, bank, domestic, foreign, increase, trade, value

Slide credit: Rob Fergus

slide-5
SLIDE 5

Bag of Words Model LSA: U and V are orthonormal matrices A singular value decomposition(SVD) process pLSA

jxiao@cs.utsa.edu 4

slide-6
SLIDE 6

Bag of Words Model -- pLSA D: set of documents W: visual words Z: topics Latent variable z is associate with w and d. Matrix NM ×N :co-occurrence of words and doc N (w,d) : the number of word w appears in document d.

jxiao@cs.utsa.edu 5

slide-7
SLIDE 7

Bag of Words Model – pLSA (Cont.)

co-occurrence of words within a topic density of topic on a given document

jxiao@cs.utsa.edu 6

slide-8
SLIDE 8

Bag of Words Model – pLSA (Cont.) topic specific word distribution document specific mixing proportion

jxiao@cs.utsa.edu 7

slide-9
SLIDE 9

Bag of Words Model – pLSA (Cont.)

jxiao@cs.utsa.edu 8

slide-10
SLIDE 10

Bag of Words Model – pLSA (Cont.) Calculating by EM E step: M step:

jxiao@cs.utsa.edu 9

slide-11
SLIDE 11

Bag of Words Model (Cont.)

Object Object Bag of words Bag of words

jxiao@cs.utsa.edu 10 Slide credit: Rob Fergus

slide-12
SLIDE 12

Bag of Words Model (Cont.)

feature detection & representation

codewords dictionary codewords dictionary

Representation Representation

1. 1. 2. 2.

& representation image representation

3. 3.

Slide credit: Rob Fergus jxiao@cs.utsa.edu

slide-13
SLIDE 13

Approach ABS-pLSA Quantize the location within the image into

  • ne of X bins

Use Use Instead of

jxiao@cs.utsa.edu 12

slide-14
SLIDE 14

Approach (Cont.) TSI-pLSA Introducing latent variable, c, represents the centriod of the object. foreground bins background bin background bin

jxiao@cs.utsa.edu 13

slide-15
SLIDE 15

Approach (Cont.)

jxiao@cs.utsa.edu 14

slide-16
SLIDE 16

Datasets PT: prepared training set, manually gathered P: prepared test set G: raw download data from Google image.

Good image: good examples, related to keyword category keyword category Intermediate images: related to keyword category, low quality than good image Junk images: totally unrelated to the keyword category

jxiao@cs.utsa.edu 15

slide-17
SLIDE 17

Datasets (Cont.) V: Google validation set.

Assume the images from first pages are positive examples. Cross language collections

jxiao@cs.utsa.edu 16

slide-18
SLIDE 18

Datasets (Cont.)

jxiao@cs.utsa.edu 17

slide-19
SLIDE 19

Datasets (Cont.) statistics

jxiao@cs.utsa.edu 18

slide-20
SLIDE 20

Experiments Region detectors:

Convert to grayscale Resize to a moderate size Detect region Represent by SIFT descriptor Quantize descriptor vector

jxiao@cs.utsa.edu 19

slide-21
SLIDE 21

Experiments – region detector Region detectors:

Kadir & Brady saliency operator Multi-scale Harris detector Difference of Gaussian Edge based operator

jxiao@cs.utsa.edu 20

slide-22
SLIDE 22

Experiments (Cont.)

jxiao@cs.utsa.edu 21

slide-23
SLIDE 23

Experiments (Cont.)

jxiao@cs.utsa.edu 22

slide-24
SLIDE 24

Experiments (Cont.)

jxiao@cs.utsa.edu 23

slide-25
SLIDE 25

Experiments (Cont.)

Red: pLSA Green: ABS-pLSA Blue: TSI-pLSA Solid line: performance of automatically chosen automatically chosen topic within model Dashed line: performance of best topic within model

jxiao@cs.utsa.edu 24

slide-26
SLIDE 26

Discussion Limited categories Prior knowledge about number of categories Image background Similar visual word

jxiao@cs.utsa.edu 25

slide-27
SLIDE 27

Conclusion Introduce spatial information in pLSA. Learn object category by category name.

jxiao@cs.utsa.edu 26

slide-28
SLIDE 28

Thank you!

jxiao@cs.utsa.edu 27