Learning Object Categories from Googles Image Search R. Fergus et - PowerPoint PPT Presentation

Learning Object Categories from Google’s Image Search R. Fergus et al R. Fergus et al Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio

Outline Motivation “Bag of words” Model Approaches (pLSA, ABS-pLSA, TSI-pLSA) Dataset Experiment Experiment Conclusion 1 jxiao@cs.utsa.edu

Motivation Current approaches of object categorization require manual labeled dataset as training set. Collecting data is time-consuming, involved in numerous human work. numerous human work. Finding good examples is another concern. 2 jxiao@cs.utsa.edu

Bag of Words Model Of all the sensory impressions proceeding to China is forecasting a trade surplus of $90bn the brain, the visual experiences are the (£51bn) to $100bn this year, a threefold dominant ones. Our perception of the world increase on 2004's $32bn. The Commerce around us is based essentially on the Ministry said the surplus would be created by messages that reach the brain from our eyes. a predicted 30% jump in exports to $750bn, For a long time it was thought that the retinal compared with a 18% rise in imports to sensory, brain, China, trade, image was transmitted point by point to visual $660bn. The figures are likely to further centers in the brain; the cerebral cortex was a annoy the US, which has long argued that visual, perception, surplus, commerce, movie screen, so to speak, upon which the China's exports are unfairly helped by a retinal, cerebral cortex, exports, imports, US, image in the eye was projected. Through the deliberately undervalued yuan. Beijing discoveries of Hubel and Wiesel we now discoveries of Hubel and Wiesel we now agrees the surplus is too high, but says the agrees the surplus is too high, but says the eye, cell, optical eye, cell, optical yuan, bank, domestic, yuan, bank, domestic, know that behind the origin of the visual yuan is only one factor. Bank of China nerve, image foreign, increase, perception in the brain there is a considerably governor Zhou Xiaochuan said the country Hubel, Wiesel trade, value more complicated course of events. By also needed to do more to boost domestic following the visual impulses along their path demand so more goods stayed within the to the various cell layers of the optical cortex, country. China increased the value of the Hubel and Wiesel have been able to yuan against the dollar by 2.1% in July and demonstrate that the message about the permitted it to trade within a narrow band, but image falling on the retina undergoes a step- the US wants the yuan to be allowed to trade wise analysis in a system of nerve cells freely. However, Beijing has made it clear that it will take its time and tread carefully before stored in columns. In this system each cell allowing the yuan to rise further in value. has its specific function and is responsible for a specific detail in the pattern of the retinal image. Slide credit: Rob Fergus 3 jxiao@cs.utsa.edu

Bag of Words Model LSA: U and V are orthonormal matrices A singular value decomposition(SVD) process pLSA 4 jxiao@cs.utsa.edu

Bag of Words Model -- pLSA D: set of documents W: visual words Z: topics Latent variable z is associate with w and d. Matrix N M ×N :co-occurrence of words and doc N (w,d) : the number of word w appears in document d. 5 jxiao@cs.utsa.edu

Bag of Words Model – pLSA (Cont.) co-occurrence of words within a topic density of topic on a given document 6 jxiao@cs.utsa.edu

Bag of Words Model – pLSA (Cont.) topic specific word distribution document specific mixing proportion 7 jxiao@cs.utsa.edu

Bag of Words Model – pLSA (Cont.) 8 jxiao@cs.utsa.edu

Bag of Words Model – pLSA (Cont.) Calculating by EM E step: M step: 9 jxiao@cs.utsa.edu

Bag of Words Model (Cont.) Object Object Bag of words Bag of words Slide credit: Rob Fergus 10 jxiao@cs.utsa.edu

Bag of Words Model (Cont.) 1. 1. Representation Representation 2. 2. codewords dictionary codewords dictionary feature detection & representation & representation image representation 3. 3. Slide credit: Rob Fergus jxiao@cs.utsa.edu

Approach ABS-pLSA Quantize the location within the image into one of X bins Use Use Instead of 12 jxiao@cs.utsa.edu

Approach (Cont.) TSI-pLSA Introducing latent variable, c, represents the centriod of the object. foreground bins background bin background bin 13 jxiao@cs.utsa.edu

Approach (Cont.) 14 jxiao@cs.utsa.edu

Datasets PT: prepared training set, manually gathered P: prepared test set G: raw download data from Google image. Good image: good examples, related to keyword category keyword category Intermediate images: related to keyword category, low quality than good image Junk images: totally unrelated to the keyword category 15 jxiao@cs.utsa.edu

Datasets (Cont.) V: Google validation set. Assume the images from first pages are positive examples. Cross language collections 16 jxiao@cs.utsa.edu

Datasets (Cont.) 17 jxiao@cs.utsa.edu

Datasets (Cont.) statistics 18 jxiao@cs.utsa.edu

Experiments Region detectors: Convert to grayscale Resize to a moderate size Detect region Represent by SIFT descriptor Quantize descriptor vector 19 jxiao@cs.utsa.edu

Experiments – region detector Region detectors: Kadir & Brady saliency operator Multi-scale Harris detector Difference of Gaussian Edge based operator 20 jxiao@cs.utsa.edu

Experiments (Cont.) 21 jxiao@cs.utsa.edu

Experiments (Cont.) Red: pLSA Green: ABS-pLSA Blue: TSI-pLSA Solid line: performance of automatically chosen automatically chosen topic within model Dashed line: performance of best topic within model 24 jxiao@cs.utsa.edu

Discussion Limited categories Prior knowledge about number of categories Image background Similar visual word 25 jxiao@cs.utsa.edu

Conclusion Introduce spatial information in pLSA. Learn object category by category name. 26 jxiao@cs.utsa.edu

Thank you! 27 jxiao@cs.utsa.edu

Learning Object Categories from Googles Image Search R. Fergus et - PowerPoint PPT Presentation

Learning Object Categories from Googles Image Search R. Fergus et al R. Fergus et al Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio Outline Motivation Bag of words Model Approaches (pLSA, ABS-pLSA,

Websites from Presentation Search Engines Google https://www.google.com/ Google Scholar

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

EE 6882 Visual Search Engine Lec. 1: Introduction tinyeye, photo copy search Web image search

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at Google JBD, Google (@rakyll)

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Combinatory Categorial Grammar (CCG) Categories Categories = types Primitive categories

From image classification to object detection Image classification Object detection Image source

BRAINJAR HOW GOOGLE THINKS AND DISPELLING 3 GOOGLE MYTHS (& 6 TIPS!) BRAINJAR HOW GOOGLE

Containers At Scale At Google, the Google Cloud Platform and Beyond Joe Beda jbeda@google.com

You Can and Should Make Hardware QCon/SF 2017 Image credit: Google Image credit: Foldscope

Google Slides Opening a New Slide To open a new Google Slide, navigate to your Google Drive and

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

What is a Chair? The object The texture The object The texture The scene The object

Image as a single label king crab Image Source: ImageNet Image as an object set Man

CS/COE 1520 pitt.edu/~ach54/cs1520 Authentication Access control vs. authentication We

Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric

Efficient Pattern Recognition Algorithm Including a Fast Retina Keypoint FPGA Implementation

Modeling the Visual System Dr. James A. Bednar jbednar@inf.ed.ac.uk

Ocular Ocular Pointers and pitfalls in: trauma trauma Corneal injuries Globe

Interaco Homem-Mquina 2- Os Humanos Pedro Campos dme.uma.pt/pcampos pcampos@uma.pt O

Biometric T echnologies Dr. Issa Traore ECE Department, University of Victoria Information

t t s

Sambuz

Useful Links

Newsletter

Mail Us

Learning Object Categories from Googles Image Search R. Fergus et - PowerPoint PPT Presentation

Learning Object Categories from Googles Image Search R. Fergus et al R. Fergus et al Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio Outline Motivation Bag of words Model Approaches (pLSA, ABS-pLSA,

Websites from Presentation Search Engines Google https://www.google.com/ Google Scholar

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

EE 6882 Visual Search Engine Lec. 1: Introduction tinyeye, photo copy search Web image search

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at Google JBD, Google (@rakyll)

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Combinatory Categorial Grammar (CCG) Categories Categories = types Primitive categories

From image classification to object detection Image classification Object detection Image source

BRAINJAR HOW GOOGLE THINKS AND DISPELLING 3 GOOGLE MYTHS (&amp; 6 TIPS!) BRAINJAR HOW GOOGLE

Containers At Scale At Google, the Google Cloud Platform and Beyond Joe Beda jbeda@google.com

You Can and Should Make Hardware QCon/SF 2017 Image credit: Google Image credit: Foldscope

Google Slides Opening a New Slide To open a new Google Slide, navigate to your Google Drive and

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

What is a Chair? The object The texture The object The texture The scene The object

Image as a single label king crab Image Source: ImageNet Image as an object set Man

CS/COE 1520 pitt.edu/~ach54/cs1520 Authentication Access control vs. authentication We

Parts-based Concept Detectors Dong-Qing Zhang, Shih-Fu Chang, Winston Hsu, Lexin Xie, Eric

Efficient Pattern Recognition Algorithm Including a Fast Retina Keypoint FPGA Implementation

Modeling the Visual System Dr. James A. Bednar jbednar@inf.ed.ac.uk

Ocular Ocular Pointers and pitfalls in: trauma trauma Corneal injuries Globe

Interaco Homem-Mquina 2- Os Humanos Pedro Campos dme.uma.pt/pcampos pcampos@uma.pt O

Biometric T echnologies Dr. Issa Traore ECE Department, University of Victoria Information

t t s

Sambuz

Useful Links

Newsletter

Mail Us

BRAINJAR HOW GOOGLE THINKS AND DISPELLING 3 GOOGLE MYTHS (& 6 TIPS!) BRAINJAR HOW GOOGLE