 
              Region- -Based Image Retrieval with Based Image Retrieval with Region High Level Semantics High Level Semantics Ying Liu, Dengsheng Zhang and Guojun Guojun Lu Lu Ying Liu, Dengsheng Zhang and Gippsland School of Info Tech, Monash University, Gippsland School of Info Tech, Monash University, Churchill, Victoria, 3842 Churchill, Victoria, 3842 {dengsheng.zhang { dengsheng.zhang, , guojun.lu}@infotech.monash.edu.au guojun.lu}@infotech.monash.edu.au www.infotech.monash.edu
Outline Outline • The Problem • The Problem • Content- -based Image Retrieval based Image Retrieval— —CBIR CBIR • Content • Semantic Gap • Semantic Gap • Low Level Image Features • Low Level Image Features • Learning Image Semantics Using Decision Tree • Learning Image Semantics Using Decision Tree • Performance Test • Performance Test • Integrate with Google— —SIEVE SIEVE • Integrate with Google • Experiments and Results • Experiments and Results • Conclusions • Conclusions www.infotech.monash.edu 2
The Problem The Problem • We are in a digital world, and we are We are in a digital world, and we are • inundated with digital images. inundated with digital images. • How to organize large image database to How to organize large image database to • facility convenient search. facility convenient search. • How to find required images becomes a How to find required images becomes a • headache for Internet users. headache for Internet users. • It It’ ’s a gold mining issue. s a gold mining issue. • www.infotech.monash.edu 3
The Problem Problem The Find similar images from database Find similar images from database Tiger www.infotech.monash.edu 4
Challenges Challenges • Images are not as structured as text Images are not as structured as text • documents. documents. • Metadata description of an image is not Metadata description of an image is not • enough. enough. • Human description of image content is Human description of image content is • subjective. subjective. ? www.infotech.monash.edu 5
Content- -based based Image Image Retrieval Retrieval (CBIR) (CBIR) Content • Represent images with content features. Represent images with content features. • • Color: histogram, dominant color. Color: histogram, dominant color. • • Shape: moments, Fourier descriptors, Shape: moments, Fourier descriptors, • scale space method. scale space method. • Texture: statistical method, fractal Texture: statistical method, fractal • method, spectral method. method, spectral method. • Region: blob, arbitrary, block. Region: blob, arbitrary, block. • www.infotech.monash.edu 6
CBIR— —State State- -of of- -the the- -Art Art CBIR • Limited success in a number of specific Limited success in a number of specific • domain. domain. • Industrial object recognition. Industrial object recognition. • • CAD and other design database CAD and other design database • management. management. • Museum visual document management Museum visual document management • • Trademark retrieval. Trademark retrieval. • • No commercial CBIR system for WWW. No commercial CBIR system for WWW. • www.infotech.monash.edu 7
Challenges— —Semantic Gap Semantic Gap Challenges • Conventional content- -based image retrieval (CBIR) systems put visual based image retrieval (CBIR) systems put visual • Conventional content features ahead of textual information. features ahead of textual information. • However, there is a gap between visual features and semantic features tures • However, there is a gap between visual features and semantic fea (textual information) which cannot be closed easily. (textual information) which cannot be closed easily. Courtesy of Md. Monirul Islam www.infotech.monash.edu 8
Cause of the Semantic Gap Cause of the Semantic Gap • Low level features are usually used Low level features are usually used • individually. individually. • Single type of features cannot describe Single type of features cannot describe • image completely. image completely. • Spatial information is usually ignored. Spatial information is usually ignored. • • Images are usually treated globally while Images are usually treated globally while • users are usually interested in objects users are usually interested in objects instead whole image. instead whole image. www.infotech.monash.edu 9
Narrow Down the Semantic Gap Narrow Down the Semantic Gap • Divide an image into objects/regions. Divide an image into objects/regions. • • Describe objects/regions using multiple Describe objects/regions using multiple • type of features. type of features. • Learn semantic concepts from large Learn semantic concepts from large • number of region samples. number of region samples. • Use the learned semantic concepts to Use the learned semantic concepts to • describe images. describe images. www.infotech.monash.edu 10
Image Segmentation Segmentation Image • Segment images into regions using JSEG technique. • Segment images into regions using JSEG technique. ( Y. Deng and B. S. ) ( , IEEE PAMI, 2001 ) Y. Deng and B. S. Manjunath Manjunath, IEEE PAMI, 2001 • JSEG segments images using a combination of • JSEG segments images using a combination of color color and texture features. and texture features. www.infotech.monash.edu 11
Region Representation Representation— —Color Color Region Features Features • Represent regions using their dominant Represent regions using their dominant color color • in HSV space. in HSV space. dominant color HSV Histogram Segmentation Regions and their dominant colors www.infotech.monash.edu 12
Gabor Filters Filters Gabor ∑∑ = − − ψ * ( , ) ( , ) ( , ) G x y I x s y t s t mn s t 2 2 1 1 x y ψ = − + ⋅ π ( , ) exp[ ( )] exp( 2 ) x y j Wx πσ σ σ σ 2 2 2 2 x y x y ~ ~ − ψ = ψ m ( , ) ( , ) x y a x y mn ~ = − θ + θ m ( cos sin ) x a x y ~ − = − θ + θ m ( sin cos ) y a x y www.infotech.monash.edu 13
Gabor Filters Filters Gabor www.infotech.monash.edu 14
Region Representation— —Gabor Gabor Region Representation Texture Features Texture Features • Gabor • Gabor texture features are obtained by computing the mean and texture features are obtained by computing the mean and standard deviation of each filtered image. standard deviation of each filtered image. = ∑∑ = = ( , ) | ( , ) | , 1 ,..., ; 1 ,...., E m n G x y m M n N mn x y ( , ) E m n µ = mn ´ P Q ∑∑ − µ 2 (| ( , ) | ) G x y mn mn σ = x y mn ´ P Q (B. S. Manjunath and W. Y. Ma, IEEE PAMI, 1996) www.infotech.monash.edu 15
Region Similarity Similarity Measurement Measurement — — Region Earth Mover Distance (EMD) Earth Mover Distance (EMD) • EMD is a distance modelled with the traditional transportation • EMD is a distance modelled with the traditional transportation problem which problem which is solved using the linear programming optimisation. ( is solved using the linear programming optimisation. ( Y. Rubner, ICCV98 Y. Rubner, ICCV98 ) ) = = = = i i j j {( , ) | 1 ,..., } arg {( , ) | 1 ,..., } Given a query image I R w i m and a t et image I R w j n Q Q Q T T T i j i j arg R and R are regio n of the q uery and t he t et image; w and w are the w eights of the region s Q T Q T m n ∑∑ v d ij ij = = = i 1 j 1 ( , ) Minimize EMD I I Q T m n ∑∑ v ij = = 1 1 i j ≥ 0; v ij ∑ n ≤ ≤ ≤ i , 1 ; v w i m = ij Q j 1 Subject to ∑ m ≤ ≤ ≤ j , 1 ; v w j n = ij T 1 i ∑ ∑ ∑ ∑ m n m n = i j min( , ) v w w = = ij = Q = T 1 1 1 1 i j i j i j d is the Euclidean distance between region and R R ij Q T www.infotech.monash.edu 16
Learning Image Image Semantics Semantics Using Using Learning Decision Tree Tree Decision • Given a set of training samples described by a • Given a set of training samples described by a set of input attributes, decision tree classifies set of input attributes, decision tree classifies the samples based on the values of the given the samples based on the values of the given attributes. attributes. • A decision tree (DT) is obtained by recursively • A decision tree (DT) is obtained by recursively splitting the training data into different subsets splitting the training data into different subsets according to the possible values of the selected according to the possible values of the selected attribute, until data samples in each subset attribute, until data samples in each subset belong to same class. belong to same class. • DT is very close to human reasoning, and is the • DT is very close to human reasoning, and is the only machine learning tool which can produce only machine learning tool which can produce human comprehensible rules. human comprehensible rules. www.infotech.monash.edu 17
Recommend
More recommend