Region- -Based Image Retrieval with Based Image Retrieval with - - PowerPoint PPT Presentation

region based image retrieval with based image retrieval
SMART_READER_LITE
LIVE PREVIEW

Region- -Based Image Retrieval with Based Image Retrieval with - - PowerPoint PPT Presentation

Region- -Based Image Retrieval with Based Image Retrieval with Region High Level Semantics High Level Semantics Ying Liu, Dengsheng Zhang and Guojun Guojun Lu Lu Ying Liu, Dengsheng Zhang and Gippsland School of Info Tech, Monash


slide-1
SLIDE 1

www.infotech.monash.edu

Region Region-

  • Based Image Retrieval with

Based Image Retrieval with High Level Semantics High Level Semantics

Ying Liu, Dengsheng Zhang and Ying Liu, Dengsheng Zhang and Guojun Guojun Lu Lu Gippsland School of Info Tech, Monash University, Gippsland School of Info Tech, Monash University, Churchill, Victoria, 3842 Churchill, Victoria, 3842

{ {dengsheng.zhang dengsheng.zhang, , guojun.lu}@infotech.monash.edu.au guojun.lu}@infotech.monash.edu.au

slide-2
SLIDE 2

www.infotech.monash.edu

2

Outline Outline

  • The Problem

The Problem

  • Content

Content-

  • based Image Retrieval

based Image Retrieval— —CBIR CBIR

  • Semantic Gap

Semantic Gap

  • Low Level Image Features

Low Level Image Features

  • Learning Image Semantics Using Decision Tree

Learning Image Semantics Using Decision Tree

  • Performance Test

Performance Test

  • Integrate with Google

Integrate with Google— —SIEVE SIEVE

  • Experiments and Results

Experiments and Results

  • Conclusions

Conclusions

slide-3
SLIDE 3

www.infotech.monash.edu

3

The Problem The Problem

  • We are in a digital world, and we are

We are in a digital world, and we are inundated with digital images. inundated with digital images.

  • How to organize large image database to

How to organize large image database to facility convenient search. facility convenient search.

  • How to find required images becomes a

How to find required images becomes a headache for Internet users. headache for Internet users.

  • It

It’ ’s a gold mining issue. s a gold mining issue.

slide-4
SLIDE 4

www.infotech.monash.edu

4

The The Problem Problem

Find similar images from database Find similar images from database Tiger

slide-5
SLIDE 5

www.infotech.monash.edu

5

Challenges Challenges

  • Images are not as structured as text

Images are not as structured as text documents. documents.

  • Metadata description of an image is not

Metadata description of an image is not enough. enough.

  • Human description of image content is

Human description of image content is subjective. subjective.

?

slide-6
SLIDE 6

www.infotech.monash.edu

6

Content Content-

  • based

based Image Image Retrieval Retrieval (CBIR) (CBIR)

  • Represent images with content features.

Represent images with content features.

  • Color: histogram, dominant color.

Color: histogram, dominant color.

  • Shape: moments, Fourier descriptors,

Shape: moments, Fourier descriptors, scale space method. scale space method.

  • Texture: statistical method, fractal

Texture: statistical method, fractal method, spectral method. method, spectral method.

  • Region: blob, arbitrary, block.

Region: blob, arbitrary, block.

slide-7
SLIDE 7

www.infotech.monash.edu

7

CBIR CBIR— —State State-

  • of
  • f-
  • the

the-

  • Art

Art

  • Limited success in a number of specific

Limited success in a number of specific domain. domain.

  • Industrial object recognition.

Industrial object recognition.

  • CAD and other design database

CAD and other design database management. management.

  • Museum visual document management

Museum visual document management

  • Trademark retrieval.

Trademark retrieval.

  • No commercial CBIR system for WWW.

No commercial CBIR system for WWW.

slide-8
SLIDE 8

www.infotech.monash.edu

8

Challenges Challenges— —Semantic Gap Semantic Gap

  • Conventional content

Conventional content-

  • based image retrieval (CBIR) systems put visual

based image retrieval (CBIR) systems put visual features ahead of textual information. features ahead of textual information.

  • However, there is a gap between visual features and semantic fea

However, there is a gap between visual features and semantic features tures (textual information) which cannot be closed easily. (textual information) which cannot be closed easily.

Courtesy of Md. Monirul Islam

slide-9
SLIDE 9

www.infotech.monash.edu

9

Cause of the Semantic Gap Cause of the Semantic Gap

  • Low level features are usually used

Low level features are usually used individually. individually.

  • Single type of features cannot describe

Single type of features cannot describe image completely. image completely.

  • Spatial information is usually ignored.

Spatial information is usually ignored.

  • Images are usually treated globally while

Images are usually treated globally while users are usually interested in objects users are usually interested in objects instead whole image. instead whole image.

slide-10
SLIDE 10

www.infotech.monash.edu

10

Narrow Down the Semantic Gap Narrow Down the Semantic Gap

  • Divide an image into objects/regions.

Divide an image into objects/regions.

  • Describe objects/regions using multiple

Describe objects/regions using multiple type of features. type of features.

  • Learn semantic concepts from large

Learn semantic concepts from large number of region samples. number of region samples.

  • Use the learned semantic concepts to

Use the learned semantic concepts to describe images. describe images.

slide-11
SLIDE 11

www.infotech.monash.edu

11

Image Image Segmentation Segmentation

  • Segment images into regions using JSEG technique.

Segment images into regions using JSEG technique.

( (Y. Deng and B. S.

  • Y. Deng and B. S. Manjunath

Manjunath, IEEE PAMI, 2001 , IEEE PAMI, 2001)

)

  • JSEG segments images using a combination of

JSEG segments images using a combination of color color and texture features. and texture features.

slide-12
SLIDE 12

www.infotech.monash.edu

12

Region Region Representation Representation— —Color Color Features Features

  • Represent regions using their dominant

Represent regions using their dominant color color in HSV space. in HSV space. Regions and their dominant colors

Segmentation HSV Histogram

dominant color

slide-13
SLIDE 13

www.infotech.monash.edu

13

Gabor Gabor Filters Filters

∑∑

− − =

s t mn

t s t y s x I y x G ) , ( ) , ( ) , (

*

ψ

) ~ , ~ ( ) , ( y x a y x

m mn

ψ ψ

=

) 2 exp( )] ( 2 1 exp[ 2 1 ) , (

2 2 2 2

Wx j y x y x

y x y x

π σ σ σ πσ ψ ⋅ + − =

) cos sin ( ~ ) sin cos ( ~ θ θ θ θ y x a y y x a x

m m

+ − = + =

− −

slide-14
SLIDE 14

www.infotech.monash.edu

14

Gabor Gabor Filters Filters

slide-15
SLIDE 15

www.infotech.monash.edu

15

Region Representation Region Representation— —Gabor Gabor Texture Features Texture Features

N n M m y x G n m E

x y mn

,...., 1 ; ,..., 1 , | ) , ( | ) , ( = = =∑∑

Q P y x G Q P n m E

x y mn mn mn mn

´ − = ´ =

∑∑

2

) | ) , ( (| ) , ( µ σ µ

  • Gabor

Gabor texture features are obtained by computing the mean and texture features are obtained by computing the mean and standard deviation of each filtered image. standard deviation of each filtered image. (B. S. Manjunath and W. Y. Ma, IEEE PAMI, 1996)

slide-16
SLIDE 16

www.infotech.monash.edu

16

Region Region Similarity Similarity Measurement Measurement — — Earth Mover Distance (EMD) Earth Mover Distance (EMD)

  • EMD is a distance modelled with the traditional transportation

EMD is a distance modelled with the traditional transportation problem which problem which is solved using the linear programming optimisation. is solved using the linear programming optimisation. ( (Y. Rubner, ICCV98

  • Y. Rubner, ICCV98)

)

Subject to

1 1 1 1

( , )

m n ij ij i j Q T m n ij i j

v d EMD I I v

= = = =

=

∑∑ ∑∑

1 1 1 1 1 1

0; , 1 ; , 1 ; min( , )

ij n i ij Q j m j ij T i m n m n i j ij Q T i j i j

v v w i m v w j n v w w

= = = = = =

≥ ≤ ≤ ≤ ≤ ≤ ≤ =

∑ ∑ ∑ ∑ ∑ ∑

Minimize

} ,..., 1 | ) , {( arg n j w R I image et t a and

j T j T T

= =

} ,..., 1 | ) , {( m i w R I image query a Given

i Q i Q Q

= =

s the region eights of are the w and w w et image; he t uery and t n of the q are regio and R R

j T i Q j T i Q

arg

ij

d is the Euclidean distance between region

i Q

R

j T

R and

slide-17
SLIDE 17

www.infotech.monash.edu

17

Learning Learning Image Image Semantics Semantics Using Using Decision Decision Tree Tree

  • Given a set of training samples described by a

Given a set of training samples described by a set of input attributes, decision tree classifies set of input attributes, decision tree classifies the samples based on the values of the given the samples based on the values of the given attributes. attributes.

  • A decision tree (DT) is obtained by recursively

A decision tree (DT) is obtained by recursively splitting the training data into different subsets splitting the training data into different subsets according to the possible values of the selected according to the possible values of the selected attribute, until data samples in each subset attribute, until data samples in each subset belong to same class. belong to same class.

  • DT is very close to human reasoning, and is the

DT is very close to human reasoning, and is the

  • nly machine learning tool which can produce
  • nly machine learning tool which can produce

human comprehensible rules. human comprehensible rules.

slide-18
SLIDE 18

www.infotech.monash.edu

18

Decision Decision Tree Tree

  • A decision tree consists of leaf nodes and non

A decision tree consists of leaf nodes and non-

  • leaf nodes.

leaf nodes.

  • Each leaf node of the decision tree represents a decision

Each leaf node of the decision tree represents a decision (outcome) whereas each non (outcome) whereas each non-

  • leaf node corresponds to an

leaf node corresponds to an input attribute with each branch being a possible value of input attribute with each branch being a possible value of the attribute. the attribute.

  • Given a decision tree generated from the training samples, a

Given a decision tree generated from the training samples, a new data instance can be classified by starting at the root new data instance can be classified by starting at the root node of the decision tree, testing the attribute specified by node of the decision tree, testing the attribute specified by this node and moving down the tree branch corresponding this node and moving down the tree branch corresponding to the value of the attribute. to the value of the attribute.

  • This process is then repeated until a leaf node (a decision)

This process is then repeated until a leaf node (a decision) is reached. is reached.

  • Once trained, a set of decision rules in

Once trained, a set of decision rules in ‘ ‘if if-

  • then

then’ ’ format can format can be derived for decision making. be derived for decision making.

slide-19
SLIDE 19

www.infotech.monash.edu

19

Decision Decision Tree Tree

  • DT is an intuitive top down approach.

DT is an intuitive top down approach.

  • DT is used by human being for day

DT is used by human being for day-

  • to

to-

  • day decision making.

day decision making.

slide-20
SLIDE 20

www.infotech.monash.edu

20

Learning Learning Mechanism Mechanism

slide-21
SLIDE 21

www.infotech.monash.edu

21

Semantic Templates Semantic Templates

  • 19 concepts are selected for training,

19 concepts are selected for training, 30 training 30 training sample regions are collected for every concept. sample regions are collected for every concept.

  • Semantic templates (ST) are generated as the

Semantic templates (ST) are generated as the centroid centroid of the low

  • f the low-
  • level features of the 30 sample

level features of the 30 sample regions. regions.

  • Using the

Using the STs STs, the continuous , the continuous-

  • valued

valued color color and and texture features are converted to texture features are converted to color color and texture and texture labels which are discrete in value. labels which are discrete in value.

  • The labels are used as discrete attribute values for

The labels are used as discrete attribute values for input to the decision tree. input to the decision tree.

  • Decision tree so derived is called DT

Decision tree so derived is called DT-

  • ST.

ST.

slide-22
SLIDE 22

www.infotech.monash.edu

22

Decision Criteria Decision Criteria

  • Start with 19x30=570 sample regions.

Start with 19x30=570 sample regions.

  • Every region is described by a set of three

Every region is described by a set of three discrete attributes: discrete attributes: color color label, texture label and label, texture label and color color-

  • texture (CT) label.

texture (CT) label.

  • Calculate the gain of each attribute A

Calculate the gain of each attribute Ai

i:

: Gain(A Gain(Ai

i) =

) = H H( (C C) ) – – E E(A (Ai

i)

)

Where H(C) is the entropy of the training dataset, Where H(C) is the entropy of the training dataset, E(A E(Ai

i) is the

) is the expect information of A expect information of Ai

i

slide-23
SLIDE 23

www.infotech.monash.edu

23

Learning Image Semantics Using DT Learning Image Semantics Using DT-

  • ST

ST

  • Gain(Color

Gain(Color) = 2.46 ) = 2.46

  • Gain(Texture

Gain(Texture) = 2.01 ) = 2.01

  • Gain(CT

Gain(CT) = 2.90 ) = 2.90

  • CT has highest information gain, it

CT has highest information gain, it requires least amount of information to requires least amount of information to split the dataset for decision making. split the dataset for decision making.

  • Therefore CT is selected as the root

Therefore CT is selected as the root attribute of the decision tree. attribute of the decision tree.

slide-24
SLIDE 24

www.infotech.monash.edu

24

Learning Image Semantics Using DT Learning Image Semantics Using DT-

  • ST

ST

  • The dataset is then divided into 19

The dataset is then divided into 19 subsets corresponding to each of the 19 subsets corresponding to each of the 19 possible values of CT. possible values of CT.

  • The 19 subsets so formed are then

The 19 subsets so formed are then pre pre-

  • pruned

pruned to remove the data samples to remove the data samples whose class probability is less than a whose class probability is less than a threshold threshold k (10%) k (10%). .

slide-25
SLIDE 25

www.infotech.monash.edu

25

Learning Image Semantics DT Learning Image Semantics DT-

  • ST

ST

  • Subsets corresponding to CT values 2,

Subsets corresponding to CT values 2, 4, 5, 7, 9, 10 and 11 are homogeneous. 4, 5, 7, 9, 10 and 11 are homogeneous.

  • Therefore, branches with CT labels 2, 4,

Therefore, branches with CT labels 2, 4, 5, 7, 9, 10, and 11 end up as leaf nodes 5, 7, 9, 10, and 11 end up as leaf nodes with the corresponding outcome: with the corresponding outcome: Blue Blue sky, Flower, Sunset, Firework, Ape fur, sky, Flower, Sunset, Firework, Ape fur, Eagle and Building, respectively. Eagle and Building, respectively.

slide-26
SLIDE 26

www.infotech.monash.edu

26

Learning Image Semantics Using DT Learning Image Semantics Using DT-

  • ST

ST

First half of the tree First half of the tree

slide-27
SLIDE 27

www.infotech.monash.edu

27

Learning Image Semantics Using DT Learning Image Semantics Using DT-

  • ST

ST

  • The subsets

The subsets corresponding to the rest of the CT corresponding to the rest of the CT labels labels are declared as non are declared as non-

  • leaf nodes.

leaf nodes.

  • These subsets require further splitting by using

These subsets require further splitting by using

  • ther attributes.
  • ther attributes.
  • Attribute CT is removed from the attribute set

Attribute CT is removed from the attribute set A A since it has been used. since it has been used.

  • The tree induction process is recursively

The tree induction process is recursively applied to each of the non applied to each of the non-

  • leaf node subsets.

leaf node subsets.

slide-28
SLIDE 28

www.infotech.monash.edu

28

Pre Pre-

  • pruning

pruning

  • During each repeat, the tree is pre

During each repeat, the tree is pre-

  • pruned

pruned to remove emove the data samples whose class probability the data samples whose class probability is less than a threshold is less than a threshold k k

slide-29
SLIDE 29

www.infotech.monash.edu

29

Post Post-

  • pruning

pruning

  • The tree so formed has leaf nodes labelled with

The tree so formed has leaf nodes labelled with ‘ ‘unknown unknown’ ’. .

  • However, a tree with too many

However, a tree with too many ‘ ‘unknown unknown’ ’

  • utcomes fails to classify many data instances.
  • utcomes fails to classify many data instances.
  • It is necessary to post

It is necessary to post-

  • prune the

prune the ‘ ‘unknown unknown’ ’ leaves. leaves.

  • Replacing all

Replacing all ‘ ‘unknown unknown’ ’ leaf nodes with the leaf nodes with the class label having the highest probability in the class label having the highest probability in the immediate parent node. immediate parent node.

slide-30
SLIDE 30

www.infotech.monash.edu

30

Post Post-

  • pruning

pruning

  • Color

Color=Not 8 is replaced with concept Tiger as it has highest probabil =Not 8 is replaced with concept Tiger as it has highest probability ity

  • Similarly, for Color = 8, Texture=Not 7&8, the subset comprise

Similarly, for Color = 8, Texture=Not 7&8, the subset comprises s concepts for which the probability of Tiger is the highest. T concepts for which the probability of Tiger is the highest. Therefore the herefore the ‘ ‘unknown unknown’ ’ for the branch Texture = Not (7,8) is changed to for the branch Texture = Not (7,8) is changed to ‘ ‘Tiger Tiger’ ’. .

slide-31
SLIDE 31

www.infotech.monash.edu

31

Final Decision Tree Final Decision Tree

slide-32
SLIDE 32

www.infotech.monash.edu

32

Derived Decision Rules Derived Decision Rules

  • This set of decision rules is the actual classifier

This set of decision rules is the actual classifier

  • r machine.
  • r machine.
slide-33
SLIDE 33

www.infotech.monash.edu

33

Experiments and Results Experiments and Results

  • Three different datasets are selected to

Three different datasets are selected to test the performance of the decision test the performance of the decision tree. tree.

  • TestSet

TestSet 1: 19x20=380 regions. 1: 19x20=380 regions.

  • TestSet

TestSet 2: 19x25=475 regions. 2: 19x25=475 regions.

  • TestSet

TestSet 3: 19x40=760 regions. 3: 19x40=760 regions.

slide-34
SLIDE 34

www.infotech.monash.edu

34

Test on Pre Test on Pre-

  • pruning Threshold

pruning Threshold

  • In all the test sets, the decision tree obtained gives best

In all the test sets, the decision tree obtained gives best performance when performance when k k = 0.1 . = 0.1 .

  • Pre

Pre-

  • pruning effectively prevents tree from fragmentation

pruning effectively prevents tree from fragmentation and noise. and noise.

slide-35
SLIDE 35

www.infotech.monash.edu

35

Performance Test with Post Performance Test with Post-

  • pruning

pruning

  • The

The ‘ ‘unknown unknown’ ’ outcomes are replaced with the highest

  • utcomes are replaced with the highest

probability class label. probability class label.

  • The average classification accuracy is improved by about

The average classification accuracy is improved by about 24.7% by using post 24.7% by using post-

  • pruning.

pruning.

slide-36
SLIDE 36

www.infotech.monash.edu

36

Test of False Positive Error Test of False Positive Error

  • 50 regions irrelevant to any of the 19

50 regions irrelevant to any of the 19 concepts are selected to test the false concepts are selected to test the false positive classification. positive classification.

  • Overall, 82% of the 50 irrelevant regions

Overall, 82% of the 50 irrelevant regions are categorized as irrelevant to the are categorized as irrelevant to the training concepts. training concepts.

  • False positive error is 18%.

False positive error is 18%.

slide-37
SLIDE 37

www.infotech.monash.edu

37

Comparison of DT Comparison of DT-

  • ST with ID3 and C4.5

ST with ID3 and C4.5

  • We

We compare the classification accuracy of DT compare the classification accuracy of DT-

  • ST with

ST with ID3 and C4.5. ID3 and C4.5.

  • ID3 and C4.5 are implemented using WEKA machine

ID3 and C4.5 are implemented using WEKA machine learning package. learning package. (

(http:// http://www.cs.waikato.ac.nz www.cs.waikato.ac.nz/ml/weka/ /ml/weka/) )

0.642 0.642 0.623 0.623 0.642 0.642 ID3( ID3(discret attribute value

discret attribute value)

) 0.704 0.704 0.743 0.743 0.768 0.768 C4.5( C4.5(discret attribute value

discret attribute value)

) 0.626 0.626 0.667 0.667 0.684 0.684 C4.5( C4.5(continue attribute value

continue attribute value)

) 0.707 0.707 0.758 0.758 0.768 0.768 DT DT-

  • ST

ST TestSet3 TestSet3 TestSet2 TestSet2 TestSet1 TestSet1 Classification accuracies for the Classification accuracies for the three datasets three datasets Tree induction method Tree induction method

slide-38
SLIDE 38

www.infotech.monash.edu

38

Retrieval Performance of DT Retrieval Performance of DT-

  • ST

ST

  • Commonly used Corel image database is

Commonly used Corel image database is selected for the image retrieval test. selected for the image retrieval test.

  • The Corel database consists of 5,000 images of

The Corel database consists of 5,000 images of 50 categories. 50 categories.

  • JSEG segmentation produces 29187 regions

JSEG segmentation produces 29187 regions from these images, an average of 5.84 regions from these images, an average of 5.84 regions per image. per image.

  • Average Precision (P) of 40 queries is obtained

Average Precision (P) of 40 queries is obtained at each level of Recall (R=10,20, at each level of Recall (R=10,20,… …100%). 100%).

  • The 40 query images are selected from the 50

The 40 query images are selected from the 50 categories excluding those with very abstract categories excluding those with very abstract category labels such as category labels such as ‘ ‘Australia Australia’ ’. .

slide-39
SLIDE 39

www.infotech.monash.edu

39

Performance of DT Performance of DT-

  • ST on Image

ST on Image Retrieval Retrieval

  • The proposed RBIR system supports query by

The proposed RBIR system supports query by regions. regions.

  • The user specify the dominant regions in the

The user specify the dominant regions in the query image. query image.

  • Using DT

Using DT-

  • ST, the system first finds a list of

ST, the system first finds a list of images that contain regions with the same images that contain regions with the same concept as that of the query. concept as that of the query.

  • Then, based on their low

Then, based on their low-

  • level color and texture

level color and texture features, these images are ranked according to features, these images are ranked according to their EMD distances to the query image. their EMD distances to the query image.

  • This is named as

This is named as retrieval with concepts retrieval with concepts. .

slide-40
SLIDE 40

www.infotech.monash.edu

40

Performance of DT Performance of DT-

  • ST on Image

ST on Image Retrieval Retrieval

slide-41
SLIDE 41

www.infotech.monash.edu

41

Retrieval Examples Retrieval Examples

slide-42
SLIDE 42

www.infotech.monash.edu

42

Integration of DT Integration of DT-

  • ST with Google

ST with Google— —SIEVE SIEVE

  • Currently, text based image search engines are

Currently, text based image search engines are not based on CBIR. not based on CBIR.

  • The textual description in existing search engines

The textual description in existing search engines may not capture image content. may not capture image content.

  • We propose to integrate the existing text

We propose to integrate the existing text-

  • based

based image search engine with visual features. image search engine with visual features.

  • A post

A post-

  • filtering algorithm is proposed, it is called

filtering algorithm is proposed, it is called SIEVE SIEVE— —Search Images Effectively through Visual Search Images Effectively through Visual Elimination Elimination. .

  • Practical fusion methods are also proposed to

Practical fusion methods are also proposed to integrate SIEVE with contemporary text integrate SIEVE with contemporary text-

  • based

based search engines. search engines.

slide-43
SLIDE 43

www.infotech.monash.edu

43

SIEVE SIEVE— —The Idea The Idea

  • The idea of using SIEVE is very similar to

The idea of using SIEVE is very similar to

  • bject classification done by a human being.
  • bject classification done by a human being.
  • First, objects of interest are roughly

First, objects of interest are roughly distinguished from other very different distinguished from other very different

  • bjects either manually or through certain
  • bjects either manually or through certain

hand tools ( hand tools (Google in this case Google in this case). ).

  • Then, the collected objects are subject to

Then, the collected objects are subject to visual inspection ( visual inspection (SIEVE in this case SIEVE in this case) to ) to confirm each object of interest from confirm each object of interest from unwanted objects. unwanted objects.

slide-44
SLIDE 44

www.infotech.monash.edu

44

SIEVE SIEVE— —The Approach The Approach

  • In our approach, text

In our approach, text-

  • based image

based image search results for a given query are search results for a given query are

  • btained at the first step.
  • btained at the first step.
  • SIEVE is then used to filter out those

SIEVE is then used to filter out those images which are semantically irrelevant images which are semantically irrelevant to the query. to the query.

slide-45
SLIDE 45

www.infotech.monash.edu

45

SIEVE SIEVE— —The System The System

slide-46
SLIDE 46

www.infotech.monash.edu

46

Experiment Experiment— —Image Collections Image Collections

  • To test the retrieval performance of SIEVE, 10

To test the retrieval performance of SIEVE, 10 queries are selected, including queries are selected, including mountain, mountain, beach, building, firework, flower, forest, snow, beach, building, firework, flower, forest, snow, sunset, tiger sunset, tiger and and sea sea. .

  • Google image search can return up to

Google image search can return up to thousands of images for a query, however, thousands of images for a query, however, users are usually only interested in the first few users are usually only interested in the first few pages. pages.

  • Therefore, for each query, the top 100 images

Therefore, for each query, the top 100 images are downloaded from the first 5 pages. are downloaded from the first 5 pages.

slide-47
SLIDE 47

www.infotech.monash.edu

47

Experiment Experiment— —Measurement Measurement

  • In Web image search scenario, it is not

In Web image search scenario, it is not known how many relevant images there known how many relevant images there are in the database for a given query. are in the database for a given query.

  • Bull

Bull’ ’s eye measurement is used. s eye measurement is used.

  • The bull

The bull’ ’s eye measures the retrieval s eye measures the retrieval precision among the top precision among the top K K retrieved retrieved images. images.

slide-48
SLIDE 48

www.infotech.monash.edu

48

Results Results— —Retrieval Accuracy Retrieval Accuracy

0.6 0.7 0.8 0.9 1 10 20 30 40 50 K P recision

SIEVE Google

Average retrieval precision for 10 image concepts

As more images are retrieved, SIEVE shows significant gain over Google.

slide-49
SLIDE 49

www.infotech.monash.edu

49

Results Results— —

Retrieval Examples Retrieval Examples

Above: Search result by Google Above: Search result by Google using query using query ‘ ‘Tiger Tiger’ ’ Left: Result by SIEVE using the Left: Result by SIEVE using the same query same query ‘ ‘Tiger Tiger’ ’

slide-50
SLIDE 50

www.infotech.monash.edu

50

Results Results— —

Retrieval Examples Retrieval Examples

Above: Search result by Google Above: Search result by Google using query using query ‘ ‘Snow Snow’ ’ Left: Result by SIEVE using the Left: Result by SIEVE using the same query same query ‘ ‘Snow Snow’ ’

slide-51
SLIDE 51

www.infotech.monash.edu

51

Results Results— —

Retrieval Examples Retrieval Examples

Above: Search result by Google Above: Search result by Google using query using query ‘ ‘Firework Firework’ ’ Right: Result by SIEVE using the Right: Result by SIEVE using the same query same query ‘ ‘Firework Firework’ ’

slide-52
SLIDE 52

www.infotech.monash.edu

52

Method of Integration Method of Integration

  • Scenario 1

Scenario 1— — SIEVE is installed on the server. SIEVE is installed on the server. User sends an image search query a Web User sends an image search query a Web

  • browser. Search engine returns the SIEVED
  • browser. Search engine returns the SIEVED

images to the user. images to the user.

  • Scenario 2

Scenario 2— — SIEVE is integrated with the Web SIEVE is integrated with the Web browser as a plug browser as a plug-

  • in. A user query is directed
  • in. A user query is directed

by the SIEVE to search engine. The returned by the SIEVE to search engine. The returned list is subject to SIEVE. list is subject to SIEVE.

  • Scenario 3

Scenario 3— — SIEVE is used as an application SIEVE is used as an application

  • software. SIEVE directs user query to various
  • software. SIEVE directs user query to various

Web image search engines. The returned lists Web image search engines. The returned lists from search engines are further SIEVED. from search engines are further SIEVED.

slide-53
SLIDE 53

www.infotech.monash.edu

53

Issues Issues

  • Significant time on image segmentation and

Significant time on image segmentation and computing image semantics. This can be solved by computing image semantics. This can be solved by indexing images semantics upfront in image search indexing images semantics upfront in image search engines. engines.

  • Although a limited concept set is used to test its

Although a limited concept set is used to test its performance, the decision tree can accommodate performance, the decision tree can accommodate more semantic concepts. more semantic concepts.

  • SIEVE can be applied more effectively if images in

SIEVE can be applied more effectively if images in database are first classified into categories. database are first classified into categories.

slide-54
SLIDE 54

www.infotech.monash.edu

54

Conclusions Conclusions

  • A semantic image retrieval using decision tree

A semantic image retrieval using decision tree learning has been proposed. learning has been proposed.

  • The key characteristics of the DT

The key characteristics of the DT-

  • ST are discrete

ST are discrete attribute inputs, pre attribute inputs, pre-

  • pruning and post

pruning and post-

  • pruning.

pruning.

  • DT produces human comprehensible rules which no

DT produces human comprehensible rules which no

  • ther machine learning tools can do.
  • ther machine learning tools can do.
  • Test results on concept learning show the proposed

Test results on concept learning show the proposed DT DT-

  • ST outperforms existing DT techniques.

ST outperforms existing DT techniques.

  • Experimental results on image retrieval show the

Experimental results on image retrieval show the DT DT-

  • ST is promising and gives better result than

ST is promising and gives better result than conventional CBIR technique. conventional CBIR technique.

  • Application of DT

Application of DT-

  • ST on WWW has also been tested

ST on WWW has also been tested and shown promising result. and shown promising result.

slide-55
SLIDE 55

www.infotech.monash.edu

55

Future Work Future Work

  • The system will be extended to learn a

The system will be extended to learn a much larger number of concepts for much larger number of concepts for practical semantic image retrieval. practical semantic image retrieval.

  • System will be improved to accept

System will be improved to accept keyword search to create a prototype keyword search to create a prototype Image Google. Image Google.

  • Various query interfaces will be

Various query interfaces will be investigated including relevance investigated including relevance feedback, spatial query etc. feedback, spatial query etc.

slide-56
SLIDE 56

www.infotech.monash.edu

56

Publications from This Research Publications from This Research

1.

  • Y. Liu, D. S. Zhang and G. Lu, "Region-Based Image Retrieval with High-Level Semantics

using Decision Tree Learning", Accepted in Pattern Recognition, Dec., 2007. 2.

  • Y. Liu, D. S. Zhang and G. Lu, "Narrowing Down The ‘Semantic Gap’ in Content-Based Image

Retrieval—A Survey", Pattern Recognition, 40(1):262-282, 2007. 3. Y Liu, D. S. Zhang, G. Lu, "SIEVE—Search Images Effectively through Visual Elimination", Lecture Notes in Computer Science, Springer, ISBN 978-3-540-73416-1, Vol. 4577, pp.381-390,

  • Intl. Workshop on Multimedia Content Analysis and Mining (MCAM07), Weihai, China, 30

June-1 July, 2007. 4.

  • Y. Liu, D. S. Zhang, G. Lu and A. Tan, "Integrating Semantic Templates with Decision Tree for

Image Semantic Learning", Lecture Notes in Computer Science, Springer Berlin/Heidelberg, ISSN: 0302-9743, (MMM07), 4352:185-195, 2007. 5.

  • Y. Liu, D. S. Zhang, G. Lu and W-Y. Ma, "Study on Texture Feature Extraction in Region-based

Image Retrieval System", In Proc. of IEEE International Conf. on Multimedia Modeling (MMM06), pp.264-271, Beijing, Jan.4-6, 2006. 6.

  • Y. Liu, D. S. Zhang, and G. Lu, "Deriving High-Level Concepts Using Fuzzy-ID3 Decision Tree

for Image Retrieval", In Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP05), pp.501-504, Philadelphia, PA, USA, March 18-23, 2005. 7.

  • Y. Liu, D. S. Zhang, G. Lu, and W.-Y. Ma, "Region-Based Image Retrieval with High-Level

Semantic Color Names", In Proc. of IEEE 11th International Multi-Media Modelling Conference (MMM05), pp.180-187, Melbourne, Australia, January, 12-14, 2005. 8.

  • Y. Liu, D. S. Zhang, G. Lu and W.-Y. Ma, "Region-based Image Retrieval with Perceptual

Colors", In Proc. of 5th Pacific-Rim Conference on Multimedia (PCM04), Tokyo, Japan, Nov. 30-Dec.3, 2004. 9.

  • Y. Liu, W. Ma, D. S. Zhang, and G. Lu, "An Efficient Texture Feature Extraction Algorithm for

Arbitrary-Shaped Regions", In Proc. of IEEE 7th International Conference on Signal Processing (ICSP04), Beijing, China, Aug. 31– Sept. 4, Vol. 2, pp.1037-1040, 2004.

slide-57
SLIDE 57

www.infotech.monash.edu

57

References References

  • Y. Deng and B. S. Manjunath, “Unsupervised

Segmentation of Color-Texture Regions in Images and Video”, IEEE Trans. on Pattern Analysis and Machine Learning (PAMI), 23(8):800-810, 2001.

  • B. S. Manjunath and W. Y. Ma, “Texture Features for

Browsing and Retrieval of Large Image Data”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8):837-842, 1996.

  • Y. Rubner, C. Tomasi and L. J. Guibas, "A Metric for

Distributions with Applications to Image Databases“, in

  • Proc. of IEEE Inter. Conf. on Computer Vision (ICCV'98),
  • p. 59-67, Jan. 1998.
  • http://www.cs.waikato.ac.nz/ml/weka/, accessed in Feb.,

2008.