Ashwin T V Rahul Gupta Sugata Ghosal {vashwin, rahulgup, - - PDF document

ashwin t v
SMART_READER_LITE
LIVE PREVIEW

Ashwin T V Rahul Gupta Sugata Ghosal {vashwin, rahulgup, - - PDF document

Adaptable Similarity Search using Adaptable Similarity Search using NonRelevant Information NonRelevant Information Ashwin T V Rahul Gupta Sugata Ghosal {vashwin, rahulgup, gsugata}@in.ibm.com IBM India Research Lab, New Delhi Application


slide-1
SLIDE 1

Adaptable Similarity Search using Adaptable Similarity Search using NonRelevant Information NonRelevant Information Ashwin T V

Rahul Gupta Sugata Ghosal {vashwin, rahulgup, gsugata}@in.ibm.com IBM India Research Lab, New Delhi Application of Similarity Search with Relevance Feedback Application of Similarity Search with Relevance Feedback

Parametric search result

Shopper wishes to see more cars like this SUV.

Similarity search with Relevance feedback

System presents shopper cars matching her feedback

Results using buyers relevance judgements

Shopper

Shopper starts by querying for cars by Ford

Similarity Search Similarity search result & shoppers relevance judgement

likes dislikes

Shopper prefers Toyota Sequioa to others.

slide-2
SLIDE 2

MindReader's MindReader's Formulation for convex query Formulation for convex query concepts concepts

To find parameters of generalized quadratic distance D ( x ) = ( x - q )

T M ( x - q )

Given : G :- Feature vectors of Relevant objects. B :- Feature vectors of NonRelevant objects. Solution : Solve : x

+ + + + + + + + + + +

  • MindReader Contd....

MindReader Contd....

Starting Query

slide-3
SLIDE 3

+ + + + + + + + + + +

  • MindReader does NOT consider NonRelevant

MindReader does NOT consider NonRelevant

  • bjects!
  • bjects!

Top-K Objects Retreived Objects close to NonRelevant objects are also retrieved

Adapting MindReader to incorporate NonRelevant Adapting MindReader to incorporate NonRelevant Objects ... Objects ...

+ + + + + + + + + + +

  • Top-K Objects

Retreived Ellipsoid with radius c

slide-4
SLIDE 4
  • Maximally expanded CH
  • +

+ + + + + + + + + +

Modified formulation incorporating NonRelevant Modified formulation incorporating NonRelevant Objects.. Objects..

Approximated by a piecewise-linear surface Convex Hull of relevant points Expand CH

Two subproblems : Decision Surface Distance Metric

Modified formulation incorporating NonRelevant Modified formulation incorporating NonRelevant Objects.. Objects..

+ + + + + + + + + + +

  • Two subproblems :

Decision Surface Distance Metric

Use MindReader's solution with only Relevant objects

slide-5
SLIDE 5

Constructing the Constructing the Piecewise Linear Decision Surface Piecewise Linear Decision Surface .. ..

Solve : To determine pi

Computation of CH not needed

Expressions for H

+ i , H

  • i
  • Summary of the Proposed Technique..

Summary of the Proposed Technique..

  • +

+ + + + + + + + + +

  • +

+ + + + + + + + + +

Estimated Relevant Region (ERR)

slide-6
SLIDE 6

Addressing Some Issues. Addressing Some Issues.

Some NonRelevant points may lie inside CH

We do not construct Hyperplanes for these points Closest point pi = NonRelvant point bi

Insufficient (<K) database points in Relevant region

Either retrieve < K or retrieve from NonRelevant region Likely due to #NonRelevant points >> #Relevant points

Small number of Relevant points

Not possible to robustly estimate generalised euclidean me Use weighted euclidean metric (MARS) Algorithm to find piecewise linear decision surface robust

Datasets Used in Experiments. Datasets Used in Experiments.

CAR LETTER DIGITS # Features 24 Numerical 16 Numerical 16 Numerical # Classes 21 Vehicle types 26 letters 10 digits # Items 1270 Cars 2600 1000 Examples of classes Midsize Sedan, Fullsize SUV, Sport Coupe Letters of English alphabet Digits 0-9 Examples of Features Price, HP, Highway Econ, Weight, Length

  • LETTER & DIGITS obtained from UCI ML repository

CARs dataset obtained from online automobile stores

slide-7
SLIDE 7

Experimental Setup Experimental Setup

Algorithms experimentally compared

A.Proposed Algorithm with Weighted Euclidean metric B.MindReader with Weighted Euclidean Distance metric (MARS

Incremental Classification experiments

To retrieve objects from a particular class Objects for feedback chosen randomly from retrieved set Retrieved target class objects marked Relevant Other retrieved objects marked NonRelevant Precision = # Target class objects retrieved # Objects retrieved Recall = # Target class objects retrieved # Target class objects

Parameters

# Relevance Feedback (RF) iterations # Feedback ( Relevant+NonRelevant ) per RF iteration

Car Dataset Car Dataset Results Results

slide-8
SLIDE 8

Letter Dataset Letter Dataset Results Results Digits Dataset Digits Dataset Results Results

slide-9
SLIDE 9

Avg Class Size CAR 5% LETTER 4% DIGITS 10% Avg Size of ERR = Avg Fraction of database points in ERR Avg Accuracy of ERR = Avg Fraction of Target Class objects in ERR

Results with Results with Different Different Sizes of Sizes of Feedback Feedback

slide-10
SLIDE 10

Query Time = Time for Parameter Estimation Query Time = Time for Parameter Estimation + + Ranking time. Ranking time. Conclusions and Future Work.. Conclusions and Future Work..

Novel algorithm to handle NonRelevant feedback robustly Improved performance over three real datasets Can incorporate many previously proposed distance metrics Comparison with pattern classifiers (SVM) Experiments with higher dimensional multimedia datasets Indexing structure to answer intersection of halfspace query efficiently Extend to Non-Convex query concepts Reduce time complexity of closest point in CH

  • ptimization