ashwin t v
play

Ashwin T V Rahul Gupta Sugata Ghosal {vashwin, rahulgup, - PDF document

Adaptable Similarity Search using Adaptable Similarity Search using NonRelevant Information NonRelevant Information Ashwin T V Rahul Gupta Sugata Ghosal {vashwin, rahulgup, gsugata}@in.ibm.com IBM India Research Lab, New Delhi Application


  1. Adaptable Similarity Search using Adaptable Similarity Search using NonRelevant Information NonRelevant Information Ashwin T V Rahul Gupta Sugata Ghosal {vashwin, rahulgup, gsugata}@in.ibm.com IBM India Research Lab, New Delhi Application of Similarity Search with Relevance Feedback Application of Similarity Search with Relevance Feedback Shopper wishes to see more cars like this SUV. Shopper Shopper starts by querying for cars by Ford Parametric search result Similarity System presents shopper cars Shopper prefers Toyota Search matching her feedback Sequioa to others. dislikes likes Similarity search with Relevance feedback Results using buyers relevance Similarity search result & judgements shoppers relevance judgement

  2. MindReader's Formulation for convex query MindReader's Formulation for convex query concepts concepts To find parameters of generalized quadratic distance T M ( x - q ) D ( x ) = ( x - q ) Given : G :- Feature vectors of Relevant objects. B :- Feature vectors of NonRelevant objects. Solve : Solution : MindReader Contd.... MindReader Contd.... + + Starting Query + + + + x + + - + + - - + - -

  3. MindReader does NOT consider NonRelevant MindReader does NOT consider NonRelevant objects! objects! Top-K Objects Retreived + + + + + + + + - + + - - Objects close to + - - NonRelevant objects are also retrieved Adapting MindReader to incorporate NonRelevant Adapting MindReader to incorporate NonRelevant Objects ... Objects ... Top-K Objects Ellipsoid with Retreived radius c + + + + + + + + - + + - - + - -

  4. Modified formulation incorporating NonRelevant Modified formulation incorporating NonRelevant Objects.. Objects.. Two subproblems : Decision Surface Approximated by a piecewise-linear surface + + Convex Hull of relevant + points + + Expand CH + + + - - Maximally expanded CH + + Distance Metric - - - - + - - - - Modified formulation incorporating NonRelevant Modified formulation incorporating NonRelevant Objects.. Objects.. Two subproblems : Decision Surface + + + + Distance Metric + + Use MindReader's + + - solution with only Relevant objects + + - - + - -

  5. Constructing the Piecewise Linear Decision Surface Constructing the Piecewise Linear Decision Surface .. .. To determine p i + - Expressions for H i , H i Computation of CH not needed Solve : Summary of the Proposed Technique.. Summary of the Proposed Technique.. + + + + + + + + + + + + Estimated Relevant Region (ERR) + + + + - - + + + + - - - - + + - - - -

  6. Addressing Some Issues. Addressing Some Issues. Some NonRelevant points may lie inside CH Closest point p i = NonRelvant point b i We do not construct Hyperplanes for these points Insufficient (<K) database points in Relevant region Likely due to #NonRelevant points >> #Relevant points Either retrieve < K or retrieve from NonRelevant region Small number of Relevant points Not possible to robustly estimate generalised euclidean me Use weighted euclidean metric (MARS) Algorithm to find piecewise linear decision surface robust Datasets Used in Experiments. Datasets Used in Experiments. CAR LETTER DIGITS # Features 24 Numerical 16 Numerical 16 Numerical 21 Vehicle # Classes 26 letters 10 digits types # Items 1270 Cars 2600 1000 Midsize Sedan, Letters of Examples of Fullsize SUV, English Digits 0-9 classes Sport Coupe alphabet Price, HP, Examples of Highway Econ, --- --- Features Weight, Length LETTER & DIGITS obtained from UCI ML repository CARs dataset obtained from online automobile stores

  7. Experimental Setup Experimental Setup Algorithms experimentally compared A.Proposed Algorithm with Weighted Euclidean metric B.MindReader with Weighted Euclidean Distance metric (MARS Incremental Classification experiments To retrieve objects from a particular class Objects for feedback chosen randomly from retrieved set Retrieved target class objects marked Relevant Other retrieved objects marked NonRelevant Precision = # Target class objects retrieved # Objects retrieved Recall = # Target class objects retrieved # Target class objects Parameters # Relevance Feedback (RF) iterations # Feedback ( Relevant+NonRelevant ) per RF iteration Car Dataset Car Dataset Results Results

  8. Letter Dataset Letter Dataset Results Results Digits Dataset Digits Dataset Results Results

  9. Avg Accuracy of ERR = Avg Fraction of Target Class objects in ERR Avg Size of ERR = Avg Fraction of database points in ERR Avg Class Size CAR 5% LETTER 4% DIGITS 10% Results with Results with Different Different Sizes of Sizes of Feedback Feedback

  10. Query Time = Time for Parameter Estimation Query Time = Time for Parameter Estimation + Ranking time. + Ranking time. Conclusions and Future Work.. Conclusions and Future Work.. Novel algorithm to handle NonRelevant feedback robustly Improved performance over three real datasets Can incorporate many previously proposed distance metrics Comparison with pattern classifiers (SVM) Experiments with higher dimensional multimedia datasets Indexing structure to answer intersection of halfspace query efficiently Extend to Non-Convex query concepts Reduce time complexity of closest point in CH optimization

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend