iLike: Integrating Visual and Textual Features for Vertical Search - PowerPoint PPT Presentation

iLike: Integrating Visual and Textual Features for Vertical Search Yuxin Chen 1 , Nenghai Yu 2 , Bo Luo 1 , Xue-wen Chen 1 1 Department of Electrical Engineering and Computer Science The University of Kansas, Lawrence, KS, USA 2 Department of Electrical Engineering and Information Sciences University of Science and Technology of China, Hefei, China A KTEC Center of Excellence 1

Motivation • The problem • Huge amount of multimedia information available • Browsing and searching is even harder than text • Text-based image search A KTEC Center of Excellence 2

Motivation • Text-based image search • Adopted by most image search engines – Efficient – text-based index – Text similarity, PageRank • Some queries work very well – Clearly labeled images – Distinct keywords • Some queries don’t – Insufficient tags – Gap between tag terms and query terms – Descriptive queries: “paintings of people wearing capes” A KTEC Center of Excellence 3

Motivation • Content-based Image Retrieval (CBIR) • Visual features: color, texture, shape… • Semantic gap – Low level visual features vs. image content – sun -> nice sunshine -> a beautiful day • Excessive computation: high dimensional indexing? A KTEC Center of Excellence 4

Motivation • Put textual and visual features together? • In the literature: hybrid approaches • Text-based search: candidates • CBIR-based re-ranking or clustering • Our idea • Connect textual features (keywords) with visual features • Represent keywords in the visual feature space – Learn users’ visual perception for keywords A KTEC Center of Excellence 5

Preliminaries • Data set • Vertical search: online shopping for apparels and accessories • Text contents are better organized • We can associate keywords and images with higher confidence • In this domain, text description and images are both important • Data collection • Focused crawling: 20K items from six online retailers – Mid-sized hi-quality image with text description • Feature extraction – 263 low-level visual features: color, texture and shape – Normalization A KTEC Center of Excellence 6

Representing keywords • Keywords • Image -> Human perception -> text description • Perception is subjective, the same impression could be described through different words • Calculating text similarity (or distance) is difficult - distance measurements (such as cosine distance in TF/IDF space) do NOT perfectly represent the distances in human perception. A KTEC Center of Excellence 7

Representing keywords • Items share the same keyword(s) may also share some consistency in selected visual features. • If the consistency is observed over a significant number of items described by the same keyword, such a set of features and their values may represent the human “visual” perception of the keyword. A KTEC Center of Excellence 8

Representing keywords • Example: checked A KTEC Center of Excellence 9

Representing keywords • Example: floral A KTEC Center of Excellence 10

Representing keywords • For each term, we have • Positive set: items described by the term • Negative set: items not described by the term • “Good” features • are coherent with the human perception of the keyword • have consistent values in the positive set • show different distributions in the positive and negative sets • How do we identify “good” features for each keyword? • Compare the distributions in the positive and negative sets… A KTEC Center of Excellence 11

Representing keywords • Distribution of visual features (term=“floral”) A KTEC Center of Excellence 12

Kolmogorov-Smirnov test • Two sample K-S test • Identify if two data sets are from same distribution • Makes no assumptions on the distribution • Null hypothesis: two samples are drawn from same distribution • P-value: measure the confidence of the comparison results on the null hypothesis. • Higher p-value -> accept the null hypothesis -> insignificant difference in the positive and negative sets - > “bad” feature • Lower p-value -> reject the null hypothesis -> statistically significant difference in the positive and negative sets -> “good” feature A KTEC Center of Excellence 13

Weighting visual features • The inverted p-value of Kolmogorov-Smirnov test could be used as weight for the feature • “floral”: A KTEC Center of Excellence 14

Weighting visual features • More examples: “shades” A KTEC Center of Excellence 15

Weighting visual features • More examples: “cute” A KTEC Center of Excellence 16

Query expansion and search • User employs text-based search to obtain an initial set • For each item in the initial set: • Load the corresponding weight vector for each keyword • Obtain an expanded weigh vector from the textual description. A KTEC Center of Excellence 17

Query expansion and search • Query: “floral” • Initial set: A KTEC Center of Excellence 18

Query expansion and search • CBIR-query vectors A KTEC Center of Excellence 19

Query expansion and search • iLike-query vectors A KTEC Center of Excellence 20

Results + “Floral” A KTEC Center of Excellence 21

Results • iLike: our approach • Baseline: Pure CBIR • Query: “floral” We are able to infer the implicit user intension behind the query term, identify a subset of visual features that are significant to such intension, and yield better results. A KTEC Center of Excellence 22

Visual thesaurus • Statistical similarities of the visual representations of the text terms A KTEC Center of Excellence 23

Conclusion and future work • iLike : find the “visual perception” of keywords • Better recall compared with text-based search • Better precision: understand the needs of the users • Better “understanding” of keywords: NLP? • More features? • Segmentation: feature+region? A KTEC Center of Excellence 24

Thank you! Questions? A KTEC Center of Excellence 25

iLike: Integrating Visual and Textual Features for Vertical Search - PowerPoint PPT Presentation

iLike: Integrating Visual and Textual Features for Vertical Search Yuxin Chen 1 , Nenghai Yu 2 , Bo Luo 1 , Xue-wen Chen 1 1 Department of Electrical Engineering and Computer Science The University of Kansas, Lawrence, KS, USA 2 Department of

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Textual Criticism Textual Criticism: Definition Textual criticism is the study of copies of

Dynamic memory networks for Dynamic memory networks for visual and textual question visual and

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Natural logic and textual inference Bill MacCartney CS224U 12 May 2014 Textual inference

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Gnter Neumann,

Dynamic Embedding on Textual Networks via a Gaussian Process Presenter : Pengyu Cheng Joint work

Textual Entailment Alina Petrova EMCL TUD, HLT FBK February 22, 2012 Alina Petrova EMCL TUD,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Integrating Local Feature Detectors in the Integrating Local Feature Detectors in the Interactive

Approaches, Applications, and Research Challenges Tobias Schreck Visual Analytics Group Computer

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

Image Resizing & Seamcarve CS16: Introduction to Algorithms & Data Structures Outline

Display Technology Cathode Ray Tube Images stolen from various locations on the web...

ECO 610: Lecture 5 Vertical Boundaries of the Firm Vertical Boundaries of the Firm: Outline

Vertical Research Partners Global Materials Conference June 14, 2018 Forward-Looking

Coherence for tricategories via weak vertical composition Eugenia Cheng School of the Art

Autoscaling All Things Kubernetes with Prometheus Michael Hausenblas & Frederic Branczyk,

Floorplanning ECE6133 Physical Design Automation of VLSI Systems Prof. Sung Kyu Lim School of

Education & HealthCare Vertical Markets Marilyn Collins Business Developm ent Education and

iLike: Integrating Visual and Textual Features for Vertical Search - PowerPoint PPT Presentation

iLike: Integrating Visual and Textual Features for Vertical Search Yuxin Chen 1 , Nenghai Yu 2 , Bo Luo 1 , Xue-wen Chen 1 1 Department of Electrical Engineering and Computer Science The University of Kansas, Lawrence, KS, USA 2 Department of

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Textual Criticism Textual Criticism: Definition Textual criticism is the study of copies of

Dynamic memory networks for Dynamic memory networks for visual and textual question visual and

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Natural logic and textual inference Bill MacCartney CS224U 12 May 2014 Textual inference

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Gnter Neumann,

Dynamic Embedding on Textual Networks via a Gaussian Process Presenter : Pengyu Cheng Joint work

Textual Entailment Alina Petrova EMCL TUD, HLT FBK February 22, 2012 Alina Petrova EMCL TUD,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Integrating Local Feature Detectors in the Integrating Local Feature Detectors in the Interactive

Approaches, Applications, and Research Challenges Tobias Schreck Visual Analytics Group Computer

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

Image Resizing &amp; Seamcarve CS16: Introduction to Algorithms &amp; Data Structures Outline

Display Technology Cathode Ray Tube Images stolen from various locations on the web...

ECO 610: Lecture 5 Vertical Boundaries of the Firm Vertical Boundaries of the Firm: Outline

Vertical Research Partners Global Materials Conference June 14, 2018 Forward-Looking

Coherence for tricategories via weak vertical composition Eugenia Cheng School of the Art

Autoscaling All Things Kubernetes with Prometheus Michael Hausenblas &amp; Frederic Branczyk,

Floorplanning ECE6133 Physical Design Automation of VLSI Systems Prof. Sung Kyu Lim School of

Education &amp; HealthCare Vertical Markets Marilyn Collins Business Developm ent Education and

Image Resizing & Seamcarve CS16: Introduction to Algorithms & Data Structures Outline

Autoscaling All Things Kubernetes with Prometheus Michael Hausenblas & Frederic Branczyk,

Education & HealthCare Vertical Markets Marilyn Collins Business Developm ent Education and