Learning Concept Taxonomies from Multi-modal Data Hao Zhang - PowerPoint PPT Presentation

Learning Concept Taxonomies from Multi-modal Data Hao Zhang Zhiting Hu, Yuntian Deng, Mrinmaya Sachan, Zhicheng Yan and Eric P. Xing Carnegie Mellon University 1

Outline • Problem • Taxonomy Induction Model • Features • Evaluation and Analysis 2

Problem • Taxonomy induction {consumer goods, fashion, uniform, A set of lexical terms = neckpiece, handwear, finery, disguise, ...} • Human knowledge • Question answering • Interpretability • Information extraction 3 • Computer vision

Problem • Existing Taxonomies – Knowledge/time intensive to build – Limited coverage – Unavailable 4

Related Works (NLP) • Automatically induction of taxonomies Widdows [2003] Snow et al [2006] Poon and Domnigos [2010] Kozareva and Hovy Yang and Callan [2009] Navigli et al [2011] [2010] Fu et al [2014] Bansal et al [2014] 5

Problem • What evidence helps taxonomy induction? – Surface features shark • Ends with • Contains • Suffix match white shark • … brid bird of prey 6

Problem • What evidence helps taxonomy induction? – Semantics from text descriptions • Parent-child relation • Sibling relation [Bansal 2014] “seafish, such as shark…” seafish “rays are a group of seafishes…” shark ray “Either shark or ray…” “Both shark and ray…” 7

Problem • What evidence helps taxonomy induction? – Semantics from text descriptions • Parent-child relation • Sibling relation [Bansal 2014] “seafish, such as shark…” • Wikipedia abstract – Presence and distance extracted as “rays are a group – Patterns of seafishes…” • Web-ngrams “Either shark or ray…” • … “Both shark and ray…” 8

Problem • What evidence helps taxonomy induction? – wordvec d(𝑤 𝑙𝑗𝑜𝑕 , 𝑤 𝑟𝑣𝑓𝑓𝑜 ) ≈ 𝑒(𝑤 𝑛𝑏𝑜 , 𝑤 𝑥𝑝𝑛𝑏𝑜 ) ? 𝑤 se𝑏𝑔𝑗𝑡ℎ − 𝑤 𝑡ℎ𝑏𝑠𝑙 𝑤 ℎ𝑣𝑛𝑏𝑜 − 𝑤(𝑥𝑝𝑛𝑏𝑜) – Projections between parent and child [Fu 2014] 9

Motivation • How about images? Seafish seafish Shark shark ray Ray 10

Motivation • Our motivation – Images may include perceptual semantics – Jointly leverage text and visual information (from the web) • Problems to be addressed: – How to design visual features to capture the perceptual semantics? – How to design models to integrate visual and text information?

Related Works (CV) • Building visual hierarchies Griffin and Perona [2008] Sivic et al [2008] Chen et al [2013] 12

Task Definition • Assume a set of N cateogries 𝒚 = 𝑦 = , 𝑦 > , … , 𝑦 @ – Each category has a name and a set of images • Goal: induce a taxonomy tree over 𝒚 – Using both text & visual features x = {Animal, Fish, Shark, Cat, Tiger, Terrestrial animal, Seafish, Feline} • Setting: Supervised learning of category hierarchies from data

Model Let 𝑨 B (1 ≤ 𝑨 B ≤ 𝑂) be the index of the parent of category 𝑦 B – The set 𝐴 = {𝑨 = , 𝑨 > , … , 𝑨 B } encodes the whole tree structure • Our goal → infer the conditional distribution 𝑞(𝒜|𝒚) x = {Animal, Fish, Shark, Cat, Tiger, Terrestrial animal, Seafish, Feline} 14

Model Overview • Intuition: Categories tend to be closely related to parents and siblings – (text) hypernym-hyponym relation: shark -> cat shark – visual similarity: images of shark ⇔ images of ray • Method: Induce features from distributed representations of images and text – image: deep convnet – text: word embedding 15

Taxonomy Induction Model • Notations: – 𝒅 B : child nodes of 𝑦 B N ∈ 𝒅 B – 𝑦 B – 𝑕 P : consistency term depending on features – 𝑥 : model weights to be learned parent indexes popularity (#child) of categories of categories N with parent consistency of 𝑦 B prior of popularity 16 𝑦 B and siblings 𝒅 𝒐 \x BN

Taxonomy Induction Model • Looking into 𝑕 P : N , 𝒅 B \𝑦 B N – 𝑕 𝑦 B , 𝑦 B evaluates how consistent a parent-child group is. – The whole model is a factorization of consistency terms of all local parent-child groups. N with parent consistency of 𝑦 B 𝑦 B and siblings 𝒅 𝒐 \x BN 17

Model: Develop 𝑕 P • Notations: – 𝒅 B : child nodes of 𝑦 B N ∈ 𝒅 B – 𝑦 B – 𝑕 P : consistency term depending on features – 𝑥 : model weights to be learned weight vector (to be learned) feature vector: feature N with parent consistency of 𝑦 B N with parent vector of 𝑦 B 𝑦 B and siblings 𝒅 𝒐 \x BN N 𝑦 B and siblings 𝒅 B \𝑦 B 18

Feature: Develop 𝑔 • Visual features: – Sibling similarity – Parent-child similarity – Parent prediction • Text features – Parent prediction [Fu et al.] – Sibling Similarity – Surface features [Bansal et al.] 19

Feature: Develop 𝑔 • Visual features: Sibling similarity (S-V1*) – Step 1 : fit a Gaussian to the images of each category – Step 2: Derive the pairwise similarity 𝑤𝑗𝑡𝑡𝑗𝑛(𝑦 B , 𝑦 U ) – Step 3: Derive the groupwise similarity by averaging S-V1 evaluates the visual similarity between siblings * S: Siblings, V: Visual 20

Feature: Develop 𝑔 • Visual features: Parent-child Similarity (PC-V1*) – Step 1 : Fit a Gaussian for child categories – Step 2: Fit a Gaussian for only the top-K images of parent categories – Step 3 – 4: same with S-V1 Seafish Shark * PC: Parent-child, V: Visual 21

Feature: Develop 𝑔 • Visual features: Parent Prediction (PC-V2*) – Step 1 : Learn a projection matrix to map the mean image of child category to the word embedding of its parent category – Step 2: Calculate the distance – Step 3: bin the distance as a feature vector * PC: Parent-child, V: Visual 22

Feature: Develop 𝑔 • Text features – Parent prediction [Fu et al.] • Parent prediction: projection from child to parent – Sibling Similarity • Distance between word vectors – Surface features [Bansal et al.] • Ends with (e.g. catshark is a sub-category of shark), LCS, Capitalization, etc. 23

Parameter Estimation • Inference – Gibbs sampling • Learning – Supervised learning from gold taxonomies of training data – Gradient descent-based maximum likelihood estimation • Output taxonomies – Chao-Liu-Edmonds algorithm 24

Experiment Setup • Implementation – Wordvec: Google word2vec – Convnet: VGG-16 >VW • Evaluation metric: Ancestor-F1 = VXW • Data – Training set: ImageNet taxonomies 25

Evaluation Results: Comparison to baseline methods • Embedding-based feature (LV) is comparable to state-of-the-art • Full feature set (LVB) achieve the best • L: L anguage features – surface features – embedding features • V: V isual features • B: B ansal2014 features – web ngrams etc. • E: E mbedding features 26

Evaluation Results: How much visual features help? Messages: • Visual similarity (S-V1, PC-V1) help a lot • The complexity of visual representations does not affect much 27

Evaluation Results: Investigating PC-V1 • Images of parent category are not all necessarily visually similar to images of child category Seafish Shark 28

Evaluation Results: When/Where visual features help? • Messages: – Shallow layers ↔ abstract categories ↔ text features more effective – Deep layers ↔ specific categories ↔ visual features more effective Weights v.s. depth 29

Take-home Message • Visual similarity helps taxonomy induction a lot – Sibling similarity – Parent-child similarity • Which features are more important? – Visual features are more indicative in near- leaf layers – Text features more evident in near-root layers • Embedding features augments word count features 30

Thank You! Q & A 31

Evaluation Results: Visualization 32

Learning Concept Taxonomies from Multi-modal Data Hao Zhang - PowerPoint PPT Presentation

Learning Concept Taxonomies from Multi-modal Data Hao Zhang Zhiting Hu, Yuntian Deng, Mrinmaya Sachan, Zhicheng Yan and Eric P. Xing Carnegie Mellon University 1 Outline Problem Taxonomy Induction Model Features Evaluation

The Expressive Power of Backround Modal Dependence Logic Modal logic Team semantics Modal

Higher Levels of Learning Diana Skrzydlo, Continuing Lecturer Outline Learning Taxonomies

Multi-modal Face Recognition Hu Han hanhu@ict.ac.cn http: / / vipl.ict.ac.cn/ members/ hhan

W HAT IS EHD? Introduction EHD without cross-flow Modal Dielectric fluid Non-modal EHD with

Modal logic Benzm uller/Rojas, 2014 Artificial Intelligence 2 What is Modal Logic?

Why is modal logic decidable Petros Potikas NTUA 9/5/2017 Petros Potikas (NTUA) Modal logic

TDDD04: Test plans and software defect taxonomies Lena Buffoni lena.buffoni@liu.se 3 Lecture

TDDD04: Test plans and software defect taxonomies Lena Buffoni lena.buffoni@liu.se 3 Lecture

CS 134: Operating Systems Definitions, Abstractions, Taxonomies, Early History 1 / 36 Overview

Software taxonomies Patterns, styles, tactics,... School of Computer Science Jose E. Labra Gayo

Conclusions TRECVID 2008 Conclusions TRECVID 2008 Good settings for Bag Good settings

ADDRESS INTER-MODAL CONFLICT CONTENTS 1. Introduction 2. Identified inter-modal conflicts within

A Southeast Louisiana Inter Modal A Southeast Louisiana Inter Modal Transportation Hub

Projective unification in modal logic II Projective unification in modal logic II Piotr Wojtylak

MODAL AUTOMATA studying modal fixpoint logics one step at a time Yde Venema

Introduction to modal logic Lus Soares Barbosa Jos Proena HASLab - INESC TEC Universidade

Digital Literary Stylis.cs Anne BANDRY-SCUBBI Womens Novels 1750s-1830s and the Company They

Venues of Excellence Spring Members Forum Friday 5 th April 2019 West Court, Je Jesus Colle

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong

print culture & beyond Quality of Information October 31, 2007 "the web is a global

Fishing in a sample to discard irrelevant RNA-Seq reads Paola Bonizzoni, Tamara Ceccato, Gianluca

Sound and Sharks, Investigating Detection from Different Directions: Detecting sharks

Wireshark Tutorial Chris Neasbitt UGA Dept. of Computer Science Contents Introduction

Ftrace Linux Kernel Tracing Steven Rostedt srostedt@redhat.com rostedt@goodmis.org

Learning Concept Taxonomies from Multi-modal Data Hao Zhang - PowerPoint PPT Presentation

Learning Concept Taxonomies from Multi-modal Data Hao Zhang Zhiting Hu, Yuntian Deng, Mrinmaya Sachan, Zhicheng Yan and Eric P. Xing Carnegie Mellon University 1 Outline Problem Taxonomy Induction Model Features Evaluation

The Expressive Power of Backround Modal Dependence Logic Modal logic Team semantics Modal

Higher Levels of Learning Diana Skrzydlo, Continuing Lecturer Outline Learning Taxonomies

Multi-modal Face Recognition Hu Han hanhu@ict.ac.cn http: / / vipl.ict.ac.cn/ members/ hhan

W HAT IS EHD? Introduction EHD without cross-flow Modal Dielectric fluid Non-modal EHD with

Modal logic Benzm uller/Rojas, 2014 Artificial Intelligence 2 What is Modal Logic?

Why is modal logic decidable Petros Potikas NTUA 9/5/2017 Petros Potikas (NTUA) Modal logic

TDDD04: Test plans and software defect taxonomies Lena Buffoni lena.buffoni@liu.se 3 Lecture

TDDD04: Test plans and software defect taxonomies Lena Buffoni lena.buffoni@liu.se 3 Lecture

CS 134: Operating Systems Definitions, Abstractions, Taxonomies, Early History 1 / 36 Overview

Software taxonomies Patterns, styles, tactics,... School of Computer Science Jose E. Labra Gayo

Conclusions TRECVID 2008 Conclusions TRECVID 2008 Good settings for Bag Good settings

ADDRESS INTER-MODAL CONFLICT CONTENTS 1. Introduction 2. Identified inter-modal conflicts within

A Southeast Louisiana Inter Modal A Southeast Louisiana Inter Modal Transportation Hub

Projective unification in modal logic II Projective unification in modal logic II Piotr Wojtylak

MODAL AUTOMATA studying modal fixpoint logics one step at a time Yde Venema

Introduction to modal logic Lus Soares Barbosa Jos Proena HASLab - INESC TEC Universidade

Digital Literary Stylis.cs Anne BANDRY-SCUBBI Womens Novels 1750s-1830s and the Company They

Venues of Excellence Spring Members Forum Friday 5 th April 2019 West Court, Je Jesus Colle

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong

print culture &amp; beyond Quality of Information October 31, 2007 &quot;the web is a global

Fishing in a sample to discard irrelevant RNA-Seq reads Paola Bonizzoni, Tamara Ceccato, Gianluca

Sound and Sharks, Investigating Detection from Different Directions: Detecting sharks

Wireshark Tutorial Chris Neasbitt UGA Dept. of Computer Science Contents Introduction

Ftrace Linux Kernel Tracing Steven Rostedt srostedt@redhat.com rostedt@goodmis.org

print culture & beyond Quality of Information October 31, 2007 "the web is a global