Background Unprecedented growth of multimedia data on the Internet. - PowerPoint PPT Presentation

Supervised Hierarchical Cross-Modal Hashing Changchang Sun † , Xuemeng Song † , Fuli Feng ‡ , Wayne Xin Zhao $ , Hao Zhang * , Liqiang Nie † † School of Computer Science and Technology, Shandong University ‡ School of Computing, National University of Singapore $ School of Information, Renmin University of China * Mercari, Inc, Japan 1

Background Ø Unprecedented growth of multimedia data on the Internet. Ø Application: cross-modal retrieval. Ø Solution: supervised cross-modal hashing. Mini-skirt UNIQLO Women Cotton Mini Skirt. Hamming Long Skirt Chicwish Endless Blooming Rose Max Skirt. 0 Space Wide-leg Jeans Chloé Frayed High-rise Wide-leg Jeans. Labels Image Text 2

Related Work Ø Define cross-modal similarity matrix Qingyuan Jiang and Wujun Li. Deep Cross-Modal Hashing. In CVPR, 2017 3

Related Work Ø Learn semantic information from multiple labels Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, and Dacheng Tao. Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval. In CVPR, 2018 4

Motivation Ø Explore the rich semantic information conveyed by the label hierarchy. Ø Finest-grained layer I I Dissimilar 3 1 Ø Less finer-grained layer I I Similar 3 1 Figure 1: Illustration of the label hierarchy. 5

Challenges Ø How to employ the label hierarchy to guide the cross-modal hashing and preserve the underlying correlations from original space to hamming space. Mapping A B C Original Space 6 Hamming Space

Challenges Ø How to enhance the hierarchical discriminative power of hash codes. Skirt Mini-Skirt Hash Code Jeans Wide-leg Jeans Hash Code 7

Challenges Ø The lack of benchmark dataset, whose data points should involve multiple modalities and are hierarchically labeled. Super-class Class Flowers Rose, Sunflower, Lily... Fish Goldfish, Shark, Dolphin... Unimodal Insect Bee, Butterfly, Caterpillar... Data Points Fruit Apple, Peach, Pear... ... ... Table 1: Hierarchical labels of benchmark dataset CIFAR-100. 8

Framework Concatenation VGG-F Figure 2: Illustration of the proposed scheme, HiCHNet. 9

Framework Ø Regularized Cross-modal Hashing p Layer-wise Hash Representation K Fully Connected Networks K Layers ~ v ~ i    k k k h s ( W v g ), k 1 ,..., K v i v i v ~    k j k h s ( W t g ), k 1 ,..., K t j t j t   k k b sign ( h ), k 1 ,..., K v v i i   k k b sign ( h ), k 1 ,..., K t t j j k k h i h ( ) : layer-wise hash representation v t j ~ k k b i b ( ) : layer-wise binary hash codes t v t j j 10

Framework Ø Regularized Cross-modal Hashing p Layer-wise Semantic Similarity Preserving • Objective function (negative log likelihood):  k S 1 Same label at the k-th layer Ground ij  k S 0 Truth Different label at the k-th layer ij K N   k         k k ( S log( 1 e )) ij 1 k ij ij   k 1 i , j 1 1 Layer Semantic   k k T k ( h ) h ij v t Confidence Similarity 2 i j 11

Framework Ø Regularized Cross-modal Hashing p Binarization Difference Penalizing To derive the optimal continuous surrogates of the hash codes  k k B sgn( H ) v v   T  a 1  , 1 , , 1  k k B sgn( H ) t t k  2 2 2 2          k k k k k k ( B H B H ) ( H a H a ) 2 v v t t v t F F 2 2  K 1 Binarization Difference Information Regularization Maximization 12

Framework Ø Hierarchical Discriminative Learning • Objective function (negative log likelihood): k k h p    k k k k p soft max( U h q ), k 1 ,..., K v i v i v v v v i i    k j k k p soft max( U h g ), k 1 ,..., K t t t t j j   K N        k T k k T k ( y ) log( p ) ( y ) log( p ) k k h k i v i t h p i i   k 1 i 1 t j t j Layer Ground-truth Confidence 13

Framework Ø Final Objective Function Non-negative Tradeoff Parameter min      ( 1 ) r h k   B , , v t Regularized Hierarchical Discriminative Cross-modal Hashing Learning 14

Experiment Ø Dataset • Two datasets: FashionVC (public) and Ssense (created by ourselves). • Ssense: Collected from the online fashion platform Ssense. (2018.12.14--2018.12.16). • Raw data: 25,974 image-text instances with hierarchical labels. • Preprocessing: Removed the noisy instances that involve multiple items. Filtered out the categories with less than 70 instances. Noisy Instances 15

Experiment Ø Dataset • Two datasets: FashionVC (public) and Ssense (created by ourselves). Table 1: Statistics of our datasets. 16

Experiment Ø Dataset • FashionVC Label Hierarchy: 35 categories with two layers 17

Experiment Ø Dataset • Ssense Label Hierarchy: 32 categories with two layers 18

Experiment Ø Experiment Setting Image to Text Task Text to Image Protocol: Mean Average Precision Shallow Learning: CCA, SCM-Or, SCM-Se, DCH Baselines Deep Learning: CDQ, SSAH, DCMH 500-D SIFT Features and 4096-D Deep Features 19

Experiment Ø On Model Comparison Ta b l e 2 : T h e M A P s c o r e s o f d i ff e r e n t methods on two datasets. The shallow learning baselines use the SIFT features. Table 3: The MAP scores of different methods on two datasets. The shallow learning baselines use the VGG-F features. 20

Experiment Ø On Label Hierarchy Figure 3: HiCHNet-flat : One derivative of our HiCHNet model. 21

Experiment Ø On Label Hierarchy Figure 4: Performance of HiCHNet and HiCHNet-flat on FashionVC. 22

Experiment Ø On Case Study 1 • Retrieve from the whole retrieval set Figure 5: Illustration of ranking results from the whole retrieval set. The irrelevant images are highlighted in red boxes. 23

Experiment Ø On Case Study 2 • Retrieve from the constrained subset of 10 images of different categories. Figure 6: Illustration of ranking results from the constrained retrieval set. 24

Conclusion l We first validate the benefits of utilizing the category hierarchy in cross-modal. l We propose a novel supervised hierarchical cross-modal hashing framework. l We build a large-scale benchmark dataset from the global fashion platform Ssense. Extensive experiments demonstrate the superiority of HiCHNet over the state-of-the-art methods. 25

Thanks Q&A Thanks for the travel grant from SIGIR. Email: sunchangchang123@gmail.com 26

Back Up 27

Experiment Ø On Category Analysis Figure 7: Performance of HiCHNet and DCMH on different categories of FashionVC and Ssense in the task of “Text→Image”. 28

Experiment Ø On Component Analysis Figure 8: Sensitivity analysis of the hyper-parameters. 29

Background Unprecedented growth of multimedia data on the Internet. - PowerPoint PPT Presentation

Supervised Hierarchical Cross-Modal Hashing Changchang Sun , Xuemeng Song , Fuli Feng , Wayne Xin Zhao $ , Hao Zhang * , Liqiang Nie School of Computer Science and Technology, Shandong University School of Computing, National

AN INTRODUCTION TO BACKGROUND SETTINGS: Allows you to change background BACKGROUND SETTINGS: Allows

Tracer Study Public Workshop August 27, 2019 Background Background Background Background

Background Background Background Background Design Task Museum! Museum! Design step1

Take Charge of Your Business Mix! Whats the problem? BACKGROUND Whats the problem?

Neural Photo Editing Andrew Brock Introduction Background: VAEs Background: VAEs Background:

Background Paper: Progress on the Background Paper: Progress on the Background Paper: Progress on

Being Open Georgia Gkioxari Background Background GREECE Background GREECE UC Berkeley

FLAME 2014/11/1 Background Complex Systems or Networks Background Design Experiment

Background The Community Network Background Background The Vision Increase community use of

B A B AR Beam Background Beam Background Simulation Simulation Steven Robertson 2 nd Hawaii

CSS Styling Styling Backgrounds background-color Background color corresponds to a HEX value like

Background Body background Subtle patterns background textures List marker (see

Child Care Background Checks The child care background check process changed on October 1, 2018

Mediation: Background and Basics David A. Kenny davidakenny.net Overview Background and

An American Bullfrog in Brazil Lauren V. Ash University of Vermont June 4, 2019 Background The

Efficient Lighting: Background and Discussion May 29, 2014 2 Agenda Introductions

Sharing is Caring in the Land of The Long Tail Samy Bengio Real life setting Real problems

Speech Technology for Mobile Phones Part I : ASR, and TTS on the Mobile phone Rajesh M. Hegde

Financial State of the Club Used the CPI to eliminate inflation (all in 2019 $) Going back

Metaphor Structure of reality built up through embodied interaction Categories created

The Gospel of Mark John Chapman September 26, 1774 March 18, 1845 American Evangelist

Safety models & accident models Eric Marsden <eric.marsden@risk-engineering.org> A

TO TO 1 2 TRUTH ON THE WEB MINISTRIES WWW.TOTW.ORG CHURCH OF GOD AT WOODSTOCK, IL 1 John

AIRS Outreach Jet Propulsion Laboratory California Institute of Technology Science Team Meeting

Background Unprecedented growth of multimedia data on the Internet. - PowerPoint PPT Presentation

Supervised Hierarchical Cross-Modal Hashing Changchang Sun , Xuemeng Song , Fuli Feng , Wayne Xin Zhao $ , Hao Zhang * , Liqiang Nie School of Computer Science and Technology, Shandong University School of Computing, National

AN INTRODUCTION TO BACKGROUND SETTINGS: Allows you to change background BACKGROUND SETTINGS: Allows

Tracer Study Public Workshop August 27, 2019 Background Background Background Background

Background Background Background Background Design Task Museum! Museum! Design step1

Take Charge of Your Business Mix! Whats the problem? BACKGROUND Whats the problem?

Neural Photo Editing Andrew Brock Introduction Background: VAEs Background: VAEs Background:

Background Paper: Progress on the Background Paper: Progress on the Background Paper: Progress on

Being Open Georgia Gkioxari Background Background GREECE Background GREECE UC Berkeley

FLAME 2014/11/1 Background Complex Systems or Networks Background Design Experiment

Background The Community Network Background Background The Vision Increase community use of

B A B AR Beam Background Beam Background Simulation Simulation Steven Robertson 2 nd Hawaii

CSS Styling Styling Backgrounds background-color Background color corresponds to a HEX value like

Background Body background Subtle patterns background textures List marker (see

Child Care Background Checks The child care background check process changed on October 1, 2018

Mediation: Background and Basics David A. Kenny davidakenny.net Overview Background and

An American Bullfrog in Brazil Lauren V. Ash University of Vermont June 4, 2019 Background The

Efficient Lighting: Background and Discussion May 29, 2014 2 Agenda Introductions

Sharing is Caring in the Land of The Long Tail Samy Bengio Real life setting Real problems

Speech Technology for Mobile Phones Part I : ASR, and TTS on the Mobile phone Rajesh M. Hegde

Financial State of the Club Used the CPI to eliminate inflation (all in 2019 $) Going back

Metaphor Structure of reality built up through embodied interaction Categories created

The Gospel of Mark John Chapman September 26, 1774 March 18, 1845 American Evangelist

Safety models &amp; accident models Eric Marsden &lt;eric.marsden@risk-engineering.org&gt; A

TO TO 1 2 TRUTH ON THE WEB MINISTRIES WWW.TOTW.ORG CHURCH OF GOD AT WOODSTOCK, IL 1 John

AIRS Outreach Jet Propulsion Laboratory California Institute of Technology Science Team Meeting

Safety models & accident models Eric Marsden <eric.marsden@risk-engineering.org> A