Deep Learning of Binary Hash Codes for Fast Image Retrieval Kevin - - PowerPoint PPT Presentation

▶

Sep 05, 2022 215 likes •546 views

Deep Learning of Binary Hash Codes for Fast Image Retrieval Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-song chen Yahoo! Taiwan CVPR 2015 2016. 11. 6. 1 Index Review Background & Motivation Method Experiment

SLIDE 1

Deep Learning of Binary Hash Codes for Fast Image Retrieval

Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-song chen Yahoo! Taiwan CVPR 2015

2016. 11. 6.

박중언

SLIDE 2

Index

Review
Background & Motivation
Method
Experiment & Result
Q & A
Quiz

SLIDE 3

Review

SLIDE 4

Review - Video Object Segmentation

4 http://sglab.kaist.ac.kr/~sungeui/IR/Presentation/first_2016/%EC%A3%BC%EC%84%B8%ED%98%84.pdf

SLIDE 5

Review - Video Object Segmentation

5 http://sglab.kaist.ac.kr/~sungeui/IR/Presentation/first_2016/%EC%A3%BC%EC%84%B8%ED%98%84.pdf

SLIDE 6

Background & Motivation

SLIDE 7

Background - Inverted Index

Reduce search space effectively with agreeable loss of

accuracy

Use those ANN techniques for efficiently finding near clusters

http://sglab.kaist.ac.kr/~sungeui/IR/Slides2016/Lec4b-bow.pdf

SLIDE 8

Motivation

Need fast retrieval within huge amount of image data sets.
Need to generate the binary compact codes directly from the

deep CNN.

SLIDE 9

Motivation

Consider characteristic of CNN layer depth
Feature from deep layer
similar appearance
Feature from shallow layer
similar High-level semantics

SLIDE 10

Method

SLIDE 11

Method

The method includes

three main component consist of 3 steps.

Pre-Training
Fine-Tuning
Hierarchical Search

SLIDE 12

Method – pre-training

Supervised pre-training on the large-scale ImageNet dataset
> 1M images, > 1000 categories
Trained with Alexnet

SLIDE 13

Method – fine-tuning

Fine-tuning the network with the latent layer to

simultaneously learn domain specific feature representation and a set of hash-like function

SLIDE 14

Method – fine-tuning

The weights of the latent layer H and the final classification

layer F8 are randomly initialized.

The initial random weights of latent layer H acts like LSH [6] which uses random

projections for constructing the hashing bits

A. Gionis, P

. Indyk, R. Motwani, et al. Similarity search in high dimensions via hashing. In VLDB, volume 99, pages 518–529, 1999. 1, 2, 4, 6

SLIDE 15

Method – fine-tuning

latent layer H are activated by sigmoid functions so the

activations are approximated to {0,1}.

Sigmoid function
To achieve domain adaptation, fine-tune the proposed

network on the target-domain dataset via back propagation.

SLIDE 16

Method – image retrieval

Retrieves images similar to the query one via the hierarchical

deep search

Hierarchical search has two steps.
First, it finds nearest n coarse feature with Coarse-level search
Second, fine Fine-level search for candidates belong to the coarse

feature.

SLIDE 17

Method – image retrieval

Similarity level in coarse-level search is as the Hamming distance

SLIDE 18

Method – image retrieval

SLIDE 19

Experiment & Result

SLIDE 20

Experiment

Supervised pre-learning on ImageNet
Fine-tuning on target domain
MNIST, CIFAR-10
Image Retrieval via Hierarchy deep search

SLIDE 21

Experiment

Experiment has done in MNIST, CIFAR-10, and Yahoo-1M

dataset

Define precision@k to measure performance
(number of ground truth images in top k) / k

SLIDE 22

Experiment - MNIST

F8 to 10 way, 10 object categories, and h is also set as 48

and 128.

50,000 training iterations

SLIDE 23

Experiment - MNIST

Classification performed for 1000 images on training set(left) ,

test set(right)

SLIDE 24

Experiment – CIFAR-10

CIFAR-10, F8 to10 way, 10 object categories, and h is also set

as 48 and 128.

SLIDE 25

Experiment – CIFAR-10

Classification performed for 1000 images on training set(left) ,

test set(right)

SLIDE 26

Experiment – Yahoo! 1M dataset

116 object categories, and h in the latent layer to 128
randomly select 1000 images

SLIDE 27

Experiment – Yahoo! 1M dataset

(1) AlexNet: F7 feature from the pre-trained CNN [14];
(2) Ours-ES: F7 features from our network;
(3) Ours-BCS: Latent binary codes from our network;
(4) Ours-HDS: F7 features and latent binary codes from our

network.

SLIDE 28

Result – Yahoo! 1M dataset

Classification performed for 1000 images on Training

set(left) , test set(right)

SLIDE 29

Result - Speed

971.3x faster than traditional exhaustive search with 4096-

dimensional features.

SLIDE 30

Conclusion

Introducing a simple, yet effective supervised learning

framework for rapid image retrieval.

Suggested CNN techniques that learns domain specific

image representations and a set of hashing-like functions for rapid image retrieval.

The proposed method outperforms all of the state of-the-art

works on the public dataset

Our approach learns binary hashing codes in a pointwised

manner and is easily scalable to the data size in comparison

f conventional pair-wised approaches

SLIDE 31

Q & A

SLIDE 32

Deep Learning of Binary Hash Codes for Fast Image Retrieval

Index

Review

Review - Video Object Segmentation

Review - Video Object Segmentation

Background & Motivation

Background - Inverted Index

Motivation

Motivation

Method

Method

Method – pre-training

Method – fine-tuning

Method – fine-tuning

Method – fine-tuning

Method – image retrieval

Method – image retrieval

Method – image retrieval

Experiment & Result

Experiment

Experiment

Experiment - MNIST

Experiment - MNIST

Experiment – CIFAR-10

Experiment – CIFAR-10

Experiment – Yahoo! 1M dataset

Experiment – Yahoo! 1M dataset

Result – Yahoo! 1M dataset

Result - Speed

Conclusion

Q & A

Thank you!