Deep Learning of Binary Hash Codes for Fast Image Retrieval Kevin - - PowerPoint PPT Presentation

deep learning of binary hash codes for fast image
SMART_READER_LITE
LIVE PREVIEW

Deep Learning of Binary Hash Codes for Fast Image Retrieval Kevin - - PowerPoint PPT Presentation

Deep Learning of Binary Hash Codes for Fast Image Retrieval Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-song chen Yahoo! Taiwan CVPR 2015 2016. 11. 6. 1 Index Review Background & Motivation Method Experiment


slide-1
SLIDE 1

Deep Learning of Binary Hash Codes for Fast Image Retrieval

Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-song chen Yahoo! Taiwan CVPR 2015

  • 2016. 11. 6.

박중언

1

slide-2
SLIDE 2

Index

  • Review
  • Background & Motivation
  • Method
  • Experiment & Result
  • Q & A
  • Quiz

2

slide-3
SLIDE 3

Review

3

slide-4
SLIDE 4

Review - Video Object Segmentation

4 http://sglab.kaist.ac.kr/~sungeui/IR/Presentation/first_2016/%EC%A3%BC%EC%84%B8%ED%98%84.pdf

slide-5
SLIDE 5

Review - Video Object Segmentation

5 http://sglab.kaist.ac.kr/~sungeui/IR/Presentation/first_2016/%EC%A3%BC%EC%84%B8%ED%98%84.pdf

slide-6
SLIDE 6

Background & Motivation

6

slide-7
SLIDE 7

Background - Inverted Index

7

  • Reduce search space effectively with agreeable loss of

accuracy

  • Use those ANN techniques for efficiently finding near clusters

http://sglab.kaist.ac.kr/~sungeui/IR/Slides2016/Lec4b-bow.pdf

slide-8
SLIDE 8

Motivation

  • Need fast retrieval within huge amount of image data sets.
  • Need to generate the binary compact codes directly from the

deep CNN.

8

slide-9
SLIDE 9

Motivation

  • Consider characteristic of CNN layer depth
  • Feature from deep layer
  • similar appearance
  • Feature from shallow layer
  • similar High-level semantics

9

slide-10
SLIDE 10

Method

10

slide-11
SLIDE 11

Method

  • The method includes

three main component consist of 3 steps.

  • Pre-Training
  • Fine-Tuning
  • Hierarchical Search

11

slide-12
SLIDE 12

Method – pre-training

  • Supervised pre-training on the large-scale ImageNet dataset
  • > 1M images, > 1000 categories
  • Trained with Alexnet

12

slide-13
SLIDE 13

Method – fine-tuning

  • Fine-tuning the network with the latent layer to

simultaneously learn domain specific feature representation and a set of hash-like function

13

slide-14
SLIDE 14

Method – fine-tuning

  • The weights of the latent layer H and the final classification

layer F8 are randomly initialized.

  • The initial random weights of latent layer H acts like LSH [6] which uses random

projections for constructing the hashing bits

  • A. Gionis, P

. Indyk, R. Motwani, et al. Similarity search in high dimensions via hashing. In VLDB, volume 99, pages 518–529, 1999. 1, 2, 4, 6

14

slide-15
SLIDE 15

Method – fine-tuning

  • latent layer H are activated by sigmoid functions so the

activations are approximated to {0,1}.

  • Sigmoid function
  • To achieve domain adaptation, fine-tune the proposed

network on the target-domain dataset via back propagation.

15

slide-16
SLIDE 16

Method – image retrieval

  • Retrieves images similar to the query one via the hierarchical

deep search

  • Hierarchical search has two steps.
  • First, it finds nearest n coarse feature with Coarse-level search
  • Second, fine Fine-level search for candidates belong to the coarse

feature.

16

slide-17
SLIDE 17

Method – image retrieval

  • Similarity level in coarse-level search is as the Hamming distance

17

slide-18
SLIDE 18

Method – image retrieval

18

slide-19
SLIDE 19

Experiment & Result

19

slide-20
SLIDE 20

Experiment

  • Supervised pre-learning on ImageNet
  • Fine-tuning on target domain
  • MNIST, CIFAR-10
  • Image Retrieval via Hierarchy deep search

20

slide-21
SLIDE 21

Experiment

  • Experiment has done in MNIST, CIFAR-10, and Yahoo-1M

dataset

  • Define precision@k to measure performance
  • (number of ground truth images in top k) / k

21

slide-22
SLIDE 22

Experiment - MNIST

  • F8 to 10 way, 10 object categories, and h is also set as 48

and 128.

  • 50,000 training iterations

22

slide-23
SLIDE 23

Experiment - MNIST

  • Classification performed for 1000 images on training set(left) ,

test set(right)

23

slide-24
SLIDE 24

Experiment – CIFAR-10

  • CIFAR-10, F8 to10 way, 10 object categories, and h is also set

as 48 and 128.

24

slide-25
SLIDE 25

Experiment – CIFAR-10

  • Classification performed for 1000 images on training set(left) ,

test set(right)

25

slide-26
SLIDE 26

Experiment – Yahoo! 1M dataset

  • 116 object categories, and h in the latent layer to 128
  • randomly select 1000 images

26

slide-27
SLIDE 27

Experiment – Yahoo! 1M dataset

  • (1) AlexNet: F7 feature from the pre-trained CNN [14];
  • (2) Ours-ES: F7 features from our network;
  • (3) Ours-BCS: Latent binary codes from our network;
  • (4) Ours-HDS: F7 features and latent binary codes from our

network.

27

slide-28
SLIDE 28

Result – Yahoo! 1M dataset

28

  • Classification performed for 1000 images on Training

set(left) , test set(right)

slide-29
SLIDE 29

Result - Speed

  • 971.3x faster than traditional exhaustive search with 4096-

dimensional features.

29

slide-30
SLIDE 30

Conclusion

  • Introducing a simple, yet effective supervised learning

framework for rapid image retrieval.

  • Suggested CNN techniques that learns domain specific

image representations and a set of hashing-like functions for rapid image retrieval.

  • The proposed method outperforms all of the state of-the-art

works on the public dataset

  • Our approach learns binary hashing codes in a pointwised

manner and is easily scalable to the data size in comparison

  • f conventional pair-wised approaches

30

slide-31
SLIDE 31

Q & A

31

slide-32
SLIDE 32

Thank you!

32