Hashing Techniques (Sung-Eui Yoon) Professor KAIST - - PowerPoint PPT Presentation

hashing techniques
SMART_READER_LITE
LIVE PREVIEW

Hashing Techniques (Sung-Eui Yoon) Professor KAIST - - PowerPoint PPT Presentation

Hashing Techniques (Sung-Eui Yoon) Professor KAIST http://sgvr.kaist.ac.kr Student Presentation Guidelines Good summary, not full detail, of the paper Talk about motivations of the work Give a broad background on the


slide-1
SLIDE 1

Hashing Techniques

윤성의 (Sung-Eui Yoon)

Professor KAIST http://sgvr.kaist.ac.kr

slide-2
SLIDE 2

2

Student Presentation Guidelines

  • Good summary, not full detail, of the

paper

  • Talk about motivations of the work
  • Give a broad background on the related work
  • Explain main idea and results of the paper
  • Discuss strengths and weaknesses of the

method

  • Prepare an overview slide
  • Talk about most important things and connect

them well

slide-3
SLIDE 3

3

High-Level Ideas

  • Deliver most important ideas and results
  • Do not talk about minor details
  • Give enough background instead
  • Deeper understanding on a paper is

required

  • Go over at least two related papers and

explain them in a few slides

  • Spend most time to figure out the most

important things and prepare good slides for them

slide-4
SLIDE 4

4

Deliver Main Ideas of the Paper

  • Identify main ideas/contributions of the

paper and deliver them

  • If there are prior techniques that you need

to understand, study those prior techniques and explain them

  • For example, A paper utilizes B’s technique in

its main idea. In this case, you need to explain B to explain A well.

slide-5
SLIDE 5

5

Be Honest

  • Do not skip important ideas that you don’t

know

  • Explain as much as you know and mention

that you don’t understand some parts

  • If you get questions you don’t know good

answers, just say it

  • In the end, you need to explain them

before the semester ends at KLMS board

slide-6
SLIDE 6

6

Result Presentation

  • Give full experiment settings and present

data with the related information

  • What does the x-axis mean in the below

image?

  • After showing the data, give a message

that we can pull of the data

  • Show images/videos, if there are
slide-7
SLIDE 7

7

Utilizing Existing Resources

  • Use author’s slides, codes, and video, if

they exist

  • Give proper credits or citations
  • Without them, you are cheating!
slide-8
SLIDE 8

8

Audience feedback form

Date: Talk title: Speaker:

  • 1. Was the talk well organized and well prepared?

5: Excellent 4: good 3: okay 2: less than average 1: poor

  • 2. Was the talk comprehensible? How well were important concepts

covered? 5: Excellent 4: good 3: okay 2: less than average 1: poor Any comments to the speaker

slide-9
SLIDE 9

9

Prepare Quiz

  • Review most important concepts of your

talk

  • Prepare two multiple-choices questions
  • Example: What is the biased algorithm?
  • A: Given N samples, the expected mean of the estimator is I
  • B: Given N samples, the exp. Mean of the estimator is I + e
  • C: Given N samples, the exp. Mean of the estimator is I + e,

where e goes to zero, as N goes to infinite

  • Grade them in the scale of 0 to 10 and

send it to TA

slide-10
SLIDE 10

10

Class Objectives

  • Understand the basic hashing techniques

based on hyperplanes

  • Unsupervised approach
  • Supervised approach using deep learning
  • At the last class:
  • Discussed re-ranking methods: spatial

verification and query expansion

  • Talked about inverted index
slide-11
SLIDE 11

11

Questions

  • When we talk about accuracy, I don't

understand why we only think about the accuracy of matching victual point/patch/features. I think we should also concern about finding images with similar style, images with similar emotion, images reflecting similar activity...

slide-12
SLIDE 12

12

Review of Basic Image Search

Near cluster search

feature space

Shortlist

Inverted file

Re-ranking

Ack.: Dr. Heo

slide-13
SLIDE 13

13

Image Search

Finding visually similar images

slide-14
SLIDE 14

14

Image Descriptor

High dimensional point

(BoW, GIST, Color Histogram, etc.)

slide-15
SLIDE 15

15

Image Descriptor

High dimensional point

(BoW, GIST, Color Histogram, etc.)

Nearest neighbor search (NNS) in high dimensional space

slide-16
SLIDE 16

16

Challenge

BoW CNN Dimensions 1000+ 4000+ 1 image 4 KB+ 16 KB+ 1B images 4 TB+ 16 TB+

slide-17
SLIDE 17

17

Binary Code

11000 11000 11001 00001 00011 00111

slide-18
SLIDE 18

18

Binary Code

11000 11000 11001 00001 00011 00111

* Benefits

  • Compression
  • Very fast distance computation

(Hamming Distance, XOR)

slide-19
SLIDE 19

19

Hyper-Plane based Binary Coding

1

slide-20
SLIDE 20

20

Hyper-Plane based Binary Coding

1 1 1 111 011 010 110 000 100

slide-21
SLIDE 21

21

Distance between Two Points

1 1 1 111 011 010 110 000 100

  • Measured by bit

differences, known as Hamming distance

  • Efficiently computed

by XOR bit operations

slide-22
SLIDE 22

22

Good and Bad Hyper-Planes

Previous work focused on how to determine good hyper-planes

slide-23
SLIDE 23

23

Components of Spherical Hashing

  • Spherical hashing
  • Hyper-sphere setting strategy
  • Spherical Hamming distance
slide-24
SLIDE 24

24

Components of Spherical Hashing

  • Spherical hashing
  • Hyper-sphere setting strategy
  • Spherical Hamming distance
slide-25
SLIDE 25

25

Spherical Hashing [Heo et al., CVPR 12]

1

slide-26
SLIDE 26

26

Spherical Hashing [Heo et al., CVPR 12]

111 011 010 110 000 100 001 101

slide-27
SLIDE 27

27

Hyper-Sphere vs Hyper-Plane

Average of maximum distances within a partition:

  • Hyper-spheres gives tighter bound!
  • pen

closed

slide-28
SLIDE 28

28

Components of Spherical Hashing

  • Spherical hashing
  • Hyper-sphere setting strategy
  • Spherical Hamming distance
slide-29
SLIDE 29

29

Good Binary Coding [Yeiss 2008, He 2011]

  • 1. Balanced partitioning
  • 2. Independence

<

slide-30
SLIDE 30

30

Intuition of Hyper-Sphere Setting

  • 1. Balance
  • 2. Independence
slide-31
SLIDE 31

31

Hyper-Sphere Setting Process

Iteratively repeat step 1, 2 until convergence.

slide-32
SLIDE 32

32

Components of Spherical Hashing

  • Spherical hashing
  • Hyper-sphere setting strategy
  • Spherical Hamming distance
slide-33
SLIDE 33

33

Max Distance and Common ‘1’

111 011 010 110 100 001 101

Common ‘1’s : 1

slide-34
SLIDE 34

34

Max Distance and Common ‘1’

111 011 110 101

Common ‘1’s : 2

slide-35
SLIDE 35

35

Max Distance and Common ‘1’

Common ‘1’s: 1 Common ‘1’s: 2

Average of maximum distances between two partitions: decreases as number of common ‘1’

slide-36
SLIDE 36

36

Spherical Hamming Distance (SHD)

SHD: Hamming Distance divided by the number

  • f common ‘1’s.
slide-37
SLIDE 37

37

Results

384 dimensional 75 million GIST descriptors

slide-38
SLIDE 38

38

Results of Image Retrieval

  • Collaborated with Adobe
  • 11M images
  • Use deep neural nets for image representations
  • Spend only 35 ms for a single CPU thread
slide-39
SLIDE 39

39

Supervised Hashing

  • Utilize image labels
  • Conducted by using deep learning
slide-40
SLIDE 40

40

Supervised hashing for image retrieval via image representation learning, AAA 14

  • First step: approximate hash codes
  • S (similarity matrix, i.e., 1 when two images i &

j have same label)

  • H (Hamming embedding, binary codes): dot

products between two similar codes gives 1

  • Minimize the reconstruction error between S

and similarity between codes

slide-41
SLIDE 41

41

Supervised hashing for image retrieval via image representation learning, AAA 14

  • Second step: learning image features and

hash functions

  • Use Alexnet by utilizing approximate target

hash codes and optionally class labels

  • Once the network is trained, it is used for test

images

slide-42
SLIDE 42

42

Class Objectives were:

  • Understand the basic hashing techniques

based on hyperplanes

  • Unsupervised approach
  • Supervised approach using deep learning
  • Codes are available

http://sglab.kaist.ac.kr/software.htm

slide-43
SLIDE 43

43

Homework for Every Class

  • Go over the next lecture slides
  • Come up with one question on what we have

discussed today

  • Write questions three times
  • Go over recent papers on image search, and submit

their summary before Tue. class

slide-44
SLIDE 44

44 44

Next Time…

  • CNN based image search techniques
slide-45
SLIDE 45

45 45

Fig

1 1 1 111 011 010 110 000 100 111 011 010 110 000 100 001 101