Hashing Techniques (Sung-Eui Yoon) Professor KAIST - PowerPoint PPT Presentation

Hashing Techniques 윤성의 (Sung-Eui Yoon) Professor KAIST http://sgvr.kaist.ac.kr

Student Presentation Guidelines ● Good summary, not full detail, of the paper ● Talk about motivations of the work ● Give a broad background on the related work ● Explain main idea and results of the paper ● Discuss strengths and weaknesses of the method ● Prepare an overview slide ● Talk about most important things and connect them well 2

High-Level Ideas ● Deliver most important ideas and results ● Do not talk about minor details ● Give enough background instead ● Deeper understanding on a paper is required ● Go over at least two related papers and explain them in a few slides ● Spend most time to figure out the most important things and prepare good slides for them 3

Deliver Main Ideas of the Paper ● Identify main ideas/contributions of the paper and deliver them ● If there are prior techniques that you need to understand, study those prior techniques and explain them ● For example, A paper utilizes B’s technique in its main idea. In this case, you need to explain B to explain A well. 4

Be Honest ● Do not skip important ideas that you don’t know ● Explain as much as you know and mention that you don’t understand some parts ● If you get questions you don’t know good answers, just say it ● In the end, you need to explain them before the semester ends at KLMS board 5

Result Presentation ● Give full experiment settings and present data with the related information ● What does the x-axis mean in the below image? ● After showing the data, give a message that we can pull of the data ● Show images/videos, if there are 6

Utilizing Existing Resources ● Use author’s slides, codes, and video, if they exist ● Give proper credits or citations ● Without them, you are cheating! 7

Audience feedback form Date: Talk title: Speaker: 1. Was the talk well organized and well prepared? 5: Excellent 4: good 3: okay 2: less than average 1: poor 2. Was the talk comprehensible? How well were important concepts covered? 5: Excellent 4: good 3: okay 2: less than average 1: poor Any comments to the speaker 8

Prepare Quiz ● Review most important concepts of your talk ● Prepare two multiple-choices questions ● Example: What is the biased algorithm? A: Given N samples, the expected mean of the estimator is I ● B: Given N samples, the exp. Mean of the estimator is I + e ● C: Given N samples, the exp. Mean of the estimator is I + e, ● where e goes to zero, as N goes to infinite ● Gr ade them in the scale of 0 to 10 and send it to TA 9

Class Objectives ● Understand the basic hashing techniques based on hyperplanes ● Unsupervised approach ● Supervised approach using deep learning ● At the last class: ● Discussed re-ranking methods: spatial verification and query expansion ● Talked about inverted index 10

Questions ● When we talk about accuracy, I don't understand why we only think about the accuracy of matching victual point/patch/features. I think we should also concern about finding images with similar style, images with similar emotion, images reflecting similar activity... 11

Review of Basic Image Search feature space Inverted file Near cluster … search Shortlist Re-ranking Ack.: Dr. Heo 12

Image Search Finding visually similar images 13

Image Descriptor High dimensional point (BoW, GIST, Color Histogram, etc.) 14

Image Descriptor High dimensional point Nearest neighbor search (NNS) (BoW, GIST, Color Histogram, etc.) in high dimensional space 15

Challenge BoW CNN Dimensions 1000+ 4000+ 1 image 4 KB+ 16 KB+ 1B images 4 TB+ 16 TB+ 16

Binary Code 00001 11000 00011 11000 00111 11001 17

Binary Code 11000 00001 11000 00011 11001 00111 * Benefits - Compression - Very fast distance computation (Hamming Distance, XOR) 18

Hyper-Plane based Binary Coding 0 1 19

Hyper-Plane based Binary Coding 0 1 1 0 0 011 1 010 000 110 111 100 20

Distance between Two Points ● Measured by bit differences, known as Hamming distance 0 1 ● Efficiently computed 1 by XOR bit operations 0 0 011 1 010 000 110 111 100 21

Good and Bad Hyper-Planes Previous work focused on how to determine good hyper-planes 22

Components of Spherical Hashing ● Spherical hashing ● Hyper-sphere setting strategy ● Spherical Hamming distance 23

Spherical Hashing [Heo et al., CVPR 12] 1 0 25

Spherical Hashing [Heo et al., CVPR 12] 101 001 100 111 011 110 000 010 26

Hyper-Sphere vs Hyper-Plane open closed Average of maximum distances within a partition: - Hyper-spheres gives tighter bound! 27

Good Binary Coding [Yeiss 2008, He 2011] 1. Balanced partitioning 2. Independence < 29

Intuition of Hyper-Sphere Setting 1. Balance 2. Independence 30

Hyper-Sphere Setting Process Iteratively repeat step 1, 2 until convergence. 31

Max Distance and Common ‘1’ Common ‘1’s : 1 101 001 100 111 011 110 010 33

Max Distance and Common ‘1’ Common ‘1’s : 2 101 111 011 110 34

Max Distance and Common ‘1’ Common ‘1’s: 1 Common ‘1’s: 2 Average of maximum distances between two partitions: decreases as number of common ‘1’ 35

Spherical Hamming Distance (SHD) SHD: Hamming Distance divided by the number of common ‘1’s. 36

Results 384 dimensional 75 million GIST descriptors 37

Results of Image Retrieval ● Collaborated with Adobe ● 11M images ● Use deep neural nets for image representations ● Spend only 35 ms for a single CPU thread 38

Supervised Hashing ● Utilize image labels ● Conducted by using deep learning 39

Supervised hashing for image retrieval via image representation learning, AAA 14 ● First step: approximate hash codes ● S (similarity matrix, i.e., 1 when two images i & j have same label) ● H (Hamming embedding, binary codes): dot products between two similar codes gives 1 ● Minimize the reconstruction error between S and similarity between codes 40

Supervised hashing for image retrieval via image representation learning, AAA 14 ● Second step: learning image features and hash functions ● Use Alexnet by utilizing approximate target hash codes and optionally class labels ● Once the network is trained, it is used for test images 41

Class Objectives were: ● Understand the basic hashing techniques based on hyperplanes ● Unsupervised approach ● Supervised approach using deep learning ● Codes are available http://sglab.kaist.ac.kr/software.htm 42

Homework for Every Class ● Go over the next lecture slides ● Come up with one question on what we have discussed today ● Write questions three times ● Go over recent papers on image search, and submit their summary before Tue. class 43

Next Time… ● CNN based image search techniques 44 44

Fig 0 1 1 0 101 0 001 011 100 1 010 111 000 110 011 111 110 000 100 010 45 45

Hashing Techniques (Sung-Eui Yoon) Professor KAIST - PowerPoint PPT Presentation

Hashing Techniques (Sung-Eui Yoon) Professor KAIST http://sgvr.kaist.ac.kr Student Presentation Guidelines Good summary, not full detail, of the paper Talk about motivations of the work Give a broad background on the

Today. Cuckoo hashing. Today. Cuckoo hashing. Johnson-Lindenstrass. Cuckoo hashing. Hashing

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Overview Intro to Hashing Intro to Hashing Hashing with Chaining Whats hashing?

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Database Systems Index: Hashing Based on slides by Feifei Li, University of Utah Hashing n

Hashing (Application of Probability) Ashwinee Panda Final CS 70 Lecture! 9 Aug 2018 Overview

Hashing Connections 2-Universal Hash Function Perfect Hashing Anil Maheshwari Proofs

Union-Find [10] In the last class Hashing Collision Handling for Hashing Closed

Hashing Chapter 5 1 Objectives Understand the idea of hashing Compare hashing to sorting

Hashing Hashing What is it? A form of narcotic intake? A side order for your eggs? A

Lecture 8: Hashing I Lecture Overview Dictionaries and Python Motivation Prehashing

Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files

Advanced Algorithms COMS31900 Hashing part two Static Perfect Hashing Rapha el Clifford

Hash Tables Outline Definition Hash functions Open hashing Closed hashing

Information near-duplicates Minimum hashing; Locality Sensitive Hashing Web Search Information

Hashing Algorithms Hash functions Separate Chaining Linear Probing Double Hashing Symbol-Table

Section 2 Link Layer CSE 461 Autumn 2015 Panji Wisesa Byte Count Add a length to the

HOST Physical Unclonable Functions I ECE 525 Introduction We discussed the basic tenets of

Permutation Editing and Matching via Embeddings Graham Cormode, S. Muthukrishnan, Cenk Sahinalp

Overview Filtering images MAP, Tikhonov and Poisson model of the noise A-priori and Markov

The Central Dogma of Genetics Or the Coding Theory Behind it Artur Schfer University of St.

Streaming and communication complexity of Hamming distance Tatiana Starikovskaya IRIF,

Talk outline Hamming similarity search Approximate similarity search using LSH Recent

Communicating with Errors Someone sends you a message: As mmbrof teGreek commniand art of n

Sambuz

Useful Links

Newsletter

Mail Us