Learning to Hash with its Application to Big Data Retrieval and - PowerPoint PPT Presentation

Learning to Hash with its Application to Big Data Retrieval and Mining o É � Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai, China Joint work with š ‘ h , Ü À ™ , L ¯ ¿ Dec 21, 2013 Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 1 / 49

Outline 1 Introduction Problem Definition Existing Methods 2 Isotropic Hashing Model Learning Experiment 3 Multiple-Bit Quantization Double-Bit Quantization Manhattan Quantization 4 Conclusion 5 Reference Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 2 / 49

Introduction Outline 1 Introduction Problem Definition Existing Methods 2 Isotropic Hashing Model Learning Experiment 3 Multiple-Bit Quantization Double-Bit Quantization Manhattan Quantization 4 Conclusion 5 Reference Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 3 / 49

Introduction Problem Definition Nearest Neighbor Search (Retrieval) Given a query point q , return the points closest (similar) to q in the database(e.g. images). Underlying many machine learning, data mining, information retrieval problems Challenge in Big Data Applications: Curse of dimensionality Storage cost Query speed Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 4 / 49

Introduction Problem Definition Similarity Preserving Hashing Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 5 / 49

Introduction Problem Definition Reduce Dimensionality and Storage Cost Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 6 / 49

Introduction Problem Definition Querying Hamming distance: || 01101110 , 00101101 || H = 3 || 11011 , 01011 || H = 1 Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 7 / 49

Introduction Problem Definition Querying Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 8 / 49

Introduction Problem Definition Querying Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 9 / 49

Introduction Problem Definition Fast Query Speed By using hashing scheme, we can achieve constant or sub-linear search time complexity. Exhaustive search is also acceptable because the distance calculation cost is cheap now. Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 10 / 49

Introduction Problem Definition Two Stages of Hash Function Learning Projection Stage (Dimension Reduction) Projected with real-valued projection function Given a point x , each projected dimension i will be associated with a real-valued projection function f i ( x ) (e.g. f i ( x ) = w T i x ) Quantization Stage Turn real into binary Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 11 / 49

Introduction Existing Methods Data-Independent Methods The hashing function family is defined independently of the training dataset: Locality-sensitive hashing (LSH): (Gionis et al., 1999; Andoni and Indyk, 2008) and its extensions (Datar et al., 2004; Kulis and Grauman, 2009; Kulis et al., 2009). SIKH: Shift invariant kernel hashing (SIKH) (Raginsky and Lazebnik, 2009). Hashing function: random projections. Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 12 / 49

Introduction Existing Methods Data-Dependent Methods Hashing functions are learned from a given training dataset. Relatively short codes Seminal papers: (Salakhutdinov and Hinton, 2007, 2009; Torralba et al., 2008; Weiss et al., 2008) Two categories: Unimodal Supervised methods given the labels y i or triplet ( x i , x j , x k ) Unsupervised methods Multimodal Supervised methods Unsupervised methods Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 13 / 49

Introduction Existing Methods (Unimodal) Unsupervised Methods No labels to denote the categories of the training points. PCAH: principal component analysis. SH: (Weiss et al., 2008) eigenfunctions computed from the data similarity graph. ITQ: (Gong and Lazebnik, 2011) orthogonal rotation matrix to refine the initial projection matrix learned by PCA. AGH: Graph-based hashing (Liu et al., 2011). Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 14 / 49

Introduction Existing Methods (Unimodal) Supervised (semi-supervised) Methods Class labels or pairwise constraints: SSH: Semi-Supervised Hashing (SSH) (Wang et al., 2010a,b) exploits both labeled data and unlabeled data for hash function learning. MLH: Minimal loss hashing (MLH) (Norouzi and Fleet, 2011) based on the latent structural SVM framework. KSH: Kernel-based supervised hashing (Liu et al., 2012) LDAHash: Linear discriminant analysis based hashing (Strecha et al., 2012) Triplet-based methods: Hamming Distance Metric Learning (HDML) (Norouzi et al., 2012) Column Generation base Hashing (CGHash) (Li et al., 2013) Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 15 / 49

Introduction Existing Methods Multimodal Methods Multi-Source Hashing Cross-Modal Hashing Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 16 / 49

Introduction Existing Methods Multi-Source Hashing Aims at learning better codes by leveraging auxiliary views than unimodal hashing. Assumes that all the views provided for a query, which are typically not feasible for many multimedia applications. Multiple Feature Hashing (Song et al., 2011) Composite Hashing (Zhang et al., 2011) Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 17 / 49

Introduction Existing Methods Cross-Modal Hashing Given a query of either image or text, return images or texts similar to it. Cross View Hashing (CVH) (Kumar and Udupa, 2011) Multimodal Latent Binary Embedding (MLBE) (Zhen and Yeung, 2012a) Co-Regularized Hashing (CRH) (Zhen and Yeung, 2012b) Inter-Media Hashing (IMH) (Song et al., 2013) Relation-aware Heterogeneous Hashing (RaHH) (Ou et al., 2013) Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 18 / 49

Introduction Existing Methods I S � ó Š FDU: Yugang Jiang, Xuanjing Huang HKUST: Dit-Yan Yeung IA-CAS: Cheng-Lin Liu, Yan-Ming Zhang ICT-CAS: Hong Chang MSRA: Kaiming He, Jian Sun, Jingdong Wang NUST: Fumin Shen SYSU: Weishi Zheng Tsinghua: Peng Cui, Shiqiang Yang, Wenwu Zhu ZJU: Jiajun Bu, Deng Cai, Xiaofei He, Yueting Zhuang ...... Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 19 / 49

Isotropic Hashing Outline 1 Introduction Problem Definition Existing Methods 2 Isotropic Hashing Model Learning Experiment 3 Multiple-Bit Quantization Double-Bit Quantization Manhattan Quantization 4 Conclusion 5 Reference Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 20 / 49

Isotropic Hashing Motivation Problem: All existing methods use the same number of bits for different projected dimensions with different variances. Possible Solutions: Different number of bits for different dimensions (Unfortunately, have not found an effective way) Isotropic (equal) variances for all dimensions Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 21 / 49

Isotropic Hashing Contribution Isotropic hashing (IsoHash):(Kong and Li, 2012b) hashing with isotropic variances for all dimensions Multiple-bit quantization: (1) Double-bit quantization (DBQ):(Kong and Li, 2012a) Hamming distance driven (2) Manhattan hashing (MH):(Kong et al., 2012) Manhattan distance driven Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 22 / 49

Isotropic Hashing PCA Hash To generate a code of m bits, PCAH performs PCA on X , and then use the top m eigenvectors of the matrix XX T as columns of the projection matrix W ∈ R d × m . Here, top m eigenvectors are those corresponding to the m largest eigenvalues { λ k } m k =1 , generally arranged with the non-increasing order λ 1 ≥ λ 2 ≥ · · · ≥ λ m . Let λ = [ λ 1 , λ 2 , · · · , λ m ] T . Then Λ = W T XX T W = diag ( λ ) Define hash function h ( x ) = sgn ( W T x ) Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 23 / 49

Isotropic Hashing Weakness of PCA Hash Using the same number of bits for different projected dimensions is unreasonable because larger-variance dimensions will carry more information. Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 24 / 49

Isotropic Hashing Weakness of PCA Hash Using the same number of bits for different projected dimensions is unreasonable because larger-variance dimensions will carry more information. Solve it by making variances equal (isotropic)! Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 24 / 49

Isotropic Hashing Model Idea of IsoHash Learn an orthogonal matrix Q ∈ R m × m which makes Q T W T XX T WQ become a matrix with equal diagonal values. Effect of Q : to make each projected dimension has the same variance while keeping the Euclidean distances between any two points unchanged. Li ( http://www.cs.sjtu.edu.cn/~liwujun ) Learning to Hash CSE, SJTU 25 / 49

Learning to Hash with its Application to Big Data Retrieval and - PowerPoint PPT Presentation

Learning to Hash with its Application to Big Data Retrieval and Mining o Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai, China Joint work with h , , L Dec 21, 2013 Li (

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Hash Functions Hash Functions 1 Cryptographic Hash Function Crypto hash function h(x) must

Hash Functions and Hash Tables (2.5.2) A hash function h maps keys of a given type to

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Generics Asumu Takikawa RacketCon 2012 1 What are generics? 2 What are generics? hash-ref

Hash Pile Ups: Using Collisions to Identify Unknown Hash Functions R. Joshua Tobin and David

Hash tables Hash functions Open addressing March 09, 2020 Cinda Heeren / Andy Roth / Geoffrey

Security Proofs for the MD6 Hash Algorithm Ahmed Ezzat Outline Introduction to hash

LUX Hash Function Ivica Nikoli c, Alex Biryukov, Dmitry Khovratovich University of Luxembourg

HASH FUNCTIONS Mihir Bellare UCSD 1 Mihir Bellare UCSD 2 Hash functions Hash functions

Topic 22 Hash Tables " hash collision n. [from the techspeak] (var. `hash clash') When used

Hash Tables 1 Hash Table in Primary Storage Main parameter B = number of buckets Hash

HASH FUNCTIONS 1 / 62 What is a hash function? By a hash function we usually mean a map h : D

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Hash Functions and MACs Properties of Cryptographic Hash Functions Introduction to Message

NAMED DATA NETWORKING IN SCIENTIFIC APPLICATIONS Susmit Shannigrahi, Chengyu Fan and Christos

Care Transitions Network Data Jam October 28, 2016 National Council for Behavioral Health

GDPR Update 3 October 2019 Phil Tompkins and Dean Murray Newcastle | Leeds | Manchester 2 What

1. The potential of data sharing 2. An ideal professed but not practiced * 1. Researchers 2.

CYBERSECURITY STRATEGIES TO MANAGE BUSINESS RISKS A C O N V E R S A T I O N W I T H H O R N E

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

Learning to Hash with its Application to Big Data Retrieval and - PowerPoint PPT Presentation

Learning to Hash with its Application to Big Data Retrieval and Mining o Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai, China Joint work with h , , L Dec 21, 2013 Li (

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Hash Functions Hash Functions 1 Cryptographic Hash Function Crypto hash function h(x) must

Hash Functions and Hash Tables (2.5.2) A hash function h maps keys of a given type to

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Generics Asumu Takikawa RacketCon 2012 1 What are generics? 2 What are generics? hash-ref

Hash Pile Ups: Using Collisions to Identify Unknown Hash Functions R. Joshua Tobin and David

Hash tables Hash functions Open addressing March 09, 2020 Cinda Heeren / Andy Roth / Geoffrey

Security Proofs for the MD6 Hash Algorithm Ahmed Ezzat Outline Introduction to hash

LUX Hash Function Ivica Nikoli c, Alex Biryukov, Dmitry Khovratovich University of Luxembourg

HASH FUNCTIONS Mihir Bellare UCSD 1 Mihir Bellare UCSD 2 Hash functions Hash functions

Topic 22 Hash Tables &quot; hash collision n. [from the techspeak] (var. `hash clash') When used

Hash Tables 1 Hash Table in Primary Storage Main parameter B = number of buckets Hash

HASH FUNCTIONS 1 / 62 What is a hash function? By a hash function we usually mean a map h : D

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Hash Functions and MACs Properties of Cryptographic Hash Functions Introduction to Message

NAMED DATA NETWORKING IN SCIENTIFIC APPLICATIONS Susmit Shannigrahi, Chengyu Fan and Christos

Care Transitions Network Data Jam October 28, 2016 National Council for Behavioral Health

GDPR Update 3 October 2019 Phil Tompkins and Dean Murray Newcastle | Leeds | Manchester 2 What

1. The potential of data sharing 2. An ideal professed but not practiced * 1. Researchers 2.

CYBERSECURITY STRATEGIES TO MANAGE BUSINESS RISKS A C O N V E R S A T I O N W I T H H O R N E

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data &amp; Real Time Data Streams

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

Topic 22 Hash Tables " hash collision n. [from the techspeak] (var. `hash clash') When used

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams