Discrete Hashing Fast, scalable retrieval and classification Fumin - - PowerPoint PPT Presentation
Discrete Hashing Fast, scalable retrieval and classification Fumin - - PowerPoint PPT Presentation
Discrete Hashing Fast, scalable retrieval and classification Fumin Shen Center for Future Media, University of Electronic Science and Technology of China Outline Introduction to Hashing Discrete optimization for Hashing Applications
Outline
- Introduction to Hashing
- Discrete optimization for Hashing
- Applications of Discrete Hashing
- Classification by Hamming Retrieval
Background: Hashing
Extremely Extremely fast! ast!
Hamming distance
- Locality-Sensitive Hashing (LSH):
[Gionis, Indyk, and Motwani 1999], [Datar et al. 2004], etc.
Data Vector hash function for a bit random
101 101
Query
1 1 1
Background: Hashing
Recent study: learn to hash
Learned from data
Background: Hashing
The main application: Approximate Nearest Neighbor Search (ANNS)
visually relevant query
Image Database
LSH
- LSH has a lot of variants suitable for
( ), Cosine similarity, Gaussian, kernels (KLSH), etc.
- Good: Sublinear search time for < 1.
- Bad: Long hash bits (~) and hundreds of hash tables (big
memory).
Learning based Hashing
Unsupervised Hashing
Learning binary codes preserving data similarities
- PCAH: generate W by principal component analysis (PCA)
- SH: (Weiss et al., 2008) introduce unsupervised graph hashing
- ITQ: (Gong and Lazebnik, 2011) orthogonal rotation matrix to
refine the initial projection matrix by PCA
- AGH: (Liu et al., 2011) solve SH by anchor graphs
- IMH: (Shen et al., 2013) generate binary codes form general
data manifolds
- DGH: (Liu et al., 2014) Solve SH by discrete optimization
- AIBC: (Shen et al., 2015) asymmetric hashing
- …
Supervised Hashing
Learning binary codes supervised by piecewise or pairwise/ranking labels
- SSH: (Wang et al., 2010) exploits both labeled and
unlabeled data for hashing
- MLH: (Norouzi and Fleet, 2011) based on structural SVM
- KSH: (Liu et al., 2012) kernel based supervised hashing
- FastH: (Lin et al., 2014) solve hashing by Graph cuts
- SDH: (Shen et al., 2015) generate binary codes by discrete
- ptimization
- COSDISH: (Kang, et al., 2016) column sampling based
discrete supervised hashing
- DSeRH (Liu et al., 2017) deep ranking hashing
- …
Deep learning based Hashing
- Lots of supervised methods
- DAPH (Shen et al., MM’17)
- DSeRH (Liu et al., CVPR’17)
- DPSH (Li et al., IJCAI’16)
- VDSH (Zhang et al., CVPR’16)
- DSH (Liu et al., CVPR’16)
- CNNH (Xia et al., AAAI’15)
- Very few unsupervised ones
- DH (Liong et al., CVPR’15)
- Deepbit (Lin et al., CVPR’16)
- UH-BDNN (Do et al., ECCV’16)
Deep vs. Shallow
Deep learning boost supervised hashing Long way for unsupervised deep hashing
Method ITQ IMH CNN+ITQ DH UH-BDNN MAP 17.76 18.38 0.255 16.62 18.35
Manifold learning vs. Hashing
Optimal hash codes
Spectral Hashing
- Very similar formulation
- Key difference: discrete constraint
The hashing problem
- Mixed Integer Program; Normally NP hard
- Difficult to optimize due to the discrete variables
Solution in literature
- Step 1: Relaxation -- discard the discrete constraints
- Mimic sign function by continuous Sigmoid
- Hard to achieve good (local) optima
- Step 2: Rounding – thresholding after learning
- Quantization techniques: ITQ (Gong and Lazebnik,
2011)
- Increasing quantization distortion with long hash codes
Our solution
(I) Supervised Discrete Hashing
- F. Shen, C. Shen, W. Liu, H. T. Shen, “Supervised Discrete Hashing”, CVPR’15.
Formulation: Joint learning of binary codes , feature representation and the linear classifier
is the ground truth label matrix
Algorithm: Alternating minimization until convergence
- solve the W-subproblem (multi-class classification);
- solve the F-subproblem (feature learning);
- solve the B-subproblem (hash learning) – the key problem
(I) Supervised Discrete Hashing
Algorithm: Discrete Cyclic Coordinate descent (DCC)
learn bit-by-bit
Optimal, closed-form solution in each iteration!
The key binary code optimization problem
(I) Supervised Discrete Hashing
Discrete optimization vs. Relaxed
- ptimization CIFAR-10 dataset
Results
Discrete Optimization is Important for Hashing!
(I) Supervised Discrete Hashing
- SDH supports other losses such as hinge loss. Then the B-
subproblem still has a closed-form update, while the W- subproblem is the multi-class SVM.
- SDH scales linearly with the number of labeled examples,
so it can incorporate massive labeled data into training.
Binary optimization
How to solve the general binary code learning problem?
- Design a new algorithm for every different loss?
- The loss can be too complex to design feasible discrete
- ptimization algorithm.
(II) Discrete Proximal Linearized Minimization
Motivation: Minimize an equivalent smooth + non- smooth loss Algorithm: Discrete Proximal Linearized Minimization Each iteration: closed-form, optimal solution!
- F. Shen, X. Zhou, Y. Yang, J. Song, H. T. Shen and D. Tao, ʺA Fast Optimization Method for General Binary
Code Learningʺ, IEEE Transactions on Image Processing (TIP), 2016.
- Theoretical
Guaranteed to converge!
- Practical:
- Very Fast, even faster than DCC in SDH
- Successfully applied to supervised and unsupervised
Hashing
(II) Discrete Proximal Linearized Minimization
(III) Asymmetric Inner-product Binary Coding
Hashing for Maximum Inner Product Search (MIPS):
Retrieve the datum having the largest inner product with query q from database A
Algorithm: Inner product fitting by asymmetric hash functions
- F. Shen, W. Liu, S. Zhang, Y. Yang, and H. T. Shen, “Learning Binary Codes for Maximum Inner Product
Search”, ICCV 2015
Decomposed this hard problem into two sub-problems with each solved by DCC, as in SDH. is the inner products of and
Results: unsupervised hashing
Asymmetric Inner-product Binary Coding (AIBC)
(IV) Discrete Collaborative Filtering
Collaborative Filtering Our proposal: Discrete Collaborative Filtering
- H. Zhang, F. Shen, L. Liu, W. Liu, X. He, H. Luan and T.‐S. Chua, “Discrete Collaborative Filtering”, SIGIR 2016.
Best Paper Award Honorable Mention
(IV) Discrete Collaborative Filtering
(V) Classification by Hamming Retrieval
- Very few learn-to-hash work for classification!
- Existing classification methods treat hash
codes as real-valued features Boost even linear classification by hashing Motivation
Idea: Classify binary data with binary weights
Floating-point multiplications XNOR operations
(V) Classification by Hamming Retrieval
- F. Shen, Y. Mu, Y. Yang, W. Liu, L. Liu, J. Song, H. T. Shen, “Classification by Retrieval:
Binarizing Data and Classifier”, SIGIR 2017. Best Paper Award Honorable Mention
Framework
Classifying an image reduces to retrieving its nearest class codes in the Hamming space.
(V) Classification by Hamming Retrieval
Fomulation: Joint learning of binary codes and binary weights
- The loss
can be any proper empirical loss. We particularly study the Exponential loss and Linear loss.
Inter-class margin
(V) Classification by Hamming Retrieval
- W-subproblem: Binary Quadratic Program (BQP)
bit-by-bit
Sequential bit flipping algorithm – local optimal
- B-subproblem
bit-by-bit
Solution: Exponential loss
- P-subproblem
(V) Classification by Hamming Retrieval
Results:
LibLinear vs. our method on SUN 397
(V) Classification by Hamming Retrieval
Results:
Comparison in accuracy (%), training and testing time (seconds).
(V) Classification by Hamming Retrieval
Results:
Accuracy (%) with increasing binary code length
(V) Classification by Hamming Retrieval
- Convert linear classification to Hamming retrieval
- Binarize both data and classifier in a joint problem
- Support many empirical loss functions
- Significant reduction on storage, training and
testing computation
(V) Classification by Hamming Retrieval
Conclusions:
(VI) Deep Sketch Hashing
Sketch based image retrieval
Existing methods:
- Hand-crafted feature engineering. (e.g., SIFT, HOG,
HELO[1], LKS[2])
- Deep learning based feature extraction
Li Liu, Fumin Shen, Yuming Shen, Xianglong Liu, Ling Shao, “Deep Sketch Hashing: Fast Free‐hand Sketch‐Based Image Retrieval,”, CVPR 2017
Framework of DSH
We integrate a convolutional neural network and discrete binary code learning into a unified framework.
(VI) Deep Sketch Hashing
Li Liu, Fumin Shen, Yuming Shen, Xianglong Liu, Ling Shao, “Deep Sketch Hashing: Fast Free‐hand Sketch‐Based Image Retrieval,”, CVPR 2017
Objective Formulation of DSH
non-convex and non-smooth
(VI) Deep Sketch Hashing
Alternating Optimization
(VI) Deep Sketch Hashing
Comparison with previous SBIR methods
(VI) Deep Sketch Hashing
Comparison with cross-modality methods
Experimental results of DSH
(VI) Deep Sketch Hashing
Successful Cases of DSH:
(VI) Deep Sketch Hashing
(VI) Hashing for Partial Action Recognition
Motivation:
- Most action recognition approaches analyze after-the-fact actions. However,
capturing complete actions is often difficult due to occlusions, interruptions, etc.
- Partial action recognition (PAR) has a wide range of applications in intelligent
surveillance, smart homes, retrieval systems, etc.
Traditional Action Recognition Action Prediction Partial Action Recognition (Ours)
Preserving similarity Feature reconstruction Learning coding matrix
The flowchart of Partial Reconstructive Binary Coding (PRBC) Objective:
Discrete Alternating Optimization
(VI) Hashing for Partial Action Recognition
- Quantitative results on three tasks:
1) Action prediction 2) Partial action retrieval 3) Partial action recognition
- J. Qin, L. Liu, L. Shao, B. Ni,
- C. Chen, F. Shen and Y. Wang,
“Binary Coding for Partial Action Analysis with Limited Observation Ratios”, in CVPR 2017.
(VI) Hashing for Partial Action Recognition
Our work on discrete hashing
- Deep Asymmetric Pairwise Hashing (ACM MM’17)
- DSeRH for deep ranking hashing (CVPR’17)
- Asymmetric Binary Coding (TMM 2016)
- Discrete Cross-modal Hashing (TIP 2016)
- ZSECOC for Action Recognition (CVPR’17)
- Compressed K-means (AAAI’17)
- Discrete Spectral Clustering (IJCAI’16)
- Attribute Hashing (ICME’17) Best Paper Award – Platinum Award
- Zero-shot Hashing (MM’16)
- AIBC for Medical Image Retrieval (ISBI’16)