Jingdong Wang Lead Researcher Microsoft Research
http://research.microsoft.com/~jingdw ICML 2104, joint work with my interns Ting Zhang from USTC and Chao Du from Tsinghua University
Composite Quantization for Approximate Nearest Neighbor Search - - PowerPoint PPT Presentation
Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research http://research.microsoft.com/~jingdw ICML 2104, joint work with my interns Ting Zhang from USTC and Chao Du from Tsinghua
http://research.microsoft.com/~jingdw ICML 2104, joint work with my interns Ting Zhang from USTC and Chao Du from Tsinghua University
5/6/2015
5/6/2015
query
5/6/2015
query
5/6/2015
5/6/2015
5/6/2015
โ ๐๐
5/6/2015
5/6/2015
5/6/2015
5/6/2015
Recall the complexity of linear scan:
5/6/2015
Projection Trees for Approximate Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 36(2): 388-403 (2014)
visual descriptor indexing. CVPR 2010: 3392-3399
Graph Search Using Cartesian Concatenation. ICCV 2013: 2128-2135
construction for visual descriptors. CVPR 2012: 1106-1113
5/6/2015
13
ICCV13 ACMMM12 CVPR10 1 NN
5/6/2015
14
ICCV13 ACMMM12 CVPR10 1 NN
5/6/2015
15
ICCV13 ACMMM12 CVPR10 1 NN
5/6/2015
5/6/2015
Recall the complexity of linear scan:
5/6/2015
5/6/2015
Retrieve candidates with an index structure using compact codes Load raw features for retrieved candidates from disk Reranking using the true distances Efficient and small memory consumption IO cost is small
5/6/2015
5/6/2015
5/6/2015
x = x1๐1 x2๐2 โฎ x๐๐๐ โ x = p1๐1 p2๐2 โฎ p๐๐๐
{p11, p12, โฏ , p1๐ฟ}
Codebook in the 1st subspace
{p21, p22, โฏ , p2๐ฟ}
Codebook in the 2nd subspace
{p๐1, p๐2, โฏ , p๐๐ฟ}
Codebook in the Mth subspace
5/6/2015
x = x1 x2 โฎ x๐ โ x = p1๐1 p2๐2 โฎ p๐๐๐
{p11, p12, โฏ , p1๐ฟ}
Codebook in the 1st subspace
{p21, p22, โฏ , p2๐ฟ}
Codebook in the 2nd subspace
{p๐1, p๐2, โฏ , p๐๐ฟ}
Codebook in the Mth subspace
5/6/2015
{p11, p12, โฏ , p1๐ฟ} {p21, p22, โฏ , p2๐ฟ}
Codebook in the 2nd subspace
{p๐1, p๐2, โฏ , p๐๐ฟ}
Codebook in the Mth subspace
x = x1 x2 โฎ x๐ โ x = p1๐1 p2๐2 โฎ p๐๐๐
Codebook in the 1st subspace
5/6/2015
{p11, p12, โฏ , p1๐ฟ}
Codebook in the 1st subspace
{p21, p22, โฏ , p2๐ฟ}
Codebook in the 2nd subspace
{p๐1, p๐2, โฏ , p๐๐ฟ}
Codebook in the Mth subspace
x = x1 x2 โฎ x๐ โ x = p1๐1 p2๐2 โฎ p๐๐๐
5/6/2015
x 2 = ๐ q1, p1๐1
2 + ๐ q2, p2๐2 2 + โฏ + ๐ qM, p๐๐๐ 2
{p11, p12, โฏ , p1๐ฟ}
Codebook in the 1st subspace
{p21, p22, โฏ , p2๐ฟ}
Codebook in the 2nd subspace
{p๐1, p๐2, โฏ , p๐๐ฟ}
Codebook in the Mth subspace
x = x1 x2 โฎ x๐ โ x = p1๐1 p2๐2 โฎ p๐๐๐
q1
๐( ๐( ๐(
, q1 ) , q2 ) , q๐ ) โ {๐ q1, p11 , ๐ q1, p12 , โฏ , ๐(q1, p1๐ฟ)}
5/6/2015
๐ = 2
5/6/2015
๐ = 2, ๐ฟ = 3
5/6/2015
๐ = 2, ๐ฟ = 3
5/6/2015
๐ = 2, ๐ฟ = 3
5/6/2015
๐ = 2, ๐ฟ = 3
5/6/2015
๐ = 2, ๐ฟ = 3
5/6/2015
5/6/2015
5/6/2015
5/6/2015
5/6/2015
5/6/2015
Source codebook 1
Source codebook 2
Source codebook M
Each source codebook is composed of K d-dimensional vectors
5/6/2015
2 source codebooks:
5/6/2015
2 source codebooks: Composite center:
5/6/2015
2 source codebooks: Composite center:
5/6/2015
2 source codebooks: Composite center:
5/6/2015
2 source codebooks: More composite centers
5/6/2015
Source codebook: Composite codebook: 9 composite centers
5/6/2015
Source codebook:
9 groups Space partition:
5/6/2015
Source codebook 1
Source codebook 2
Source codebook M
5/6/2015
Source codebook 1 Source codebook 2 Source codebook M
5/6/2015
x โ x = p1๐1 p2๐2 โฎ p๐๐๐ = p1๐1 โฎ + p2๐2 โฎ + โฏ + โฎ p๐๐๐
5/6/2015
x โ x = p1๐1 p2๐2 โฎ p๐๐๐ = p1๐1 โฎ + p2๐2 โฎ + โฏ + โฎ p๐๐๐
5/6/2015
x โ x = p1๐1 p2๐2 โฎ p๐๐๐ = p1๐1 โฎ + p2๐2 โฎ + โฏ + โฎ p๐๐๐
5/6/2015
x โ x = p1๐1 p2๐2 โฎ p๐๐๐ = p1๐1 โฎ + p2๐2 โฎ + โฏ + โฎ p๐๐๐
5/6/2015
x โ x = p1๐1 p2๐2 โฎ p๐๐๐ = p1๐1 โฎ + p2๐2 โฎ + โฏ + โฎ p๐๐๐ x โ x = c1๐1 + c2๐2 + โฏ + c๐๐๐
5/6/2015
x โ x = R p1๐1 p2๐2 โฎ p๐๐๐ = R p1๐1 โฎ + R p2๐2 โฎ + โฏ + R โฎ p๐๐๐
5/6/2015
x โ x = R p1๐1 p2๐2 โฎ p๐๐๐ = R p1๐1 โฎ + R p2๐2 โฎ + โฏ + R โฎ p๐๐๐
5/6/2015
x โ x = R p1๐1 p2๐2 โฎ p๐๐๐ = R p1๐1 โฎ + R p2๐2 โฎ + โฏ + R โฎ p๐๐๐ x โ x = c1๐1 + c2๐2 + โฏ + c๐๐๐
5/6/2015
Product quantization: Coordinate aligned space partition Cartesian k-means: Rotated coordinate aligned space partition Composite quantization: Flexible space partition
5/6/2015
x = ๐=1
๐
c๐๐๐ x
2 โ
q โ ๐=1
๐
c๐๐๐ x
2 2
Time-consuming
5/6/2015
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
5/6/2015
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
๐(๐) additions Implemented with a pre-computed distance lookup table Distance lookup table: Store the distances from source codebook elements to q
5/6/2015
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
Constant ๐(๐) additions
5/6/2015
Using a pre-computed dot product lookup table
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
O(๐2) additions ๐(๐) additions Dot product lookup table: Store the dot products between codebook elements Constant
5/6/2015
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
๐(๐) additions O(๐2) additions Constant
5/6/2015
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
If constant
๐(๐) additions Constant
5/6/2015
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
If constant
Computing this is enough for search ๐(๐) additions Constant
5/6/2015
q โ ๐=1
๐
c๐๐๐ x
2 2
= ๐=1
๐
q โ c๐๐๐(x) 2
2 โ ๐ โ 1
q 2
2 + ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x)
Subject to the third term is a constant Minimize quantization error: x โ ๐=1
๐
c๐๐๐ x
2 2
Constant
5/6/2015
min
C๐ , ๐๐ x ,๐ x x โ ๐=1 ๐
c๐๐๐ x
2 2
๐ก. ๐ข. ๐โ ๐ c๐๐๐(x)
๐
c๐๐๐(x) = ๐
Minimize quantization error for search accuracy Constant constraint for search efficiency
5/6/2015
min
C๐ , ๐๐ x ,๐ x x โ ๐=1 ๐
c๐๐๐ x
2 2
๐ก. ๐ข. ๐โ ๐ c๐๐๐(x)
๐
c๐๐๐(x) = ๐
Minimize quantization error for search accuracy Constant constraint for search efficiency
5/6/2015
min
C๐ , ๐๐ x ,๐ x x โ ๐=1 ๐
c๐๐๐ x
2 2
๐ก. ๐ข. ๐โ ๐ c๐๐๐(x)
๐
c๐๐๐(x) = ๐
Non-overlapped space partitioning Codebooks are mutually orthogonal
๐โ ๐
c๐๐๐(x)
๐
c๐๐๐(x) = ๐ Product quantization and Cartesian k-means Minimize quantization error for search accuracy Constant constraint for search efficiency
5/6/2015
min
C๐ , ๐๐ x ,๐ x x โ ๐=1 ๐
c๐๐๐ x
2 2
๐ก. ๐ข. ๐โ ๐ c๐๐๐(x)
๐
c๐๐๐(x) = ๐
๐ {C๐ , ๐๐(x) , ๐) = x x โ ๐=1
๐
c๐๐๐ x
2 2 + ๐ x ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x) โ ๐
2 Minimize quantization error for search accuracy Constant constraint for search efficiency
5/6/2015
๐ {C๐ , ๐๐(x) , ๐) = x x โ ๐=1
๐
c๐๐๐ x
2 2 + ๐ x ๐โ ๐ c๐๐๐(x) ๐
c๐๐๐(x) โ ๐
2
Selected by validation Distortion error Constraints violation
5/6/2015
1 #{x} x ๐โ ๐ c๐๐๐ x ๐
c๐๐๐(x)
5/6/2015
1 #{x} x ๐โ ๐ c๐๐๐ x ๐
c๐๐๐(x)
๐)
5/6/2015
1MSIFT, 64 bits Converge about 10~15 iterations
5/6/2015
5/6/2015
retrieved items
5/6/2015
Query q Code of database vector x Distance between q and x Output the nearest vectors Repeated for n database vectors Source codebooks Distance tables
(between query and codebook elements)
5/6/2015
5/6/2015
Our:71.59% CKM:63.83% Recall@10: 64 btis
5/6/2015
Our:71.59% CKM:63.83% Relatively small improvement on 1M GIST might be that CKM has already achieved large improvement
5/6/2015
Our:71.59% 64 bits ITQ: 53.95% 128 bits ITQ without asymmetric distance underperformed ITQ with asymmetric distance Our approach with 64 bits outperforms (A) ITQ with 128 bits, with slightly smaller search cost
5/6/2015
5/6/2015
Our:70.12% CKM:64.57% Recall@100:
5/6/2015
Average query time
5/6/2015
MAP on the holiday dataset Scores on the UKBench dataset
5/6/2015
๐ฆ 2
2 subject to ๐โ ๐ c๐๐๐(๐ฆ) ๐
c๐๐๐(๐ฆ) = ๐
๐โ ๐ c๐๐๐(๐ฆ)
๐
c๐๐๐(๐ฆ) = 0
Search performance with learnt ๐ is better, since learning ๐ is more flexible
(R,T) recall@R
5/6/2015
๐ฆ 2
2 subject to ๐โ ๐ c๐๐๐(๐ฆ) ๐
c๐๐๐(๐ฆ) = ๐
๐ฆ ๐ฆ โ ๐ข โ ๐ฆ 2
2
recall@R (R,T)
Contribution of the offset is relatively small compared with the composite quantization
5/6/2015
min
C๐ , ๐๐ x ,๐ x x โ ๐=1 ๐
c๐๐๐ x
2 2
๐ก. ๐ข. ๐โ ๐ c๐๐๐(x)
๐
c๐๐๐(x) = ๐ ๐ C๐ 1 โค ๐
Minimize quantization error for search accuracy Constant constraint for search efficiency Sparsity constraint for precomputation efficiency
5/6/2015
min
C๐ , ๐๐ x ,๐ x x โ P ๐=1 ๐
c๐๐๐ x
2 2
๐ก. ๐ข. ๐โ ๐ c๐๐๐(x)
๐
c๐๐๐(x) = ๐ ๐ C๐ 1 โค ๐
Minimize quantization error for search accuracy Constant constraint for search efficiency Sparsity constraint for precomputation efficiency Dimension reduction
5/6/2015
Vector quantization (1) Vector quantization (2) Residuals Vector quantization (M) Residuals Residuals โฆโฆ Database ๐
5/6/2015
5/6/2015
http://research.microsoft.com/~jingdw/pubs/lthsurvey.pdf
5/6/2015
http://research.microsoft.com/~jingdw/cfp/CFP_TBDSI_BMD.pdf
5/6/2015
http://research.microsoft.com/~jingdw/cfp/CFP_ICDM15WORKSHOP_BMD.pdf
5/6/2015
5/6/2015
x โ x
x 2 โ q โ x 2
q โ x 2 โ q โ x 2 โค x โ x 2
x 2
2
5/6/2015
q โ x 2 โ q โ x 2 โค x โ x 2
๐ q, x = (ฮฃ๐=1
๐
q โ c๐๐๐ x
2 2)1/2
๐ q, x = ( q โ x 2
2 + ๐ โ 1
q 2
2)1/2
๐ = ฮฃ๐โ ๐c๐๐๐(x)
๐
c๐๐๐(x) x = ฮฃ๐=1
๐
c๐๐๐(๐ฆ)
Distortion Efficiency
๐ q, x โ ๐(q, x) โค x โ x 2 + |๐|1/2
5/6/2015
min
C๐ , ๐๐ x ,๐ ฮฃx x โ
x 2
2
๐ก. ๐ข. ฮด = ๐
Minimize distortion for search accuracy Constant constraint for search efficiency
๐ q, x โ ๐(q, x) โค x โ x 2 + |๐|1/2
Distortion Efficiency