Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix - - PowerPoint PPT Presentation

ruiqi guo philip sun erik lindgren quan geng david simcha
SMART_READER_LITE
LIVE PREVIEW

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix - - PowerPoint PPT Presentation

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar Overview Application: recommender systems Application: recommender systems Application: recommender systems Application: recommender systems Application:


slide-1
SLIDE 1

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar

slide-2
SLIDE 2

Overview

slide-3
SLIDE 3

Application: recommender systems

slide-4
SLIDE 4

Application: recommender systems

slide-5
SLIDE 5

Application: recommender systems

slide-6
SLIDE 6

Application: recommender systems

slide-7
SLIDE 7

Application: recommender systems

slide-8
SLIDE 8

MIPS: paruitioning and quantization

Paruitioning:

  • Split database into disjoint subsets
  • Search only the most promising subsets
slide-9
SLIDE 9

MIPS: paruitioning and quantization

Paruitioning:

  • Split database into disjoint subsets
  • Search only the most promising subsets

Quantization:

  • Reduce the number of bits used to describe data

points.

  • Leads to smaller index size and faster inner

product calculations.

slide-10
SLIDE 10

MIPS: paruitioning and quantization

Paruitioning:

  • Split database into disjoint subsets
  • Search only the most promising subsets

Quantization:

  • Reduce the number of bits used to describe data

points.

  • Leads to smaller index size and faster inner

product calculations.

slide-11
SLIDE 11

Quantization overview: codebooks

Given a set of vectors x1, x2, …, xn, we want to create a quantized dataset x̃

1, x̃ 2, …, x̃ n.

Quantize to an element of the codebook, Cθ

slide-12
SLIDE 12

Example codebook: vector quantization

Parameters are a set of centers c1, c2, …, ck. Codebook Cθ is the set of all centers: {c1, c2, …, ck}.

slide-13
SLIDE 13

Example codebook: vector quantization

Parameters are a set of centers c1, c2, …, ck. Codebook Cθ is the set of all centers: {c1, c2, …, ck}. Product quantization:

  • splits the space into multiple subspaces
  • uses a vector quantization codebook for each subspace.
slide-14
SLIDE 14

Quantization basics: assignment

To assign a datapoint to a codeword, we select the codeword that minimizes a loss function.

slide-15
SLIDE 15

Traditional loss function choice

Classic approach: reconstruction error.

slide-16
SLIDE 16

Traditional loss function choice

Classic approach: reconstruction error. By Cauchy-Schwaruz:

slide-17
SLIDE 17

Some inner product errors are worse than others

Consider a query q and database points x1, … , xn Rank points by inner product

slide-18
SLIDE 18

Some inner product errors are worse than others

Consider a query q and database points x1, … , xn Rank points by inner product

slide-19
SLIDE 19

Some inner product errors are worse than others

Consider a query q and database points x1, … , xn Rank points by inner product

Low inner product High inner product

a1 a2 a3 a4 a5 … an-

3

an-

2

an-1 an

slide-20
SLIDE 20

Some inner product errors are worse than others

Consider a query q and database points x1, … , xn Rank points by inner product

Low inner product High inner product

a1 a2 a3 a4 a5 … an-

3

an-

2

an-1 an MIPS Results

slide-21
SLIDE 21

Some inner product errors are worse than others

Consider a query q and database points x1, … , xn Rank points by inner product

Low inner product High inner product

a1 a2 a3 a4 a5 … an-

3

an-

2

an-1 an MIPS Results Peruurbations of low inner products are unlikely to result in changes to top-k

slide-22
SLIDE 22

Some inner product errors are worse than others

Consider a query q and database points x1, … , xn Rank points by inner product

Low inner product High inner product

a1 a2 a3 a4 a5 … an-

3

an-

2

an-1 an MIPS Results Peruurbations of low inner products are unlikely to result in changes to top-k Peruurbations of high inner products change top-k and lead to recall loss

slide-23
SLIDE 23

Some inner product errors are worse than others

Consider a query q and database points x1, … , xn Rank points by inner product

Low inner product High inner product

a1 a2 a3 a4 a5 … an-

3

an-

2

an-1 an MIPS Results Peruurbations of low inner products are unlikely to result in changes to top-k Peruurbations of high inner products change top-k and lead to recall loss Takeaway: to maximize recall, emphasize reducing quantization error for high inner products

slide-24
SLIDE 24

Visualization of query distribution

x

slide-25
SLIDE 25

Visualization of query distribution

Quantization error: litule impact on MIPS recall x

slide-26
SLIDE 26

Visualization of query distribution

Quantization error: some impact on MIPS recall x

slide-27
SLIDE 27

Visualization of query distribution

Quantization error: signifjcant impact on MIPS recall x

slide-28
SLIDE 28

Visualization of query distribution

Quantization error: signifjcant impact on MIPS recall x x Reconstruction loss

slide-29
SLIDE 29

Score-aware quantization loss

Traditional quantization loss: Score-aware loss: N By earlier intuition, w should put more weight on higher . Example weight function: w(t) = 1(t ≥T).

slide-30
SLIDE 30

Evaluating and minimizing score-aware loss

Expand expectation:

slide-31
SLIDE 31

Evaluating and minimizing score-aware loss

x x̃ r|| error r⟂

slide-32
SLIDE 32

Evaluating and minimizing score-aware loss

Integral evaluates to a weighted sum of r|| and r⟂: For w that weight higher inner products more,

slide-33
SLIDE 33

Visualization of result

c1 gives lower inner product error than c2 even though ||x - c1|| > ||x - c2|| Reason: x - c1 is oruhogonal, not parallel, to x

slide-34
SLIDE 34

Applications to quantization

Given a family of codewords C, we now want to solve the following optimization problem. We work out an approach for effjcient approximate

  • ptimization in the large-scale setuing for:

1. Vector Quantization 2. Product Quantization

slide-35
SLIDE 35

Constant-bitrate comparison

GloVe: 100 dimensions, 1183514 points Cosine distance dataset; normalize dataset to unit-norm during training time 25 codebooks, 16 centers each 50 codebooks, 16 centers each

slide-36
SLIDE 36

Glove: QPS-recall experiment setup

Higher a, b result in higher recall, lower QPS Exact re-scoring

Top b inner products from AH are re-computed exactly; top 10 are returned as MIPS results

Pruning via K-means tree

2000 centers; all but the closest a centers to the query are pruned

Quantized Scoring

Compute approximate inner products via with quantized database (product quantization with anisotropic loss)

slide-37
SLIDE 37

Glove: QPS-recall pareto frontier

slide-38
SLIDE 38

Glove: QPS-recall pareto frontier

Source code: https://github.com/google-research/google-research/tree/master/scann