[PPT] - On Top-k Selection from m-wise Partial Rankings via Borda Counting PowerPoint Presentation

SLIDE 1

On Top-k Selection from m-wise Partial Rankings via Borda Counting

Wenjing Chen1 Ruida Zhou1 Chao Tian1 Cong Shen2

1Department of Electrical and Computer Engineering

Texas A&M University

2Electrical and Computer Engineering Department

University of Virginia

ISIT, June 2020

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 1 / 22

SLIDE 2

Motivation and Relevant Results

Ranking aggregation is important: e.g. information retrieval and recommending systems. A well studied case is ranking aggregation using pairwise comparisons. Algorithms based on the parametric model (e.g. BTL models) may perform poorly if there is a mismatch to the model. Shah et al., 2017 employed the Borda counting procedure on ranking aggregation using pairwise comparisons in a nonparametric model setting. Our work extends the pairwise comparison problem to the ranking aggregation using m-wise comparisons.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 3 / 22

SLIDE 4

Borda Counting Procedure: An Example, n=6, m=3

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 4 / 22

SLIDE 5

Problem Setup

A total of n items indexed by [n] = {1, 2, . . . , n}. The noisy partial ranking samples are collected in r rounds. In round ℓ ∈ [r], each subset of m items, say A ⊆ [n], are compared. Each ranking result is observed with probability p.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 5 / 22

SLIDE 6

Borda Counting Procedure

In a comparison among the set A, if the result is

bserved, the i-th item will receive score βi

(1 = β1 ≥ β2 ≥ ... ≥ βm ≥ 0). Otherwise, each item in this comparison will receive score 0. X (ℓ)

a,A− : the score item-a receives in the ℓ-th round in

the comparison among the set A = a ∪ A− .

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 6 / 22

SLIDE 7

Borda Counting Procedure

In a comparison among the set A, if the result is

bserved, the i-th item will receive score βi

(1 = β1 ≥ β2 ≥ ... ≥ βm ≥ 0). Otherwise, each item in this comparison will receive score 0. X (ℓ)

a,A− : the score item-a receives in the ℓ-th round in

the comparison among the set A = a ∪ A− . After r rounds, the total score item-a receives is

Wa =

ℓ∈[r]
A−⊆[n]\{a}

X (ℓ)

a,A−.

The top-k estimate ( ˜ Sk): the k items which receive the highest empirical scores .

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 6 / 22

SLIDE 8

Probabilistic Model

When the items in set A = {v1, v2...vm} are being compared, The probability of the order v = (v1, v2, ...vm) occurring: Mv1v2...vm or M

v.

v .

= A: the items in the vector v are those in the set A. Constraints: M

v ≥ 0 and

v .

=A M v = 1 for any

v with distinct elements.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 7 / 22

SLIDE 9

A Few Notations

Ra,A−(t): the probability that item-a ranks at the t-th position in the set A = {a} ∪ A−. Then the expected score of item-a relative to A−

E

X (ℓ)

a,A−

= p

m

t=1

βtRa,A−(t), ∀a ∈ [n], ∀A = a ∪ A−.

Associated score of any item-a: average expected score τa = 1 ρn,m

A−⊆[n]\{a}

m

t=1

βtRa,A−(t)

,

where ρn,m = n−1

m−1

.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 8 / 22

SLIDE 10

Main Results: Upper Bound

Theorem 1

For any α > 0, the probability of choosing incorrect top-k items using the Borda counting method with probability distribution M ∈ Fk(α) is upper-bounded as

sup

M∈Fk(α)

PM[ ˜ Sk = S∗

k ] ≤ n−α2/4+2.

˜ Sk: the Borda counting estimator of top-k subset, S∗

k: the true top-k subset of the

highest associated scores. ∆k = τ(k) − τ(k+1) is the k-th threshold of associated scores. Fk(α) =

M ∈ M : ∆k ≥ α
log n

rpρn,m

: set of ”good” ranking probability distributions.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 9 / 22

SLIDE 11

Main Results: Converse Part

Theorem 2

Let n, k where 2k ≤ n be chosen. If α ≤ ¯ α(g, m, β) √ 2 7 g(n, m, β)

1

h(n)ρn,m , p ≥ log n 4rh(n), and n ≥ 7, then the error probability of any estimator ˆ Sk is lower bounded as

sup

M∈Fk(α)

PM[ ˆ Sk = S∗

k ] ≥ 1

7,

where g(n, m, β) and h(n) are two constants to be specified in the proof.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 10 / 22

SLIDE 12

Example: Special Case of Theorem 1

Special case m = 2, (β1, β2) = (1, 0): the case in Shah et al., 2017. The set of ranking probabilities can be simplified to

Fk(α) =

M ∈ M : ∆k ≥ α
log n

rp(n − 1)

.

Result in Shah et al., 2017 has the set F (1)

k (α)

F (1)

k (α) =

M ∈ M : ∆k ≥ α
log n

rp(n − 1)

n

n − 1

.

If we set α ≥ 8, then the bound becomes

sup

M∈Fk(α)

PM[ ˜ Sk = S∗

k ] ≤ n−α2/4+2 ≤ n−14.

The bound matches the result in Shah et al., 2017, F (1)

k (α) ⊆ Fk(α), so the Theorem 1

here is slightly tighter than that in Shah et al., 2017.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 11 / 22

SLIDE 13

Example: Special Case of Theorem 2

When m = 2, the assumptions in Theorem 2 reduce to α ≤ 1

7

n

n−1, and that p ≥ log n 2rn .

Theorem 2 matches precisely the converse part in Shah et al., 2017. Next: details of the proofs. Assume w.l.o.g The underlying ranking is consistent with the index of items. Then S∗

k = [k] = {1, 2, ..., k}.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 12 / 22

SLIDE 14

Proof: Upper Bound

Outline of the proof of Theorem 1. From the fact that the event

˜ Sk = S∗

k

= {∃a ∈ S∗

k , b ∈ [n]\S∗ k such that a is ranked after b } ,

we know

PM[ ˜ Sk = S∗

k ] ≤

a∈[k],b∈[n]\[k]

P(Wb − Wa > 0)

By applying the Bernstein’s inequality, for any a ∈ [k], b ∈ [n]\[k], P(Wb − Wa > 0) can be upper bounded as n− α2

4 . Then

PM[ ˜ Sk = S∗

k ] ≤ k(n − k)n− α2

4 ≤ n− α2 4 +2.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 13 / 22

SLIDE 15

Proof: Upper Bound

Difference from the pairwise case : X (ℓ)

a,b vs X (ℓ) a,A−: X (ℓ) a,b is an indicator function representing if a beats b in round l. In

m-wise case, the situation is more complicated. We use a general concept X (ℓ)

a,A−.

Mab vs Ra,A−(t): Mab: the probability of a beats b. In m-wise case, to deal with more complicated ranking results, use Ra,A−(t) to aggregate the probability of some cases together to simplify the analysis.

Lemma 3 (Bernstein’s inequality)

Let Y1, ..., Yn be independent zero-mean random variables. Suppose that |Yi| ≤ M almost surely, for all i. Then, for all positive t, we have that P n

i=1

Yi > t

≤ exp
−

1 2t2

E

Y 2

i

+ 1

3Mt

.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 14 / 22

SLIDE 16

Proof: Upper Bound

How we apply Bernstein’s inequality? Notice that Wb − Wa =

ℓ∈[r]
A−⊆[n]\{b}

X (ℓ)

b,A− −

A−⊆[n]\{a}

X (ℓ)

a,A−

.

We need to prepare the following steps.

1 First, transform the r.v.s X (ℓ)

a,A− and X (ℓ) b,A− in the RHS of the equation above into

independent zero-mean r.v.s.

2 Bound the first-order and second-order moments of the new r.v.s obtained in last step. ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 15 / 22

SLIDE 17

Proof: Upper Bound

Step 1: define the centralized score:

¯ X (ℓ)

a,A− X (ℓ) a,A− − E

X (ℓ)

a,A−

= X (ℓ)

a,A− − p m

t=1

βtRa,A−(t), ∀a ∈ [n]

and the centralized cross-score

¯ X (ℓ)

{a,b},A−− X (ℓ) a,{b,A−−} − X (ℓ) b,{a,A−−} − E

X (ℓ)

a,{b,A−−}

+ E
X (ℓ)

b,{a,A−−}

= X (ℓ)

a,{b,A−−} − X (ℓ) b,{a,A−−} −

p

m

t=1

βtRa,{b,A−−}(t) − p

m

t=1

βtRt

b,{a,A−−}

.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 16 / 22

SLIDE 18

Proof: Upper Bound

Step 2: for centralized score: First-order moment bound: | ¯ X (ℓ)

a,A−| ≤ β1 and | ¯

X (ℓ)

{a,b},A−−| ≤ 2β1

Bound the variance of centralized score as

E

¯

X (ℓ)

b,A−

2 = E

X (ℓ)

b,A−

2 − E

X (ℓ)

b,A−

2 ≤ E

X (ℓ)

b,A−

2 = p

m

t=1

β2

t Rb,A−(t).

Bound the variance of centralized cross-score: ...

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 17 / 22

SLIDE 19

Proof: Upper Bound

Step 2: for centralized score: First-order moment bound: | ¯ X (ℓ)

a,A−| ≤ β1 and | ¯

X (ℓ)

{a,b},A−−| ≤ 2β1

Bound the variance of centralized score as

E

¯

X (ℓ)

b,A−

2 = E

X (ℓ)

b,A−

2 − E

X (ℓ)

b,A−

2 ≤ E

X (ℓ)

b,A−

2 = p

m

t=1

β2

t Rb,A−(t).

Bound the variance of centralized cross-score: ... Combine the above together, the sum of variance of the centralized random variable can be upper bounded as

Σ ¯

Xa,b

A−⊆[n]\{a,b}
E
¯

X (ℓ)

b,A−

2 + E

¯

X (ℓ)

a,A−

2 +

A−−⊆[n]\{a,b}

E

¯

X (ℓ)

{a,b},A−−

2 ≤ 2pρn,m − pρn,m∆k.

Apply Berntein’s inequality + some algebraic simplification = ⇒ the upper bound.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 17 / 22

SLIDE 20

Proof: Converse Part

Constructing a set of probability distributions difficult to distinguish: For each a ∈ {k, . . . , n}, S∗[a] . = {1, 2, . . . , k − 1} ∪ {a}, Ma . =

Ma
v :

v . = A, A ⊆ [n]

. Define Ma
v as

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 18 / 22

SLIDE 21

Proof: Converse Part

Suppose the underlying distribution is drawn uniformly at random from the set F = {Ma|a ∈ [n] \ [k − 1]}, and the true index is a∗. By Fano’s inequality, any estimator ˆ a must satisfy that PM[ˆ a = a∗] ≥ 1 − maxa,b∈[n]\[k−1] DKL(Pa||Pb) + log 2 log(n − k + 1) . The error probability of the top-k set has the same lower bound. For any distinct a, b ∈ {k, k + 1, . . . , n}, DKL(Pa||Pb) ≤ rph(n) 4δ2 (1 − δ2), where h(n) =

1

(m

q)

k−1

q−1

n−k

m−q

+

k

q

n−k−1

m−q−1

.

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 19 / 22

SLIDE 22

Proof: Converse Part

Using the property of the KL-divergence, we can write DKL(Pa||Pb) =

ℓ∈[r]

p

A⊆[n]:|A|=m

DKL(Pa(V (ℓ)

A )||Pb(V (ℓ) A ))

= rp

A⊆[n]:|A|=m

DKL(Pa(V (ℓ)

A )||Pb(V (ℓ) A )),

where V (ℓ)

A

is the result of the comparison among the elements in A. If a, b / ∈ A, then clearly DKL(Pa(V (ℓ)

A )||Pb(V (ℓ) A )) = 0;

If a ∈ A but b / ∈ A, If A ∩ [k − 1] = q − 1, we have DKL(Pa(V (ℓ)

A )||Pb(V (ℓ) A )) = q!(m − q)!

(m!) µ(δ) ≤ 2ωm,q pδ2 1 − δ2 . Many other cases...

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 20 / 22

SLIDE 23

Conclusion

1 Extend the study of the non-parametric model to the m-wise case. 2 Prove the upper bound and lower bound of error probability of top-k estimate with Borda

counting procedure.

3 Other results: approximate Hamming error. Details in the ISIT proceedings. 4 Future work: study the effect of various beta configurations. ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 21 / 22

SLIDE 24

Thank you!

ISIT On Top-k Selection from m-wise Partial Rankings via Borda Counting June 2020 22 / 22

On Top-k Selection from m-wise Partial Rankings via Borda Counting

Wenjing Chen1 Ruida Zhou1 Chao Tian1 Cong Shen2

Texas A&M University

University of Virginia

ISIT, June 2020

Contents

1

Problem Setup

2

Borda Counting Procedure

3

Main Results

4

Proof

Motivation and Relevant Results

Borda Counting Procedure: An Example, n=6, m=3

Problem Setup

A total of n items indexed by [n] = {1, 2, . . . , n}. The noisy partial ranking samples are collected in r rounds. In round ℓ ∈ [r], each subset of m items, say A ⊆ [n], are compared. Each ranking result is observed with probability p.

Borda Counting Procedure

In a comparison among the set A, if the result is

(1 = β1 ≥ β2 ≥ ... ≥ βm ≥ 0). Otherwise, each item in this comparison will receive score 0. X (ℓ)

a,A− : the score item-a receives in the ℓ-th round in

the comparison among the set A = a ∪ A− .

Borda Counting Procedure

In a comparison among the set A, if the result is

(1 = β1 ≥ β2 ≥ ... ≥ βm ≥ 0). Otherwise, each item in this comparison will receive score 0. X (ℓ)

a,A− : the score item-a receives in the ℓ-th round in

the comparison among the set A = a ∪ A− . After r rounds, the total score item-a receives is

Wa =

X (ℓ)

The top-k estimate ( ˜ Sk): the k items which receive the highest empirical scores .

Probabilistic Model

When the items in set A = {v1, v2...vm} are being compared, The probability of the order v = (v1, v2, ...vm) occurring: Mv1v2...vm or M

v.

= A: the items in the vector v are those in the set A. Constraints: M

v ≥ 0 and

=A M v = 1 for any

v with distinct elements.

A Few Notations

Ra,A−(t): the probability that item-a ranks at the t-th position in the set A = {a} ∪ A−. Then the expected score of item-a relative to A−

E

βtRa,A−(t), ∀a ∈ [n], ∀A = a ∪ A−.

Associated score of any item-a: average expected score τa = 1 ρn,m

m

βtRa,A−(t)

where ρn,m = n−1

m−1

Main Results: Upper Bound

Theorem 1

For any α > 0, the probability of choosing incorrect top-k items using the Borda counting method with probability distribution M ∈ Fk(α) is upper-bounded as

sup

PM[ ˜ Sk = S∗

˜ Sk: the Borda counting estimator of top-k subset, S∗

k: the true top-k subset of the

highest associated scores. ∆k = τ(k) − τ(k+1) is the k-th threshold of associated scores. Fk(α) =

rpρn,m

Main Results: Converse Part

Theorem 2

Let n, k where 2k ≤ n be chosen. If α ≤ ¯ α(g, m, β) √ 2 7 g(n, m, β)

h(n)ρn,m , p ≥ log n 4rh(n), and n ≥ 7, then the error probability of any estimator ˆ Sk is lower bounded as

sup

PM[ ˆ Sk = S∗

7,

where g(n, m, β) and h(n) are two constants to be specified in the proof.

Example: Special Case of Theorem 1

Special case m = 2, (β1, β2) = (1, 0): the case in Shah et al., 2017. The set of ranking probabilities can be simplified to

Fk(α) =

rp(n − 1)

Result in Shah et al., 2017 has the set F (1)

k (α)

F (1)

rp(n − 1)

n − 1

If we set α ≥ 8, then the bound becomes

sup

PM[ ˜ Sk = S∗

The bound matches the result in Shah et al., 2017, F (1)

k (α) ⊆ Fk(α), so the Theorem 1

here is slightly tighter than that in Shah et al., 2017.

Example: Special Case of Theorem 2