Statistical inference for incomplete Ins Couso ranking data: A - - PowerPoint PPT Presentation

statistical inference for incomplete
SMART_READER_LITE
LIVE PREVIEW

Statistical inference for incomplete Ins Couso ranking data: A - - PowerPoint PPT Presentation

DA2PL 2016 Paderborn, Nov 7, 2016 Statistical inference for incomplete Ins Couso ranking data: A comparison of two Mohsen Ahmadi likelihood-based estimators Eyke Hllermeier Statement of the problem Sample of N individuals. K


slide-1
SLIDE 1

DA2PL 2016 Paderborn, Nov 7, 2016

Statistical inference for incomplete ranking data: A comparison of two likelihood-based estimators

Inés Couso Mohsen Ahmadi Eyke Hüllermeier

slide-2
SLIDE 2

Statement of the problem

❖ Sample of N individuals. ❖ K alternatives. ❖ Every individual provides a pairwise comparisons. ❖ Goal: estimate the most popular complete ranking over

the K alternatives.

❖ Intermediate goal: estimate the probability distribution

  • ver the collection of K! possible rankings.
slide-3
SLIDE 3

Notation

  • Alternatives: a1, . . . , aK.
  • Complete ranking: π : {1, . . . , K} ! {1, . . . , K}
  • π(i) = position of ai in the ranking.
  • Observable incomplete rankings: ai aj.
  • E(ai aj) = {π : π(i) < π(j)}.
  • Dataset: (τ1, . . . , τN), sequence of i.i.d. observations

(pairwise rankings).

slide-4
SLIDE 4

A stochastic model of coarsening

  • pθ(π) = probability of appearance of π (complete ranking).
  • pλ(τ|π) = probability of observing τ, provided the true

ranking is π.

  • pθ,λ(π, τ): joint mass function.
slide-5
SLIDE 5

Two likelihood-based approaches

LV (✓, ) = P(~ ⌧ | ✓, ) =

N

Y

i=1

P(Y = ⌧i | ✓, ) =

N

Y

i=1

X

π∈SK

pθ(⇡)pλ(⌧i | ⇡). LF (✓, ) =

N

Y

i=1

P (X ∈ E(⌧i) | ✓, ) =

N

Y

i=1

X

π∈E(τi)

pθ(⇡).

slide-6
SLIDE 6

Plackett-Luce parametric family

  • Parameter θ = (θ1, . . . , θK).
  • PLθ({π}) = QK

i=1 θπ−1(i) θπ−1(i)+θπ−1(i+1)+...+θπ−1(K) .

  • π∗ = arg maxπ∈SK PLθ(π) = arg sortk∈[K]{θ1, . . . , θK}.

(If, for instance, θ1 > . . . > θK then π∗ = [1 . . . k].)

  • PLθ(E(ai aj)) =

θi θi+θj .

slide-7
SLIDE 7

Known coarsening: example

a1a2 a2a1 a1a3 a3a1 a2a3 a3a2 a1a2a3 1 a1a3a2 1 a2a1a3 1 a2a3a1 1 a3a1a2 1 a3a2a1 1 LF (~ ⌧; ✓) = Q3

i=1

Q

j6=i

θi θi+θj

⌘nij . ✓ = (0.99, 0.0.5, 0.05) ˆ ✓ = arg max LF (~ ⌧; ✓) − → (0, 0.5, 0.5).

slide-8
SLIDE 8

Unknown coarsening

Number of unknowns: 2K! ✓

K

2 ◆ . a1a2 a2a1 a1a3 a3a1 a2a3 a3a2 π1 = a1a2a3 λπ1

1,2

λπ1

1,3

λπ1

2,3

π2 = a1a3a2 λπ2

1,3

λπ2

1,2

λπ2

2,3

π3 = a2a1a3 λπ3

1,2

λπ3

2,3

λπ3

1,3

π4 = a2a3a1 λπ4

1,3

λπ4

2,3

λπ4

1,2

π5 = a3a1a2 λπ5

2,3

λπ5

1,2

λπ5

1,3

π6 = a3a2a1 λπ6

2,3

λπ6

1,3

λπ6

1,2

slide-9
SLIDE 9

Rank-dependent coarsening assumption

a1a2 a2a1 a1a3 a3a1 a2a3 a3a2 a1a2a3 λ1,2 λ1,3 λ2,3 a1a3a2 λ1,3 λ1,2 λ2,3 a2a1a3 λ1,2 λ2,3 λ1,3 a2a3a1 λ1,3 λ2,3 λ1,2 a3a1a2 λ2,3 λ1,2 λ1,3 a3a2a1 λ2,3 λ1,3 λ1,2 Number of unknowns:

K(K−1) 2

− 1. The FLM is able to estimate the mode of the PL model. (a.s. and provided N is sufficiently large).

slide-10
SLIDE 10

Three experimental settings

❖ The MLM (solid), the FLM

(dashed) and the TLM (dotted) are compared.

❖ Three different coarsening

processes are considered.

❖ The same PL parameter is taken

in the three cases.

❖ Left column: Euclidean distance

true par. - par. estimate.

❖ Right column: Kendall distance

true mode - predicted ranking.

500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35
slide-11
SLIDE 11

Case 1: uniform selection of pairwise comparison

500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35

set λ1,2 = . . . = λ3,4 = 1/6. selected uniformly at random. In

slide-12
SLIDE 12

Case 2: top 2 case

500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

xperiment, λ1,2 = 1 corresponds to the top-2

slide-13
SLIDE 13

Case 3: rank proportional selection

500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 500 1000 1500 2000 0.05 0.1 0.15 0.2 0.25 0.3 0.35

ranks: λi,j ∝ (8 − i − j). with a higher probability than

slide-14
SLIDE 14

Conclusion

❖ MLM is theoretically the best one, but it involves to

many parameters.

❖ FML is simpler, but it ignores the coarsening process. It

may lead to biased estimations.

❖ Biased estimations of the parameter do not imply non-

accurate predictions of the most popular ranking.

❖ Future directions: search for computational acceptable

methods with a good performance.