Cautious label-wise ranking with constraint satisfaction Sbastien - - PowerPoint PPT Presentation

cautious label wise ranking with constraint satisfaction
SMART_READER_LITE
LIVE PREVIEW

Cautious label-wise ranking with constraint satisfaction Sbastien - - PowerPoint PPT Presentation

Cautious label-wise ranking with constraint satisfaction Sbastien Destercke, Yonatan Carlos Carranza Alarcon DA2PL 2018 S. Destercke, Y. Alarcon Cautious label ranking DA2PL 2018 1 / 26 Some announcements: SUM 2019 When: 16-18 december


slide-1
SLIDE 1

Cautious label-wise ranking with constraint satisfaction

Sébastien Destercke, Yonatan Carlos Carranza Alarcon DA2PL 2018

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 1 / 26

slide-2
SLIDE 2

Some announcements: SUM 2019

When: 16-18 december 2019 Where: Compiègne What: (scalable) undertainty management How: papers (long/short/abstracts) but also tutorials/surveys of particular areas

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 2 / 26

slide-3
SLIDE 3

Where is Compiegne?

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 3 / 26

slide-4
SLIDE 4

Our approach in a nutshell

What? Cautious label-ranking by rank-wise decomposition How? Rank-wise decomposition For each label, predict set of ranks using imprecise probabilities Use CSP to:

resolve inconsistencies remove impossible assignments

Why? weak information in structured settings more prone to be of use few rank-wise approaches (except score-based) for this problem

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 4 / 26

slide-5
SLIDE 5

Introduction

Introduction and decomposition

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 6 / 26

slide-6
SLIDE 6

Introduction

Ranking data - preferences

To each instance x correspond an ordering over possible labels Blog theme A blog x can be about Politic ≻ Literature ≻ Movies ≻ . . . Cutomer preferences A customer x may prefer White wine ≻ Red wine ≻ Beer ≻ . . .

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 7 / 26

slide-7
SLIDE 7

Introduction

Classification problem/data

C = {c1, c2, c3} X1 X2 X3 X4 Y 107.1 25 Blue 60 c3 −50 10 Red 40 c1 200.6 30 Blue 58 c2 107.1 5 Green 60 c4 . . . . . . . . . . . . . . .

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 8 / 26

slide-8
SLIDE 8

Introduction

Label ranking problem/data

W = {w1, w2, w3} X1 X2 X3 X4 Y 107.1 25 Blue 60 w1 ≻ w3 ≻ w2 −50 10 Red 40 w2 ≻ w1 ≻ w3 200.6 30 Blue 58 w1 ≻ w2 ≻ w3 107.1 5 Green 60 w3 ≻ w1 ≻ w2 . . . . . . . . . . . . . . . Potentially huge output space (K! with K labels) → naive extension (one ranking=one class) doomed to fail

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 9 / 26

slide-9
SLIDE 9

Introduction

One solution: rank-wise decomposition

D X1 X2 X3 X4 Y 107.1 25 Blue 60 λ1 ≻ λ3 ≻ λ2 −50 10 Red 40 λ2 ≻ λ3 ≻ λ1 200.6 30 Blue 58 λ2 ≻ λ1 ≻ λ3 107.1 5 Green 33 λ1 ≻ λ2 ≻ λ3 . . . . . . . . . . . . . . . D1 X1 X4 Y 107.1 60 1 −50 40 3 200.6 58 2 107.1 33 1 . . . . . . . . . D2 X1 X4 Y 107.1 60 3 −50 40 1 200.6 58 1 107.1 33 2 . . . . . . . . . D3 X1 X4 Y 107.1 60 2 −50 40 2 200.6 58 3 107.1 33 3 . . . . . . . . .

For each label, solve an ordinal regression problem

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 10 / 26

slide-10
SLIDE 10

Predicting ranks

Predicting candidate ranks

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 12 / 26

slide-11
SLIDE 11

Predicting ranks

Learning with IP: a crash course

Classical case: input space X and output space Y set D = {(x1, y1), . . . , (xn, yn)} of data given x, estimate P(y|x) using D P(y|x) = information about y when observing x However, estimate ˆ P(y|x) of P(y|x) can be pretty bad if data are noisy, missing, imprecise estimation is based on little data Replace the estimate ˆ P(y|x) by a set P of estimates

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 13 / 26

slide-12
SLIDE 12

Predicting ranks

X1 X2

?

Ambiguity P( |?) ∈ [0.49, 0.51] P( |?) ∈ [0.49, 0.51]

X1 X2

?

Lack of information P( |?) ∈ [0, 0.7] P( |?) ∈ [0.3, 1]

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 14 / 26

slide-13
SLIDE 13

Predicting ranks

Decision with probability sets

Probability sets If ℓω : Y → R loss function of choice ω ∈ Y, then ω ω′ ⇔ inf

P∈P(y|x) E(ℓω′ − ℓω) ≥ 0 = E(ℓω′ − ℓω)

⇔ inf

P∈P(y|x)

  • y∈Y

P(y|x) (ℓω′(y) − ℓω′(y)) ≥ 0 ⇒ if insufficient information, we can have ω ω′ and ω′ ω That is, we can have E(ℓω′ − ℓω) < 0 and E(ℓω − ℓω′) < 0 ⇒ Possibly optimal decisions = maximal element(s) of

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 15 / 26

slide-14
SLIDE 14

Predicting ranks

Our choice of ℓ, P

What? ℓ= L1 norm between ranks, loss of predicting rank j if k is true ℓj(k) = |j − k| P described by lower/upper cumulative distributions F, F Why? prediction is guaranteed to be an “interval" of ranks (dedicated CSP models) it corresponds to the set of possible medians (very easy to get)

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 16 / 26

slide-15
SLIDE 15

Predicting ranks

An example of rank prediction

Rank j 1 2 3 4 5 F j 0.15 0.55 0.7 0.95 1 F j 0.1 0.3 0.45 0.8 1 0.25 0.5 0.75 1 1 2 3 4 5 Predicted rank for label: {2, 3}

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 17 / 26

slide-16
SLIDE 16

Making a final prediction

Making a final cautious prediction

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 19 / 26

slide-17
SLIDE 17

Making a final prediction

Inconsistency and assignment reductions

Inconsistency Consider four labels λ1, λ2, λ3, λ4, then the predicted possible ranks ˆ R1 = {1, 3}, ˆ R2 = {1, 3}, ˆ R3 = {1, 3}, ˆ R4 = {2, 4} are inconsistent → λ1, λ2, λ3 should all take different values Removal of impossible solutions Consider the predictions ˆ R1 = {1, 2}, ˆ R2 = {1, 2, 3}, ˆ R3 = {2}, ˆ R4 = {1, 2, 3, 4}. As λ3 has to take value 2, λ1 has to take value {1}, . . . until we get ˆ R′

1 = {1}, ˆ

R′

2 = {3}, ˆ

R′

3 = {2}, ˆ

R′

4 = {4}

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 20 / 26

slide-18
SLIDE 18

Making a final prediction

Dealing with the issue: CSP modelling

A possible assignment ˆ Ri ⊆ {1, . . . , K} Need to find if each of them can take a different value Exactly what the all different constraint does in CSP So, just apply standard librairies Bonus: if all ˆ Ri intervals, efficient (polynomial) algorithms exist

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 21 / 26

slide-19
SLIDE 19

Making a final prediction

Experiments

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 22 / 26

slide-20
SLIDE 20

Making a final prediction

Setting

Material and method Classification and regression data sets turned into ranking Binary decomposition + Naive imprecise classifier Measuring results quality Completeness (CP)

CP(ˆ R) = k2 − k

i=1 |ˆ

Ri| k2 − k

Max if one ranking possible Min if all rankings possible Correctness (CR)

CR(ˆ R) = 1 − k

i=1 minˆ ri∈ˆ Ri |ˆ

ri − ri| 0.5k2

Equivalent to Spearman footrule if

  • ne ranking predicted
  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 23 / 26

slide-21
SLIDE 21

Making a final prediction

An example of results

0.2 0.4 0.6 0.8 1.0 0.7 0.8 0.9 1.0 Discretization 6 intervals Completeness Correctness 0.1 1.1 2.1 3.1 4.1 5.1 6.1 7.1 8.1 9.1 10.1 0.6 0.7 0.8 0.9 1.0 1.1 0.2 0.4 0.6 0.8 Discretization 6 intervals Completeness Correctness 0.1 1.1 2.1 3.1 4.1 5.1 6.1 7.1 8.1 9.1 10.1

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 24 / 26

slide-22
SLIDE 22

Making a final prediction

Why rank-wise approaches?

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 25 / 26

slide-23
SLIDE 23

Making a final prediction

Yes, why?

different expressiveness when it comes to represent partial predictions, e.g., the set-valued prediction {λ1 ≻ λ2 ≻ λ3, λ2 ≻ λ2 ≻ λ1} between three labels is perfectly representable by imprecise ranks, but not through pairwise information or partial orders (i.e., interval-valued scores) not (entirely) clear how to make score-based methods imprecise (IP-SVM?) + need to turn them into imprecise ranks?

  • S. Destercke, Y. Alarcon

Cautious label ranking DA2PL 2018 26 / 26