Un Unsu super pervised vised Ensem semble ble of Rankin king - - PowerPoint PPT Presentation

un unsu super pervised vised ensem semble ble of rankin
SMART_READER_LITE
LIVE PREVIEW

Un Unsu super pervised vised Ensem semble ble of Rankin king - - PowerPoint PPT Presentation

Un Unsu super pervised vised Ensem semble ble of Rankin king Mo Models els for New ews s Co Commen ents s Usin Using Pseu seudo An Answ swer ers Soichiro Fujita 1 , Hayato Kobayashi 2 , Manabu Okumura 1 1. Tokyo Institute of


slide-1
SLIDE 1

Un Unsu super pervised vised Ensem semble ble of Rankin king Mo Models els for New ews s Co Commen ents s Usin Using Pseu seudo An Answ swer ers

Soichiro Fujita 1, Hayato Kobayashi 2, Manabu Okumura 1

  • 1. Tokyo Institute of Technology
  • 2. Yahoo Japan Corporation / RIKEN AIP
slide-2
SLIDE 2

Ta Task: Ran

Ranking g comme mments ts on online news ws se services

Go Goal: Di

Disp splay ay high gh qu qual ality ty comments ts

Pr Problem: ``

`` high gh qu qual ality ty‘’ has as complex fac factors

Background und

User comments

Mo Model el

! No.1

News Article

No.2 No.3

Ranking

slide-3
SLIDE 3
  • We have various situations of judging whether a comment is good
  • Indicating rare user experiences
  • Providing new ideas
  • Causing discussions
  • Ranking models often fail to capture these information

Ho How to de deal with th th this problem? → En Ense sembl mble techniques

If we prepare many models, some models can capture these information

Rank nking ng ne news comment nts is difficul ult

slide-4
SLIDE 4

Two Basic Ens nsemble Techni hnique ues

Se Selecting

✓ Denoising lower accuracy model ✘ Depend on a single model output

Averaging ng

✓ Make up for other models’ mistakes ✘ Lower accuracy model could be noise ⨁

Final output model outputs model outputs selected outputs

slide-5
SLIDE 5
  • Hybrid method using the Pseudo Answer
  • Hybrid of an output selection and a typical averaging method
  • Dynamic denoising of outputs via a pseudo answer ̅

"

Proposed metho hod: HPA ens nsemble

Selecting

!

"

Weighting

!

#

!

$

!

%

!

&

!

'

Remove lower accuracy outputs Treat ̅

" as an ideal ranking

slide-6
SLIDE 6

St Step1: Cal

Calculat ate a a pse seudo

  • an

answ swer

Rankings: !

・ ・ ・

Pseudo Answer: ̅ $ Each block represents a ranking score of a comment

¯ r = 1 |R| X

r2R

r ||r||.

Normalize & Average

Ranking scores: r

Mo Model el

slide-7
SLIDE 7

St Step2: Cal

Calculat ate si similar arity sc scor

  • res

s of eac ach predicted ran anking

Rankings: !

・ ・ ・

Pseudo Answer: ̅ $ 0.87 0.75 0.92 0.65 Regard the pseudo answer as an ideal ranking

g(r) = NDCG(¯ r, r)

Using NDCG for a similarity function

slide-8
SLIDE 8

St Step3: Cal

Calculat ate the fina

nal rank nking ng from si

similarity sc scor

  • res

Rankings: !

・ ・ ・

Pseudo Answer: ̅ $ 0.87 0.75 0.92 0.65

・ ・ ・

Selecting the top k models with the highest scores

0.87 0.92 Final ranking Weighted average

  • f selected models
slide-9
SLIDE 9

Da Datase set: YJ Constructiv

ive Comment Rankin ing Dataset

Train 1,300 articles, Validation 113 articles, Test 200 articles (each article associated with more than 100 comments)

Mo Models dels: LS

LSTM-ba based Ra RankNet Net

Prepared 100 different models by random initialization

Metr Metric ics: ND

NDCG@k and Pr Precision@k ( k ∈ { 1, 5, 10} )

Experiment ntal Setting ngs

Dataset Link

slide-10
SLIDE 10

Evalua uation n Resul ults

Met Methods @1 @1 NDC NDCG @5 @5 @1 @10 @1 @1 Pr Prec. @5 @5 @1 @10 RankNet 76.35 77.97 79.52 15.0 33.20 42.99 NormAvg 79. 79.83 83 80. 80.77 77 82. 82.16 16 17. 17.08 08 37. 37.18 18 46.48 SupWeight 78.64 80.33 81.94 16.28 35.47 46. 46.58 58 HPA SPA WPA 79. 79.87 87 79.68 79. 79.87 87 81. 81.43 43 80.96 81.39 82. 82.33 33 82.19 82.17 17. 17.08 08 17. 17.08 08 17. 17.08 08 37.39 35.87 37. 37.88 88 47. 47.34 34 46.68 46.63

Best single model Unsupervised baseline Supervised baseline Ours Ours w/o weighting Ours w/o selecting This is a part of the results. Please see Table 1 in our paper if you want to find other baselines.

slide-11
SLIDE 11

Evalua uation n Resul ults

Met Methods @1 @1 NDC NDCG @5 @5 @1 @10 @1 @1 Pr Prec. @5 @5 @1 @10 RankNet 76.35 77.97 79.52 15.0 33.20 42.99 NormAvg 79. 79.83 83 80. 80.77 77 82. 82.16 16 17. 17.08 08 37. 37.18 18 46.48 SupWeight 78.64 80.33 81.94 16.28 35.47 46. 46.58 58 HPA SPA WPA 79. 79.87 87 79.68 79. 79.87 87 81. 81.43 43 80.96 81.39 82. 82.33 33 82.19 82.17 17. 17.08 08 17. 17.08 08 17. 17.08 08 37.39 35.87 37. 37.88 88 47. 47.34 34 46.68 46.63

Best single model Unsupervised baseline Supervised baseline Ours Ours w/o weighting Ours w/o selecting

Ou Our r method

  • d achie

ieved the best perfor formance

slide-12
SLIDE 12

Evalua uation n Resul ults

Met Methods @1 @1 NDC NDCG @5 @5 @1 @10 @1 @1 Pr Prec. @5 @5 @1 @10 RankNet 76.35 77.97 79.52 15.0 33.20 42.99 NormAvg 79. 79.83 83 80. 80.77 77 82. 82.16 16 17. 17.08 08 37. 37.18 18 46.48 SupWeight 78.64 80.33 81.94 16.28 35.47 46. 46.58 58 HPA SPA WPA 79. 79.87 87 79.68 79. 79.87 87 81. 81.43 43 80.96 81.39 82. 82.33 33 82.19 82.17 17. 17.08 08 17. 17.08 08 17. 17.08 08 37.39 35.87 37. 37.88 88 47. 47.34 34 46.68 46.63

Best single model Unsupervised baseline Supervised baseline Ours Ours w/o weighting Ours w/o selecting

Hy Hybrid d of weigh ghti ting g and d selecti ting g is effecti tive

slide-13
SLIDE 13

Pr Proposed Method:

  • A hybrid unsupervised method using pseudo answers

Re Result:

  • Our method achieved the best performance
  • Denoising predicted rankings using the pseudo answer is effective

Fut Futur ure e wor

  • rk:
  • Combine various types of network structures
  • Investigate effectiveness of our methods on other ranking datasets

Conc nclus usion