Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with - - PowerPoint PPT Presentation

lecture 3 improving ranking with lecture 3 improving
SMART_READER_LITE
LIVE PREVIEW

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with - - PowerPoint PPT Presentation

Modeling User Behavior and Interactions Modeling User Behavior and Interactions Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data Eugene Agichtein Emory University Eugene Agichtein, Emory University, RuSSIR


slide-1
SLIDE 1

Modeling User Behavior and Interactions Modeling User Behavior and Interactions

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data

Eugene Agichtein Emory University

1

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-2
SLIDE 2

Lecture 3 Plan

  • 1. Review: Learning to Rank
  • 2. Exploiting User Behavior for Ranking:

– Automatic relevance labels – Enriching feature space

  • 3. Implementation and System Issues
  • 3. Implementation and System Issues

– Dealing with Scale – Dealing with data sparseness

  • 4. New Directions

– Active learning – Ranking for diversity

2

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-3
SLIDE 3

Review: Learning to Rank

  • Goal: instead of fixed retrieval models learn them:

U ll i d l i d t/ – Usually: supervised learning on document/query pairs embedded in high-dimensional feature space L b l d b l f d t t – Labeled by relevance of document to query – Features: provided by IR methods.

  • Given training instances:

– (xq,d, yq,d) for q = {1..N}, d = {1 .. Nq}

  • Learn a ranking function

f(x x ) – f(xq,1, … xq,Nq)

3

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-4
SLIDE 4

Ordinal Regression Approaches

  • Learn multiple thresholds:

Maintain T thresholds (b1, … bT), b1 < b2 < … < bT => Learn parameters + (b1, …, bT) h h h l Chu & Keerthi, New Approaches to Support Vector Ordinal Regression ICML 05

  • Learn multiple classifiers:

U T diff i i i l ifi C C S Use T different training sets, train classifiers C1..CT => Sum

  • T. Qin et al., “Ranking with Multiple Hyperplanes.” SIGIR 2007
  • Optimize pairwise preferences:
  • Optimize pairwise preferences:

RankNet: Burges et al., Learning to Rank Using Gradient Descent, ICML 05

  • Optimize Rank-based Measures:

Optimize Rank based Measures:

Directly optimize (n)DCG via local approximation of gradient LambdaRank: C. Burges, et al., “Learning to Rank with Non-Smooth Cost Functions.” NIPS 2006

4

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-5
SLIDE 5

Learning to Rank Summary

  • Many learning algorithms available to choose from
  • Require training data (feature vectors + labels)
  • Where does training data come from?

Where does training data come from?

– “Expert” human judges (TREC, editors, …) – Users: logs of user behavior – Users: logs of user behavior

  • Rest of this lecture:

L i f l ti d t t t i d – Learning formulation and setup, to train and use learning to rank algorithms

5

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-6
SLIDE 6

Approaches to Use Behavior Data

  • Use “clicks” as new training examples

Joachims KDD 2002 – Joachims, KDD 2002 – Radlinski & Joachims, KDD 2005

  • Incorporate behavior data as additional features

– Richardson et al., WWW 2005 – Agichtein et al., SIGIR 2006 – Bilenko and White, WWW 2008 – Zhu and Mishne, KDD 2009 ,

6

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-7
SLIDE 7

Recap: Available Behavior Data

7

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-8
SLIDE 8

Training Examples from Click Data

[ Joachims 2002 ]

8

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-9
SLIDE 9

Loss Function

[ Joachims 2002 ]

9

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-10
SLIDE 10

Learned Retrieval Function

[ Joachims 2002 ]

10

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-11
SLIDE 11

Features

[ Joachims 2002 ]

11

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-12
SLIDE 12

Results

[ Joachims 2002 ]

Summary: Learned outperforms all base methods in experiment Learning from clickthrough data is possible data is possible Relative preferences are useful training data.

12

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-13
SLIDE 13

Extension: Query Chains

[Radlinski & Joachims, KDD 2005]

13

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-14
SLIDE 14

Query Chains (Cont’d)

[Radlinski & Joachims, KDD 2005]

14

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-15
SLIDE 15

Query Chains (Results)

[Radlinski & Joachims, KDD 2005]

  • Query Chains add slight improvement over clicks

[ , ]

15

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-16
SLIDE 16

Lecture 3 Plan

  • Review: Learning to Rank
  • Exploiting User Behavior for Ranking:
  • Automatic relevance labels
  • Enriching the ranking feature space
  • Enriching the ranking feature space

1. Implementation and System Issues

D li ith S l – Dealing with Scale – Dealing with data sparseness

i i 2. New Directions

– Active learning – Ranking for diversity – Fun and games

16

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-17
SLIDE 17

Incorporating Behavior for Static Rank

[Richardson et al., WWW2006] Web Crawl Build Index Answer Queries Which Efficient Informs pages to crawl index

  • rder

dynamic ranking

Static Rank

17

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-18
SLIDE 18

fRank: Machine Learning for Static Ranking

[Richardson et al., WWW2006]

Words on page

Web

p g # Inlinks C t i ‘Vi ’

Machine Learning fRank

PageRank Contains ‘Viagra’

g Model

18

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-19
SLIDE 19

Features: Summary

[Richardson et al., WWW2006]

  • Popularity

h d l k

  • Anchor text and inlinks
  • Page
  • Domain
  • PageRank

PageRank

19

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-20
SLIDE 20

Features: Popularity

[Richardson et al., WWW2006]

  • Data from MSN Toolbar
  • Smoothed

Smoothed

Function Example Exact URL cnn.com/2005/tech/wikipedia.html?v=mobile p No Params cnn.com/2005/tech/wikipedia.html Page wikipedia.html URL 1 /2005/t h URL-1 cnn.com/2005/tech URL-2 cnn.com/2005 … Domain cnn.com Domain+1 cnn.com/2005

20

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-21
SLIDE 21

Features: Anchor, Page, Domain

[Richardson et al., WWW2006]

  • Anchor text and inlinks

– Total amount of anchor text, unique anchor text words, number of inlinks, etc.

  • Page

– 8 Features based on page alone: Words in body, frequency of most common term, etc.

  • Domain

– Averages in domain: average #outlinks, etc.

21

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-22
SLIDE 22

Data

[Richardson et al., WWW2006]

  • Human judgments
  • 1. Randomly choose query from MSN users
  • 2. Chose top URLs by search engine
  • 3. Rate quality of URL for that query
  • 500k (Query,URL,Rating) tuples
  • Judged URLs biased to good pages

– Results apply to index ordering relevance Results apply to index ordering, relevance – Crawl ordering requires unbiased sample

22

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-23
SLIDE 23

Becoming Query Independent

[Richardson et al., WWW2006]

  • (Query,URL,Rating) → (URL,Rating)

k f h

  • Take maximum rating for each URL

– Good page if relevant for at least one query

  • Queries are common → likely correct index order

and relevance order

23

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-24
SLIDE 24

Measure

[Richardson et al., WWW2006]

  • Goal: Find static ranking algorithm that most

Goal: Find static ranking algorithm that most correctly reproduces judged order

S H ∩

p p p

H S H ∩ = accuracy pairwise

  • Fraction of pairs that, when the humans claim
  • ne is better than the other the static rank
  • ne is better than the other, the static rank

algorithm orders them correctly

24

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-25
SLIDE 25

RankNet, Burges et al., ICML 2005

[Richardson et al., WWW2006] Feature Vector

Label

NN output Error is function of label and output

25

Error is function of label and output

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-26
SLIDE 26

RankNet [Burges et al. 2005]

[Richardson et al., WWW2006]

  • Training Phase:

– Present pair of vectors with label1 > label2

26

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-27
SLIDE 27

RankNet [Burges et al. 2005]

[Richardson et al., WWW2006]

  • Training Phase:

Feature Vector1

Label1

– Present pair of vectors with label1 > label2

NN output 1

27

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-28
SLIDE 28

RankNet [Burges et al. 2005]

[Richardson et al., WWW2006]

  • Training Phase:

Feature Vector2

Label2

– Present pair of vectors with label1 > label2

NN output 1 NN output 2

28

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-29
SLIDE 29

RankNet [Burges et al. 2005]

[Richardson et al., WWW2006]

  • Training Phase:

Present pair of vectors with label1 > label2 – Present pair of vectors with label1 > label2

NN output 1 NN output 2 Error is function of both outputs (Desire output1 > output2)

29

( p p )

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-30
SLIDE 30

RankNet [Burges et al. 2005]

[Richardson et al., WWW2006]

  • Test Phase:
  • Test Phase:

– Present individual vector and get score

Feature Vector1 NN output

30

NN output

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-31
SLIDE 31

Experimental Methodology

[Richardson et al., WWW2006]

  • Split ratings

84% i i – 84% training set – 8% validation set – 8% test set

Training set: Train RankNet

  • Validation set: Choose best net
  • Test set: Measure pairwise accuracy

Test set: Measure pairwise accuracy

31

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-32
SLIDE 32

Accuracy of Each Feature Set

[Richardson et al., WWW2006]

All 70

Feature Set Accuracy (%)

Popularity Page 65

PageRank 56.70 Popularity 60.82

PageRank Anchor Domain 55 60

Anchor 59.09 Page 63.93 Domain 59 03

50 55

Domain 59.03 All Features 67.43

(1) (14) (5) (8) (7)

  • Accuracy with only the given feature set
  • Every feature set outperformed PageRank

( ) ( ) ( ) ( ) ( )

32

  • Best feature sets contain no link information

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-33
SLIDE 33

Qualitative Evaluation

[Richardson et al., WWW2006]

  • Top ten URLs for PageRank vs. fRank

PageRank fRank google.com google.com apple.com/quicktime/download yahoo.com amazon com americanexpress com amazon.com americanexpress.com yahoo.com hp.com microsoft.com/windows/ie target.com apple.com/quicktime bestbuy.com mapquest.com dell.com ebay com autotrader com ebay.com autotrader.com mozilla.org/products/firefox dogpile.com ftc.gov bankofamerica.com

33

Technology Oriented Consumer Oriented

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-34
SLIDE 34

Behavior for Dynamic Ranking

[Agichtein et al., SIGIR2006]

Presentation Presentation ResultPosition ResultPosition Position of the URL in Current ranking Position of the URL in Current ranking QueryTitleOverlap QueryTitleOverlap Fraction of query terms in result Title Fraction of query terms in result Title Clickthrough Clickthrough Clickthrough Clickthrough DeliberationTime DeliberationTime Seconds between query and first click Seconds between query and first click ClickFrequency ClickFrequency Fraction of all clicks landing on page Fraction of all clicks landing on page ClickDeviation ClickDeviation Deviation from expected click frequency Deviation from expected click frequency Browsing Browsing DwellTime DwellTime Result page dwell time Result page dwell time p g p g DwellTimeDeviation DwellTimeDeviation Deviation from expected dwell time for query Deviation from expected dwell time for query Sample Behavior Features (from Lecture 2)

34

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-35
SLIDE 35

Feature Merging: Details

[A i ht i t l SIGIR2006]

R l URL BM25 P R k Cl k D llT

Query: SIGIR, fake results w/ fake feature values [Agichtein et al., SIGIR2006]

Result URL BM25 PageRank … Clicks DwellTime … sigir2007.org 2.4 0.5 … ? ? … Sigir2006 org 1 4 1 1 150 145 2

  • Value scaling:

Sigir2006.org 1.4 1.1 … 150 145.2 … acm.org/sigs/sigir/ 1.2 2 … 60 23.5 …

  • Value scaling:

– Binning vs. log-linear vs. linear (e.g., μ=0, σ=1)

  • Missing Values:
  • Missing Values:

– 0? (meaning for normalized feature values s.t. μ=0?)

  • “real-time”: significant architecture/system problems

real time : significant architecture/system problems

35

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-36
SLIDE 36

Results for Incorporating Behavior into Ranking

[A i ht i t l SIGIR2006]

0 7 0.72 0.74

[Agichtein et al., SIGIR2006]

0.64 0.66 0.68 0.7

NDCG

0.56 0.58 0.6 0.62

RN Rerank-All RN+All

MAP Gain

1 2 3 4 5 6 7 8 9 10

K

MAP Gain RN 0.270 RN+ALL 0.321 0.052 (19.13%) BM25 0.236 BM25+ALL 0.292 0.056 (23.71%)

36

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-37
SLIDE 37

Which Queries Benefit Most

[Agichtein et al., SIGIR2006] 350 0.2 Frequency Average Gain 250 300 0.05 0.1 0.15 150 200

  • 0.15
  • 0.1
  • 0.05

50 100

  • 0.3
  • 0.25
  • 0.2

0.1 0.2 0.3 0.4 0.5 0.6

  • 0.4
  • 0.35

Most gains are for queries with poor original ranking

37

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-38
SLIDE 38

Lecture 3 Plan

  • Review: Learning to Rank
  • Exploiting User Behavior for Ranking:
  • Automatic relevance labels
  • Enriching feature space
  • Enriching feature space

1. Implementation and System Issues

  • D

li ith d t

  • Dealing with data sparseness

– Dealing with Scale

i i 2. New Directions

– Active learning – Ranking for diversity – Fun and games

38

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-39
SLIDE 39

Extension to Unseen Queries/Documents: Search Trails

[Bilenko and White, WWW 2008]

  • Trails start with a search engine query
  • Continue until a terminating event

Continue until a terminating event

– Another search – Visit to an unrelated site (social networks, webmail)

39

– Timeout, browser homepage, browser closing

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-40
SLIDE 40

Probabilistic Model

[Bilenko and White, WWW 2008]

  • IR via language modeling [Zhai-Lafferty, Lavrenko]

g g g [ y, ]

  • Query-term distribution gives more mass to rare

terms:

  • Term-website weights combine dwell time and counts

40

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-41
SLIDE 41

Results: Learning to Rank

[Bilenko and White, WWW 2008]

Add Rel(q, di) as a feature to RankNet

0.72 0 68 0.7 0.66 0.68 DCG Baseline 0.62 0.64 ND Baseline Baseline+Heuristic 0.58 0.6 Baseline+Probabilistic Baseline+Probabilistic+RW NDCG@1 NDCG@3 NDCG@10

41

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-42
SLIDE 42

BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009

Scalability: (Peta?)bytes of Click Data

query URL1 URL2 URL3 URL4 S1 S2 S3 S4 Relevance E E E E Examine E1 E2 E3 E4 Snippet

i i d l

C1 C2 C3 C4 ClickThroughs

BBM: Bayesian Browsing Model

42

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-43
SLIDE 43

Training BBM: One-Pass Counting

BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009 Find Rj

43

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-44
SLIDE 44

Training BBM on MapReduce

BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009

  • Map: emit((q,u), idx)

p ((q, ), )

  • Reduce: construct the

count vector

44

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-45
SLIDE 45

Model Comparison on Efficiency

BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009

57 times faster 57 times faster

45

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-46
SLIDE 46

Large-Scale Experiment

BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009

  • Setup:

– 8 weeks data, 8 jobs – Job k takes first k- week data

  • Experiment platform

– SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets [Chaiken et al, VLDB’08]

46

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-47
SLIDE 47

Scalability of BBM

BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009

  • Increasing computation load

– more queries, more URLs, more impressions q , , p

  • Near-constant elapsed time
  • 3 hours
  • Scan 265 terabyte data
  • Full posteriors for 1.15

p billion (query, url) pairs

Computation Overload Elapse Time on SCOPE

47

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-48
SLIDE 48

Lecture 3 Plan

Review: Learning to Rank Exploiting User Behavior for Ranking:

  • Automatic relevance labels
  • Enriching feature space

Implementation and System Issues Implementation and System Issues

  • Dealing with data sparseness
  • Dealing with Scale

New Directions

  • Active learning
  • Ranking for diversity

48

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-49
SLIDE 49

New Direction: Active Learning

[Radlinski & Joachims, KDD 2007]

  • Goal: Learn the relevances with as little training

data as possible data as possible.

  • Search involves a three step process:
  • 1. Given relevance estimates, pick a ranking to display to

users.

  • 2. Given a ranking, users provide feedback: User clicks

provide pairwise relevance judgments. 3 Gi f db k d h l i

  • 3. Given feedback, update the relevance estimates.

49

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-50
SLIDE 50

Overview of Approach

[Radlinski & Joachims, KDD 2007]

  • Available information:

1. Have an estimate of the relevance of each result. 2. Can obtain pairwise comparisons of the top few results. 3. Do not have absolute relevance information.

  • Goal: Learn the document relevance quickly.
  • Will address four questions:

1. How to represent knowledge about doc relevance. 1. How to represent knowledge about doc relevance. 2. How to maintain this knowledge as we collect data. 3. Given our knowledge, what is the best ranking? h k d h f l d ? 4. What rankings do we show users to get useful data?

50

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-51
SLIDE 51

1: Representing Document Relevance

[Radlinski & Joachims, KDD 2007]

  • Given a query, q , let M ∗ = (µ∗

1, . . . , µ∗ |C|)

M be the true relevance values of the documents the true relevance values of the documents.

  • Model knowledge of M∗ with a Bayesian:

P (M |D) = P (D|M ) P (M )/P (D)

  • Assume P (M|D) is spherical multivariate normal:

P (M |D) = N (ν1, . . . , ν|C|; σ1

2, . . . , σ|C| 2)

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

51

slide-52
SLIDE 52

1: Representing Document Relevance

[Radlinski & Joachims, KDD 2007]

  • Given a fixed query, maintain knowledge about

relevance as clicks are observed relevance as clicks are observed.

– This tells us which documents we are sure about, and which ones need more data which ones need more data.

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

52

slide-53
SLIDE 53

2: Maintaining P(M|D)

[Radlinski & Joachims, KDD 2007]

Model noisy pairwise judgments w [Bradley-Terry’52] Adding a Gaussian prior, apply off-the-shelf algorithm to maintain Glicko Rating System, commonly used g y , y for chess [Glickman 1999]

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

53

slide-54
SLIDE 54

3: Ranking (Inference)

[Radlinski & Joachims, KDD 2007]

  • Want to assign relevances M = (µ1, . . . , µ|C|) such

that L(M, M∗) is small, but M∗ is unknown.

  • Minimize expected loss (pairwise):

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

54

slide-55
SLIDE 55

4: Getting Useful Data

[Radlinski & Joachims, KDD 2007]

  • Problem: could present the ranking based on

current best estimate of relevance.

– Then the data we get would always be about the documents already ranked highly.

  • Instead, optimize ranking shown users:
  • 1. Pick top two docs to minimize future loss
  • 1. Pick top two docs to minimize future loss
  • 2. Append current best estimate ranking.

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

55

slide-56
SLIDE 56

4: Exploration Strategies

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

56

slide-57
SLIDE 57

4: Loss Functions

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

57

slide-58
SLIDE 58

Results: TREC Data

[Radlinski & Joachims, KDD 2007]

Optimizing for relevance estimates better than for ordering

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

58

Optimizing for relevance estimates better than for ordering

slide-59
SLIDE 59

Need for Diversity (in IR)

[Predicting Diverse Subsets Using Structural SVMs,

  • Y. Yue and Joachims, ICML 2008]
  • Ambiguous Queries

Users with different information needs issuing the same textual – Users with different information needs issuing the same textual query (“Jaguar”)

I f ti l (E l t ) Q i

  • Informational (Exploratory) Queries:

– User interested in “a specific detail or entire breadth of knowledge available” [Swaminathan et al 2008] knowledge available [Swaminathan et al., 2008] – Want results with high information diversity

59

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-60
SLIDE 60

Optimizing for Diversity

[Predicting Diverse Subsets Using Structural SVMs,

  • Y. Yue and Joachims, ICML 2008]
  • Long interest in IR community
  • Requires inter document dependencies
  • Requires inter-document dependencies

Impossible given current learning to rank methods

  • Problem: no consensus on how to measure diversity.

Formulate as predicting diverse subsets

  • Experiment:

– Use training data with explicitly labeled subtopics (TREC 6-8 g p y p ( Interactive Track) – Use loss function to encode subtopic loss – Train using structural SVMs [Tsochantaridis et al 2005] Train using structural SVMs [Tsochantaridis et al., 2005]

60

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-61
SLIDE 61

Representing Diversity

[Predicting Diverse Subsets Using Structural SVMs,

  • Y. Yue and Joachims, ICML 2008]
  • Existing datasets with manual subtopic labels

E “U f b i h ld d ” – E.g., “Use of robots in the world today”

  • Nanorobots
  • Space mission robots
  • Space mission robots
  • Underwater robots

– Manual partitioning of the total information regarding a – Manual partitioning of the total information regarding a query – Relatively reliable Relatively reliable

61

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-62
SLIDE 62

Example

[Predicting Diverse Subsets Using Structural SVMs,

  • Y. Yue and Joachims, ICML 2008]
  • Choose K documents with maximal information coverage.
  • For K = 3 optimal set is {D1 D2 D10}
  • For K = 3, optimal set is {D1, D2, D10}

62

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-63
SLIDE 63

Maximizing Subtopic Coverage

[Predicting Diverse Subsets Using Structural SVMs,

  • Y. Yue and Joachims, ICML 2008]
  • Goal: select K documents which collectively cover

as many subtopics as possible as many subtopics as possible.

  • Perfect selection takes n choose K time.

–Set cover problem. p

  • Greedy gives (1 1/e) approximation bound
  • Greedy gives (1-1/e)-approximation bound.

–Special case of Max Coverage (Khuller et al, 1997)

63

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-64
SLIDE 64

Weighted Word Coverage

[Predicting Diverse Subsets Using Structural SVMs,

  • Y. Yue and Joachims, ICML 2008]
  • More distinct words = more information

– Weight word importance – Does not depend on human labels

  • Goal: select K documents which collectively

cover as many distinct (weighted) words as y ( g ) possible

– Greedy selection also yields (1-1/e) bound. Greedy selection also yields (1 1/e) bound. – Need to find good weighting function (learning problem). p )

64

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-65
SLIDE 65

Example

[Predicting Diverse Subsets Using Structural SVMs,

  • Y. Yue and Joachims, ICML 2008]

Word Benefit

Document Word Counts

V1 V2 V3 V4 V5 D1 X X X V1 1 V2 2 D2 X X X D3 X X X X V3 3 V4 4

Marginal Benefit

V5 5

D1 D2 D3 Best

Iter 1 12 11 10 D1 It 2 Iter 2

65

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-66
SLIDE 66

Example (cont’d)

[Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008]

Word Benefit

Document Word Counts

V1 V2 V3 V4 V5 D1 X X X V1 1 V2 2 D2 X X X D3 X X X X V3 3 V4 4

Marginal Benefit

V5 5

D1 D2 D3 Best

Iter 1 12 11 10 D1 It 2 2 3 D3 Iter 2 -- 2 3 D3

66

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-67
SLIDE 67

Results: TREC data

[Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008]

  • 12/4/1 train/valid/test split

– Approx 500 documents in training set Approx 500 documents in training set

  • Permuted until all 17 queries were tested once
  • Set K=5 (some queries have very few documents)
  • SVM-div – uses term frequency thresholds to define importance

levels

  • SVM-div2 – in addition uses TFIDF thresholds

67

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-68
SLIDE 68

Results: TREC data

[Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008]

Method Loss Random 0.469 Methods W / T / L SVM-div vs 14 / 0 / 3 ** Okapi 0.472 Unweighted Model 0.471

  • Ess. Pages

SVM-div2 vs 13 / 0 / 4 Essential Pages 0.434 SVM di 0 349

  • Ess. Pages

SVM-div vs 9 / 6 / 2 SVM-div 0.349 SVM-div2 0.382 SVM-div2

68

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-69
SLIDE 69

Results: TREC data

[Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008]

Can expect further benefit from having more training data.

69

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-70
SLIDE 70

Summary

Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008

  • Formulated diversified retrieval as predicting

diverse subsets diverse subsets

– Efficient training and prediction algorithms

  • Used weighted word coverage as proxy to

information coverage.

  • Encode diversity criteria using loss function

– Weighted subtopic loss

http://projects.yisongyue.com/svmdiv/

70

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-71
SLIDE 71

Lecture 3 Summary

Review: Learning to Rank Exploiting User Behavior for Ranking:

  • Automatic relevance labels
  • Enriching feature space

Implementation and System Issues Implementation and System Issues

  • Dealing with data sparseness
  • Dealing with Scale

New Directions

  • Active learning
  • Ranking for diversity

71

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

slide-72
SLIDE 72

Key References and Further Reading

Joachims, T. 2002. Optimizing search engines using clickthrough data, KDD 2002 clickthrough data, KDD 2002 Agichtein, E., Brill, E., Dumais, S. Improving web search ranking by incorporating user behavior information, SIGIR 2006 R dli ki F d J hi T Q h i l i k Radlinski, F. and Joachims, T. Query chains: learning to rank from implicit feedback, KDD 2005 Radlinski, F. and Joachims, T. Active exploration for learning , , p f g rankings from clickthrough data, KDD 2007 Bilenko, M and White, R, Mining the search trails of surfing crowds: identifying relevant websites from user activity crowds: identifying relevant websites from user activity., WWW 2008 Yue, Y and Joachims, Predicting Diverse Subsets Using Structural SVMs ICML 2008 Structural SVMs, ICML 2008

72

Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)