Learning to Rank Vinay Setty Jannik Strtgen vsetty@mpi-inf.mpg.de - PowerPoint PPT Presentation

Advanced Topics in Information Retrieval Learning to Rank Vinay Setty Jannik Strötgen vsetty@mpi-inf.mpg.de jannik.stroetgen@mpi-inf.mpg.de ATIR – July 14, 2016

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Before we start oral exams July 28, the full day if you have any temporal constraints, let us know Q-A sessions – suggestion Thursday, July 21: Vinay and “his topics” Monday, July 25: Jannik and “his topics” � Jannik Strötgen – ATIR-10 c 2 / 72

Advanced Topics in Information Retrieval Learning to Rank Vinay Setty Jannik Strötgen vsetty@mpi-inf.mpg.de jannik.stroetgen@mpi-inf.mpg.de ATIR – July 14, 2016

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search The Beginning of LeToR learning to rank (LeToR) builds on established methods from Machine Learning allows different targets derived from different kinds of user input active area of research for past 10 – 15 years early work already (end of) 1980s (e.g., Fuhr 1989) � Jannik Strötgen – ATIR-10 c 4 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search The Beginning of LeToR why wasn’t LeToR successful earlier? IR and ML communities were not very connected sometimes ideas take time limited training data – it was hard to gather (real-world) test collection queries and relevance judgments that are representative of real user needs and judgments on returned documents – this changed in academia and industry poor machine learning techniques insufficient customization to IR problem not enough features for ML to show value � Jannik Strötgen – ATIR-10 c 5 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search The Beginning of LeToR standard ranking based on term frequency / inverse document frequency Okapi BM25 language models ... traditional ranking functions in IR exploit very few features standard approach to combine different features normalize features (zero mean, unit standard deviation) feature combination function (typically: weighted sum) tune weights (either manually or exhaustively via grid search) traditional ranking functions easy to tune � Jannik Strötgen – ATIR-10 c 6 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Why learning to rank nowadays? � Jannik Strötgen – ATIR-10 c 7 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Why learning to rank? modern systems use a huge number of features (especially Web search engines) textual relevance (e.g., using LM, Okapi BM25) proximity of query keywords in document content link-based importance (e.g., determined using PageRank) depth of URL (top-level page vs. leaf page) spamminess (e.g., determine using SpamRank) host importance (e.g., determined using host-level PageRank) readability of content location and time of the user location and time of documents ... � Jannik Strötgen – ATIR-10 c 8 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Why learning to rank? high creativity in feature engineering task query word in color on page? number of images on page? URL contains ~? number of (out) links on a page? page edit recency page length learning to rank makes combining features more systematic � Jannik Strötgen – ATIR-10 c 9 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Outline I LeToR Framework 1 Modeling Approaches 2 Gathering User Feedback 3 Evaluating Learning to Rank 4 Learning-to-Rank for Temporal IR 5 Learning-to-Rank – Beyond Search 6 � Jannik Strötgen – ATIR-10 c 10 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Outline LeToR Framework 1 Modeling Approaches 2 Gathering User Feedback 3 Evaluating Learning to Rank 4 Learning-to-Rank for Temporal IR 5 Learning-to-Rank – Beyond Search 6 � Jannik Strötgen – ATIR-10 c 11 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search LeToR Framework query learning ranked documents method results user open issues how do we model the problem? is it a regression or classification problem? what about our prediction target ? � Jannik Strötgen – ATIR-10 c 12 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search LeToR Framework query learning ranked documents method results user scoring as function different input signals (features) x i with weights α i score(d,q) = f ( x 1 , ..., x m , α 1 , ..., α m ) where weights are learned features derived from d, q, and context � Jannik Strötgen – ATIR-10 c 13 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Classification – Regression classification example dataset of � q , d , r � triples r: relevance (binary or multiclass) d: document represented by feature vector train ML model to predict class r of a d-q pair decide relevant if score is above threshold classification problems result in an unordered set of classes � Jannik Strötgen – ATIR-10 c 15 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Classification – Regression classification problems result in an unordered set of classes regression problems map to real values ordinal regression problems result in ordered set of classes � Jannik Strötgen – ATIR-10 c 16 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search LeToR Modeling LeToR can be modeled in three ways: pointwise: predict goodness of individual documents pairwise: predict users’ relative preference for pairs of documents listwise: predict goodness of entire query results each has advantages and disadvantages for each concrete approaches exist in-depth discussion of concrete approaches by Liu 2009 � Jannik Strötgen – ATIR-10 c 17 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Pointwise Modeling yes / no ( query document ) , ( −∞ , + ∞ ) y x f ( x , θ ) pointwise approaches predict for every document based on its feature vector x document goodness y (e.g., label or measure of engagement) training determines the parameter θ based on a loss function (e.g., root-mean-square error) main disadvantage as input is single document, relative order between documents cannot be naturally considered in the learning process � Jannik Strötgen – ATIR-10 c 18 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Pairwise Modeling ( query document 1 document 2 ) {-1, +1} , , y x f ( x , θ ) pairwise approaches predict for every pair of documents based on feature vector x users’ relative preference regarding the documents ( +1 shows preference for document 1; -1 for document 2) training determines the parameter θ based on a loss function (e.g., the number of inverted pairs) advantage : models relative order main disadvantages: no distinction between excellent–bad and fair–bad sensitive to noisy labels (1 wrong label, many mislabeled pairs) � Jannik Strötgen – ATIR-10 c 19 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Listwise Modeling ( query doc. 1 . . . doc. k ) ( −∞ , + ∞ ) , , , y x f ( x , θ ) listwise approaches predict for ranked list of documents based on feature vector x effectiveness of ranked list y (e.g., MAP or nDCG) training determines the parameter θ based on a loss function advantage: positional information visible to loss function disadvantage: high training complexity, ... � Jannik Strötgen – ATIR-10 c 20 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Typical Learning-to-Rank Pipeline learning to rank is typically deployed as a re-ranking step (infeasible to apply it to entire document collection) query top-K results top-k results user 1 2 step 1 Determine a top-K result (K ~1,000) using a proven baseline retrieval method (e.g., Okapi BM25 + PageRank) step 2 Re-rank documents from top-K using learning to rank approach, then return top-k (k ~100) to user � Jannik Strötgen – ATIR-10 c 21 / 72

LeToR Framework Modeling User Feedback Evaluation Time Beyond Search Gathering User Feedback independent of pointwise, pairwise, or listwise modeling some input from the user is required to determine prediction target y two types of user input explicit user input (e.g., relevance assessments) implicit user input (e.g., by analyzing their behavior) � Jannik Strötgen – ATIR-10 c 23 / 72

Learning to Rank Vinay Setty Jannik Strtgen vsetty@mpi-inf.mpg.de - PowerPoint PPT Presentation

Advanced Topics in Information Retrieval Learning to Rank Vinay Setty Jannik Strtgen vsetty@mpi-inf.mpg.de jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 LeToR Framework Modeling User Feedback Evaluation Time Beyond Search

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

10. Learning to Rank Outline 10.1. Why Learning to Rank (LeToR)? 10.2. Pointwise, Pairwise,

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

Learning to Rank Learning to Rank with Partially-Labeled Data with Partially-Labeled Data Kevin

Cross-Domain Learning-to-rank with SVM Erheng Zhong 1 1 Department of Computer Science and

Learning to Rank with Learning to Rank with Partially-Labeled Data Partially-Labeled Data Kevin

2018 - 2019 Teacher Salary Comparison Report 0-Year 5-Year 10-Year 15-Year 20-Year District

Introduction to rank-based cryptography Philippe Gaborit University of Limoges, France ASCRYPTO

Web Mining Mining content Simple rank is confused by rank sinks, e.g. two pages that

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

Selection Problem Rank Given n unsorted elements, determine the Rank of an element is its

Selection Problem Rank Given n unsorted elements, determine the Rank of an element is its

Multiple-Rank Updates to Matrix Factorizations Zack 8/30/2013 Outline u Introduction u

/k Content 2/15 1. Introduction 2. Hamming weight 3. Rank weight 4. Extended rank weight

Tax Law Changes, Updates, and Reminders 3 1 11/13/2017 L aw Changes 2017 Wisconsin Act 2

AAPICO HITECH PLC [AH] Q3 2017 Performance & Business Update Opportunity Day The Stock

and Emergency Attendance Discussion Q&A Craig Morris Street Lighting Engineer Doncaster

AOLS APRIL WEBINAR: Insurance Claims and How To Avoid Them Tom Packowski, O.L.S. AOLS

Resilient Washington Subcabinet Report: Findings related to post-earthquake reconaissance April 17

RAFT continued Distributed Systems Nikita Borisov Slide content borrowed from Diego Ongaro, John

Procurement Connection to Other Requirements Equal Conflict of Opportunity/ Interest and

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously Julian Zimmert

Learning to Rank Vinay Setty Jannik Strtgen vsetty@mpi-inf.mpg.de - PowerPoint PPT Presentation

Advanced Topics in Information Retrieval Learning to Rank Vinay Setty Jannik Strtgen vsetty@mpi-inf.mpg.de jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 LeToR Framework Modeling User Feedback Evaluation Time Beyond Search

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

10. Learning to Rank Outline 10.1. Why Learning to Rank (LeToR)? 10.2. Pointwise, Pairwise,

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

Learning to Rank Learning to Rank with Partially-Labeled Data with Partially-Labeled Data Kevin

Cross-Domain Learning-to-rank with SVM Erheng Zhong 1 1 Department of Computer Science and

Learning to Rank with Learning to Rank with Partially-Labeled Data Partially-Labeled Data Kevin

2018 - 2019 Teacher Salary Comparison Report 0-Year 5-Year 10-Year 15-Year 20-Year District

Introduction to rank-based cryptography Philippe Gaborit University of Limoges, France ASCRYPTO

Web Mining Mining content Simple rank is confused by rank sinks, e.g. two pages that

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

Selection Problem Rank Given n unsorted elements, determine the Rank of an element is its

Selection Problem Rank Given n unsorted elements, determine the Rank of an element is its

Multiple-Rank Updates to Matrix Factorizations Zack 8/30/2013 Outline u Introduction u

/k Content 2/15 1. Introduction 2. Hamming weight 3. Rank weight 4. Extended rank weight

Tax Law Changes, Updates, and Reminders 3 1 11/13/2017 L aw Changes 2017 Wisconsin Act 2

AAPICO HITECH PLC [AH] Q3 2017 Performance &amp; Business Update Opportunity Day The Stock

and Emergency Attendance Discussion Q&amp;A Craig Morris Street Lighting Engineer Doncaster

AOLS APRIL WEBINAR: Insurance Claims and How To Avoid Them Tom Packowski, O.L.S. AOLS

Resilient Washington Subcabinet Report: Findings related to post-earthquake reconaissance April 17

RAFT continued Distributed Systems Nikita Borisov Slide content borrowed from Diego Ongaro, John

Procurement Connection to Other Requirements Equal Conflict of Opportunity/ Interest and

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously Julian Zimmert

AAPICO HITECH PLC [AH] Q3 2017 Performance & Business Update Opportunity Day The Stock

and Emergency Attendance Discussion Q&A Craig Morris Street Lighting Engineer Doncaster