DocumentSelec,onMethodologies forEfficientandEffec,ve - PowerPoint PPT Presentation

Document Selec,on Methodologies  for Efficient and Effec,ve   Learning‐to‐Rank  Javed Aslam, Evangelos Kanoulas,   Virgil Pavlu, Stefan Savev, Emine Yilmaz  

Search Engines  User’s Request Results Search Engine BM25,I*idf,  PageRank, …  Document Corpus Hundreds of  features 

Training Search Engines  Queries Search Engine Metric Judges BM25,I*idf,  PageRank, …  Document Corpus 1. Neural Network  2. Support Vector  Machine  3. Regression Func,on  4. Decision Tree  … 

Training Data Sets  • Data Collec,ons  – Billions of documents  – Thousands of queries  • Ideal, in theory; infeasible, in prac,ce…  – Extract features from all query‐document pairs  – Judge each document with respect to each query  • Extensive human effort  – Train over all query‐document pairs 

Training Data Sets  • Train the ranking func,on over a subset of the  complete collec,on  • Few queries with many document judged vs.  many queries with few documents judged  – Be^er to train over many queries with few judged  documents [Yilmaz and Robertson ’09]  • How should we select document? 

Training Data Sets  • Machine Learning (Ac,ve Learning)  – Itera,ve process  – Tightly coupled with the learning algorithm  • IR Evalua,on  – Many test collec,ons already available  – Efficient and effec,ve techniques to construct test  collec,ons  • Intelligent way of selec,ng documents  • Inferences of effec,veness metrics 

Duality between LTR and Evalua,on  • This work: Explore duality between Evalua,on  and Learning‐to‐Rank  – Employ techniques used for efficient and effec,ve  test collec,on construc,on to construct training  collec,ons 

Duality between LTR and Evalua,on  • Can test collec,on construc,on  methodologies be used to construct training  collec,ons?  • If yes, which one of these methodologies is  be^er?  • What makes a training set be^er than the  other? 

Methodology  • Depth‐100 pool (as the complete collec,on)  • Select subsets of documents from the depth‐100 pool  – Using different document selec,on methodologies  • Train over the different training sets  – Using a number of learning‐to‐rank algorithms  • Test the performance of the resul,ng ranking func,ons  – Five fold cross valida,on 

Data Sets  • Data from TREC 6,7 and 8  – Document corpus : TREC Discs 4 and 5  – Queries : 150 queries; ad‐hoc tracks  – Relevance judgments : depth‐100 pools  • Features from each query‐document pair  – 22 features; subset of LETOR features   (BM25, Language Models, TF‐IDF, …) 

Document Selec,on Methodologies  Select subsets of documents   • Subset size varying from 6% to 60%  – 1. Depth‐k pooling  2. InfAP (uniform random sampling)  3. StatAP (stra,fied random sampling)  4. MTC (greedy on‐line algorithm)  5. LETOR (top‐k by BM25; current prac,ce)  6. Hedge (greedy on‐line algorithm) 

Document Selec,on Methodologies  Discrepancy between relevant and non � relevant documents Precision of the selection methods 6.5 0.8 (symmetrized KL divergence) 6 0.6 Discrepancy Precision 5.5 0.4 depth depth 5 hedge hedge infAP infAP 0.2 mtc mtc 4.5 statAP statAP LETOR LETOR 0 4 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 Percentage of data used for training Percentage of data used for training • Precision : frac,on of selected documents that are  relevant  • Discrepancy : symmetrized KL divergence between  documents’ language models 

LTR Algorithms  • Train over the different data sets  1. Regression (classifica,on error)  2. Ranking SVM (AUC)  3. RankBoost (pairwise preferences)  4. RankNet (probability of correct order)  5. LambdaRank (nDCG) 

Results (1)  Regression Ranking SVM 0.25 0.25 0.25 0.25 MAP MAP 0.2 0.2 0.2 depth depth 0.2 hedge hedge infAP infAP 0.15 0.15 0.15 0.15 MTC MTC statAP statAP 0.1 0.1 0 5 10 0 5 10 LETOR LETOR 0.1 0.1 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 Percentage of data used for training Percentage of data used for training

Results (2)  RankBoost LambdaRank 0.25 0.25 0.25 0.2 0.25 MAP 0.2 MAP 0.2 depth depth 0.2 hedge hedge 0.15 0.15 infAP infAP 0.15 0.15 MTC MTC statAP statAP 0.1 0.1 0 5 10 0 5 10 LETOR LETOR 0.1 0.1 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 Percentage of data used for training Percentage of data used for training

Results (3)  RankNet RankNet (hidden layer) 0.25 0.25 0.25 0.25 MAP MAP 0.2 0.2 depth 0.2 0.2 depth hedge hedge infAP infAP 0.15 0.15 0.15 0.15 MTC MTC statAP statAP 0.1 0.1 LETOR 0 5 10 0 5 10 LETOR 0.1 0.1 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 Percentage of data used for training Percentage of data used for training

Observa,ons (1)  • Some Learning‐to‐Rank algorithms are robust  to document selec,on methodologies  – LambdaRank vs. RankBoost  LambdaRank RankBoost 0.25 0.25 0.25 0.2 0.25 0.2 MAP MAP 0.2 depth depth 0.2 hedge hedge 0.15 0.15 infAP infAP 0.15 0.15 MTC MTC statAP statAP 0.1 0.1 0 5 10 0 5 10 LETOR LETOR 0.1 0.1 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 Percentage of data used for training Percentage of data used for training

Observa,ons (2)  • Near‐op,mal performance with 1%‐2% of  complete collec,on (depth‐100 pool)  – No significant differences at greater % (t‐test)  – Number of features ma^er [Taylor et.al ‘06]   RankNet 0.25 0.25 MAP 0.2 0.2 depth hedge infAP 0.15 0.15 MTC statAP 0.1 0 5 10 LETOR 0.1 0 10 20 30 40 50 60 70 Percentage of data used for training

Observa,ons (3)  • Selec,on methodology ma^ers  – Hedge (worst performance)  – Depth‐k pooling and statAP (best performance)  – LETOR‐like  (neither most efficient nor most effec,ve)  Ranking SVM 0.25 0.25 MAP 0.2 0.2 depth hedge infAP 0.15 0.15 MTC statAP 0.1 0 5 10 LETOR 0.1 0 10 20 30 40 50 60 70 Percentage of data used for training

Rela,ve Importance on Effec,veness  • Learning‐to‐Rank algorithm vs. document  selec,on methodology  – 2‐way ANOVA model  • Variance decomposi,on over all data sets  – 26% due to document selec,on  – 31% due to LTR algorithm  • Variance decomposi,on (small data sets, <10%)  – 44% due to document selec,on  – 31% due to LTR algorithm 

What makes one training set be^er  than another?  • Different methods have different proper,es  – Precision  – Recall  – Similari,es between relevant documents  – Similari,es between relevant and non‐relevant  documents  – ...  • Model selec,on 

What makes one training set be^er  than another?  • Different methods have different proper,es  – Precision  – Recall  – Similari,es between relevant documents  – Similari,es between relevant and non‐relevant  documents  – ...  • Model selec,on  – Linear model (adjusted R 2  = 0.99) 

What makes one training set be^er  than another?  RankBoost Ranking SVM 0.25 0.25 MAP MAP 0.2 0.2 0.15 0.15 0.1 0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Precision Precision

What makes one training set be^er  than another?  RankBoost Ranking SVM 0.25 0.25 MAP MAP 0.2 0.2 0.15 0.15 0.1 0.1 4 4.5 5 5.5 6 6.5 4 4.5 5 5.5 6 6.5 Discrepancy between relevant and non � relevant Discrepancy between relevant and non � relevant documents in the training data documents in the training data

Conclusions  • Some LTR algorithms are robust to document  selec,on methodologies  • For those not, selec,on methodology ma^ers  – Depth‐k pooling, stra,fied sampling  • Harmful to select too many relevant docs  • Harmful to select relevant and non‐relevant  docs that are too similar 

DocumentSelec,onMethodologies forEfficientandEffec,ve - PowerPoint PPT Presentation

DocumentSelec,onMethodologies forEfficientandEffec,ve LearningtoRank JavedAslam,EvangelosKanoulas, VirgilPavlu,StefanSavev,EmineYilmaz SearchEngines Users Request

www.selec.com sales@selec.com The No. 1 brand in India - now expanding worldwide...

Document #15 Document #15 Document #15 Document #15 Document #15 Document #15 Document #15

Developing an effec tive Developing an effec tive c our c our c our c our se outc omes se

Software Development Methodologies Lecturer: Raman Ramsin Lecture 8 Agile Methodologies: DSDM

DESI ESIGN B BUILD P PROC OCESS SS WHER ERE WEV EVE B BEEN EEN 1. 1. Selec

Customer and Awareness Surveys 2018 Survey Methodologies Survey Methodologies Used similar

Development Methodologies Development Methodologies A methodology is a system of methods and

Software Development Methodologies Lecturer: Raman Ramsin Lecture 10 Agile Methodologies: XP

Complexity-Effec/ve Mul/core Cache Coherence MCC2012 Stefanos

Shares and Assets CA Sujal Shah 11 Aug 2018 VALUATION METHODOLOGIES 2 VALUATION METHODOLOGIES

Developing JCM methodologies Kentaro Takahashi, Task Manager/Senior Policy Researcher Climate

Software Development Methodologies Lecturer: Raman Ramsin Lecture 6 Integrated Object-Oriented

Development Methodologies Perdita Stevens perdita@inf.ed.ac.uk

A small molecule glycomime.c antagonist of E-selec.n and

Screening and Selec,on of Investors Training for Governments

E L A Passage Selec tio n fo r Grades 3-12 What to c onside r whe n se le c ting passage s

Translating an ER diagram to a relational schema Given an ER diagram, we can look for a relational

Mega Modeling for Scien/fic Big Data Processing Stefano

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal

The Entity-Relationship (E-R) Model An E-R model is used to describe an enterprise that is to

Database Design Process Requirements analysis IT420: Database Management and Conceptual

Entity Resolution with Weighted Constraints Zeyu Shen and Qing Wang Research School of Computer

Patient F Financial S Services R Report March 2020 2020 Angela McLain-Johnson, MA, RHIA

From raw data to rich(er) data Lessons learned while aggregating metadata Julia Beck |