Chapter 13: Ranking Models I apply some basic rules of probability - PowerPoint PPT Presentation

Chapter 13: Ranking Models I apply some basic rules of probability theory to calculate the probability of God's existence – the odds of God, really. -- Stephen Unwin God does not roll dice. -- Albert Einstein Not only does God play dice, but He sometimes confuses us by throwing them where they can't be seen. -- Stephen Hawking 13-1 IRDM WS 2015

Outline 13.1 IR Effectiveness Measures 13.2 Probabilistic IR 13.3 Statistical Language Model 13.4 Latent-Topic Models 13.5 Learning to Rank following Büttcher/Clarke/Cormack Chapters 12, 8, 9 and/or Manning/Raghavan/Schuetze Chapters 8, 11, 12, 18 plus additional literature for 13.4 and 13.5 13-2 IRDM WS 2015

13.1 IR Effectivness Measures ideal measure is user satisfaction heuristically approximated by benchmarking measures (on test corpora with query suite and relevance assessment by experts) Capability to return only relevant documents: # relevant docs among top r typically for Precision (Präzision) = r = 10, 100, 1000 r Capability to return all relevant documents: # relevant docs among top r typically for Recall (Ausbeute) = r = corpus size # relevant docs Typical quality Ideal quality 1 1 0,8 0,8 Precision Precision 0,6 0,6 0,4 0,4 0,2 0,2 0 0 0 0,2 0,4 0,6 0,8 0 0,2 0,4 0,6 0,8 Recall Recall 13-3 IRDM WS 2015

IR Effectiveness: Aggregated Measures Combining precision and recall into F measure (e.g. with  =0.5: 1  harmonic mean F1 ): F 1 1     ( 1 ) precision recall Precision-recall breakeven point of query q: point on precision-recall curve p = f(r) with p = r for a set of n queries q1, ..., qn (e.g. TREC benchmark) Macro evaluation n 1  precision ( qi ) (user-oriented) = n of precision  i 1 analogous for recall n  # relevant & found docs for qi and F1 Micro evaluation  i 1 (system-oriented) = n  of precision # found docs for qi  i 1 13-4 IRDM WS 2015

IR Effectivness: Integrated Measures • Interpolated average precision of query q  1 / with precision p(x) at recall x 1   p ( i ) area and step width  (e.g. 0.1):  1 /  under i 1 precision- • Uninterpolated average precision of query q recall curve with top-m search result rank list d 1 , ..., d m , k 1 j relevant results di 1 , ..., di k (k  m, i j  i j+1  m):  k i  j 1 j • Mean average precision (MAP) of query benchmark suite macro-average of per-query interpolated average precision for top-m results (usually with recall width 0.01)  1 / 1 1     precision ( recall i )  | Q | 1 /   q Q i 1 13-5 IRDM WS 2015

IR Effectiveness: Integrated Measures plot ROC curve (receiver operating characteristics): true-positives rate vs. false-positives rate corresponds to: Recall vs. Fallout # irrelevant docs among top r where Fallout = # irrelevant docs in corpus good ROC curve: 1 0,8 0,6 Recall area under curve (AUC) 0,4 is quality indicator 0,2 0 0 0,2 0,4 0,6 0,8 Fallout 13-6 IRDM WS 2015

IR Effectiveness: Weighted Measures Mean reciprocal rank (MRR) over query set Q: 1 1   Variation:  MRR summand 0 if q Q | Q | First Re levantRank ( q ) FirstRelevantRank > k Discounted Cumulative Gain (DCG) for query q: k rating ( i )  2 1   DCG  log ( 1 i )  2 i 1 with finite set of result ratings: 0 (irrelevant), 1 (ok), 2(good), … Normalized Discounted Cumulative Gain (NDCG) for query q: NDCG  DCG / DCG ( Perfect Re sult ) 13-7 IRDM WS 2015

IR Effectiveness: Ordered List Measures Consider top-k of two rankings  1 and  2 or full permutations of 1..n • overlap similarity OSim (  1,  2) = | top(k,  1)  top(k,  2) | / k • Kendall's  measure KDist (  1,  2) =     | {( u , v ) | u , v U , u v , and 1 , 2 disagree on relative order of u , v } |   | U | (| U | 1 ) with U = top(k,  1)  top(k,  2) (with missing items set to rank k+1) with ties in one ranking and order in the other, count p with 0  p  1  p=0: weak KDist,  p=1: strict KDist 1     • footrule distance Fdist (  1,  2) = | 1 ( u ) 2 ( u ) | | U |  u U (normalized) Fdist is upper bound for KDist and Fdist/2 is lower bound 13-8 IRDM WS 2015

Outline 13.1 IR Effectiveness Measures 13.2 Probabilistic IR 13.2.1 Prob. IR with the Binary Model 13.2.2 Prob. IR with Poisson Model (Okapi BM25) 13.2.3 Extensions with Term Dependencies 13.3 Statistical Language Model 13.4 Latent-Topic Models 13.5 Learning to Rank 13-9 IRDM WS 2015

13.2 Probabilistic IR based on generative model: probabilistic mechanism for producing document (or query) usually with specific family of parameterized distribution often with assumption of independence among words justified by „ curse of dimensionality “: corpus with n docs and m terms has 2 m possible docs would have to estimate model parameters from n << 2 m (problems of sparseness & computational tractability) 13-10 IRDM WS 2015

13.2.1 Multivariate Bernoulli Model (aka. Multi-Bernoulli Model) For generating doc x • consider binary RVs: x w = 1 if w occurs in x, 0 otherwise • postulate independence among these RVs   1 x x with vocabulary W      w P [ x | ] w ( 1 ) w w and parameters  w =  w W P[randomly drawn word is w]       ( 1 ) w w    w x w W , w x • product for absent words underestimates prob. of likely docs • too much prob. mass given to very unlikely word combinations 13-11 IRDM WS 2015

Probability Ranking Principle (PRP) [Robertson and Sparck Jones 1976] Goal: Ranking based on sim(doc d, query q) = P[R|d] = P [ doc d is relevant for query q | d has term vector X1, ..., Xm ] Probability Ranking Principle (PRP) [Robertson 1977]: For a given retrieval task, the cost of retrieving d as the next result in a ranked list is: cost(d) := C R * P[R|d] + C notR * P[not R|d] with cost constants C R = cost of retrieving a relevant doc C notR = cost of retrieving an irrrelevant doc For C R < C notR , the cost is minimized by choosing argmax d P[R|d] 13-12 IRDM WS 2015

Probabilistic IR with Binary Independence Model [Robertson and Sparck Jones 1976] based on Multi-Bernoulli generative model and Probability Ranking Principle Assumptions: • Relevant and irrelevant documents differ in their terms. • Binary Independence Retrieval (BIR) Model: • Probabilities of term occurrence of different terms are pairwise independent • Term frequencies are binary  {0,1}. • for terms that do not occur in query q the probabilities for such a term occurring are the same for relevant and irrelevant documents. BIR principle analogous to Naive Bayes classifier 13-14 IRDM WS 2015

Ranking Proportional to Relevance Odds  p 1 p    i  i  with estimators p i =P[X i =1|R] q 1 q  i  i i d i d and q i =P[X i =1|  R]   i q i q  d 1 d  p ( 1 p )   i i  i  i  d 1 d  q ( 1 q )   i i i q i i q i d d   p ( 1 p ) q ( 1 q )  i i i i  i i ~ log ( ) log ( ) d d   ( 1 p ) ( 1 q )  i i i q i i   1 q p 1 p     i  i  i d log d log log i i   1 p q 1 q    i i i i q i q i q  p 1 q   i  i ~ d log d log i i ~ sim ( d , q )  1 p q   i i i q i q 13-16 IRDM WS 2015

Chapter 13: Ranking Models I apply some basic rules of probability - PowerPoint PPT Presentation

Chapter 13: Ranking Models I apply some basic rules of probability theory to calculate the probability of God's existence the odds of God, really. -- Stephen Unwin God does not roll dice. -- Albert Einstein Not only does God play dice, but

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

KNN and re ranking models for English KNN and re-ranking models for English patent mining at

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko

Online Submodular Set Cover, Ranking, and Repeated Active Learning Online Ranking: At each round,

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch Ads Ranking at

Retrieval Models Probability Ranking Principle Web Search Slides based on the books: 1

Chapter III: Ranking Principles Information Retrieval & Data Mining Universitt des

Chapter III: Ranking Principles Information Retrieval & Data Mining Universitt des

A Ranking Method to Improve A Ranking Method to Improve Detection of Disease Using Selectively

+ Ranking Factor Latest Trends What factors matter in 2016-2017 for ranking your Google

Kernel Principal Component Ranking: Robust Ranking on Noisy Data Evgeni Tsivtsivadze Botond

Tutorial Ranking Mechanisms in Games Vanessa Volz and Boris Naujoks CIG 2018, Maastricht

Web Mining and Recommender Systems Advanced Recommender Systems: Bayesian Personalized Ranking

PRanking with Ranking Koby Crammer Technion Israel Institute of Technology Based on joint

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data Eugene

Dirichlet Bayesian Network Scores and the Maximum Entropy Principle Marco Scutari

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 20: Topic

Natural Language Understanding Unsupervised Part-of-Speech Tagging Adam Lopez Slide credits:

Document and Topic Models: pLSA and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced

Lecture 8: Maximum Likelihood Estimation (MLE) (contd.) Maximum a posteriori (MAP)

Using topic models as classifiers Pavel Oleinikov Associate Director Quantitative Analysis

A characterisation of transient random walks on stochastic matrices with Dirichlet distributed

Uncovering latent jet substructure Barry M . Dillon Jozef Stefan Institute , Ljubljana , Slovenia

Chapter 13: Ranking Models I apply some basic rules of probability - PowerPoint PPT Presentation

Chapter 13: Ranking Models I apply some basic rules of probability theory to calculate the probability of God's existence the odds of God, really. -- Stephen Unwin God does not roll dice. -- Albert Einstein Not only does God play dice, but

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

KNN and re ranking models for English KNN and re-ranking models for English patent mining at

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko

Online Submodular Set Cover, Ranking, and Repeated Active Learning Online Ranking: At each round,

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch Ads Ranking at

Retrieval Models Probability Ranking Principle Web Search Slides based on the books: 1

Chapter III: Ranking Principles Information Retrieval &amp; Data Mining Universitt des

Chapter III: Ranking Principles Information Retrieval &amp; Data Mining Universitt des

A Ranking Method to Improve A Ranking Method to Improve Detection of Disease Using Selectively

+ Ranking Factor Latest Trends What factors matter in 2016-2017 for ranking your Google

Kernel Principal Component Ranking: Robust Ranking on Noisy Data Evgeni Tsivtsivadze Botond

Tutorial Ranking Mechanisms in Games Vanessa Volz and Boris Naujoks CIG 2018, Maastricht

Web Mining and Recommender Systems Advanced Recommender Systems: Bayesian Personalized Ranking

PRanking with Ranking Koby Crammer Technion Israel Institute of Technology Based on joint

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data Eugene

Dirichlet Bayesian Network Scores and the Maximum Entropy Principle Marco Scutari

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 20: Topic

Natural Language Understanding Unsupervised Part-of-Speech Tagging Adam Lopez Slide credits:

Document and Topic Models: pLSA and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced

Lecture 8: Maximum Likelihood Estimation (MLE) (contd.) Maximum a posteriori (MAP)

Using topic models as classifiers Pavel Oleinikov Associate Director Quantitative Analysis

A characterisation of transient random walks on stochastic matrices with Dirichlet distributed

Uncovering latent jet substructure Barry M . Dillon Jozef Stefan Institute , Ljubljana , Slovenia

Chapter III: Ranking Principles Information Retrieval & Data Mining Universitt des

Chapter III: Ranking Principles Information Retrieval & Data Mining Universitt des