Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil - PowerPoint PPT Presentation

Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil Pavlu Javed Aslam

Score Distributions 2

Score Distributions 9.6592 ¡ ¡ ¡ ¡ 9.5761 ¡ ¡ ¡ ¡ 9.4919 ¡ ¡ ¡ ¡ 9.4784 ¡ ¡ ¡ ¡ 9.2693 ¡ ¡ ¡ ¡ 9.2066 ¡ ¡ ¡ ¡ 9.1407 ¡ ¡ ¡ ¡ 9.0824 ¡ ¡ ¡ ¡ 9.0110 ¡ ¡ ¡ ¡ 9.0084 ¡ ¡ ¡ ¡ 8.9826 ¡ ¡ ¡ ¡ 8.9351 2

Score Distributions ✓ 9.6592 ¡ ¡ ¡ ¡ ✓ 9.5761 ¡ ¡ ¡ ¡ ✗ 9.4919 ¡ ¡ ¡ ¡ ✗ 9.4784 ¡ ¡ ¡ ¡ 9.2693 ¡ ¡ ¡ ¡ ✗ 9.2066 ¡ ¡ ¡ ¡ ✓ 9.1407 ¡ ¡ ¡ ¡ ✗ 9.0824 ¡ ¡ ¡ ¡ ✓ 9.0110 ¡ ¡ ¡ ¡ ✗ 9.0084 ¡ ¡ ¡ ¡ ✗ 8.9826 ¡ ¡ ¡ ¡ ✗ 8.9351 ✓ 3

Score Distributions • Applications : norm. for multiple sources – Information Filtering (e.g. news retrieval) – Recall-oriented IR (e.g. legal, patent IR) – Distributed IR (multiple data collections) – Diversity/Faceted IR (news, images, video, web pages, feeds) – Meta-search • To be useful, Score Distributions models must be reasonably accurate 4

Modeling Score Distributions • Modeling score distributions key to inference • EM to fit the model into the data • Dozens of models in the literature – Negative Exponential (nonrel) & Gaussian (rel) – Gamma & Gaussian – 2 Poisson – 2 Gaussian – … 5

Motivation • What is wrong with Neg. Exponential & Gaussian? – It simply does not fit the data – Undesirable IR properties 6

Our work (some previous) 7

Our work (some previous) • New model – Theoretical basis – Fits the data better • Focus on getting it right rather than making it simple 7

Overview • Many related problems – hardest: on modeling [TREC] relevant documents • This talk: three of these problems – Theory – BM25 and LM – Relevant docs score distribution via PR curves 8

1 DL/TF variable: A case for Gamma- mixture-based distribution model 9

Why DL/TF • BM25 • LM 10

Quality classes and term frequency

Quality classes and term frequency • Quality class = set of documents for which query terms are consistently “generated” by a Poisson process – can model aspects/facets, doc types,etc

Quality classes and term frequency • Quality class = set of documents for which query terms are consistently “generated” by a Poisson process – can model aspects/facets, doc types,etc • Distance between terms occurrences =waiting time between Poisson events

Quality classes and term frequency • Quality class = set of documents for which query terms are consistently “generated” by a Poisson process – can model aspects/facets, doc types,etc • Distance between terms occurrences =waiting time between Poisson events 1 2 3 4 !me

Quality classes and term frequency • Quality class = set of documents for which query terms are consistently “generated” by a Poisson process – can model aspects/facets, doc types,etc • Distance between terms occurrences =waiting time between Poisson events 1 2 3 4 !me waiting times(exp distrib) average waiting time

DL/TF variable • θ = average waiting time between terms – depends on class quality Q and query generality (hardness) g, collection size etc • ADL = average document length • For each class, model the DL/TF variable separately for each TF value k – DL = sum of waiting times 12

Mixture over TF values k= 1 2 3 4... 13

Mixture over TF values • P Q []=geometric mixture over TF values (k) with rate 1-p – example: relevant class p=0.1 – nonrelevant class p=0.7 k= 1 2 3 4... – avg TF = mean(P Q ) = 1/p 13

Mixture over TF values • P Q []=geometric mixture over TF values (k) with rate 1-p – example: relevant class p=0.1 – nonrelevant class p=0.7 k= 1 2 3 4... – avg TF = mean(P Q ) = 1/p • Model DL/TF as a mixture of gammas 13

DL/TF per quality class 14

DL/TF per quality class • For a geometric P[], the mixture is actually a single gamma 14

DL/TF per quality class • For a geometric P[], the mixture is actually a single gamma • Multiple query terms : requires a proportionality – usually not achievable in practice – but approx by a gamma with higher “shape” 14

Gamma mixture for DL/TF • mixture 0.03 Empirical Histogram MLE Gamma Fit 0.025 • approximate 0.02 with a single 0.015 gamma 0.01 0.005 0 0 100 200 300 400 500 600 700 800 900 1000 DL/TF 15

Score Transformations • r=non-decreasing differentiable function • f(X) = distribution modeled – Many basic transformations preserve gamma-like distribution shape – 16

Score Transform: Inversion 17

Score Transformations 5 k1=1 • Saturators r Robertson’s TF 4 k1=3 k1=5 3 (RobertsonTF) 2 can make the 1 distribution 0 0 2 4 6 8 10 12 14 16 18 20 TF more “hill”- 0.035 like k1=1 0.03 k1=3 0.025 Frequency k1=5 0.02 0.015 0.01 0.005 0 0 1 2 3 4 5 6 18 BM25 Scores

2 Popular retrieval functions: BM25 and LM 19

Three fits • Theory models – Mixture of gammas inverted, score transformations • Data-driven approach – maximum likelihood gamma fit • Analytical approach – Traditional ranking functions: TF-IDF, BM25, LM – Make basic assumptions of low level components – Derive score distribution 20

Analytical Approach:BM25 ¡ ¡ Ireland ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡Peace ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ Talks BM25 21

BM25 • X=DL/TF 22

BM25 0.045 BM25 score histogram Analytically Numerical 0.04 MLE Gamma fit Model (theory) 0.035 0.03 Frequency 0.025 0.02 0.015 0.01 0.005 0 0 1 2 3 4 5 6 23 BM25 score

Analytical Approach:LM ¡ ¡ Ireland ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡Peace ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ Talks ireland, 6.155, 7698 docs peac, 3.876, 35454 docs talk, 2.777, 70795 docs TF 4 TF 4 TF 2 x 10 6 x 10 6000 4000 4 1 2000 2 0 0 0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Normalized TF Normalized TF Normalized TF 1000 4000 15000 10000 500 2000 5000 0 0 0 0 0.02 0.04 0.06 0.08 0.1 0 0.02 0.04 0.06 0.08 0.1 0 0.02 0.04 0.06 0.08 0.1 log(Normalized TF) log(Normalized TF) log(Normalized TF) 300 2000 3000 200 2000 1000 100 1000 0 0 0 � 14 � 12 � 10 � 8 � 6 � 4 � 2 � 14 � 12 � 10 � 8 � 6 � 4 � 2 � 12 � 10 � 8 � 6 � 4 � 2 log(Normalized TF + CTF/TN) log(Normalized TF + CTF/TN) log(Normalized TF + CTF/TN) 300 1000 2000 200 500 1000 100 0 0 0 � 10 � 8 � 6 � 4 � 2 � 8 � 7 � 6 � 5 � 4 � 3 � 2 � 8 � 7 � 6 � 5 � 4 � 3 � 2 log(lambda*Normalized TF + (1 � lambda)*CTF/TN) log(lambda*Normalized TF + (1 � lambda)*CTF/TN) log(lambda*Normalized TF + (1 � lambda)*CTF/TN) 200 1000 3000 2000 100 500 1000 0 0 0 � 10 � 8 � 6 � 4 � 2 � 8 � 7 � 6 � 5 � 4 � 3 � 8 � 7 � 6 � 5 � 4 � 3 BM25 Scores 4000 Language ¡Model 2000 24 0 � 24 � 22 � 20 � 18 � 16 � 14 � 12

LM(Jelinek-Mercer smooth) 0.09 BM25 score histogram Analytically Numerical 0.08 MLE Gamma fit Model (theory) 0.07 0.06 Frequency 0.05 0.04 0.03 0.02 0.01 25 0 − 7 − 6 − 5 − 4 − 3 − 2 − 1 LM (Jelinek − Mercer smoothing) score

3 Inferring Relevant distribution using a Precision-Recall model 26

Precision-Recall curves 27

Precision-Recall curves • Model Precision − recall curves for various values of rp 1 0.9 0.8 0.7 0.6 precision 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 27 recall

Score Distrib for Relevant Docs • Previous work • Input : – Score distribution of relevant documents – Score distribution of non-relevant documents • Output : – PR-curve model

Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil - PowerPoint PPT Presentation

Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil Pavlu Javed Aslam Score Distributions 2 Score Distributions 2 Score Distributions 9.6592 9.5761 9.4919

MARC Fall Meeting 09/24/17 MARC Fall Meeting 09/24/17 SCORE Presentation SCORE

1. Normal distribution 2. Geometric distribution 3. Binomial distribution 4.

Sample Score Report by three areas, or claims. Sample Score

Entrepreneurship & SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

Linear Classification w T x i is the classifier score for the instance x i The score can be used

Body Condition Score Anna McKelvey Kirsty Husby Henneke Body Condition Score Measures amount

WELCOME TO WASHINGTON STATE HORSEMEN CONVENTION 2012 HIGH SCORE ENGLISH SHOW HORSE A

Evaluation of the NIGMS Support for Competitive Research (SCORE) Program Peter Espenshade,

Understanding PSAT Results and Next Steps Revised Jan. 2020 BREAKING IT DOWN Your Total Score

Risk Assessment Method for Local Air Pollution Control Score Sheet Name of permitted installation

SCORE LCA presentation Jade GARCIA Director SCORE LCA workshop 21 March 2019

ANNUAL PRESENTATION Results - 2017 WEEKDAY POINT SCORE: 1st- Irene Mc Cormack 153 2nd- Dallas

SCORE Study Coordinators Organization for Research & Education Wednesday, June 20, 2018 1

SCORE Study Coordinators Organization for Research & Education Wednesday, January 17, 2018

Keeping Score: Keeping Score: New Approaches to the Standard of Living New Approaches to the

Zebrackets: A Score of Years and Delimiters Michael Cohen, Blanca Mancilla, and John Plaice

Monte-Carlo Tree Search Parallelisation International Go Symposium 2012 Francois van Niekerk

The future of some Bianchi A spacetimes with an ensemble of free falling particles Ernesto

JCCS Expert Task Group on Robustness Dr T.D. Gerard Canisius Scott Wilson PLC, The UK Chairman,

Cohomology Seminar Algorithms Jari de Kroon Eindhoven University of Technology May 22, 2018

Alex Suciu Northeastern University MIMS Summer School: New Trends in Topology and Geometry

Entropy in local algebraic dynamics Mahdi Majidi-Zolbanin Nikita Miasnikov Lucien Szpiro

r sts sts rt

TOPOLOGICAL STRING ENTANGLEMENT Mukund Rangamani QMAP & Dept of Physics, UC Davis It from

Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil - PowerPoint PPT Presentation

Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil Pavlu Javed Aslam Score Distributions 2 Score Distributions 2 Score Distributions 9.6592 9.5761 9.4919

MARC Fall Meeting 09/24/17 MARC Fall Meeting 09/24/17 SCORE Presentation SCORE

1. Normal distribution 2. Geometric distribution 3. Binomial distribution 4.

Sample Score Report by three areas, or claims. Sample Score

Entrepreneurship &amp; SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

Linear Classification w T x i is the classifier score for the instance x i The score can be used

Body Condition Score Anna McKelvey Kirsty Husby Henneke Body Condition Score Measures amount

WELCOME TO WASHINGTON STATE HORSEMEN CONVENTION 2012 HIGH SCORE ENGLISH SHOW HORSE A

Evaluation of the NIGMS Support for Competitive Research (SCORE) Program Peter Espenshade,

Understanding PSAT Results and Next Steps Revised Jan. 2020 BREAKING IT DOWN Your Total Score

Risk Assessment Method for Local Air Pollution Control Score Sheet Name of permitted installation

SCORE LCA presentation Jade GARCIA Director SCORE LCA workshop 21 March 2019

ANNUAL PRESENTATION Results - 2017 WEEKDAY POINT SCORE: 1st- Irene Mc Cormack 153 2nd- Dallas

SCORE Study Coordinators Organization for Research &amp; Education Wednesday, June 20, 2018 1

SCORE Study Coordinators Organization for Research &amp; Education Wednesday, January 17, 2018

Keeping Score: Keeping Score: New Approaches to the Standard of Living New Approaches to the

Zebrackets: A Score of Years and Delimiters Michael Cohen, Blanca Mancilla, and John Plaice

Monte-Carlo Tree Search Parallelisation International Go Symposium 2012 Francois van Niekerk

The future of some Bianchi A spacetimes with an ensemble of free falling particles Ernesto

JCCS Expert Task Group on Robustness Dr T.D. Gerard Canisius Scott Wilson PLC, The UK Chairman,

Cohomology Seminar Algorithms Jari de Kroon Eindhoven University of Technology May 22, 2018

Alex Suciu Northeastern University MIMS Summer School: New Trends in Topology and Geometry

Entropy in local algebraic dynamics Mahdi Majidi-Zolbanin Nikita Miasnikov Lucien Szpiro

r sts sts rt

TOPOLOGICAL STRING ENTANGLEMENT Mukund Rangamani QMAP &amp; Dept of Physics, UC Davis It from

Entrepreneurship & SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

SCORE Study Coordinators Organization for Research & Education Wednesday, June 20, 2018 1

SCORE Study Coordinators Organization for Research & Education Wednesday, January 17, 2018

TOPOLOGICAL STRING ENTANGLEMENT Mukund Rangamani QMAP & Dept of Physics, UC Davis It from