Harmony Assumptions: Extending Probability Theory for Information - PowerPoint PPT Presentation

Harmony Assumptions: Extending Probability Theory Harmony Assumptions: Extending Probability Theory for Information Retrieval (IR) and for Databases (DB) and for Knowledge Management (KM) and for Machine Learning (ML) and for Artificial Intelligence (AI) Lernen. Wissen. Daten. Analysen. LWDA Potsdam, September 2016 Thomas Roelleke Queen Mary University of London 1 / 25

Harmony Assumptions: Extending Probability Theory Outline: 17 slides Outline: 17 slides 1 Introduction 2 TF-IDF 3 4 TF Quantifications Harmony Assumptions 5 Experimental Study: IR and Social Networks 6 Impact 7 8 Summary Background 9 2 / 25

Harmony Assumptions: Extending Probability Theory Introduction TF-IDF and Probability Theory Probability Theory: Independence Assumption P ( sailing , boats , sailing ) = P ( sailing ) 2 · P ( boats ) Applied in AI, DB and IR and “Big Data” and “Data Science” and ... TF-IDF the best known ranking formulae? known in IR, DB and AI and other disciplines? TF-IDF and probability theory? log ( P ( sailing , boats , sailing )) = 2 · log ( P ( sailing )) + ... TF-IDF and LM (language modelling)? 3 / 25

Harmony Assumptions: Extending Probability Theory Introduction TF-IDF and Probability Theory Probability Theory: Independence Assumption P ( sailing , boats , sailing ) = P ( sailing ) 2 · P ( boats ) Applied in AI, DB and IR and “Big Data” and “Data Science” and ... TF-IDF the best known ranking formulae? known in IR, DB and AI and other disciplines? TF-IDF and probability theory? log ( P ( sailing , boats , sailing )) = 2 · log ( P ( sailing )) + ... TF-IDF and LM (language modelling)? 4 / 25

Harmony Assumptions: Extending Probability Theory Introduction Why Research on Foundations!? Research on foundations required for ... Abstraction: DB+IR+KM+ML: probabilistic logical programming 1 # Probabilistic facts and rules are great, BUT ... 2 # one needs more expressiveness. 4 # For example: 5 # P(t | d) = tf d /doclen 6 p t d SUM (T,D) : − term doc(T,D) | (D); extended probability theory → DB+IR+KM+ML on the road 5 / 25

Harmony Assumptions: Extending Probability Theory Introduction The wider picture: Penrose “Shadows of the mind” - a search for the missing science of consciousness Preface: dad and daughter enter a cave: -“Dad, that boulder at the entrance, if it comes down, we are locked in.” -“Well, it stood there the last 10,000 years, so it won’t fall down just now.” -“Dad, will it fall down one day?” -“Yes.” -“So it is more likely to fall down with every day it did not fall down?” Taxi: on average, 1/6 taxis are free busy busy ... after 7 busy taxis, keep waiting or give up? 6 / 25

Harmony Assumptions: Extending Probability Theory TF-IDF Hardcore TF-IDF � RSV TF-IDF ( d , q ) := TF ( t , d ) · TF ( t , q ) · IDF ( t ) t How can someone spend 10 years looking at the equation? Maybe because of what Norbert Fuhr said: We know why TF-IDF works; we have no idea why LM (language modelling) works. ∝ P ( q | d ) ∝ P ( d | q ) !!! ??? RSV LM ( d , q ) RSV TF-IDF ( d , q ) P ( q ) P ( d ) 7 / 25

Harmony Assumptions: Extending Probability Theory TF-IDF Hardcore TF-IDF � RSV TF-IDF ( d , q ) := TF ( t , d ) · TF ( t , q ) · IDF ( t ) t How can someone spend 10 years looking at the equation? Maybe because of what Norbert Fuhr said: We know why TF-IDF works; we have no idea why LM (language modelling) works. ∝ P ( q | d ) ∝ P ( d | q ) !!! ??? RSV LM ( d , q ) RSV TF-IDF ( d , q ) P ( q ) P ( d ) 8 / 25

Harmony Assumptions: Extending Probability Theory TF-IDF Example: Naive TF-IDF % A document: d1[sailing boats are sailing with other sailing boats in greece ...] w TF-IDF ( sailing , d1 ) = TF ( sailing , d1 ) · IDF ( sailing ) = 3 · log 1000 = 3 · 2 = 6 10 w TF-IDF ( boats , d1 ) = TF ( boats , d1 ) · IDF ( boats ) = 2 · log 1000 = 2 · 3 = 6 1 NOTE: w TF-IDF ( sailing , d1 ) = w TF-IDF ( boats , d1 ) Both terms have the same impact on the score of d1! The rare term should have MORE impact than the frequent one! 9 / 25

Harmony Assumptions: Extending Probability Theory TF Quantifications Theoretical Justifications!?!?  tf d total TF: independence!    1 + log ( tf d ) log TF: dependence? TF ( t , d ) := log ( tf d + 1 ) another log TF    tf d / ( tf d + K d ) BM25 TF: dependence? K d : pivoted document length: K d > 1 for long documents ... Experimental results: log-TF much better than total TF (ltc, [Lewis, 1998]) BM25-TF better than log-TF Theoretical results? Why? Wieso - Weshalb - Warum? 10 / 25

Harmony Assumptions: Extending Probability Theory TF Quantifications BM25-TF 1 0.8 K=1 K=2 0.6 K=5 K=10 0.4 0.2 0 0 5 10 15 20 n L (t,d) tf d TF BM25 ( t , d ) := tf d + K d 11 / 25

Harmony Assumptions: Extending Probability Theory TF Quantifications Example: BM25-TF Remember Naive TF-IDF? Now, try BM25-TF-IDF: 3 + 1 · log 1000 3 = 3 w BM25-TF-IDF ( sailing , d1 ) = 4 · 2 = 1 . 5 10 2 + 1 · log 1000 2 = 2 w BM25-TF-IDF ( boats , d1 ) = 3 · 3 = 2 1 IMPORTANT: w BM25-TF-IDF ( sailing , d1 ) < w BM25-TF-IDF ( boats , d1 ) 12 / 25

Harmony Assumptions: Extending Probability Theory TF Quantifications Series-based explanations Series-based explanations of the TF quantifications: tf d = 1 + 1 + ... + 1 TF total 1 + log ( tf d ) ≈ 1 + 1 2 + . . . + 1 TF log tf d � � tf d tf d + 1 = 1 1 1 TF BM25 2 · 1 + 1 + 2 + . . . + 1 + 2 + ... + tf d 13 / 25

Harmony Assumptions: Extending Probability Theory Harmony Assumptions FORGET Information Retrieval ... BACK TO Probability Theory 14 / 25

Harmony Assumptions: Extending Probability Theory Harmony Assumptions k sailing , ... ) = 1 Ω · P ( sailing ) k = 1 � �� Ω · P ( sailing ) 1 + 1 + ... + 1 P ( k sailing , ... ) = 1 � �� Ω · P ( sailing ) 1 + 1 2 α + ... + 1 P α ( k α independent: α = 0 square-root-harmonic: α = 0 . 5 naturally harmonic: α = 1 square-harmonic: α = 2 ... Ω : Later 15 / 25

Harmony Assumptions: Extending Probability Theory Harmony Assumptions The Main Harmony Assumptions assumption name assumption function af ( n ) description / comment 1 1 1 + 2 0 + . . . + zero harmony independence: 1+1+1+...+1 n 0 1 + 1 2 + . . . + 1 natural harmony harmonic sum n 1 1 alpha-harmony 1 + 2 α + . . . + generalised harmonic sum n α 1 1 sqrt harmony 1 + 2 1 / 2 + . . . + α = 1 / 2; divergent n 1 / 2 α = 2; convergent: π 2 1 1 square harmony 1 + 2 2 + . . . + 6 ≈ 1 . 645 n 2 tf d n 1 1 2 · n + 1 = 1 + 1 + 2 + . . . + Gaussian harmony explains the BM25-TF 1 + ... + n tf d + pivdl 16 / 25

Harmony Assumptions: Extending Probability Theory Harmony Assumptions Illustration 0 . 25 0 . 306 0 . 353 independent: α = 0 sqrt-harmonic: α = 1 / 2 naturally harmonic: α = 1 0 . 5 · 0 . 5 1 / 2 ≈ 0 . 353 2 ≈ 0 . 306 √ 0 . 5 · 0 . 5 = 0 . 25 0 . 5 · 0 . 5 1 / The area of each circle corresponds to the single event probability: p = 0 . 5. The overlap becomes larger for growing α (harmony). 17 / 25

Harmony Assumptions: Extending Probability Theory Experimental Study: IR and Social Networks Data & Test Africa in TREC-3 742 , 611 = 734 , 078 + 8 , 533 k 0 1 2 3 4 5 6 7 8 P obs 0 . 9885 0 . 0062 0 . 0019 0 . 0011 0 . 0007 0 . 0005 0 . 0004 0 . 0002 0 . 0002 documents 734 , 078 4 , 584 1 , 462 809 550 345 271 182 137 P binomial 0 . 9738 0 . 0258 0 . 0003 0 0 0 0 0 0 P alpha-harmonic ,α = 0 . 41 0 . 9787 0 . 018 0 . 0023 0 . 0005 0 . 0002 0 . 0001 0 0 0 Binomial assumes independence: P binomial ( 1 ) > P obs ( 1 ) ! P binomial ( 2 ) < P obs ( 2 ) ! P binomial ( 3 ) = 0! 18 / 25

Harmony Assumptions: Extending Probability Theory Experimental Study: IR and Social Networks Distribution of α ’s Distribution of alpha fit in the topical IR case Distribution of alpha fit in the social network case 2.5 2.5 independence independence sqrt−harmony sqrt−harmony natural harmony natural harmony 2 2 1.5 1.5 % of terms % of users 1 1 0.5 0.5 0 0 0 0.2 0.4 0.6 0.8 1 1.2 0 0.2 0.4 0.6 0.8 1 1.2 harmonic alpha harmonic alpha Distribution of alpha’s: for many terms, 0 . 3 ≤ α ≤ 0 . 8. Sqrt-harmony appears to be a good default assumption. 19 / 25

Harmony Assumptions: Extending Probability Theory Impact Extended Probability Theory applicable in DB+IR+KM+ML + other disciplines where probabilities and ranking are involved. DB+IR+KM+ML: A new generation 1 w BM25(Term,Doc) : − tf d(Term,Doc) BM25 & piv dl(Doc); 2 # w BM25: a probabilistic variant of the BM25 − TF weight. 4 # What to add for modelling ranking algorithms (TF − IDF, BM25, LM, DFR)? 6 # What makes engineers happy??? [Frommholz and Roelleke, 2016]: DB Spektrum 20 / 25

Harmony Assumptions: Extending Probability Theory Summary The Independence Assumption: easy and scales, BUT ...!!! Many disciplines rely on probability theory. Between Disjointness and Subsumption, there is more than Independence. For example: Natural Harmony: log 2 ( k + 1 ) Gaussian Harmony: 2 · k / ( k + 1 ) tf d 1 1 BM25-TF: 2 · tf d + 1 = 1 + 1 + 2 + . . . + 1 + 2 + ... + tf d Harmony Assumptions: A link between TF-IDF and Probability Theory 21 / 25

Harmony Assumptions: Extending Probability Theory Summary Other theories to model dependencies? Questions? 22 / 25

Harmony Assumptions: Extending Probability Theory for Information - PowerPoint PPT Presentation

Harmony Assumptions: Extending Probability Theory Harmony Assumptions: Extending Probability Theory for Information Retrieval (IR) and for Databases (DB) and for Knowledge Management (KM) and for Machine Learning (ML) and for Artificial

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

UYGHUR BACKNESS HARMONY There are generally two classes of vowels for harmony: regular and

On GE-harmony Justifying the E-rules Conjunction Two E-Rules or One? Implication Stephen Read

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Theory p ( E ) = p ( a 1 ) + p ( a 2 ) + ... + p ( a m ) 1 2 3 4 5 6 7 8 9 10 11 12 13

Counting and Probability Whats to come? Counting and Probability Whats to come?

How does Harmony unify the Greek quadrivium throughout history? Pythagoras Harmony

Five - Zope 2 and Zope 3 in Harmony Sidnei da Silva Enfold Systems, LLC July 22, 2005 Five -

DragonWave Proprietary Information 1 Agenda | Harmony Eband Product Overview E-band

AgeAdvisor AGING GING in HARMONY in HARMONY A scientifically based life course modifying

EU Phosphorus Phosphorus Project Project EU Harmony Harmony Advance Reservation Reservation

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology

for Finding Similar Images Cyrill Stachniss Slides have been created by Cyrill Stachniss. Most

Efficient visual search of local features Cordelia Schmid Visual search change in viewing

The effect of parental job loss on child school dropout: evidence from the Occupied Palestinian

Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 21 to March 4, 2011

Relevance Feedback for Association Rules by Leveraging Concepts from Information Retrieval Georg

How many ways can you slice a classifier? Exploring HPC architectures and programming models for

Recommending Remedial Learning Materials to the Students by Filling their Knowledge Gaps Konstantin