Model Divergence Retrieval LM, session 10 CS6200: Information - PowerPoint PPT Presentation

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse Anderton

Retrieval With Language Models There are three obvious ways to perform retrieval using language models: 1. Query Likelihood Retrieval trains a model on the document and estimates the query’s likelihood. We’ve focused on these so far. 2. Document Likelihood Retrieval trains a model on the query and estimates the document’s likelihood. Queries are very short, so these seem less promising. 3. Model Divergence Retrieval trains models on both the document and the query, and compares them.

Comparing Distributions The most common way to compare probability distributions is with Kullback-Liebler (“KL”) Divergence . This is a measure from Information p ( e ) log p ( e ) Theory which can be interpreted as � D KL ( p � q ) = the expected number of bits you q ( e ) e would waste if you compressed data distributed along p as if it was distributed along q . If p = q , D KL ( p || q ) = 0 .

Divergence-based Retrieval Model Divergence Retrieval works as follows: D KL ( p ( w | q ) � p ( w | d )) 1. Choose a language model for the query, p ( w | q ) . p ( w | q ) log p ( w | q ) � = p ( w | d ) w 2. Choose a language model for the � � = p ( w | q ) log p ( w | q ) � p ( w | q ) log p ( w | d ) document, p ( w | d ) . w w 3. Rank by –D KL ( p ( w | q ) || p ( w | d )) – more rank � = � p ( w | q ) log p ( w | d ) divergence means a worse match. w This can be simplified to a cross-entropy calculation, as shown to the right.

Retrieval Flexibility Model Divergence Retrieval generalizes the Query and Document Likelihood models, and is the most Pick p ( w | q ) := tf w , q | q | = 1 flexible of the three. | q | Any language model can be used for rank � D KL ( p ( w | q ) � p ( w | d )) = � p ( w | q ) log p ( w | d ) the query or document. They don’t w 1 have to be the same. It can help to � = � | q | log p ( w | d ) smooth or normalize them differently. w Equivalence to Query Likelihood Model If you pick the maximum likelihood model for the query, this is equivalent to the query likelihood model.

Example: Model Divergence Retrieval We make the following model choices: Let qf w := count ( word w in query log ) • p ( w | q ) is Dirichlet-smoothed with a qf w tf w , q + 2 � � w qf w background of words used in p ( w | q , μ = 2 ) = | q | + 2 historical queries. cf w tf w , d + 2 , 000 � � w cf w p ( w | d , μ = 2000 ) = | d | + 2 , 000 • p ( w | d ) is Dirichlet-smoothed with a rank � background of words used in D KL ( p ( w | q ) � p ( w | d )) = � p ( w | q ) log p ( w | d ) documents from the corpus. w qf w cf w tf w , q + 2 � tf w , d + 2 , 000 � � � w qf w w cf w � = � log • Σ w qf w = 500,000 | q | + 2 | d | + 2 , 000 w • Σ w cf w = 1,000,000,000

Example: Model Divergence Retrieval qf w cf w tf w , q + 2 × tf w , d + 2 , 000 × � � w qf w w cf w � log | q | + 2 | d | + 2 , 000 w Wikipedia: WWI World War I ( WWI or WW1 or World War Query: “world war one” One ), also known as the First World War or qf w cf w p(w|q) p(w|d) Score the Great War , was a global war centred in Europe that began on 28 July 1914 and world 2,500 90,000 0.202 0.002 -1.891 lasted until 11 November 1918. More than 9 million combatants and 7 million civilians war 2,000 35,000 0.202 0.003 -1.700 died as a result of the war, a casualty rate exacerbated by the belligerents' one 6,000 5E+07 0.205 0.049 -0.893 technological and industrial sophistication, and tactical stalemate. It was one of the -4.484

Example: Model Divergence Retrieval qf w cf w tf w , q + 2 × tf w , d + 2 , 000 × � � w qf w w cf w � log | q | + 2 | d | + 2 , 000 w Wikipedia: Taiping Rebellion Query: “world war one” The Taiping Rebellion was a massive civil war in qf w cf w p(w|q) p(w|d) Score southern China from 1850 to 1864, against the ruling Manchu Qing dynasty. It was a millenarian movement world 2,500 90,000 0.202 8.75E-05 -2.723 led by Hong Xiuquan, who announced that he had received visions, in which he learned that he was the war 2,000 35,000 0.202 0.001 -2.199 younger brother of Jesus. At least 20 million people died, mainly civilians, in one of the deadliest military one 6,000 5E+07 0.205 0.049 -0.890 conflicts in history. -5.812

Wrapping Up Ranking by (negative) KL-Divergence provides a very flexible and theoretically-sound retrieval system. You are free to model queries and documents any way you like, so you don’t have to assume people use the same linguistic behaviors to write each. Next, we’ll see how to use a divergence retrieval model to build a pseudo-relevance feedback method that outperforms the Rocchio algorithm.

Model Divergence Retrieval LM, session 10 CS6200: Information - PowerPoint PPT Presentation

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse Anderton Retrieval With Language Models There are three obvious ways to perform retrieval using language models: 1. Query Likelihood Retrieval trains a

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

JUST THE MATHS SLIDES NUMBER 2.3 SERIES 3 (Elementary convergence and divergence) by

Divergence Theorems in Path Space Denis Bell University of North Florida Motivation Divergence

29. The divergence theorem Theorem 29.1 (Divergence Theorem; Gauss, Ostrogradsky) . Let S be a

NPFL103: Information Retrieval (4) Ranked retrieval, Term weighting, Vector space model Pavel

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Information Retrieval Introducing Information Retrieval and Web Search

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

THE INTERNATIONAL JOINT COMMISSION : CONVERGENCE, DIVERGENCE OR SUBMERGENCE? 1. The Boundary

Seeding Random Number Generators Jesse Walker Intel Corpora:on

UNLESS THE LORD BUILDS THE HOUSE Whoever does not carry his own cross and come after Me cannot

Here are the songs we sang this Sunday. This shows the song name, the artist who performed the

Art in the Ancient World LECTURE 3 | Art of the Old T estament Jews A U G G U S T I I N N

Language-oriented programming in Racket A cultural anthropology Jesse Alama jesse@lisp.sh

7/16/2018 The Lord said to Samuel, Listen to all that the people are saying to you. It is

1 & 2 Samuel Series Lesson #168 April 23, 2019 Dean Bible Ministries

The Arithmetical Hierarchy in the Setting of 1 - Computability Jesse Johnson Department of

Model Divergence Retrieval LM, session 10 CS6200: Information - PowerPoint PPT Presentation

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse Anderton Retrieval With Language Models There are three obvious ways to perform retrieval using language models: 1. Query Likelihood Retrieval trains a

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

JUST THE MATHS SLIDES NUMBER 2.3 SERIES 3 (Elementary convergence and divergence) by

Divergence Theorems in Path Space Denis Bell University of North Florida Motivation Divergence

29. The divergence theorem Theorem 29.1 (Divergence Theorem; Gauss, Ostrogradsky) . Let S be a

NPFL103: Information Retrieval (4) Ranked retrieval, Term weighting, Vector space model Pavel

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Information Retrieval Introducing Information Retrieval and Web Search

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

THE INTERNATIONAL JOINT COMMISSION : CONVERGENCE, DIVERGENCE OR SUBMERGENCE? 1. The Boundary

Seeding Random Number Generators Jesse Walker Intel Corpora:on

UNLESS THE LORD BUILDS THE HOUSE Whoever does not carry his own cross and come after Me cannot

Here are the songs we sang this Sunday. This shows the song name, the artist who performed the

Art in the Ancient World LECTURE 3 | Art of the Old T estament Jews A U G G U S T I I N N

Language-oriented programming in Racket A cultural anthropology Jesse Alama jesse@lisp.sh

7/16/2018 The Lord said to Samuel, Listen to all that the people are saying to you. It is

1 &amp; 2 Samuel Series Lesson #168 April 23, 2019 Dean Bible Ministries

The Arithmetical Hierarchy in the Setting of 1 - Computability Jesse Johnson Department of

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

1 & 2 Samuel Series Lesson #168 April 23, 2019 Dean Bible Ministries