Pseudo-Relevance Feedback CS6200: Information Retrieval Slides by: - PowerPoint PPT Presentation

Pseudo-Relevance Feedback CS6200: Information Retrieval Slides by: Jesse Anderton

Pseudo-Relevance Feedback If we assume the first k documents are relevant, we can update our query to find more relevant documents. Rocchio’s Algorithm for VSMs takes a linear combination of the original query and the set F of documents labeled as relevant: ??? � � + � � ideal query � � � � � � = � � � |F| initial query � � � ∈ F relevant doc How can we update this for language non-relevant doc models?

Relevance Feedback with LM 1. Generate query model p ( w | q ). A natural way to incorporate feedback documents into a query language 2. Pick top k ranking documents as model is to create a generative model feedback set F. of feedback documents, and smooth the query model together with it. 3. Smooth query model together with feedback model, obtaining p ( w | q , F ). This generates an updated query model for use in Model Divergence 4. Rank documents using p ( w | q , F ) as Retrieval. query model and display results.

Incorporating Feedback One effective way to combine the query and feedback document |F| � � � ( � |F , C ) := arg min � �� ( � � � ( � |F � )) � �� ( � � � ( � |C )) models is to choose a model which F � � = � minimizes average KL divergence � � |F| � � � between the query and feedback � � exp log � ( � |F � ) � � � � log � ( � |C ) � � � � � |F| docs. � = � It’s important to pay attention only to Feedback Model terms that are distinctive to the feedback documents in F , so we also � ( � | � , F , C ) := � · � ( � | � ) + ( � − � ) · � ( � |F , C ) want to maximize KL divergence to the corpus model C . Updated Query Model

Does it work? No Feedback Change Feedback This method consistently improves AP88-89 both average precision and recall. It AP 0.21 0.295 40% finds more relevant documents, and Recall 3067/4805 3665/4805 19% places them higher in the ranking. TREC8 AP 0.256 0.269 5% The disproportionate results from AP88-89 may be because vocabulary Recall 2853/4728 3129/4728 10% usage in this collection is more WEB uniform, and thus easier. AP 0.281 0.312 11% Recall 1755/2279 1798/2279 2% Zhai et al, 2001

Comparing to Rocchio’s Algorithm Here we compare to Rocchio’s algorithm using a VSM with BM25 term scores. Rocchio’s LM Change AP88-8 AP 0.291 0.295 1% Average Precision has improved, but 9 Recall 3729/4805 3665/4805 -2% recall has decreased. This may be TREC8 AP 0.26 0.269 3% because the cutoff used to ignore low- Recall 3204/4728 3129/4728 -2% probability words was more carefully WEB AP 0.271 0.312 15% tuned for the VSM. Recall 1826/2279 1798/2279 -2% Zhai et al, 2001 For the LM approach, they calculate matching scores only for terms having p ( w | q , F ) ≥ 0.001 .

Wrapping Up This approach was developed in the following paper: Chengxiang Zhai and John Lafferty. 2001. Model-based feedback in the language modeling approach to information retrieval. In Proceedings of the tenth international conference on Information and knowledge management (CIKM '01), Henrique Paques, Ling Liu, and David Grossman (Eds.). ACM, New York, NY, USA, 403-410. Pseudo-relevance feedback can make a big impact on retrieval performance, partly because queries tend to be under-specified. This approach, based on minimizing KL divergence, is just one possibility.

Pseudo-Relevance Feedback CS6200: Information Retrieval Slides by: - PowerPoint PPT Presentation

Pseudo-Relevance Feedback CS6200: Information Retrieval Slides by: Jesse Anderton Pseudo-Relevance Feedback If we assume the first k documents are relevant, we can update our query to find more relevant documents. Rocchios Algorithm for

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

Topic of this talk Topic of this talk From E- -Relevance Relevance From E to W- -Relevance

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Luo Si Department of Computer Science Purdue University Query Expansion: Outline Query

(Pseudo)-Relevance Feedback & Passage Retrieval Ling573 NLP Systems & Applications

Enhancing Sketch-Based Image Retrieval by Re-Ranking and Relevance Feedback Heechan Shin CS688

NPFL103: Information Retrieval (6) Result summaries, Relevance Feedback, Qvery Expansion Pavel

Google example query: heat in query doesnt match with thermodynamics in hospital

Relevance Feedback & Other Query Expansion Techniques (Thesaurus, Semantic Network) (COSC

Query Expansion Techniques (Relevance Feedback, Thesaurus, Semantic Network) (COSC 488) Nazli

ECEN 5022 Cryptography Pseudo Random Number Generators Peter Mathys University of Colorado

Models for Inexact Reasoning Reasoning with Subjective Pseudo Reasoning with Subjective Pseudo

MIPS Pseudo Instructions and Functions Philipp Koehn 2 October 2019 Philipp Koehn Computer

Stackable GSS Pseudo-Mechs draft-williams-gssapi-stackable-pseudo-mechs-00

Pseudo-random Functions Debdeep Mukhopadhyay IIT Kharagpur We have seen the construction of

Completions of Pseudo Ordered Sets Maria D Cruz BLAST 2018 August 10,2018 Maria D Cruz (NMSU)

Power Expectation Propagation for Deep Gaussian Processes Dr. Richard E. Turner ( ret26@cam.ac.uk

The pseudo-GDPR on digital marketplaces challenge a general testbed for normative reasoning and

Basic Pipelining Wrap-up Exploiting ILP (from Slide Set 20) Chapter 6 and beyond 1 2

Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical

Optimal Service Placement using Pseudo Service Chaining Mechanism 2016. 11. 15. Taeheum Na

Data Structures Summary Today In-class work on Java: Gnome Static data and methods

Pseudo-measurement simulations and bootstrap for the experimental cross-section covariances

Running Time Why do we need to analyze the running Algorithm/Running Time Analysis time of a

Pseudo-Relevance Feedback CS6200: Information Retrieval Slides by: - PowerPoint PPT Presentation

Pseudo-Relevance Feedback CS6200: Information Retrieval Slides by: Jesse Anderton Pseudo-Relevance Feedback If we assume the first k documents are relevant, we can update our query to find more relevant documents. Rocchios Algorithm for

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

Topic of this talk Topic of this talk From E- -Relevance Relevance From E to W- -Relevance

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Luo Si Department of Computer Science Purdue University Query Expansion: Outline Query

(Pseudo)-Relevance Feedback &amp; Passage Retrieval Ling573 NLP Systems &amp; Applications

Enhancing Sketch-Based Image Retrieval by Re-Ranking and Relevance Feedback Heechan Shin CS688

NPFL103: Information Retrieval (6) Result summaries, Relevance Feedback, Qvery Expansion Pavel

Google example query: heat in query doesnt match with thermodynamics in hospital

Relevance Feedback &amp; Other Query Expansion Techniques (Thesaurus, Semantic Network) (COSC

Query Expansion Techniques (Relevance Feedback, Thesaurus, Semantic Network) (COSC 488) Nazli

ECEN 5022 Cryptography Pseudo Random Number Generators Peter Mathys University of Colorado

Models for Inexact Reasoning Reasoning with Subjective Pseudo Reasoning with Subjective Pseudo

MIPS Pseudo Instructions and Functions Philipp Koehn 2 October 2019 Philipp Koehn Computer

Stackable GSS Pseudo-Mechs draft-williams-gssapi-stackable-pseudo-mechs-00

Pseudo-random Functions Debdeep Mukhopadhyay IIT Kharagpur We have seen the construction of

Completions of Pseudo Ordered Sets Maria D Cruz BLAST 2018 August 10,2018 Maria D Cruz (NMSU)

Power Expectation Propagation for Deep Gaussian Processes Dr. Richard E. Turner ( ret26@cam.ac.uk

The pseudo-GDPR on digital marketplaces challenge a general testbed for normative reasoning and

Basic Pipelining Wrap-up Exploiting ILP (from Slide Set 20) Chapter 6 and beyond 1 2

Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical

Optimal Service Placement using Pseudo Service Chaining Mechanism 2016. 11. 15. Taeheum Na

Data Structures Summary Today In-class work on Java: Gnome Static data and methods

Pseudo-measurement simulations and bootstrap for the experimental cross-section covariances

Running Time Why do we need to analyze the running Algorithm/Running Time Analysis time of a

(Pseudo)-Relevance Feedback & Passage Retrieval Ling573 NLP Systems & Applications

Relevance Feedback & Other Query Expansion Techniques (Thesaurus, Semantic Network) (COSC