A Probability Ranking Principle for Interactive IR Norbert Fuhr - PowerPoint PPT Presentation

A Probability Ranking Principle for Interactive IR Norbert Fuhr October 18, 2008

Outline Motivation Approach The Model Towards application Conclusion and Outlook

Motivation The classical PRP Questioning the PRP assumptions Interactive Retrieval

The classical PRP ◮ Task: Retrieve relevant documents ◮ Relevance of a document to a query is independent of other documents ◮ Scanning through the ranked list is the major task of the user (and the only one considered)

Questioning the PRP assumptions ◮ Relevance depends on documents the user has seen before ◮ Relevance judgment is not the most expensive task for a user

Interactive Retrieval ◮ User has a rich set of interaction possibilities ◮ (re)formulate query ◮ selection based on summaries of various granularity ◮ select related terms from list ◮ follow document link ◮ relevance judgment ◮ Information need changes during a search ◮ No theoretic foundation for constructing IIR systems

Approach Requirements for an IIR-PRP Basic Assumptions Abstraction: Situations with Lists of Choices

Requirements for an IIR-PRP ◮ Consider the complete interaction process ◮ Allow for different costs for different activities ◮ Allow for changes of the information need

Basic Assumptions ◮ Focus on a functional level of interaction (usability issues disregarded here) ◮ System presents list of choices to the user ◮ Users evaluate choices in linear order ◮ Only positive decisions/choices are of benefit for a user

Examples of decision lists ◮ ranked list of documents ◮ list of summaries ◮ list of document cluster ◮ KWIC list ◮ list of expansion terms ◮ links to related documents ◮ ...

Example: Non-linear decision list

Abstraction: Situations with Lists of Choices

The Model Choices Selection lists Ranking of choices

Basic ideas ◮ A user moves from situation to situation ◮ In each situation s i , the user is presented a list of (binary) choices < c i 1 , c i 2 , . . . , c i , n i > ◮ The user decides about each of these choices sequentially ◮ The first positive decision moves the user to a new situation s j ◮ A decision may be wrong, requiring backtracking

Probabilistic model focusing on single situation

Probabilistic Event space U i �� A �� R �� U i : Uses in situation s i �� C i : choices in situa- �� tion s i J �� C i �� J ⊂ U i × C i : �� judged choices �� A ⊂ J : �� accepted choices �� R ⊆ A : ’right’ choices

Expected Benefit of a choice p ij probability that the user will accept choice c ij q ij probability that this decision was right e ij < 0: effort for evaluating the choice c ij b ij > 0: resulting benefit from positive, correct decision g ij ≤ 0: cost for correcting a wrong decision Expected benefit of choice c ij � � E ( c ij ) = e ij + p ij q ij b ij + ( 1 − q ij ) g ij

Example Web search: ’Java’ → n 0 =290 mio. hits System proposes extension terms: term n i p ij b ij p ij b ij program 195 mio 0.67 0.4 0.268 blend 5 mio 0.02 4.0 0.08 island 2 mio 0.01 4.9 0.049 benefit b ij = log n 0 n i

Strategies for maximizing expected benefit � � E ( c ij ) = e ij + p ij q ij b ij + ( 1 − q ij ) g ij (assume that benefit b ij and corr. effort g ij are given) 1. minimize effort | e ij | — but keep p ij (selection prob.) and q ij (success prob.) high 2. maximize p ij : user should choose c ij whenever it is appropriate — but keep success probability q ij high � increased effort e ij 3. maximize q ij by avoiding erroneous positive decisions � increased effort e ij

Further remarks � � E ( c ij ) = e ij + p ij q ij b ij + ( 1 − q ij ) g ij ◮ Expected benefit should be positive choices with negative values should not be presented to a user. ◮ Methods for estimating parameters p ij , q ij , b ij , e ij , g ij : Issue of further research ◮ In the following, let a ij = q ij b ij − ( 1 − q ij ) g ij (“average benefit”) E ( c ij ) = e ij + p ij a ij

Selection list situation s i with list of choices r i = < c i 1 , c i 2 , . . . , c i , n i > expected benefit of choice list: E ( r i ) = e i 1 + p i 1 a i 1 + ( 1 − p i 1 ) ( e i 2 + p i 2 a i 2 + ( 1 − p i 2 ) ( e i 3 + p i 3 a i 3 + . . . ( 1 − p i , n − 1 ) ( e in + p in a in ) ))   j − 1 n � �  ( e ij + p ij a ij ) = ( 1 − p ik )  j = 1 k = 1

Expected benefit of a choice list   n j − 1 � �  ( e ij + p ij a ij ) E ( r i ) = ( 1 − p ik )  j = 1 k = 1

Ranking of choices Consider two subsequent choices c il and c i , l + 1   j − 1 n � �  ( e ij + p ij a ij ) + t l , l + 1 E ( r i ) = ( 1 − p ik )  i k = 1 j = 1 l � = j � = l + 1 where l − 1 t l , l + 1 � = ( e il + p il a il ) ( 1 − p ik ) + i k = 1 l � ( e i , l + 1 + p i , l + 1 a i , l + 1 ) ( 1 − p ik ) k = 1 analogously t l + 1 , l for < . . . , c i , l + 1 , c il , , . . . > i

Difference between alternative rankings t l , l + 1 − t l + 1 , l d l , l + 1 i i = i � l − 1 k = 1 ( 1 − p ik ) = e il + p il a il + ( 1 − p il )( e i , l + 1 + p i , l + 1 a i , l + 1 ) − � � e i , l + 1 + p i , l + 1 a i , l + 1 + ( 1 − p i , l + 1 )( e il + p il a il ) = p i , l + 1 ( e il + p il a il ) − p il ( e i , l + 1 + p i , l + 1 a i , l + 1 ) ! For d l , l + 1 ≥ 0, we get i ≥ a i , l + 1 + e i , l + 1 a il + e il p il p i , l + 1

PRP for Interactive IR a il + e il ≥ a i , l + 1 + e i , l + 1 p il p i , l + 1 � Rank choices by decreasing values of ̺ ( c ij ) = a il + e il p il

Expected benefit: single choices vs. list expected benefit: E ( c ij ) = p ij a ij + e ij a il + e il ranking criterion: ̺ ( c ij ) = p il Example: choice p ij a ij e ij E ( c ij ) ̺ ( c ij ) c 1 0.5 10 -1 4 8 c 2 0.25 16 -1 3 12 E ( < c 1 , c 2 > ) = 4 + 0 . 5 · 3 = 5 . 5 E ( < c 2 , c 1 > ) = 3 + 0 . 75 · 4 = 6

IIR-PRP vs. PRP ≥ a i , l + 1 + e i , l + 1 a il + e il p il p i , l + 1 Let e ij = − ¯ C , ¯ C > 0 and a il = C : ¯ ¯ C C C − ≥ C − p il p i , l + 1 ⇒ p il ≥ p i , l + 1 � Classic PRP still holds!

IIR-PRP: Observations a ij + e ij Rank choices by p ij ◮ p ij ’probability of relevance’ still involved ◮ tradeoff between effort e ij and benefit a ij ◮ difference between PRP and IIR-PRP due to variable values for e ij and a ij ◮ IIR-PRP looks only for the first positive decision

Towards application Parameter estimation Saved effort

Parameter estimation 1. Selection probability p ij : focus of many IR models, but models for dynamic info needs required 2. Effort parameters e ij , g ij +success probability q ij : most research needed 3. Benefit b ij : ◮ information value ? ◮ saved effort (see below)

A Probability Ranking Principle for Interactive IR Norbert Fuhr - PowerPoint PPT Presentation

A Probability Ranking Principle for Interactive IR Norbert Fuhr October 18, 2008 Outline Motivation Approach The Model Towards application Conclusion and Outlook Motivation The classical PRP Questioning the PRP assumptions Interactive

Retrieval Models Probability Ranking Principle Web Search Slides based on the books: 1

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Which probability Which probability Which probability Which probability theory for cosmology?

Counting and Probability Whats to come? Counting and Probability Whats to come?

Class 14 Slides SLIDE what is the designing principle how does designing principle

Topic 6 Conditional Probability and Independence Conditional Probability 1 / 9 Definition The

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko

Online Submodular Set Cover, Ranking, and Repeated Active Learning Online Ranking: At each round,

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch Ads Ranking at

Recap of Basic Probability Elements of basic probability theory probability theory The

Adaptive Management Task Force: Implementation Plan and Pilot Criteria October 2018 Presentation

Foundation of Cryptography (0368-4162-01), Lecture 4 Pseudorandom Functions Iftach Haitner, Tel

Some challenges in heavyweight cipher design Daniel J. Bernstein University of Illinois at

Minimum Blockcipher Calls for Block cipher based Designs Mridul Nandi Indian Statistical

U s e r s p a c e N V M e D r i v e r i n Q E M U F a m Z h e n g

Graph-based and Lexical-Syntactic Approaches for the Authorship Attribution Task Notebook for PAN

Practitioner Research Programme Working together to understand new migration and superdiversity

Stat 8931 (Aster Models) Lecture Slides Deck 9 Charles J. Geyer School of Statistics University