Boolean retrieval & basics of indexing
CE-324: Modern Information Retrieval
Sharif University of Technology
- M. Soleymani
Boolean retrieval & basics of indexing CE-324: Modern - - PowerPoint PPT Presentation
Boolean retrieval & basics of indexing CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2017 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan lectures (CS-276, Stanford)
Sharif University of Technology
2
Shows presence or absence of terms in each doc
3
Task Info Need Query Verbal form Results SEARCH ENGINE Query Refinement
mouse trap
Corpus 4
5
Antony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth
Antony 1 1 1 Brutus 1 1 1 Caesar 1 1 1 1 1 Calpurnia 1 Cleopatra 1 mercy 1 1 1 1 1 worser 1 1 1 1
6
7
Antony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth
Antony 1 1 1 Brutus 1 1 1 Caesar 1 1 1 1 1 Calpurnia 1 Cleopatra 1 mercy 1 1 1 1 1 worser 1 1 1 1
8
9
10
11
Some tradeoffs in size/ease of insertion
12
13
14
15
16
17
18
Retrieve its postings.
Retrieve its postings.
19
20
21
22
23
24
24
25
26
28
29
Suppose we are processing 8 on each list. We
We then have 41 and 11. The skip successor of 11 is 31 (31<41). So, we
30
More likely to skip but lots of comparisons to skip pointers (and also
few successful skips but also few pointer comparison (and also less
31
32
33
34
Usually either too few or too many docs in response to a user query
35