SLIDE 5 9
Inverted Files. Searching
- What to search on inverted files:
Single word queries the process ends by delivering the list of – Single-word queries – the process ends by delivering the list of
– Context queries are more difficult to solve with inverted indices
h l t t b h d t l d li t t d f
- each element must be searched separately and a list generated for
each one.
- then the list of all elements are traversed in synchronization to find
places where all the words appear in sequence or appear close places where all the words appear in sequence or appear close enough.
TDT4215 – Indexing & Searching 10
Inverted Files Search Inverted Files. Search Algorithm g
The search algorithm on an inverted index follows three general steps: s eps
– The words and patterns present in the query are isolated and searched in the vocabulary. y – Notice that phrases and proximity queries are split into single words.
- 2. Retrieval of occurrences.
– The list of the occurences of the words found are retrieved. The list of the occurences of the words found are retrieved.
- 3. Manipulation of occurrences.
– The occurrences are processed to solve phrases, proximity or Boolean operations. – If block addressing is used it may be necessary to directly search the text to find the If block addressing is used it may be necessary to directly search the text to find the information missing from the occurrences. TDT4215 – Indexing & Searching