SLIDE 4 7
Keyword-based querying
Boolean queries
– OR (e1 OR e2) – AND (e1 AND e2) – BUT (e1 BUT e2) NOT BUT (e1 BUT e2) NOT
- No ranking of documents provided
- “Fuzzy boolean”: Meaning of AND and OR relaxed
Natural language:
- Query is an enumeration of words and context queries
- Query is an enumeration of words and context queries
- All documents matching a portion of the user query are retrieved
- Higher ranking is assigned to those documents matching more parts of the
query Q d d t i d t TDT4215 Query Languages
- Query and documents viewed as vectors
8
Pattern matching
- A pattern is a set of syntactic features that must occur in a text segment, ranging
from simple (e.g. words) to complex (e.g. regular expressions) terms
– words – prefixes ‘comput’ -> ‘computer’ ‘computation’ ‘computing’ – prefixes comput -> computer , computation , computing –
- suffixes. ‘ters’ -> ‘computers’, ‘testers’, ‘printers’
– sub-strings. ‘tal’ -> ‘coastal’, ‘talk’, ‘metallic’ –
- ranges. ‘held’ and ‘hold’ -> ‘hoax’, ‘hissing’
- ranges. held and hold
hoax , hissing – allowing erros – regular expressions – extended patterns
TDT4215 Query Languages