1
Concept Search: Semantics Enabled
Syntactic Search
Fausto Giunchiglia, Uladzimir Kharkevich, Ilya Zaihrayeu
Outline Information Retrieval (IR) Syntactic IR Problems of - - PowerPoint PPT Presentation
Fausto Giunchiglia, Uladzimir Kharkevich , Ilya Zaihrayeu Concept Search : Semantics Enabled Syntactic Search June 2nd, 2008, Tenerife, Spain 1 Outline Information Retrieval (IR) Syntactic IR Problems of Syntactic IR Semantic
1
Fausto Giunchiglia, Uladzimir Kharkevich, Ilya Zaihrayeu
2
Information Retrieval (IR)
Syntactic IR Problems of Syntactic IR
Semantic Continuum Concept Search (C-Search) C-Search via Inverted Indices Preliminary Evaluation Conclusion and Future work
3
needs, (optionally) ordered according to the degree of relevance.
4
search for equivalent words search for words with common prefixes search for words within a certain edit distance with a given word
5
6
e.g., baby is a young mammal or a human child
e.g., mark and print – a visible indication made on a surface
Natural Language Phrases (e.g., Noun Phrases).
E.g., Computer table → A laptop computer is on a coffee table
E.g., carnivores (flesh-eating mammals) is more general than
dog OR cat
7
(0, 0, 0) Pure Syntax NL Word String Similarity
8
NL2FL (0, 0, 0) Pure Syntax NL (FL) 1 Word String Similarity
9
NL2FL W2P +Noun Phrase +Verb Phrase … (0, 0, 0) Pure Syntax NL (FL) 1 Word String Similarity 1 (Free Text)
A laptop computer is on a coffee table →
10
NL2FL W2P +Noun Phrase +Lexical knowledge +Verb Phrase … (0, 0, 0) Pure Syntax NL (FL) 1 Word String Similarity +Statistical Knowledge 1 (Complete Ontological Knowledge)
…
1 (Free Text) KNOW
11
NL2FL W2P +Noun Phrase +Lexical knowledge +Verb Phrase … (0, 0, 0) Pure Syntax NL (FL) 1 Word String Similarity +Statistical Knowledge 1 (Complete Ontological Knowledge)
…
1 (Free Text) KNOW Full Semantics (1, 1, 1)
C-Search
12
given word (e.g., a concept does not exist in the lexical database). In this case, word itself is used as the identifier for a concept.
descriptive phrase ::= noun phrase { OR noun phrase} E.g., C(A little dog OR a huge cat) = (little-2 ⊓ dog-1) ⊔ (huge-1 ⊓ cat-
3)
13
NL2FL W2P +Noun Phrase +Lexical knowledge +Verb Phrase …
C-Search
(0, 0, 0) Pure Syntax NL (FL) 1 Word String Similarity +Statistical Knowledge 1 (Complete Ontological Knowledge)
…
1 (Free Text) KNOW +Descriptive Phrase
NL&FL
Full Semantics (1, 1, 1)
14
Boolean Model (retrieval), Vector Space Model (ranking)
15
16
→ Index atomic concepts by more general atomic concept
→ Index conjunctive clauses by its components (i.e., atomic concepts)
→ Index DNF formulas by its components (i.e., conjunctive clauses)
q d d q ms
17
…
…
…
…
C1,… C1,…
…
Concept ∪-index C2(little ∩ dog) C3(huge ∩ cat)
…
…
…
18
19
20
21
an improvement when semantics is available
fully semantic IR in which indexing and retrieval can be performed at any point of the continuum depending on how much semantics is available
semantic similarities of query and document descriptions
art syntactic IR systems using a syntactic IR benchmark
22