MIMIR: Multi-paradigm Information Management Index and Repository
Valentin Tablan Niraj Aswani, Ian Roberts
University of Sheffield
MIMIR: Multi-paradigm Information Management Index and Repository - - PowerPoint PPT Presentation
MIMIR: Multi-paradigm Information Management Index and Repository Valentin Tablan Niraj Aswani, Ian Roberts University of Sheffield University of Sheffield, NLP MIMIR is an IR engine that can search over: Text Semantic
University of Sheffield
2 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
○ Text ○ Semantic Annotations ○ Ontologies and Knowledge Bases
...represented as GATE documents
○ Ontotext ORDI ○ MG4J text indexing engine
3 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
○ Ontology learning ○ Ontology population (though it sometimes includes it)
4 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
5 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
Document Collection Ontology + Knowledge Base Mentions Index Token Index Token Index Token Index
6 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
○ string (the document text, downcased) ○ root (morphological root of each word) ○ category (part-of-speech of each word)
○ Measurement (indexed features: type, dimension) ○ Reference (indexed feature: type) ○ Section (indexed feature: type)
7 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
Syntax: sequence of words Example: device for measurement of light intensity
Syntax: {Type feature1=value1 feature2=value2...} Example: {Measurement type=scalarValue}
Syntax: Query1 [n..m] Query2... Example: up to {Measurement} [1..5] {Measurement}
8 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
Syntax: Query1 IN Query2 Example: London IN {Reference}
Syntax: Query1 OVER Query2 Example: {Reference} OVER London
9 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
Syntax: indexName:term Example: root:be [matches is, am, was, were, ...]
Syntax: Query +n, Query +n..m Example: {Measurement}[2], category:JJ[1..3]
10 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
11 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
ANNIC
20 40 60 80 100 120 87.77 103.04 21.51 6.9 8.33 0.82
Index Size (times raw input)
12 University of Sheffield, NLP 2009 GATE Summer School, Sheffield
ANNIC Annotation features All (+) Only configured (-) Hit details Full (+) Text only (-) JAPE Compatible Yes (+) Partial (-) Scalability Poor (-) Very Good (+) Index Size Large (-) ~ Input (+) Search Speed Fair (-) Fast (+) Mimir
13 University of Sheffield, NLP 2009 GATE Summer School, Sheffield