mimir multi paradigm information management index and
play

MIMIR: Multi-paradigm Information Management Index and Repository - PowerPoint PPT Presentation

MIMIR: Multi-paradigm Information Management Index and Repository Valentin Tablan Niraj Aswani, Ian Roberts University of Sheffield University of Sheffield, NLP MIMIR is an IR engine that can search over: Text Semantic


  1. MIMIR: Multi-paradigm Information Management Index and Repository Valentin Tablan Niraj Aswani, Ian Roberts University of Sheffield

  2. University of Sheffield, NLP MIMIR □ … is an IR engine that can search over: ○ Text ○ Semantic Annotations ○ Ontologies and Knowledge Bases ...represented as GATE documents □ … is built on top of: ○ Ontotext ORDI ○ MG4J text indexing engine 2009 GATE Summer School, Sheffield 2

  3. University of Sheffield, NLP Semantic Annotation □ … is an annotation process where [parts of] the schema (annotation types, annotation features) are ontological objects. □ … is different from: ○ Ontology learning ○ Ontology population (though it sometimes includes it) 2009 GATE Summer School, Sheffield 3

  4. University of Sheffield, NLP Semantic Annotation 2009 GATE Summer School, Sheffield 4

  5. University of Sheffield, NLP Under the Hood Document Ontology + Collection Knowledge Base ... Mentions Token Token Token Index Index Index Index 2009 GATE Summer School, Sheffield 5

  6. University of Sheffield, NLP A Mimir Configuration □ Text fields ○ string (the document text, downcased) ○ root (morphological root of each word) ○ category (part-of-speech of each word) □ Annotations ○ Measurement (indexed features: type, dimension) ○ Reference (indexed feature: type) ○ Section (indexed feature: type) 2009 GATE Summer School, Sheffield 6

  7. University of Sheffield, NLP Query Types (basic) □ Text. Matches plain text. Syntax: sequence of words Example: device for measurement of light intensity □ Annotation. Matches annotations. Syntax: {Type feature1=value1 feature2=value2...} Example: {Measurement type=scalarValue} □ Sequence Query. Sequence of other queries. Syntax: Query1 [n..m] Query2... Example: up to {Measurement} [1..5] {Measurement} 2009 GATE Summer School, Sheffield 7

  8. University of Sheffield, NLP Query Types (inclusion) □ IN Query. Hits of one query only if in hits of another. Syntax: Query1 IN Query2 Example: London IN {Reference} □ OVER Query. Hits of a query, only if overlapping hits of another. Syntax: Query1 OVER Query2 Example: {Reference} OVER London 2009 GATE Summer School, Sheffield 8

  9. University of Sheffield, NLP Query Types (advanced) □ Named Index. Search different text indexes. Syntax: indexName:term Example: root:be [matches is, am, was, were, ...] □ Kleene. Specified number of repeats. Syntax: Query +n, Query +n..m Example: {Measurement}[2], category:JJ[1..3] 2009 GATE Summer School, Sheffield 9

  10. University of Sheffield, NLP MIMIR ancestry: ANNIC 2009 GATE Summer School, Sheffield 10

  11. University of Sheffield, NLP MIMIR v. ANNIC: Index size 120 103.04 100 87.77 80 60 40 21.51 20 8.33 6.9 0.82 0 ANNIC v. 0.1 v. 0.2 v. 0.3 v. 0.4 v. 1.0 Index Size (times raw input) 2009 GATE Summer School, Sheffield 11

  12. University of Sheffield, NLP MIMIR v. ANNIC: Features ANNIC Mimir Annotation features All (+) Only configured (-) Hit details Full (+) Text only (-) JAPE Compatible Yes (+) Partial (-) Scalability Poor (-) Very Good (+) Index Size Large (-) ~ Input (+) Search Speed Fair (-) Fast (+) 2009 GATE Summer School, Sheffield 12

  13. University of Sheffield, NLP DEMO! 2009 GATE Summer School, Sheffield 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend