A Node Indexing Scheme for Web Entity Retrieval Renaud Delbru, - PowerPoint PPT Presentation

A Node Indexing Scheme for Web Entity Retrieval Renaud Delbru, Nickolai Toupikov, Michele Catasta, and Giovanni Tummarello Digital Enterprise Research Institute, Galway June 2, 2010

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Introduction Web of Data Pages with semantic markups: RDF, RDFa, Microformats. Currently in the area of X00.000.000 pages with semantic markups. How to consume these data ? Traditional search engines ineffective; Shift from text document to data entity. Semi-structured IR: node indexing scheme Technique from XML IR world; Good compromise between query expressiveness, query processing time and update complexity. SIREn (Semantic Information Retrieval Engine) Open Source implementation; At the core of the Sindice search engine. 1 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion From “Web” to “Web of Data” Web of Data Web Dataset - Entity Document Bag of RDF assertions Bag of words Semi-structured Unstructured Dataset - entity centric Document centric 2 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Entity Retrieval Entity Retrieval Given an entity search query, find the most relevant entities (list of entities ordered by relevance). Entity Search Query We aim to support three types of queries: full-text search keyword-based queries when the data structure is unknown; structural query complex queries specified in a star-shaped structure when the data schema is known; semi-structural query combination of the two (where full-text search can be used on any part of the star-shaped query) when the data structure is partially known. Relevant subset of SPARQL Match well with IR 3 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Entity Retrieval: Star Query (a) Visual representation of an RDF (b) Star-shaped query graph. Figure: Oval nodes represent resources and rectangular ones represent literals. 4 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Outline: Tree Model Tree Model Conceptual Model Node-Labelled Tree: Model Node-Labelled Tree: Example 5 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Conceptual Model Figure: Conceptual representation of the node-labelled tree model 6 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Node-Labelled Tree: Model Origin: Semi-structured information retrieval, more recently XML retrieval. Goal: Encode relationship between nodes Operators: Parent-Child and Ancestor-Descendant (as in XPath) Requirement: Assign unique identifiers (node labels) that encode relationships between the nodes Solution: Node labelling scheme (e.g., Dewey Encoding) 7 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Node-Labelled Tree: Example Figure: Node-labelled tree using Dewey’s encoding 8 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Outline: Query Model Query Model Operator Overview Structure Operators 9 / 29

Introduction Tree Model Query Model Implementation Comparison Experimental Results Conclusion Operator Overview Content Operators Orthogonal to the structure operators; Atomic search element: keyword; Boolean operators (intersection, union, difference), proximity operators (phrase, etc.), ... Allow to compose complex keyword queries to retrieve nodes. Structure Operators Atomic search element: node; Allow to compose path queries to retrieve quads; Allow combination of quads. 10 / 29

A Node Indexing Scheme for Web Entity Retrieval Renaud Delbru, - PowerPoint PPT Presentation

A Node Indexing Scheme for Web Entity Retrieval Renaud Delbru, Nickolai Toupikov, Michele Catasta, and Giovanni Tummarello Digital Enterprise Research Institute, Galway June 2, 2010 Introduction Tree Model Query Model Implementation

Title node 1 branch 1 branch 2 node 2 root branch 3 node 3 branch 4 node 4 Title node

Anonymity and Censorship Resistance Entry node Middle node Exit node Tor user Tor Node Tor

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

1 Agenda Quick'Intro' Node.js:'The'Beginning' What'Is'Node.js? Why'Use'Node.js?

Entity Representation and Retrieval from Knowledge Graphs Alexander Kotov Textual Data Analytics

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Media Indexing & Retrieval Media Indexing & Retrieval Prepared by Ling Guan Jose Lay

Node.js Workshop Tom Hughes-Croucher Chief Evangelist / Node Tech Lead @sh1mmer tom@joyent.com

Warmup Exercise while (node != NULL) { ! Consider a binary tree if (node->m_data == value) {

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Retrieval by Content Part 3: Text Retrieval Latent Semantic Indexing Srihari: CSE 626 1 Latent

Dev Lab: Node + Express What is Node? Node.js = JavaScript + File I/O + A Package Manager or:

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

Scheme Announcements Scheme Scheme is a Dialect of Lisp 4 Scheme is a Dialect of Lisp What

Joint Research Centre (JRC) ANDES WP1 Measurements for advanced reactor systems, MARS Arjan

Similar code fragment A code fragment that has similar part to it in source code

Coherence: Pisa Ideas Factory What is the point? ITN: Training network Ideas Factory: Broaden

f ( f ( x )) Solving Iterated Functions Using Genetic Programming Michael Schmidt Hod Lipson

Mathematical Programming: Modelling and Software Leo Liberti LIX, Ecole Polytechnique,

Taken Out of Context: Security Risks with Security Code AutoFill in iOS & macOS Andreas

ITN Innovative Training Networks Call 2015 Dr. Jennifer Brennan NCP & ND Marie

(a.k.a. path guiding) Jaroslav Kivnek Charles University, Prague Render Legion/Chaos Group

A Node Indexing Scheme for Web Entity Retrieval Renaud Delbru, - PowerPoint PPT Presentation

A Node Indexing Scheme for Web Entity Retrieval Renaud Delbru, Nickolai Toupikov, Michele Catasta, and Giovanni Tummarello Digital Enterprise Research Institute, Galway June 2, 2010 Introduction Tree Model Query Model Implementation

Title node 1 branch 1 branch 2 node 2 root branch 3 node 3 branch 4 node 4 Title node

Anonymity and Censorship Resistance Entry node Middle node Exit node Tor user Tor Node Tor

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

1 Agenda Quick'Intro' Node.js:'The'Beginning' What'Is'Node.js? Why'Use'Node.js?

Entity Representation and Retrieval from Knowledge Graphs Alexander Kotov Textual Data Analytics

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Media Indexing &amp; Retrieval Media Indexing &amp; Retrieval Prepared by Ling Guan Jose Lay

Node.js Workshop Tom Hughes-Croucher Chief Evangelist / Node Tech Lead @sh1mmer tom@joyent.com

Warmup Exercise while (node != NULL) { ! Consider a binary tree if (node-&gt;m_data == value) {

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

Retrieval by Content Part 3: Text Retrieval Latent Semantic Indexing Srihari: CSE 626 1 Latent

Dev Lab: Node + Express What is Node? Node.js = JavaScript + File I/O + A Package Manager or:

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

Scheme Announcements Scheme Scheme is a Dialect of Lisp 4 Scheme is a Dialect of Lisp What

Joint Research Centre (JRC) ANDES WP1 Measurements for advanced reactor systems, MARS Arjan

Similar code fragment A code fragment that has similar part to it in source code

Coherence: Pisa Ideas Factory What is the point? ITN: Training network Ideas Factory: Broaden

f ( f ( x )) Solving Iterated Functions Using Genetic Programming Michael Schmidt Hod Lipson

Mathematical Programming: Modelling and Software Leo Liberti LIX, Ecole Polytechnique,

Taken Out of Context: Security Risks with Security Code AutoFill in iOS &amp; macOS Andreas

ITN Innovative Training Networks Call 2015 Dr. Jennifer Brennan NCP &amp; ND Marie

(a.k.a. path guiding) Jaroslav Kivnek Charles University, Prague Render Legion/Chaos Group

Media Indexing & Retrieval Media Indexing & Retrieval Prepared by Ling Guan Jose Lay

Warmup Exercise while (node != NULL) { ! Consider a binary tree if (node->m_data == value) {

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Taken Out of Context: Security Risks with Security Code AutoFill in iOS & macOS Andreas

ITN Innovative Training Networks Call 2015 Dr. Jennifer Brennan NCP & ND Marie