Landmark indexing for evaluation of label-constrained reachability - PowerPoint PPT Presentation

Landmark indexing for evaluation of label-constrained reachability queries Lucien Valstar † , George Fletcher † , Yuichi Yoshida ‡ † TU Eindhoven (Netherlands), ‡ National Institute of Informatics and Preferred Infrastructure, Inc. (Japan) SIGMOD 2017 Chicago, 16 May 2017

Labeled networks Big graph data sets are ubiquitous ◮ social networks (e.g., LinkedIn, friendOf v 1 v 2 Facebook) likes ◮ scientific networks (e.g., Uniprot, friendOf friendOf PubChem) v 3 follows ◮ knowledge graphs (e.g., DBPedia, v 4 v 5 likes MS Academic Graph) follows ◮ transportation and utility networks ◮ ... Focus is on “things” (i.e., nodes, vertices) and their relationships (i.e., labeled directed edges) Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Label-constrained reachability queries on networks We study Label-Constrained Reachability (LCR) Queries on networks: Given vertices s and t of labeled graph G and a subset L of the set of all edge labels L of G, determine whether or not there is a path from s to t using only edges with labels in L. L When such a path exists, we denote this by s ❀ t . Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Label-constrained reachability queries on networks friendOf v 1 v 2 Example. The query likes ( v 1 , v 5 , { friendOf } ) is true. friendOf friendOf The query v 3 follows ( v 1 , v 3 , { friendOf } ) is likes v 4 v 5 false. follows Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Label-constrained reachability queries on networks friendOf v 1 v 2 Example. The query likes ( v 1 , v 5 , { friendOf } ) is true. friendOf friendOf The query v 3 follows ( v 1 , v 3 , { friendOf } ) is likes v 4 v 5 false. follows LCR Queries ◮ Natural generalization of reachability queries. ◮ An important fragment of the language of regular path queries. ◮ Implemented in W3C’s SPARQL 1.1, Neo4j’s Cypher, and Oracle’s PGQL. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

LCR queries: current evaluation solutions Despite the importance of LCR queries, current solutions do not scale to large graphs occurring in practice. There are two approaches to solving LCR queries: exhaustive search using state-of-the-art methods such as direction-optimizing BFS (DBFS) ◮ Beamer et al. Scientific Programming 21, 2013 or graph indexing for accelerated search ◮ Jin et al. SIGMOD 2010 ◮ Bonchi et al. EDBT , 2014 ◮ Zou et al. Information Systems 40, 2014. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

LCR queries: our contributions Our contributions. New indexing methods for LCR queries exploiting landmark vertices. ◮ Scales to orders of magnitude larger graphs than current indexing methods. ◮ Up to orders of magnitude faster query evaluation than current solutions. ◮ Our implementation is publicly available as open-source at https://github.com/DeLaChance/LCR Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: naive solution Naive Idea ( Full-LI ) Given a graph ( V , E , L ), for each vertex v ∈ V , store in an index L the pairs { ( w , L ) | w ∈ V , L ⊆ L , and v ❀ w } . Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: naive solution Naive Idea ( Full-LI ) Given a graph ( V , E , L ), for each vertex v ∈ V , store in an index L the pairs { ( w , L ) | w ∈ V , L ⊆ L , and v ❀ w } . Given a query ( s , t , L ), just check whether or not ( t , L ) is in the index for s . Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: naive solution Example. The Full-LI friendOf index entry for v 2 : v 1 v 2 likes ( v 3 , { likes } ) , ( v 3 , { friendOf , likes } ) , friendOf friendOf ( v 3 , { friendOf , follows , likes } ) , v 3 follows ( v 4 , { friendOf , follows } ) , likes v 4 v 5 ( v 5 , { friendOf } ) , ( v 5 , { friendOf , follows } ) . follows Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: naive solution Example. The Full-LI friendOf index entry for v 2 : v 1 v 2 likes ( v 3 , { likes } ) , ( v 3 , { friendOf , likes } ) , friendOf friendOf ( v 3 , { friendOf , follows , likes } ) , v 3 follows ( v 4 , { friendOf , follows } ) , likes v 4 v 5 ( v 5 , { friendOf } ) , ( v 5 , { friendOf , follows } ) . follows Naive Idea ( Full-LI ) ◮ Excellent query performance. ◮ Does not scale to large graphs. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: selective landmarking Landmark Index ( LI ) Only build indexes for a select small number of landmark vertices ◮ e.g., top k vertices of highest degree Furthermore, only store entries ( w , L ) such that L is a minimal label set connecting v to w ◮ that is, there is no L ′ strictly contained in L such that v L ′ ❀ w . Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: selective landmarking Landmark Index ( LI ) Only build indexes for a select small number of landmark vertices ◮ e.g., top k vertices of highest degree Furthermore, only store entries ( w , L ) such that L is a minimal label set connecting v to w ◮ that is, there is no L ′ strictly contained in L such that v L ′ ❀ w . Given a query ( s , t , L ), perform BFS from s only using edges with labels in L . When we hit a landmark vertex, we use its index to obtain the answer immediately. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: selective landmarking Example. The LI index friendOf v 1 v 2 entry for v 2 : likes ( v 3 , { likes } ) , friendOf friendOf ( v 4 , { friendOf , follows } ) , v 3 ( v 5 , { friendOf } ) . follows v 4 v 5 likes Half as many entries as Full-LI entry for v 2 . follows Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: selective landmarking Example. The LI index friendOf v 1 v 2 entry for v 2 : likes ( v 3 , { likes } ) , friendOf friendOf ( v 4 , { friendOf , follows } ) , v 3 ( v 5 , { friendOf } ) . follows v 4 v 5 likes Half as many entries as Full-LI entry for v 2 . follows Landmark index ( LI ) ◮ Balances space/time. ◮ Can significantly reduce index size. ◮ Still obtain the benefits of accelerated search. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: extended indexing Extended Landmark Index ( LI + ) Two extensions to make LI more efficient. (1) It may take a long time before finding a landmark. We can remedy this by building an incomplete index for non-landmarks: for each non-landmark v , we insert a small number of entries ( v ′ , L ) where v ′ is a landmark and v L ❀ v ′ . These provide shortcuts to landmarks during search. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: extended indexing Extended Landmark Index ( LI + ) Two extensions to make LI more efficient. (2) There is a strong asymmetry in evaluation of true- and false-queries. A true-query can stop after finding a landmark, whereas a false-query often needs to explore larger parts of the graph. To remedy this, we can maintain for each landmark v and label set L L the “reachable-by” set R L ( v ) = { w ∈ V | v ❀ w } . Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: extended indexing friendOf v 1 v 2 likes Example. friendOf friendOf R { friendOf } ( v 1 ) = { v 2 , v 4 , v 5 } . v 3 follows likes v 4 v 5 follows Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: extended indexing Extended Landmark Index ( LI + ) Two extensions to make LI more efficient. (2, cont.) During evaluation of query ( s , t , L ), suppose we have L L found s ❀ v and v � ❀ t , for some landmark v . L Then, for every w ∈ R L ( v ), we must have w � ❀ t . Hence, we can mark and never visit any vertex of R L ( v ) during the rest of the search. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for LCR: extended indexing Extended Landmark Index ( LI + ) Two extensions to make LI more efficient. (2, cont.) During evaluation of query ( s , t , L ), suppose we have L L found s ❀ v and v � ❀ t , for some landmark v . L Then, for every w ∈ R L ( v ), we must have w � ❀ t . Hence, we can mark and never visit any vertex of R L ( v ) during the rest of the search. For practical purposes, we only keep a subset of the R L ( v )’s, and only for relatively small label sets. Landmark indexing for LCR query evaluation (SIGMOD 2017, Chicago) Valstar, Fletcher, and Yoshida

Landmark indexing for evaluation of label-constrained reachability - PowerPoint PPT Presentation

Landmark indexing for evaluation of label-constrained reachability queries Lucien Valstar , George Fletcher , Yuichi Yoshida TU Eindhoven (Netherlands), National Institute of Informatics and Preferred Infrastructure, Inc.

Blue Label Pilot-plant Reactor 1 Product Line-up Platinum Label Gold Label Blue Label Blue

AG! Blue Label Bench-top Reactor 1 Product line up Platinum Label Gold Label Blue Label Blue

Landmark Landmark-based routing based routing Landmark Landmark-based routing based routing

Landmark indexing for scalable evaluation of label-constrained reachability queries Lucien

Extreme Classification A New Paradigm for Ranking & Recommendation Manik Varma Microsoft

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

CS 557 Landmark Routing The Landmark Hierarchy: A New Hierarchy For Routing in Very Large

Club Med Bintan Island, Indonesia A HOLISTIC WELLNESS ESCAPE JUST OFF SINGAPORE Image label

Presentation of the label Certicold WHY A CERTICOLD LABEL? A European conformity label For

IETF 78 TPA-Label for ADSP DKIM Third-Party Authorization Label draft-otis-dkim-tpa-label By

MPLS Source Label draft-chen-mpls-source-label-02 Mach Chen, Xiaohu Xu Zhenbin Li, Luyuan Fang

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Lecture 4 Signal Flow Graphs and recurrence relations Plan Fibonaccis rabbits and

Shortest Path Similar Routing 2 A New Metric A new metric path- based metric that can use used

Equational Theories for Real-Time Coalgebraic State Machines Sergey Goncharov a Stefan Milius a

New reasoning techniques for monoidal algebra Aleks Kissinger November 4, 2015 Q UANTUM G ROUP

Graphical Linear Algebra PhD Open, University of Warsaw Pawel Sobocinski University of

Towards a Coalgebraic Chomsky Hierarchy Sergey Goncharov , Stefan Milius, Alexandra Silva CMCS

New Trends on Exploratory Methods for Data Analytics Davide Mottin, Matteo Lissandrini , Yannis

Diagrammatic Quantum Reasoning: Completeness and Incompleteness Simon Perdrix CNRS, Loria,

Landmark indexing for evaluation of label-constrained reachability - PowerPoint PPT Presentation

Landmark indexing for evaluation of label-constrained reachability queries Lucien Valstar , George Fletcher , Yuichi Yoshida TU Eindhoven (Netherlands), National Institute of Informatics and Preferred Infrastructure, Inc.

Blue Label Pilot-plant Reactor 1 Product Line-up Platinum Label Gold Label Blue Label Blue

AG! Blue Label Bench-top Reactor 1 Product line up Platinum Label Gold Label Blue Label Blue

Landmark Landmark-based routing based routing Landmark Landmark-based routing based routing

Landmark indexing for scalable evaluation of label-constrained reachability queries Lucien

Extreme Classification A New Paradigm for Ranking &amp; Recommendation Manik Varma Microsoft

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

CS 557 Landmark Routing The Landmark Hierarchy: A New Hierarchy For Routing in Very Large

Club Med Bintan Island, Indonesia A HOLISTIC WELLNESS ESCAPE JUST OFF SINGAPORE Image label

Presentation of the label Certicold WHY A CERTICOLD LABEL? A European conformity label For

IETF 78 TPA-Label for ADSP DKIM Third-Party Authorization Label draft-otis-dkim-tpa-label By

MPLS Source Label draft-chen-mpls-source-label-02 Mach Chen, Xiaohu Xu Zhenbin Li, Luyuan Fang

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing &amp; Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Lecture 4 Signal Flow Graphs and recurrence relations Plan Fibonaccis rabbits and

Shortest Path Similar Routing 2 A New Metric A new metric path- based metric that can use used

Equational Theories for Real-Time Coalgebraic State Machines Sergey Goncharov a Stefan Milius a

New reasoning techniques for monoidal algebra Aleks Kissinger November 4, 2015 Q UANTUM G ROUP

Graphical Linear Algebra PhD Open, University of Warsaw Pawel Sobocinski University of

Towards a Coalgebraic Chomsky Hierarchy Sergey Goncharov , Stefan Milius, Alexandra Silva CMCS

New Trends on Exploratory Methods for Data Analytics Davide Mottin, Matteo Lissandrini , Yannis

Diagrammatic Quantum Reasoning: Completeness and Incompleteness Simon Perdrix CNRS, Loria,

Extreme Classification A New Paradigm for Ranking & Recommendation Manik Varma Microsoft

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3