Multi-Attribute Range Queries on Read-Only DHT Verdi March, Yong - PowerPoint PPT Presentation

Multi-Attribute Range Queries on Read-Only DHT Verdi March, Yong Meng Teo Department of Computer Science National University of Singapore Email: [ verdimar,teoym]@comp.nus.edu.sg 11 September 2006 ICCCN2006 1

Outline � Introduction to R-DHT � Problem Statement � Related Works � Midas � Indexing � Range-Query Optimizations � Analysis � Conclusion ICCCN 2006 2 11 September 2006

Introduction Goal: provide lookup service in large distributed systems with � minimum dependency to a 3 rd -party infrastructure � Effective : result guarantee (minimize false negative) � Efficient : short bounded lookup path length, scalable to # nodes � DHT : distributed implementation of hash-table abstractions, i.e. ‹key, value›, get (key), and put (key, value) � Distributed file system (CFS, PAST) � Multicast (Scribe) � RSS distribution (Corona , FeedTree) � Grid resource discovery ( DGRID , MAAN, Self-Organizing Condor, RIC, XenoSearch) ICCCN 2006 3 11 September 2006

DHT Lookups � User: lookup key k � DHT: walk along a path to a certain direction � User: I’ve walk 10 steps, and I haven’t see k � DHT: Continue 10 steps. � … � User: I’ve been walking for a total of 50 steps � DHT: Look around. If k is not around, then k does not exist ICCCN 2006 4 11 September 2006

DHT Concepts Data items are distributed across Map keys to nodes. Keys (and � the overlay network, and this is values) are stored to the responsible controlled by the hash function. nodes Higher result guarantee 10 � Node = bucket 56 54 � Locating a key is equals to 14 locating the responsible node 55 Structured overlay network: � 38 topology + nodes ordering 21 � Routing to a node in short bounded path length Nodes under different adm. domain (e.g. commercial organization): � Node maintains a small number � Ownership, don’t proactively “push” data of routing states � Self-interest to protect investment ICCCN 2006 5 Scalability 11 September 2006

R-DHT Framework � A class of DHT � Framework to turn existing DHT into a read-only version Hash-Table Conventional DHT R-DHT Abstraction Store Yes No Lookup Yes Yes � No distribution of key-value pairs � Each node stores only its own key-value pairs (data items) � Keys are mapped to their original location ICCCN 2006 6 11 September 2006

R-DHT m -bit m -bit Key k Host Identifier h k| h = 2 2 | 3 9 | 3 3 9 | 9 9 S 9 2 | 3 S 2 2 | 9 9 | 3 R-Chord 5 | 6 Organize Virtualize 5 6 5 | 9 5 | 6 2 S 5 9 2 | 9 9 | 9 5 | 9 5 Lookup is O(log N ) hops: 9 similar with Chord � � N = # hosts ICCCN 2006 7 11 September 2006

R-DHT Example 2 | 3 MDS Chord-based R-DHT Overlay Resource Type 2 Resource Type 9 9 | 3 Administrative Domain 3 R-DHT Terminologies Organize 2 2 | 3 3 Virtualize 9 9 | 3 Host Keys T 3 = { 2 , 9 } m -bit identifier space 2 m -bit identifier space ICCCN 2006 8 11 September 2006

Outline Introduction to R-DHT � Problem Statem ent � Related Works � Midas � � Indexing � Range-Query Optimizations Analysis � Conclusion � ICCCN 2006 9 11 September 2006

Multi-Attribute Resources � Basic lookup operation in DHT supports only exact queries � lookup(3) to search resource type 3 � Ongoing research for efficient multi-attribute range queries in DHT � Resource type is described by d attributes: cpu and ram � A multi-attribute range query: � Find resources where { cpu = * , ‘1 GB’ ≤ ram ≤ ‘2 GB’} ICCCN 2006 10 11 September 2006

Modeling Multi-Attribute Resource � We index resources by their type (the d attributes) ram � d -attribute resource type 2 GB � d -dimensional attribute space � Dimension : attribute 1 GB : resource type ( ≥ 1 � Point cpu resource instances) P3 P4 2-Dimensional Attribute Space ICCCN 2006 11 11 September 2006

Proposed Scheme � Objective: efficient searching through multi-dimensional indexing on top of R-DHT to answer multi-attribute range queries � Find { cpu = ‘P3’, ‘1 GB’ ≤ ram ≤ ‘2 GB’} � Our approach, Midas, is based on d-to-one mapping scheme � Multi-dimensional indexing of resource types � Search strategy to efficiently retrieve answers ICCCN 2006 12 11 September 2006

Contribution � Midas scheme to support multi-attribute range queries on R-DHT � Study on the implication of data-item distribution to the performance of multi-attribute range queries ICCCN 2006 13 11 September 2006

Outline Introduction to R-DHT � Problem Statement � Related W orks � Midas � � Indexing � Range-Query Optimizations Analysis � Conclusion � ICCCN 2006 14 11 September 2006

Related Works (1) d -Attribute Resource Type Distributed d -to- d Mapping d -to- one Mapping Inverted Index 1-dimensional DHT d -dimensional DHT d -dimensional torus: CAN Ring: Chord, Pastry Tree: Kademlia ICCCN 2006 15 11 September 2006

Related Works (2) � Distributed Inverted Index � MAAN (Cai et. al., 2004), CANDy (Bauer et. al., 2004), Harren 2002, KSS (Gnawali 2002), and MLP (Shi et. al., 2004) � d- to -d Mapping � pSearch (Tang et. al., 2003), MURK (Ganesan et. al., 2004), and 2CAN (Agrawal et. al., 2005) � d- to -one Mapping � Squid (Schmidt et. al., 2003), CONE (Agrawal et. al., 2005), ZNet (Shu et. al., 2005), SCRAP (Ganesan et. al., 2004), and CISS (Lee et. al., 2004) ICCCN 2006 16 11 September 2006

Distributed Inverted Index (1) Resource R = { cpu= ‘P3’, ram= ‘ 1GB’} h(‘P3’) = 1 h(‘1 GB’) = 30 Order-Preserving Hashing 1 store 56 store 30 Indexing: store each key to the DHT ICCCN 2006 17 11 September 2006

Distributed Inverted Index (2) Find resource where { cpu= ‘P3’, ram= ‘ 1GB’} h(‘P3’) = 1 h(‘1 GB’) = 30 RS 1 = σ cpu = P3 RS 1 = σ cpu = P3 1 1 1 RS 1 ∩ RS 2 56 56 56 30 30 RS 2 = σ ram = 1 GB 30 RS 2 = RS1 ∩ σ ram = 1 GB RS = σ cpu = P3 ∩ σ ram = 1 GB ICCCN 2006 18 11 September 2006

d -to- d Mapping � Maps d -dimensional attribute space to d -dimensional DHT (CAN) � With the exception of 2CAN, which maps d -dimensional attribute space to 2 d - dimensional CAN � Range query is modeled as a region ram in d -dimensional space � Route a search request to any point in the query region � Flood to the remaining points in the region cpu Resource type ICCCN 2006 19 11 September 2006

d -to- one Mapping 3 hash(sparc, 4 GB) = 10 8 56 ram 48 10 cpu hash(P3, 1 GB) = 3 Store keys to DHT Map point in d -dimensional space to one-dimensional key For indexing resources and query processing ICCCN 2006 20 11 September 2006

Outline Introduction to R-DHT � Problem Statement � Related Works � Midas � � I ndexing � Range-Query Optim izations Analysis � Conclusion � ICCCN 2006 21 11 September 2006

Midas Framework I ndexing d-to-one R-DHT mapping mapping Key k Resource r R-DHT Search Keys { k } Query q d-to-one R-DHT lookups mapping Query Processing ICCCN 2006 22 11 September 2006

Space-Filling Curve � Hilbert SFC is an example of d -to- one mapping function 6 5 9 1 0 3 4 7 8 1 1 2 3 2 1 3 1 2 1 0 1 1 4 1 5 0 0 1 2 3 Hilbert (3, 0) = 15 ICCCN 2006 23 11 September 2006

Indexing m -bit 2 m -bit memory Key k n k , h = 15| h Host h Virtualization Organize 1 5 0 0 cpu S 15 3 1 5 | h r = (cpu= ‘P3’ , memory= ‘1 GB’ ) = (3, 0) k = 15 ICCCN 2006 24 11 September 2006

Query Processing Search keys = { 1 , 2, 13, 14} Result set = { } lookup (1) Search keys = { } Result set = { 1, 2} Search keys = { 2 , 13, 14} Result set = { 1} S 1 S 15 lookup (2) lookup (13) 2 13 S 2 3 Search keys = { 1 3 , 14} Result set = { 1, 2} 0 1 14 S 3 1 2 ICCCN 2006 25 11 September 2006

Outline Introduction to R-DHT � Problem Statement � Related Works � Midas � � Indexing � Range-Query Optimizations Analysis � Conclusion � ICCCN 2006 26 11 September 2006

Experiment Setup � Compare Midas on R-Chord and Chord � Parameters � m = 16-bit � d = 3–4 � K = 10,000–50,000 � Keys follow normal distribution in d -dimensional space � N = 25,000 � Each administrative domain has 4–10 resource types � Query selectivity = 1% (of 2 m ) ICCCN 2006 27 11 September 2006

Resiliency to Node Failures (1) � Resiliency : ability to locate available resources when FN nodes fail simultaneously (0 ≤ F ≤ 1) � Resources are not replicated (i.e. we are not looking at resource availability ) � With R-Chord as the underlying infrastructure, nearly all keys are retrieved � Though no replication � In Chord, without replication, # keys retrieved is affected by F ICCCN 2006 28 11 September 2006

Resiliency to Node Failures (2) � In R-DHT, node is responsible for only one key, i.e., its own resources � In conventional DHT, node is responsible for several keys (even clusters), i.e., index other resources. When a node is down, it affects resources belonging to other nodes. k r k n n’ r 1 2 ICCCN 2006 29 11 September 2006

Multi-Attribute Range Queries on Read-Only DHT Verdi March, Yong - PowerPoint PPT Presentation

Multi-Attribute Range Queries on Read-Only DHT Verdi March, Yong Meng Teo Department of Computer Science National University of Singapore Email: [ verdimar,teoym]@comp.nus.edu.sg 11 September 2006 ICCCN2006 1 Outline Introduction to

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

DHT Routing Presented by Emma Kilfoyle October 24, 2013 DHT History/Background 1995 -

Computational Geometry Lecture 8: Range trees 1 Computational Geometry Lecture 8: Range trees

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Geometric Algorithms Range & windowing queries (2 lectures) Database queries 2/180 G.

A fA-Range Low-Power Multi-Channel Digital A fA-Range Low-Power Multi-Channel Digital Read-Out

Attribute Grammars Wilhelm/Seidl/Hack: Compiler Design, Syntactic and Semantic Analysis

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

Massive Data Algorithmics Lecture 7: Range Searching Massive Data Algorithmics Lecture 7: Range

Multi Multi-dimensional Data and Spatial Range dimensional Data and Spatial Range Query in

Bulletinboard DHT and wireguard-p2p https://github.com/manuels FOSDEM 2018 February 2 nd

KadOH Kademlia over HTTP a Javascript framework bringing DHT to mobile applications What have

Welcome Queen Anne High School Course Choices for Session 2016 2017 S2 into S3: Mrs Davie

Balfron High School S4/5 Course Choice Information Evening Monday 26 th February 2018 Welcome

Balfron High School S4/5 Course Choice Information Evening Monday 27 th February 2017 Welcome

Fourth Grade Multiplication and Division of Multi-Digit Numbers 2015-11-23 www.njctl.org Slide

LEGO Develops a new LEGO Develops a new robotics platform - WeDo robotics platform - WeDo

Happy Thanksgiving Order of Service November 22, 2020 Final Sunday of 2019-2020 Church Year

Middleware Petr T uma Middleware by Petr T uma This material is a work in progress that

Sidecars Future and Other Cavity Concepts Nathan Woollett August, 2018 LLNL-PRES-756799

Move your VS Code extension into Eclipse Che Florent Benoit 1 Eclipse Che 7 Eclipse Che 7 2

One year of Deploying Applications for Docker, CoreOS, Kubernetes and Co thomas@endocode.com

Implementing Blue/Green Deployments with Istio Machine Intelligence Modern Infrastructure

Multi-Attribute Range Queries on Read-Only DHT Verdi March, Yong - PowerPoint PPT Presentation

Multi-Attribute Range Queries on Read-Only DHT Verdi March, Yong Meng Teo Department of Computer Science National University of Singapore Email: [ verdimar,teoym]@comp.nus.edu.sg 11 September 2006 ICCCN2006 1 Outline Introduction to

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

DHT Routing Presented by Emma Kilfoyle October 24, 2013 DHT History/Background 1995 -

Computational Geometry Lecture 8: Range trees 1 Computational Geometry Lecture 8: Range trees

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Geometric Algorithms Range &amp; windowing queries (2 lectures) Database queries 2/180 G.

A fA-Range Low-Power Multi-Channel Digital A fA-Range Low-Power Multi-Channel Digital Read-Out

Attribute Grammars Wilhelm/Seidl/Hack: Compiler Design, Syntactic and Semantic Analysis

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

Massive Data Algorithmics Lecture 7: Range Searching Massive Data Algorithmics Lecture 7: Range

Multi Multi-dimensional Data and Spatial Range dimensional Data and Spatial Range Query in

Bulletinboard DHT and wireguard-p2p https://github.com/manuels FOSDEM 2018 February 2 nd

KadOH Kademlia over HTTP a Javascript framework bringing DHT to mobile applications What have

Welcome Queen Anne High School Course Choices for Session 2016 2017 S2 into S3: Mrs Davie

Balfron High School S4/5 Course Choice Information Evening Monday 26 th February 2018 Welcome

Balfron High School S4/5 Course Choice Information Evening Monday 27 th February 2017 Welcome

Fourth Grade Multiplication and Division of Multi-Digit Numbers 2015-11-23 www.njctl.org Slide

LEGO Develops a new LEGO Develops a new robotics platform - WeDo robotics platform - WeDo

Happy Thanksgiving Order of Service November 22, 2020 Final Sunday of 2019-2020 Church Year

Middleware Petr T uma Middleware by Petr T uma This material is a work in progress that

Sidecars Future and Other Cavity Concepts Nathan Woollett August, 2018 LLNL-PRES-756799

Move your VS Code extension into Eclipse Che Florent Benoit 1 Eclipse Che 7 Eclipse Che 7 2

One year of Deploying Applications for Docker, CoreOS, Kubernetes and Co thomas@endocode.com

Implementing Blue/Green Deployments with Istio Machine Intelligence Modern Infrastructure

Geometric Algorithms Range & windowing queries (2 lectures) Database queries 2/180 G.