Scalable Consistency in Scatter A Distributed Key-Value Storage - PowerPoint PPT Presentation

Scalable Consistency in Scatter A Distributed Key-Value Storage System Lisa Glendenning Ivan Beschastnikh University of Washington Arvind Krishnamurthy Thomas Anderson Supported by NSF CNS-0963754 October 2011 1 1

Internet services depend on distributed key-value stores Consistency Scatter Dynamo Scalability 2 2

Scatter: Goals ✓ linearizable consistency semantics ✓ scalable in a wide area network ✓ high availability ✓ performance close to existing systems 3 3

Scatter: Approach combine ideas from: scalable peer-to-peer consistent datacenter systems systems ✓ distributed hash table ✓ consensus ✓ self-organization ✓ replication ✓ decentralization ✓ transactions 4 4

Distributed Hash Tables: Background core functionality: partition and assign keys to nodes system structure: nodes knowledge of system state is distributed among all nodes keys system management: nodes coordinate locally to respond to churn, e.g., • give keys to new nodes • take over keys of failed links between nodes form overlay nodes 5 5

Distributed Hash Tables: Faults Cause Inconsistencies Example : c joins between a and b c a b k a k b c.pred = a c.succ = b JOIN a.succ = c b.pred = c c b.keys = (k c ,k b ] c.keys = (k a ,k c ] a b k a k c k b 6 6

Distributed Hash Tables: Faults Cause Inconsistencies Example : c joins between a and b what could go wrong? c a b FAULT OUTCOME communication both b and c claim k a k b c.pred = a fault between b ownership of c.succ = b and c JOIN (k a ,k c ] a.succ = c no node claims c fails during b.pred = c ownership of c operation b.keys = (k c ,k b ] (k a ,k c ] communication c.keys = (k a ,k c ] a b routes through a fault between a skip over c and c k a k c k b 6 6

Distributed Hash Tables: Weak Atomicity Causes Anomalies DHTs use ad-hoc protocols to add and remove nodes what happens if... • two nodes join at the same place at the same time • two adjacent nodes leave at the same time • during a node join the predecessor leaves • one node mistakenly thinks another node has failed ... 7 7

Scatter: Design Overview How is Scatter different? use groups as building blocks instead of nodes What is a group? set of nodes that cooperatively manage a key-range What does this give us? • nodes within a group act as a single entity • a group is much less likely to fail than an individual node • distributed transactions for operations group involving multiple groups node 8 8

Scatter: Group Anatomy a c b k z k a k b k c ‣ group replicates all state ‣ key-range further partitioned among members with Paxos among nodes of group for performance nodes = {a,b,c} keys = (k z ,k c ] values = {...} a.keys = (k z ,k a ] b.keys = (k a ,k b ] ‣ changes to group membership c.keys = (k b ,k c ] are Paxos reconfigurations: ‣ each node orders client • include new nodes operations on its keys • exclude failed nodes 9 9

Scatter: Self-Reorganization some problems can’t be handled within a single group • small groups are at risk of failing • large groups are slow • load imbalance across groups multi-group operations: a a SPLIT • merge two small groups into one b 1 • split one large group into two b b 2 • rebalance keys and nodes MERGE between groups c c distributed transactions coordinated locally by groups 10 10

Example: Group Split 2PC a split? b c 11 11

Example: Group Split 2PC ok! a a split? b b ok! c c 11 11

Example: Group Split 2PC ok! a a a split? split! b b b ok! c c c 11 11

Example: Group Split 2PC a ok! a a a b 1 split? split! b b b b 2 ok! c c c c 11 11

Example: Group Split 2PC a ok! a a a b 1 split? split! b b b b 2 ok! c c c c ok! b split? ok! 11 11

Example: Group Split 2PC a ok! a a a b 1 split? split! b b b b 2 ok! c c c c ok! split b? a ok! ok! b split? ok! ok! split b? c ok! 11 11

Example: Group Split 2PC a ok! a a a b 1 split? split! b b b b 2 ok! c c c c ok! RECONFIGURE ! split b? a ok! ok! b 1 ok! b split? b split! ok! b 2 ok! ok! split b? c ok! committed 11 11

Example: Group Split 2PC a ok! a a a b 1 split? split! b b b b 2 ok! c c c c ok! ok! b split! RECONFIGURE ! a split b? a ok! ok! ok! b 1 ok! b split? b split! ok! ok! b 2 ok! ok! split b? b split! c c ok! ok! committed 11 11

Scatter ✓ linearizable consistency semantics ...group consensus, transactions ✓ scalable in a wide area network ...local operations ✓ high availability ...replication, reconfiguration ✓ performance close to existing systems ...key partitioning, optimizations 12 12

Evaluation: Overview Questions: 1.How robust is Scatter in high-churn peer-to- peer environment? 2.How does Scatter adapt to dynamic workload in datacenter environment? Comparisons: Environment P2P Datacenter Comparison OpenDHT ZooKeeper System 13 13

Comparison: OpenDHT Layered OpenDHT’s recursive routing on top of Scatter groups Implemented a Twitter- like application, Chirp Experimental Setup: • 840 PlanetLab nodes • injected node churn at varying rates • Twitter traces as a workload • tweets and social network stored in DHT 14 14

Comparison: OpenDHT Consistency Availability 100 100 completed fetches (%) consistent fetches (%) 95 95 90 90 85 85 Scatter Scatter 80 80 OpenDHT OpenDHT 75 75 100 300 500 700 900 100 300 500 700 900 node lifetime (seconds) node lifetime (seconds) Scatter has zero inconsistencies and high availability even under churn 15 stency 15

Comparison: OpenDHT Latency 1400 Scatter fetch latency (ms) OpenDHT 1050 ] 10-12% 700 350 0 100 300 500 700 900 node lifetime (seconds) Scalable consistency is cheap 16 16

Comparison: Replicated ZooKeeper ZooKeeper: small-scale, centralized coordination service Replicated ZooKeeper: statically partitioned global key-space to multiple, isolated ZooKeeper instantiations Experimental Setup: Z 1 • testbed: Emulab • varied total number of nodes Z 2 • no churn Z 3 • same Chirp workload 17 17

Comparison: Replicated ZooKeeper Scalability 400 Scatter throughput (1000 ops/sec) ZooKeeper 300 200 100 0 5 25 50 75 100 125 150 total number of nodes Dynamic partitioning adapts to changes in workload 18 18

Scatter: Summary ✓ consensus groups of nodes as fault- tolerant building blocks ✓ distributed transactions across groups to repartition the global key-space ✓ evaluation against OpenDHT and ZooKeeper shows strict consistency, linear scalability, and high availability 19 19

Scalable Consistency in Scatter A Distributed Key-Value Storage - PowerPoint PPT Presentation

Scalable Consistency in Scatter A Distributed Key-Value Storage System Lisa Glendenning Ivan Beschastnikh University of Washington Arvind Krishnamurthy Thomas Anderson Supported by NSF CNS-0963754 October 2011 1 1 Internet services

Consistency - Chapter 5 Introduce several notions of Local Consistency: arc consistency,

Constraint Programming - An overview Node-consistency Arc-consistency Path-consistency

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Consistent Storage or Scalable Storage Why Not Both? CONSISTENCY Strong Consistency

1 Applications ? Trading Consistency for Performance Applications ? Trading Consistency for

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

PHI: ARCHITECTURAL SUPPORT FOR SYNCHRONIZATION- AND BANDWIDTH-EFFICIENT COMMUTATIVE SCATTER

MPI types, Scatter and Scatterv MPI types, Scatter and Scatterv 0 1 2 3 4 5 Logical and

Scalable consistency for replicated data Anne3e Bieniusa Overview

Seminar: Search and Optimization Directional Consistency Gabi R oger Universit at Basel

Advanced consistency methods Chapter 8 ICS-275 Winter 2016 Winter 2016 ICS 275 - Constraint

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Scatter Creek Aquifer Area S eptic S ystem Management Proj ect Public Health and Social

Robust Statistics Part 2: Multivariate location and scatter Peter Rousseeuw LARS-IASC School,

Robust scatter regularization G. Haesbroeck and C. Croux University of Li` ege - University of

Making a scatter plot IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON Hillar y Green -

Simulation and Modelling of Large-scale Structured P2P Overlays y Mario Kolberg & Jamie

WAP5 Black-box Performance Debugging for Wide-Area Distributed Systems Patrick Reynolds

Transactional Caching of Application Data using Recent Snapshots Dan R. K. Ports Austin T.

Building Secure Decentralized Applications the DECENT Way Haofan Zheng Xiaowei Chu University

Addressing Model in DHT based Networks Mohammed B. M. Kamel , Pter Ligeti, dm Nagy Etvs

Efficient DHT attack mitigation through peers ID distribution Thibault Cholez, Isabelle

The time scales of a stochastic network with failures Mathieu Feuillet joint work with Philippe

Connectivity Properties of Mainline BitTorrent DHT Nodes Raul Jimenez, Flutra Osmani, Bjrn

Scalable Consistency in Scatter A Distributed Key-Value Storage - PowerPoint PPT Presentation

Scalable Consistency in Scatter A Distributed Key-Value Storage System Lisa Glendenning Ivan Beschastnikh University of Washington Arvind Krishnamurthy Thomas Anderson Supported by NSF CNS-0963754 October 2011 1 1 Internet services

Consistency - Chapter 5 Introduce several notions of Local Consistency: arc consistency,

Constraint Programming - An overview Node-consistency Arc-consistency Path-consistency

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Consistent Storage or Scalable Storage Why Not Both? CONSISTENCY Strong Consistency

1 Applications ? Trading Consistency for Performance Applications ? Trading Consistency for

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

PHI: ARCHITECTURAL SUPPORT FOR SYNCHRONIZATION- AND BANDWIDTH-EFFICIENT COMMUTATIVE SCATTER

MPI types, Scatter and Scatterv MPI types, Scatter and Scatterv 0 1 2 3 4 5 Logical and

Scalable consistency for replicated data Anne3e Bieniusa Overview

Seminar: Search and Optimization Directional Consistency Gabi R oger Universit at Basel

Advanced consistency methods Chapter 8 ICS-275 Winter 2016 Winter 2016 ICS 275 - Constraint

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Scatter Creek Aquifer Area S eptic S ystem Management Proj ect Public Health and Social

Robust Statistics Part 2: Multivariate location and scatter Peter Rousseeuw LARS-IASC School,

Robust scatter regularization G. Haesbroeck and C. Croux University of Li` ege - University of

Making a scatter plot IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON Hillar y Green -

Simulation and Modelling of Large-scale Structured P2P Overlays y Mario Kolberg &amp; Jamie

WAP5 Black-box Performance Debugging for Wide-Area Distributed Systems Patrick Reynolds

Transactional Caching of Application Data using Recent Snapshots Dan R. K. Ports Austin T.

Building Secure Decentralized Applications the DECENT Way Haofan Zheng Xiaowei Chu University

Addressing Model in DHT based Networks Mohammed B. M. Kamel , Pter Ligeti, dm Nagy Etvs

Efficient DHT attack mitigation through peers ID distribution Thibault Cholez, Isabelle

The time scales of a stochastic network with failures Mathieu Feuillet joint work with Philippe

Connectivity Properties of Mainline BitTorrent DHT Nodes Raul Jimenez, Flutra Osmani, Bjrn

Simulation and Modelling of Large-scale Structured P2P Overlays y Mario Kolberg & Jamie