amazon dynamo
play

Amazon Dynamo A Highly Available Key-value Store Present by Jian - PowerPoint PPT Presentation

Amazon Dynamo A Highly Available Key-value Store Present by Jian Fang jianf@cmu.edu What is Dynamo Eventually consistent key-value store Support scalable highly available data access Optimized for availability to maximize customer


  1. Amazon Dynamo A Highly Available Key-value Store Present by Jian Fang jianf@cmu.edu

  2. What is Dynamo  Eventually consistent key-value store  Support scalable highly available data access  Optimized for availability to maximize customer satisfaction

  3. Why not RDBMS?  Only need primary-key access  RDBMS have limited scalability  RDBMS require expensive hardware and skillful administrators

  4. Amazon’s Requirements  Objects are less than 1MB  No operations span for multiple data  <300ms response time for 99.9% requests  Heterogeneous commodity hardware infrastructure  Decentralized, loosely coupled services  Highly available(always writable)

  5. Techniques used in Dynamo  Consistent Hashing  Vector clocks  Sloppy Quorum and Hinted handoff  Merkle trees  Gossip-based membership protocol

  6. Interfaces  Key-value storage system with operators:  Get(key): return a single or a list of objects with conflicting versions  Put(key, context, object): context contains the version information  MD5 hashing is applied on the key to generate 128-bit identifier

  7. Partitioning  Scale Incrementally  Consistent Hashing  Variant of Consistent Hashing

  8. Consistent Hashing 12 keys, N = 3  Simple Non-Consistent Hashing  𝐼𝑏𝑡ℎ 𝑙𝑓𝑧 𝑛𝑝𝑒 𝑂  What if N = N + 1 S1 S2 S3  6 keys(a half) remapped  Consistent Hashing  Only K/N keys need to be remapped 12 keys, N = 4 S1 S2 S3 S4

  9. Consistent Hashing A Key Z Key X D C B Key Y

  10. Consistent Hashing  Not good enough  Non-uniform load distribution  No heterogeneity in node’s performance  Variant of Consistent Hashing  Virtual Nodes

  11. Variant of Consistent Hashing S1 S2 S3 S3 S2 S1 Q = 12 (Virtual Nodes) S = 3 (Physical Nodes) T = Q/S = 4 (Tokens) S1 S2 S3 S3 S2 S1

  12. Variant of Consistent Hashing S3 S1 S2 S4 S3 S1 S2 S4 Q = 12 (Virtual Nodes) S = 4 (Physical Nodes) T = Q/S = 4 (Tokens) S1 S2 S3 S3 S4 S1 S2

  13. Replication Key Z  A coordinator Node(i)  (N-1) clockwise successor nodes as replicas Node(i) A  Node(i) update all other (N-1) replicas  A preference list of nodes  List size > N B D C Preference List = [A,B,C,D]

  14. Data Versioning  Eventual Consistency  Put() is returned before updating all replicas  Get() can return multiple versions for the same key  Data mutation as new version  Vector Clock

  15. Vector Clock(Example) Supplier A 500$ Sx Sy Sz 500$(1,0,0) 500$(1,0,0) 500$(1,0,0)

  16. Vector Clock(Example) Supplier A 550$ Sx Sy Sz 500$(1,0,0) 500$(1,0,0) 500$(1,0,0) 550$(2,0,0) 550$(2,0,0) 550$(2,0,0)

  17. Vector Clock(Example) Supplier B 600$ Sx Sy Sz 500$(1,0,0) 500$(1,0,0) 500$(1,0,0) 550$(2,0,0) 550$(2,0,0) 550$(2,0,0) 600$(2,1,0)

  18. Vector Clock(Example) Supplier C 650$ Sx Sy Sz 500$(1,0,0) 500$(1,0,0) 500$(1,0,0) 550$(2,0,0) 550$(2,0,0) 550$(2,0,0) 650$(2,0,1) 600$(2,1,0) 650$(2,0,1) 650$(2,0,1) Conflict!

  19. Vector Clock(Example) Supplier B Resolve Conflict Choose 650$ 600$(2,1,0)/650$(2,0,1) Sx Sy Sz 500$(1,0,0) 500$(1,0,0) 500$(1,0,0) 550$(2,0,0) 550$(2,0,0) 550$(2,0,0) 650$(2,0,1) 600$(2,1,0) 650$(2,0,1) 650$(2,0,1)

  20. Vector Clock(Example) Supplier B 650$(2,1,1) Sx Sy Sz 500$(1,0,0) 500$(1,0,0) 500$(1,0,0) 550$(2,0,0) 550$(2,0,0) 550$(2,0,0) 650$(2,0,1) 600$(2,1,0)/650$(2,0,1) 650$(2,0,1) 650$(2,1,1) 650$(2,1,1) 650$(2,1,1)

  21. Processing get() and put()  How to select a coordinator node  Load balancer (server-driven)  Partition aware client library (client-driven) N  Quorum-like system for consistency  W + R > N W R  Typical value: W=2 R=2 N=3

  22. Hinted Handoff Put() A B D A C

  23. Hinted Handoff A B D A C

  24. Replica Synchronization(Merkle Tree) Row key1 Row key2 Row key3 Row key4 128 Token: 5 Token: 135 Token: 170 Token: 185 0x0010 Hash: 0x1001 Hash: 0x1100 Hash: 0x0101 Hash: 0x0010 Range: (0,256] Depth: 3 64 192 Tokens: 8 * 32 XOR 0x1001 0x1011 32 96 160 224 XOR XOR 0 0x1011 0 0x1001 (128,160] (160,192] (192,224] (224,256] (64,96} (96,128] (0,32] (32,64} XOR XOR XOR XOR 0 0 0x1100 0x0111 0 0 0 0x1001 Example from: http://bit.ly/1fUa0CS

  25. Performance

  26. Q&A Thank you!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend