DISTRIBUTED HASH TABLES Soumya Basu November 5, 2015 CS 6410 - - PowerPoint PPT Presentation

distributed hash tables
SMART_READER_LITE
LIVE PREVIEW

DISTRIBUTED HASH TABLES Soumya Basu November 5, 2015 CS 6410 - - PowerPoint PPT Presentation

DISTRIBUTED HASH TABLES Soumya Basu November 5, 2015 CS 6410 OVERVIEW Why DHTs? Chord Dynamo PEER TO PEER What guarantees does IP provide? What features do you get? What happens if you want more? Overlay networks!


slide-1
SLIDE 1

DISTRIBUTED HASH TABLES

Soumya Basu November 5, 2015 CS 6410

slide-2
SLIDE 2

OVERVIEW

  • Why DHTs?
  • Chord
  • Dynamo
slide-3
SLIDE 3

PEER TO PEER

  • What guarantees does IP provide?
  • What features do you get?
  • What happens if you want more?
  • Overlay networks!
slide-4
SLIDE 4

CHORD PROTOCOL

  • Intended as another building block
  • Supports one operation:
  • Mapping keys to nodes
slide-5
SLIDE 5

FEATURES OF CHORD

  • Scalability
  • Provable correctness and performance
  • O(log(N)) lookups
  • Simplicity
slide-6
SLIDE 6

HOW CHORD WORKS

Finger Table for a node

slide-7
SLIDE 7

HOW CHORD WORKS

How routing works

slide-8
SLIDE 8

UNFAIR LOADS

slide-9
SLIDE 9

LOAD BALANCING

slide-10
SLIDE 10

FAULT TOLERANCE

slide-11
SLIDE 11

IMPACT

  • Distributed Hash Tables were a hot topic!
  • Chord: 12193* citations
  • Pastry: 9606* citations
  • CAN: 9010* citations

*According to Google Scholar

slide-12
SLIDE 12

DISCUSSION

  • Why was this so impactful?
  • What limitations are there to Chord? Is it easy to
  • vercome? Why/why not?
slide-13
SLIDE 13

DYNAMO

  • Another distributed hash table
  • Similar structure to Chord
  • Ring
  • Only supports get() and put()
  • Follows the CAP theorem (no strong consistency)
slide-14
SLIDE 14

STRICT PERFORMANCE

  • Service level agreements in 99.9th percentile
  • Availability
  • Latency
  • Explicitly don’t care about averages!
slide-15
SLIDE 15

FAULT TOLERANCE

  • Nodes fail all the time
  • Keys can’t be lost
  • Solution: replicate keys for next N successors
slide-16
SLIDE 16

REPLICATION

  • Sloppy quorum
  • Each nodes maintains a “preference list” of replicas
  • Requests are made on first N healthy nodes
  • Need R nodes to respond for read
  • Need W nodes to respond for write
slide-17
SLIDE 17

REPLICATION

  • Sloppy quorum
  • Developers can tune R, N and W
  • Hinted handoff
  • If node is down, periodically check for recovery
  • Include “hint” declaring original replica for key
slide-18
SLIDE 18

CONSISTENCY

  • Replication leads to consistency problems
  • Most systems resolve conflicts on writes
  • Amazon needs high write throughput
  • e.g. adding to a cart
  • Gives up on consistent reads: “eventual consistency”
slide-19
SLIDE 19

HANDLING CONFLICTS

slide-20
SLIDE 20

PERFORMANCE