A Sybil-Proof Distributed Hash Table Chris Lesniewski-Laas M. - - PowerPoint PPT Presentation

a sybil proof distributed hash table
SMART_READER_LITE
LIVE PREVIEW

A Sybil-Proof Distributed Hash Table Chris Lesniewski-Laas M. - - PowerPoint PPT Presentation

A Sybil-Proof Distributed Hash Table Chris Lesniewski-Laas M. Frans Kaashoek MIT 28 April 2010 NSDI http://pdos.csail.mit.edu/whanau/slides.pptx Distributed Hash Table Interface: PUT( key , value ), GET( key ) value Route to peer


slide-1
SLIDE 1

A Sybil-Proof Distributed Hash Table

Chris Lesniewski-Laas M. Frans Kaashoek MIT

28 April 2010 NSDI http://pdos.csail.mit.edu/whanau/slides.pptx

slide-2
SLIDE 2

Distributed Hash Table

  • Interface: PUT(key, value), GET(key)→value
  • Route to peer responsible for key

GET( sip://alice@foo ) PUT( sip://alice@foo, 18.26.4.9 )

slide-3
SLIDE 3

The Sybil aBack on open DHTs

  • Create many pseudonyms (Sybils), join DHT
  • Sybils join the DHT as usual, disrupt rouFng

Brute‐force aBack Clustering aBack

slide-4
SLIDE 4

Sybil state of the art

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

P2P mania! Chord, Pastry, Tapestry, CAN The Sybil ABack [Douceur], Security ConsideraFons [Sit, Morris] Restricted tables [Castro et al] BFT [Rodrigues, Liskov] SPROUT, Turtle, Bootstrap graphs Puzzles [Borisov] CAPTCHA [Rowaihy et al] SybilLimit [Yu et al] SybilInfer, SumUp, DSybil (This work) P2P mania!

slide-5
SLIDE 5

ContribuFon

  • Whānau: an efficient Sybil‐proof DHT protocol

– GET cost: O(1) messages, one RTT latency – Cost to build rouFng tables: O(√N log N) storage/ bandwidth per node (for N keys) – Oblivious to number of Sybils!

  • Proof of correctness
  • PlanetLab implementaFon
  • Large‐scale simulaFons vs. powerful aBack
slide-6
SLIDE 6

Division of labor

  • ApplicaFon provides integrity
  • Whānau provides availability
  • E.g., applicaFon signs values using private key
  • Proc GET(key):

UnFl valid value found: Try value = LOOKUP(key) Repeat

slide-7
SLIDE 7

Approach

  • Use a social network to limit Sybils

– Addresses brute‐force aBack

  • New technique: layered iden4fiers

– Addresses clustering aBacks

slide-8
SLIDE 8
  • SETUP: periodically build tables using social links
  • LOOKUP: use tables to route efficiently

Two main phases

SETUP LOOKUP

Social Network RouFng Tables

key value

key value

PUT(key, value) PUT Queue

slide-9
SLIDE 9

Social links created

slide-10
SLIDE 10

Social links maintained over Internet

slide-11
SLIDE 11

Sybil region

Social network

Honest region

ABack edges

slide-12
SLIDE 12

Random walks

c.f. SybilLimit [Yu et al 2008]

slide-13
SLIDE 13

Building tables using random walks

c.f. SybilLimit [Yu et al 2008]

What have we accomplished?

  • Small fracFon (e.g. < 50%) of

bad nodes in rouFng tables

  • Bad fracFon is independent
  • f number of Sybil nodes
slide-14
SLIDE 14

SETUP LOOKUP

Social Network RouFng Tables

key value

key value

PUT(key, value) PUT Queue

slide-15
SLIDE 15

RouFng table structure

  • O(√n) fingers and O(√n) keys stored per node
  • Fingers have random IDs, cover all keys WHP
  • Lookup: query closest finger to target key

Finger tables: (ID, address) Key tables: (key,value)

Keynes Aardvark Zyzzyva Kelvin

slide-16
SLIDE 16

From social network to rouFng tables

  • Finger table: randomly sample O(√n) nodes
  • Most samples are honest

ID IP address

slide-17
SLIDE 17

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Honest nodes pick IDs uniformly

Plenty of fingers near key

slide-18
SLIDE 18

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Sybil ID clustering aBack

[HypotheFcal scenario: 50% Sybil IDs, 50% honest IDs]

Many bad fingers near key

slide-19
SLIDE 19

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Honest layered IDs mimic Sybil IDs

Layer 0 Layer 1

slide-20
SLIDE 20

Every range is balanced in some layer

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Layer 0 Layer 1

slide-21
SLIDE 21

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Two layers is not quite enough

Layer 0 Layer 1 RaFo = 1 honest : 10 Sybils RaFo = 10 honest : 100 Sybils

slide-22
SLIDE 22

Log n parallel layers is enough

  • log n layered IDs for each node
  • Lookup steps:
  • 1. Pick a random layer
  • 2. Pick a finger to query
  • 3. GOTO 1 unFl success or Fmeout

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Layer 0 Layer 1 Layer 2 Layer L

slide-23
SLIDE 23

Main theorem: secure DHT rouFng

If we run Whānau’s SETUP using:

  • 1. A social network with walk length = O(log n)

and number of aBack edges = O(n/log n)

  • 2. RouFng tables of size Ω(√N log N) per node

Then, for any input key and all but εn nodes:

  • Each lookup aBempt (i.e., coin flip) succeeds

with probability Ω(1)

  • Thus GET(key) uses O(1) messages (expected)
slide-24
SLIDE 24

EvaluaFon: Hypotheses

  • 1. Random walk technique yields good samples
  • 2. Lookups succeed under clustering aBacks
  • 3. Layered idenFfiers are necessary for security
  • 4. Performance scales the same as a one‐hop DHT
  • 5. Whānau handles network failures and churn
slide-25
SLIDE 25

Method

  • Efficient message‐based simulator

– Social network data spidered from Flickr, Youtube, DBLP, and LiveJournal (n=5.2M) – Clustering aBack, varying number of aBack edges

  • PlanetLab implementaFon
slide-26
SLIDE 26

Escape probability

0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 80 Random walk length 2M aBack edges 200K aBack edges 20K aBack edges [Flickr social network: n ≈ 1.6M, average degree ≈ 9.5]

slide-27
SLIDE 27

Walk length tradeoff

0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 80 Random walk length 2M aBack edges 200K aBack edges 20K aBack edges Clumpiness [Flickr social network: n ≈ 1.6M, average degree ≈ 9.5]

slide-28
SLIDE 28

Whānau delivers high availability

10 20 30 40 100 1000 10000 100000 1000000 Median lookup messages Table size 2M aBack edges (>n) 200K aBack edges 20K aBack edges No aBacker [Flickr social network: n ≈ 1.6M, 3√n ≈ 4000]

3√n

slide-29
SLIDE 29

Everything rests on the model… …

slide-30
SLIDE 30

ContribuFons

  • Whānau: an efficient Sybil‐proof DHT

– Use a social network to filter good nodes – Resist up to O(n/log n) aBack edges – Table size per node: O(√N log N) – Messages to route: O(1)

  • Introduced layers to combat clustering aBacks