Improving Performance in the Gnutella Protocol Jonathan Hess - - PowerPoint PPT Presentation

improving performance in the gnutella protocol
SMART_READER_LITE
LIVE PREVIEW

Improving Performance in the Gnutella Protocol Jonathan Hess - - PowerPoint PPT Presentation

Improving Performance in the Gnutella Protocol Jonathan Hess Benjamin Poon University of California at Berkeley Department of Computer Science Cs294-4 Peer-to-Peer Systems 1 Outline Background Motivation Solution Mirroring


slide-1
SLIDE 1

1

Improving Performance in the Gnutella Protocol

Jonathan Hess Benjamin Poon

University of California at Berkeley Department of Computer Science Cs294-4 Peer-to-Peer Systems

slide-2
SLIDE 2

Cs294-4 Jonathan Hess | Benjamin Poon 2

Outline

Background Motivation Solution

Mirroring Directed Search

Results Possible Future Work

slide-3
SLIDE 3

Cs294-4 Jonathan Hess | Benjamin Poon 3

Background

Gnutella

Protocol for distributed search No centralization Searches through query flooding

Opponents

Censorship + threatening of Gnutella users

slide-4
SLIDE 4

Cs294-4 Jonathan Hess | Benjamin Poon 4

Motivation

1.

Opponents cause participation

2.

participation causes replication of

shared files

  • Same files being shared, but not as many copies

3.

replication causes

  • workload for sharing peers
  • Need for deeper query depths
  • Overall decrease in performance
slide-5
SLIDE 5

Cs294-4 Jonathan Hess | Benjamin Poon 5

Solution

Improve performance given decreased

participation

Mirroring Directed Search

slide-6
SLIDE 6

Cs294-4 Jonathan Hess | Benjamin Poon 6

Mirroring – Main Idea

  • Achieve more replication by copying

file to a willing peer (a mirror)

  • Only replicate on demand
  • Preserve blame on original sharer of

file

  • i.e., mirrors should retain plausible

deniability despite sharing the file

slide-7
SLIDE 7

Cs294-4 Jonathan Hess | Benjamin Poon 7

Mirroring Request Messages

Mirror requestor (originator) sends Mirroring Request

Message (MRM) to find a client to act as mirror

MRM(header, listeningPort, fileIndex)

No need to flood

Clients pass MRM’s only on one randomly chosen outgoing

connection

MRMTTL should be relatively high

Prevents people from intercepting query traffic to see what

file is

Con: originator must stay in network in order for

mirroring to occur

slide-8
SLIDE 8

Cs294-4 Jonathan Hess | Benjamin Poon 8

Mirroring – Sending MRMs

  • Procedure per client sharing n files F1…Fn

1.

Record demand Di (# uploads) for locally shared file Fi

2.

When Di > mirrorThreshi, request a mirror

  • Send MRM on one random outbound connection

3.

Having a new mirror means we shouldn’t create additional mirror as readily

  • mirrorThreshi + = threshIncrement
slide-9
SLIDE 9

Cs294-4 Jonathan Hess | Benjamin Poon 9

Mirroring – Receiving MRMs

1.

Mirror M sends file transfer request for MRM.fileIndex to originator O

2.

O receives request for fileIndex

3.

O adds M to its list of mirrors of fileIndex

4.

O sends M encrypted file associated with fileIndex

  • Preserves plausible deniability for mirror
  • Con: still a possibility for a client to figure out

what original file was – how?

slide-10
SLIDE 10

Cs294-4 Jonathan Hess | Benjamin Poon 10

Mirroring – Using Mirrors

  • Procedure for originator of MRMs
  • If originator has enough bandwidth
  • Serve files
  • If not enough bandwidth
  • Check if there are mirrors for fileIndex
  • If no mirrors
  • Proceed according to original Gnutella protocol
  • If has mirrors
  • Multiplex requests over set of mirrors M1...Mx
  • Send QueryHits as if they were from Mi (1 < = i < = x)

containing the decryption key

slide-11
SLIDE 11

Cs294-4 Jonathan Hess | Benjamin Poon 11

Directed Search – Motivation

  • As the ratio of free-loaders to serving peers

increases, search moves towards needle-in- a-haystack

  • Flood excels at finding piles of hay
  • Much research effort has gone into

successive deepening and file indexing

  • Directed search is not as well understood
slide-12
SLIDE 12

Cs294-4 Jonathan Hess | Benjamin Poon 12

Directed Search – Main Idea

  • Pay a one time up front cost for a

bloom filter broadcast

  • Nodes within N hops merge filter into a

collection associated with each edge

  • Collection is depth aware
  • Upon receiving a query, forward

message to n edges with highest scores

slide-13
SLIDE 13

Cs294-4 Jonathan Hess | Benjamin Poon 13

Directed Search

Query reaches nqueryTTL nodes n may be much smaller than out-degree

and queryTTL can be larger than normal TTLs

nqueryTTL< out-degreeTTL Reach more and better users Avoid free-loaders

slide-14
SLIDE 14

Cs294-4 Jonathan Hess | Benjamin Poon 14

Results

Simulation: BloomNet

Models real-world Gnutella network as close as possible

Uses statistics from many previous measurement studies of

Gnutella networks

File sharing/requesting

Master filename list of 5072 files Each client chooses to share certain number of files from

master list

Queries generated by taking a random filename at most

  • nce from master list according to modified Zipf distribution

(à la Efficient search in peer-to-peer networks, B. Yang, H. Garcia-Molina)

slide-15
SLIDE 15

Cs294-4 Jonathan Hess | Benjamin Poon 15

Results – Overview

Advantages

BloomNet finds hits better than Gnutella

Uses approximately 3x less query bandwidth As network size increases Gap in performance increases

BloomNet achieves higher % successful queries

than Gnutella

Uses approximately 3x less query bandwidth

Disadvantages

20% more total bandwidth used to run BloomNet

Can be improved using different Bloom parameters

slide-16
SLIDE 16

Cs294-4 Jonathan Hess | Benjamin Poon 16

Results – Query Success

Query Success Over Bloom Parameters

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0 0 3 384 3 768 3 1536 3 3072 4 384 4 768 4 1536 4 3072 Bloom Parameters (Depth/Buckets) Query Success

slide-17
SLIDE 17

Cs294-4 Jonathan Hess | Benjamin Poon 17

Results – Query Bandwidth

Query Bandwidth Over Bloom Parameters

10 20 30 40 50 60 70 0 0 3 384 3 768 3 1536 3 3072 4 384 4 768 4 1536 4 3072 Bloom Parameters (Depth/Buckets) Query Bandwidth

slide-18
SLIDE 18

Cs294-4 Jonathan Hess | Benjamin Poon 18

Results – Total Bandwidth

Total Bandwidth Over Bloom Parameters

200 400 600 800 1000 1200 1400 1600 1800 0 0 3 384 3 768 3 1536 3 3072 4 384 4 768 4 1536 4 3072 Bloom Parameters (Depth/Buckets) Total Bandwidth

slide-19
SLIDE 19

Cs294-4 Jonathan Hess | Benjamin Poon 19

Possible Future Work

Mirroring

More sophisticated demand realization

techniques – gossiping protocols?

Directed Search

Only highly-connected peers exchange

Bloom Filters

Better score functions for edge selection Better understanding of filter merging

slide-20
SLIDE 20

Cs294-4 Jonathan Hess | Benjamin Poon 20

Questions

slide-21
SLIDE 21

Cs294-4 Jonathan Hess | Benjamin Poon 21

slide-22
SLIDE 22

Cs294-4 Jonathan Hess | Benjamin Poon 22

Simulation Parameters

Clients

1024

Bloom Depth

3-4

Bloom Size

384-3072

Ping TTL

5

Query TTL

5-7

Mirror TTL

15