central philosophy of the work making gnutella like p2p
play

Central philosophy of the work Making Gnutella-like P2P - PowerPoint PPT Presentation

Central philosophy of the work Making Gnutella-like P2P File-sharing is a dominant P2P application Systems Scalable DHTs might not be suitable for file-sharing Gnutellas design Presented by: Karthik Lakshminarayanan


  1. Central philosophy of the work Making Gnutella-like P2P • File-sharing is a dominant P2P application Systems Scalable • DHTs might not be suitable for file-sharing • Gnutella’s design Presented by: Karthik Lakshminarayanan – Simplicity – Unscalable (number of queries and system size) Yatin Chawathe, Sylvia Ratnasamy, Lee • Improve Gnutella Breslau, Nick Lanham, and Scott Shenker – Adapt overlay topology and search algorithms to accommodate heterogeneity Why not DHTs? Gnutella’s scaling problems • P2P clients are extremely transient • Gnutella performs flooding-based search – Can DHTs handle churn as well as unstructured? – Find files if they are replicated at small number of nodes – How would Bamboo compare with Gnutella? – Obvious scaling issues • Keyword searches are more prevalent • Random walks – Inverted indices might not scale – Forwarding is oblivious to node contents – No unambiguous naming convention – Forwarding is oblivious to node load • Most queries are for hay • Bias towards high degree – Well-replicated content is queried for most – Node capacity still not taken into account

  2. GIA design 1. Topology adaptation 1. Dynamic topology adaptation – Nodes are close to high-capacity nodes • Goal: Make high-capacity nodes have high 2. Active flow control scheme degree (i.e., more neighbors) – Avoid overloaded hot-spots – Explicitly handles heterogeneity • Each node has a level of satisfaction, S 3. One-hop replication of pointers to content – S = 0 if no neighbors (dissatisfied) – Allows high-capacity nodes to answer more – S = 1 if enough good neighbors (fully satisfied) queries – S is a function of capacity, degree, age of 4. Search protocol neighbors and capacity of node – Based on random walks towards high-capacity – Improve the neighbor set as long as S < 1 nodes Exploit heterogeneity 1. Topology adaptation 2. Proactive flow control • Improving neighbor set • Allocate tokens to neighbors based on processing capability – Pick a new neighbor – Cannot perform arbitrary dropping due to – Decide whether to preempt an existing random walk mechanism of GIA neighbor • Depends on degree, capacity of neighbors • Allocation is proportional to neighbors’ • Asymmetric links? capacities • Issues – Incentive to announce true capacities – Avoid oscillations – use hysteresis • Uses token assignment based on SFQ – Converge to a stable state

  3. 3. One-hop replication 4. Search protocol • Biased random walk • Each GIA node maintains index of contents of all neighbors – Pick highest capacity node to which it has tokens – If no tokens, queues till tokens arrive • Exchanged during neighbor setup • TTLs to bound duration of random walks • Periodically incrementally updated • Book-keeping • Flushed on node failures – Maintain list of neighbors to which a query (unique GUID) has been forwarded Simulation results System model • Compare four systems – FLOOD: TTL-scoped, random topologies • Capacities of nodes based on UW study – RWRT: Random walks, random topologies – Separated by 4 orders of magnitude – SUPER: Supernode-based search • Query generation rate for each node – GIA: search using GIA protocol suite – Limited by node capacity • Metric: • Keyword queries are performed – Success-rate, Delay, Hop-count – Files are randomly replicated • Knee/collapse point at a particular query rate • Control traffic consumes resources – Collapse point : • Use uniformly random graphs • Per-node query rate at the knee – Prevent bias against FLOOD and RWRT • Aggregate throughput that the system can sustain

  4. Single search response Questions addressed by 1000 simulations Collapse Point (qps/node) GIA: N=10,000 SUPER: N=10,000 10 RWRT: N=10,000 • What is the relative performance of the FLOOD: N=10,000 0.1 four algorithms? • Which of the GIA components matters 0.001 the most? 0.00001 • What is the impact of heterogeneity? 0.01 0.1 1 Replication Rate (percentage) • How does the system behave in the • GIA outperforms SUPER, RWRT & FLOOD by many face of transient nodes? orders of magnitude in terms of aggregate query load • Also scales to very large size network as replication factor determines scalability Factor Analysis Impact of Heterogeneity Algorithm Collapse Algorithm Collapse point point RWRT 0.0005 GIA 7 GIA – OHR RWRT+OHR 0.004 0.005 GIA – BIAS 6 RWRT+BIAS 0.0015 GIA – TADAPT 0.2 RWRT+TADAPT 0.001 GIA – FLWCTL 2 RWRT+FLWCTL 0.0006 10000 nodes, 0.1% replication 10000 nodes, 0.1% replication • GIA improves under heterogeneity • Large CP-HC for GIA under uniform capacities as • No single component is useful by itself; the queries are directed towards high capacity nodes combination of all of them is what makes GIA scalable

  5. Node failures Implementation 1000 • Capacity settings replication rate = 1.0% 100 replication rate = 0.5% Collapse point – Bandwidth, CPU, disk access replication rate = 0.1% (qps/node) 10 – Configured by user 1 • Satisfaction level 0.1 – Based on capacity, degree, age of neighbors and 0.01 10000 nodes GIA system capacity of node 0.001 – Adaptation interval I = T. K -(1-s) , K = degree of 10 100 1000 10000 aggressiveness Per-node max-lifetime (seconds) • Query resilience – Keep-alive message periodically sent • Even under heavy churn GIA outperforms the other – Optimizations on adaptation to avoid query dropping algorithms (under no churn) by many orders of magnitude Deployment Conclusions • GIA: scalable Gnutella – 3–5 orders of magnitude improvement in system capacity • Unstructured approach is good enough! – DHTs may be overkill – Incremental changes to deployed systems • Ran GIA on 83 nodes of PlanetLab for 15 min • Can DHTs be used for file-sharing at all? • Artificially imposed capacities on nodes • Progress of topology adaptation shown

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend