modeling peer peer file sharing systems
play

Modeling Peer-Peer File Sharing Systems Zihui Ge, Daniel R. - PowerPoint PPT Presentation

Modeling Peer-Peer File Sharing Systems Zihui Ge, Daniel R. Figueiredo, Sharad Jaiswal, Jim Kurose, Don Towsley INFOCOM 2003 Outline P2P file sharing architectures Napster (centralized) Gnutella (flooding) Chord (routing)


  1. Modeling Peer-Peer File Sharing Systems Zihui Ge, Daniel R. Figueiredo, Sharad Jaiswal, Jim Kurose, Don Towsley INFOCOM 2003

  2. Outline  P2P file sharing architectures  Napster (centralized)  Gnutella (flooding)  Chord (routing)  General model framework for P2P file sharing  closed queuing network  model parameters  model solution  Apply model to study performance  scalability, freeloaders, etc  Summary 2

  3. User behavior in P2P file sharing search file transfer phase query response file request phase user user user generates query file transfer 3

  4. Centralized Indexing Architecture (CIA) ❏ Central server (cluster) File owner N 2 stores global index N 3 N 1 ❏ Napster Internet DB user N 5 N 4 Lookup(“LetItBe”) search file transfer phase query response phase file request user 4 generates query file transfer

  5. Distributed Indexing with Flooded queries Architecture (DIFA) ❏ Limited-scope flooding to N 2 locate files N 3 N 1 File owner ❏ Gnutella Internet user N 5 Lookup(“LetItBe”) N 4 search file transfer phase query response phase file request user 5 generates query file transfer

  6. Distributed Indexing with Hash- directed queries Architecture (DIHA) Hash(“LetItBe”)=N 4 ❏ Hash-directed query File owner N 2 RT N 3 N 1 ❏ Tapestry, Chord, CAN RT RT ❏ Only handles exact query Internet N 5 N 4 user RT RT Lookup(“LetItBe”) Hash(“LetItBe”)=N 4 search file transfer phase query response phase file request user 6

  7. File transfers File owner N 2 N 3 N 1 Internet ❏ File transfers download(“LetItBe”) directly between N 5 file owner and N 4 user receiver search file transfer phase query response phase file request user 7 generates query file transfer

  8. Modeling P2P file sharing systems: challenges  Unique workload/service model:  peers generate workload (queries, downloads) but also add service capacity (file sharing, process query)  Complex peer behavior:  transient: off-line, on-line (inactive), on-line (active - query, download)  different classes of peers: ■ freeloaders ■ service capacity 8

  9. A general model On-Line File download Query 1 Thinking p 1 Processing q p off M p M Off-Line  Closed loop, fixed population of peers  No structural dependency on architecture 9

  10. A general model w/ multiple classes of peers On-Line File download Thinking 1 p 1 Query Processing q p off M p M Off-Line  Different classes of peers have different behaviors 10

  11. Modeling query processing  Modeled by a single server queue  Service rate of queue is a function of # peers on-line: μ q (N a )  Query failure prob. (q) associated with each file request File download Thinking 1 p 1 Query Processing q p off M p M Off-Line 11

  12. Modeling file downloading  Associate each unique file in system with a “service capacity”  modeled by single server queue  Requests chosen w/ probability p j : rank j (by req. popularity)  Service capacity μ f (N a , i) is function of # replicas:  file availability: rank i (by # of replicas) File download Thinking 1  # peers on-line p 1 Query Processing q p off M p M Off-Line 12

  13. Model parameters Capacity for Capacity for proc. Capacity for = C 1 = C 0 N a / i = C 3 N a /log N a downloading queries in CIA proc. file w/ availability rank: queries in DIHA C 0 N a / i α i File download C 0 Σ c W (c) N a (c) / i Thinking 1 p 1 Query p off Processing q M p M Off-Line 13

  14. Model solutions  Performance metric: expected system throughput (# files downloaded per unit of time)  Approximate numerical solution  bottleneck analysis with multiple classes of peers  set of non-linear equations, solved via fixed-point  mostly independent of service rate functions ■ flexibility to use other functions  Simulation in more general cases  approximations validated 14

  15. Scalability with Population 1000 Workload: System Troughput: T  1000 files 100  12 hour off-line  30 minutes on-line CIA idle period 10 DIFA  average of 5 downloads per DIHA 1 active period 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 Total Population: N  System throughput scales with population size in distributed indexing architectures 15

  16. Impact of Freeloaders 100,000 non-freeloaders Freeloaders: 160  Do not share files System Throughput 120  Support query processing 80  More aggressive CIA DIFA  shorter think times 40 DIHA  double file downloads 0 0 1000 2000 3000 4000 Number of Freeloaders (in thousands)  P2P can support large number of 16 freeloaders

  17. Impact of Freeloaders (Cont’d) 100,000 non-freeloaders 12 Throughput of non-freeloader  Freeloaders impact 10 non-freeloaders 8  marginal effect 6 when system not saturated 4 CIA DIFA 2 DIHA 0 0 1000 2000 3000 4000 Number of Freeloaders (in thousands)  P2P can support large number of 17 freeloaders

  18. Mismatch between file availability and request popularity What if service capacity 60 500,000 peers doesn’t match 55 CIA popularity? System throughput 50 DIFA  Each file ranked by DIHA 45  request popularity ( j ) 40  # replicas available ( i ) 35  Randomly match 30 ranks within a window 25 20  i-w < j < i+w 0 1 1 10 100 1000 10000 . Rank permutaion window: w  Small mismatches have little effect; large mismatches do 18

  19. Supernodes (Kazaa) Kazaa: 1400 1:11 # supernodes : # total nodes =  2-level hierarchy 1:12 1200 1:10 System Throughput  top-level 1000 1:20 1:6  well provisioned 800 supernodes 600 ■ higher capacity 400 1:1  gnutella-like 200  bottom-level 0  connect to single 0.E+00 1.E+08 2.E+08 3.E+08 4.E+08 supernode Total Population: N  Hierarchical design improves system thruput 19

  20. Summary  Simple models: insights into fundamental performance questions of P2P file sharing systems  compare different architectures  scalability on peer population  impact of freeloaders  impact of imbalance of file availability and request popularity  Model extensions :  hierarchical peer structure  off-line to on-line transition phase 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend