Analysis of peer-to-peer systems: workload characterization and ef- - PowerPoint PPT Presentation

Analysis of peer-to-peer systems: workload characterization and ef- fects on traffic cacheability Mauro Andreolini University of Rome “Tor Vergata” Riccardo Lancellotti University of Modena and Reggio Emilia Philip S. Yu IBM T.J. Watson research center

File sharing  Killer application of peer-to-peer systems  More than 10^5 peers involved  More than 30% of Internet traffic is related to file sharing  Not yet widely studied  Our contribution:  Workload overview  Analytical models of some workload characteristics  Analysis of factors reducing cacheability

Experimental methodology  Traffic interception  Analyzes actual file-sharing traffic  Needs representative traffic to analyze (e.g., backbone links)  Crawling  Crawler sends queries and analyzes responses  Needs known protocols: Gnutella network  Does not need high traffic links  Different definition of some workload characteristics respect to packet Interception (e.g., resource popularity)

Overview of experiments File sharing Queries network Responses Crawler  Crawling for nearly three months (Aug-Oct 2003)  Average of 78,900 nodes for each crawler run, with peaks >100,000 nodes  Up to 1,500,000 resources per run  File sharing is a killer application for P2P

Working set composition  4 sets of resources  Video, Audio, Documents, Archives  Type identification based on filename extension  Sample downloads shows that extension is reli- able to identify file type  Results stable over time  For each type we consider  shared resources  shared bytes

Working set composition by type Audio clips accounts for the best part of shared files

Working set composition by type Archives accounts for the best part of shared bytes

Working set composition by type Shared files Shared bytes Video Audio Documents Archives Our result confirms the observations of Leibowitz et al. (obtained through traffic interception)

Analytical models  Resource size according to type  Video and archives:  Heavy tailed size distribution  Lognormal body  Pareto tail  Audio and documents  Lognormal size distribution  non heavy tailed  Volume shared by each node  Lognormal body, Pareto tail

Analytical models

Analytical models Volume of resources shared by each node

File sharing traffic cacheability  Common belief:  “File sharing download is based on HTTP, hence we can use off-the-shelf Web caches”  Not completely true  Cache hit rate estimation should take into account two differences with Web traffic  Resource identifiers:  File name  Hash code  Firewalled nodes with unroutable IP addresses

Filename vs. Content hash For popular resources the filename is not a suitable identifier: multiple files share the same name

Filename vs. Hash: Impact on cacheability  Previous studies based on traffic interception used filenames as a resource ID  Use of name as resource ID  Over-estimation of Zipf alpha parameter (popularity seems more skewed)  Under-estimation of working set size (with hashes we have a greater number of distinct resources)  Cache hit rate seems higher

Filename vs. Hash: Reduction of cache hit rate

Non-routable IP addresses: Impact on cacheability  Previous studies did not take non-routable IP addresses into account  10% nodes behind a firewall  Download from these nodes needs a push- based mechanism which is not compatible with Web caching  Resource on these nodes are not cacheable  Cache hit rate seems higher

non-routable IPs: Reduction of cache hit rate

Conclusion on cacheability  File sharing traffic is cacheable  Web caches need to be modified to take insto account file-sharing characteristics  Cache must consider also content hash (have to interact also with the query mechanism)  Cache must deal with push-based downloads

Open issues  Comparison of data obtained through different methods  Crawling  Traffic analysis  Study of time-related patterns at different ime scales:  Daily patterns  Weekly patterns  Yearly patterns

Analysis of peer-to-peer systems: workload characterization and ef- fects on traffic cacheability Mauro Andreolini University of Rome “Tor Vergata” Riccardo Lancellotti University of Modena and Reggio Emilia Philip S. Yu IBM T.J. Watson research center

Analysis of peer-to-peer systems: workload characterization and ef- - PowerPoint PPT Presentation

Analysis of peer-to-peer systems: workload characterization and ef- fects on traffic cacheability Mauro Andreolini University of Rome Tor Vergata Riccardo Lancellotti University of Modena and Reggio Emilia Philip S. Yu IBM T.J. Watson

Workload, Fatigue, and Sleep Disruption 1 Workload 1.What is workload? 2.What is the

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

WORKLOAD WORKLOAD WORKLOAD During exercise, nasal breathing causes a reduction in FEO 2

ASHA Workload Calculator What is Direct and Other indirect workload? activities Services

Comparing Hybrid Peer-to-Peer Hybrid peer-to-peer systems Systems Beverly Yang and Hector

Serverless networking (peer-to-peer computing) Peer-to-peer models Client-server computing

THE PEER-TO-PEER NETWORK JOHN NEWBERY @jfnewbery github.com/jnewbery THE PEER-TO-PEER NETWORK

Peer-to-Peer Networks 09 Random Graphs for Peer-to-Peer-Networks Christian Ortolf Technical

DAY 2 Agenda for Today Introduce the workload characterization problem. Discuss a

Day 3 Agenda for Today Formulate simple problem statement Revisit the workload

Local 006 Workload Appeal COLLECTIVE AGREEMENT 2014:LETTER OF INTENT #2 Why a Workload Appeal?

Workload Formulas Judicial Branch Workload Formulas and On-Bench Time Reporting | September 23,

Dependability within Dependability within Peer- -to to- -Peer Systems Peer Systems Peer

SpamResist: Making Peer-to-Peer Tagging SpamResist: Making Peer-to-Peer Tagging Systems Robust to

Peer to Peer Learning & Support Aims and Objectives of this Workshop Workshop 3: Peer to

Peer-to-Peer Networking and Discovery Technologies Week 6 Whats Peer-to-Peer? A different

Understanding traffic flows to improve air quality Project leader: Laura Po laura.po@unimore.it

Supporting doctors, protecting patients GMCs Future Strategy SMEC 2018 The GMCs

Teacher Leadership: What do we know so far? WHY TE HY TEAC ACHER L LEAD ADERSHIP? AND AND

Support and Supervision for AHPs A Once for Scotland approach @nesnmahp #AHPs #nesnmahp

Becoming an After- School and Play Advocate Anne Gladfelter LTET & CCT An Educators

and fuel production Harm Grobrgge European Biogas Association Strong connection between

URBANPROOF TOOL: A DECISION SUPPORT TOOL FOR CLIMATE PROOFING URBAN MUNICIPALITIES C.

Building on Momentum: A Look at Ward 6 Middle Schools Monday, April 26, 2010 J.O. Wilson

Sambuz

Useful Links

Newsletter

Mail Us

Analysis of peer-to-peer systems: workload characterization and ef- - PowerPoint PPT Presentation

Analysis of peer-to-peer systems: workload characterization and ef- fects on traffic cacheability Mauro Andreolini University of Rome Tor Vergata Riccardo Lancellotti University of Modena and Reggio Emilia Philip S. Yu IBM T.J. Watson

Workload, Fatigue, and Sleep Disruption 1 Workload 1.What is workload? 2.What is the

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

WORKLOAD WORKLOAD WORKLOAD During exercise, nasal breathing causes a reduction in FEO 2

ASHA Workload Calculator What is Direct and Other indirect workload? activities Services

Comparing Hybrid Peer-to-Peer Hybrid peer-to-peer systems Systems Beverly Yang and Hector

Serverless networking (peer-to-peer computing) Peer-to-peer models Client-server computing

THE PEER-TO-PEER NETWORK JOHN NEWBERY @jfnewbery github.com/jnewbery THE PEER-TO-PEER NETWORK

Peer-to-Peer Networks 09 Random Graphs for Peer-to-Peer-Networks Christian Ortolf Technical

DAY 2 Agenda for Today Introduce the workload characterization problem. Discuss a

Day 3 Agenda for Today Formulate simple problem statement Revisit the workload

Local 006 Workload Appeal COLLECTIVE AGREEMENT 2014:LETTER OF INTENT #2 Why a Workload Appeal?

Workload Formulas Judicial Branch Workload Formulas and On-Bench Time Reporting | September 23,

Dependability within Dependability within Peer- -to to- -Peer Systems Peer Systems Peer

SpamResist: Making Peer-to-Peer Tagging SpamResist: Making Peer-to-Peer Tagging Systems Robust to

Peer to Peer Learning &amp; Support Aims and Objectives of this Workshop Workshop 3: Peer to

Peer-to-Peer Networking and Discovery Technologies Week 6 Whats Peer-to-Peer? A different

Understanding traffic flows to improve air quality Project leader: Laura Po laura.po@unimore.it

Supporting doctors, protecting patients GMCs Future Strategy SMEC 2018 The GMCs

Teacher Leadership: What do we know so far? WHY TE HY TEAC ACHER L LEAD ADERSHIP? AND AND

Support and Supervision for AHPs A Once for Scotland approach @nesnmahp #AHPs #nesnmahp

Becoming an After- School and Play Advocate Anne Gladfelter LTET &amp; CCT An Educators

and fuel production Harm Grobrgge European Biogas Association Strong connection between

URBANPROOF TOOL: A DECISION SUPPORT TOOL FOR CLIMATE PROOFING URBAN MUNICIPALITIES C.

Building on Momentum: A Look at Ward 6 Middle Schools Monday, April 26, 2010 J.O. Wilson

Sambuz

Useful Links

Newsletter

Mail Us

Peer to Peer Learning & Support Aims and Objectives of this Workshop Workshop 3: Peer to

Becoming an After- School and Play Advocate Anne Gladfelter LTET & CCT An Educators