peer to peer computing
play

Peer-to-Peer Computing Peer-to-Peer (P2P) employ distributed - PDF document

Introduction Peer-to-Peer Computing Peer-to-Peer (P2P) employ distributed resources to perform function in a decentralized manner Resource can be: computing, storage, bandwidth Function can be: computing, data sharing, D.


  1. Introduction Peer-to-Peer Computing • Peer-to-Peer (P2P) employ distributed resources to perform function in a decentralized manner – Resource can be: computing, storage, bandwidth… – Function can be: computing, data sharing, D. Milojicic, V. Kalogeraki, R. Lukose, K. collaboration … • The goal of this paper is to describe what is P2P Nagaraja, J. Pruyne, B. Richard, S. Rollins and what is not P2P and Z. Xu • P2P gained visibility during Napster – But was here before ( Doom , Internet telephony ) Technical Report HPL-2002-57 – But has moved beyond ( KaZaa , Gnutella ) HP Laboratories, Palo Alto – And includes more ( Seti@home ) • Simple definition is it include sharing – giving and March 2002 obtaining from peer community Taxonomy of Computer Systems What’s New and What’s Not Simplified Architecture Centralized Peer-to-Peer Client-Server Degree of Centralization Taxonomy of P2P Systems “Hybrid” Initial communication is centralized (Tough to get around. For example, how to find peers?) Pure: Gnutella, Freenet Hybrid: Napster Intermediate: KaZaa (super peers) 1

  2. Outline Decentralization and Taxonomy • Introduction (done) • Components and Algorithms (next) • Systems • Case Studies • Summary P2P Components P2P Algorithms – Centralized Index (Specific applications here) (Different data types) (Robust when peers autonomous) • Search central index, download content from peer (Find and move data – Popular with Napster among) • Need representation for “best” peer (Overcome dynamic nature – Cheapest, closest, most available of peers) P2P Algorithms – Flooded Requests P2P Algorithms – Document Routing • Each request flooded (broadcast) to directly connected peers • When document published, generate hash – Repeat until answered or too many hops (5-9) • Uses lots of network capacity based on name and content • Move document node with ID closest to hash • Revise with • Requests also migrate to such node – “Super-Peer” to concentrate most requests – Note, requires knowing document name ahead of – Caching of recent requests time, so harder to do search 2

  3. Outline P2P Systems • Introduction • Historical (done) • Components and Algorithms • Distributed Computing (done) • Systems • File Sharing (next) • Case Studies • Collaboration • Summary Historical (1 of 2) Historical (2 of 2) • Prior to continuously connected computers • Most early distributed systems were P2P (Internet) had UUNet and Fidonet – Examples: • Email (on top of SMTP peers) – Would periodically dial-up and exchange • Usenet News (on top of NNTP peers) information (email and bboard) – Message routing – Local servers communicated with peers • Similar to Gnutella • File Transfer (via FTP) centralized • In “modern” area, first widely used P2P was – But since many ran own server, similar to instant messaging today’s file sharing • P2P interest shift came because of legal – Indexing system named “Archie” to query ramifications (Napster) across FTP servers • Exactly like Napster – (MLC: plus traffic! See next paper.) Distributed Computing P2P Systems • Clusters • Historical – Inexpensive PCs plus open source software • Distributed Computing � super computer • NASA’s Beowulf project, MOSIX, … • File Sharing • Collaboration – Issues include delegation and migration • Grid computing – Connect distributed computers so can use idle cycles – Transparent way to add jobs, have work executed, results returned 3

  4. How it Works Distributed Computing • Parallelizable job • Historical – Split into subtasks • PCs agree to – January 1999, 10k computers broke RSA participate challenge in less than 24 hours • Centralized • Users realized the power of Internet PCs dispatcher • Recent • When PCs idle (screensaver), – seti@home and genome@home subtasks work • Send results to – Realize a teraflop centralized DB • P2P? Application Area Examples P2P Systems • Financial – Complex market simulations (pricing, • Historical portfolios, credit, …) • Distributed Computing – Run-during night, but real-time important – Plus, larger so only big institutions • File Sharing – Use P2P – speedup 15 hours to 30 minutes, • Collaboration and available to smaller companies • Biotechnology – Colossal amounts of data (3 billion sequences in human genome dbase) – Only high-perf clusters and approximation – But using P2P can do exact and used by smaller companies File Sharing File Sharing Examples • One of the most successful • Napster • Features – Centralized index, single peer download – Since centralized does not scale well, performance – Large, when otherwise could not store • Multimedia content inherently large files may suffer • Morpheus – Available, from multiple sources – Simultaneous downloads from multiple peers – Anonymity to protect publisher and reader – Encryption for privacy – Manageability for better performance • KaZaa (download from close hosts) • Issues: bandwidth consumption, search, and – Distribute centralized among SuperNodes – Use “intelligent” selection for peers security – MD5 checksums to verify content 4

  5. P2P Systems Collaboration • Historical • Instant messaging to chat to online games • Distributed Computing • Finding location of peers still a challenge • File Sharing • Use centralized server for peer location • Collaboration – NetMeeting, GameSpy, … • Use out-of-band system to identify peers – Ie- call on telephone and give IP Outline Case Studies • Introduction • Avaki (done) (distributed computing) • Components and Algorithms (done) (distributed computing) • seti@home • Systems • Groove (done) (collaboration) • Case Studies • Magi (next) (collaboration) • Summary • FreeNet (file sharing) • Gnutella (file sharing) • JXTA (platforms) • .Net (platforms) Seti@home Magi (1 of 2) • Search for Extraterrestrial Intelligence • Background • P2P infrastructure for building secure, – Search through massive amounts of radio telescope data to look for signals collaborative applications – Build huge virtual computer by using idle cycles on – Started as research project from UC Internet computer Berkeley 1998, commercial release 2001 • Runs computation as part of screen saver • Uses standard technology: HTTP, XML, – Old enough project so robust tools • Features WebDAV – Fault resilience – since clients can stop at anytime, – "Web-based Distributed Authoring and use checkpointing every 10 minutes Versioning“ - extensions to HTTP to allow – Scalability – horizontal, but vertical (to db) could collaborative edits at remote web servers still be a bottleneck (still, many users) • Was largest non-Sun Java project • Lessons – Can apply this technology to real problems – Expected 100k participants, but have 3 million 5

  6. Magi (2 of 2) FreeNet • File sharing with primary design is to make system anonymous – Read, Publish, Store • Completely decentralized – File location based on hash (and on path in-between) – Hash generated automatically – Users find hash names by out-of-band source (ie- posted on Web page) • Nodes cache until full, then LRU • Core is micro-Apache server • Nodes do “search” to announce presence to others • Users could build modules over Magi services • Scales to O(log n) • Uses DNS to find Magi servers • Available as open source • No fault resilience • Lessons: issues of anonymity (good for discourse, • JVM and Server means maybe tough for PDA bad for intellectual property rights) • Existing standards makes highly interoperable .NET Summary • More than P2P (c#, tools, Web servers), but “My Services” has a lot of P2P stuff • As P2P matures, infrastructure will improve • Microsoft introduced in 2000 • Goals is to enable Web servers to variety – Increased interoperability – More robust software of devices. Focus on user data. • Will remain an important technology because: – Scalability a concern, especially with global “Passport” login connections gives puid . That used for services. – Ad-hoc, disconnected networks lend themselves to P2P Cons: - only Windows? – Some applications inherently P2p Future Work • Algorithms – Scalable, anonymity, connectivity • Applications – Beyond music and movie sharing • Platforms – Tools to build better, newer P2P systems 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend