iLab P2P Networks Dirk Haage Chair for Network Architectures and - - PowerPoint PPT Presentation

ilab
SMART_READER_LITE
LIVE PREVIEW

iLab P2P Networks Dirk Haage Chair for Network Architectures and - - PowerPoint PPT Presentation

Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universitt Mnchen, Germany iLab P2P Networks Dirk Haage Chair for Network Architectures and Services Department of Computer Science Technische


slide-1
SLIDE 1

Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universität München, Germany

iLab²

P2P Networks

Dirk Haage Chair for Network Architectures and Services Department of Computer Science Technische Universität München http://www.net.in.tum.de

slide-2
SLIDE 2

ILab2

2

Motivation

 More and more private users on

the Internet (further away from the infrastructur)

 Powerful private end systems  Flatrates with always-on users

Why waste these resources?

Internet

Lets work together and be our own network

 A need to provide services independent

from commercial or dedicated server providers

 Application-specific network structures

instead of machine location-based addressing For instance, friends want their computers to be together.  Why not their own network - their own overlay network?

Today, P2P causes more than 50 % of all traffic on the Internet (2007)

slide-3
SLIDE 3

ILab2

3

Term: Peer-to-Peer

Peer-to-Peer systems

 Distributed systems that consist of equals (peers) with no predefined

distinction between client and server and no dedicated servers or central authority. Characteristics

 Peer-to-Peer networks are decentralized and take advantage of

resources at the edge of the Internet, say the computers of users, the users, etc.

 End systems do not primarily serve the purpose of the Peer-to-Peer

system.  their resources must not be exhausted by the Peer-to-Peer network

 Computers are not always-on.

 environment is less stable and more dynamic than in the traditional client-server case.

slide-4
SLIDE 4

ILab2

4

Peer-to-Peer or not Peer-to-Peer

Auctions / Ebay

 Peer-to-Peer

  • Money and goods exchange (nothing to do with the network)

 Not Peer-to-Peer

  • The platform itself (Auctions, Accounts, Information transfer) and its

Information Management Skype

 Peer-to-Peer

  • Lookup, User Interaction, Data Exchange

 Not Peer-to-Peer

  • Login, Account Management

Many Peer-to-Peer systems are not purely Peer-to-Peer.

slide-5
SLIDE 5

ILab2

5

Some terms from Graph Theory

 Graph G=(V,E)  Vertex set V = {v1, v2, …vn}

  • We usally say nodes.
  • n = |V|

 Edge set E = {e1, e2, …em}

  • We usually say links.
  • m = |E|
  • Can have attributes like distance, etc.

Graph G Vertex set V Edge set E v5 v4 v1 v6 v3 v2 e2 e4 e5 e2 e1 e3 e6 e7

slide-6
SLIDE 6

ILab2

6

Some terms from Graph Theory

Distance d(i,j)

  • Shortest path between nodes vi and vj

Diameter D of G

  • Longest distance in graph G

Degree

  • Node degree = number of edges adjacent to node
  • Degree of a graph = max. node degree

A graph is connected if there is a path from any node in the graph to any other node in the graph.

A graph is k-connected if any k-1 nodes can be removed without causing the resulting subgraph to become disconnected.

diameter(G)=5

2 4 1 1 2 3 2 1

degree (v5) = 3 v5

distance d(v1,v6)=5

v1 v6

slide-7
SLIDE 7

ILab2

7

Peer-to-Peer network

Underlay

 Provides connectivity between all peers in the Peer-to-Peer network

(overlay). Peers V = {v1, v2, …vn}

 Peers are the nodes of the graph G.  Peers may have a name (identities are usually necessary).  The set of edges E needs to be created by the Peer-to-Peer

algorithms.

  • The graph needs to be connected.
  • The structure should be good for the purpose of the Peer-to-Peer

system.

Underlay Peers

slide-8
SLIDE 8

ILab2

8

P2P network is not static – Peers join and leave

Node joins

  • Needs to be added to the

network

  • Usually via some node in

the network already known (rendezvous point, list/cache of nodes)

Node leaves

  • Important to keep the

graph connected

  • Better not rely on a single

node that could leave anytime

How to organize such a network?

  • e.g. k-connected graph

join ? leave ?

2-connected -- each node can be removed without disconnecting the graph disconnected when this node fails or leaves.

slide-9
SLIDE 9

ILab2

9

Application Requirements

Application

Peer-to-Peer networks are usually created for an application or application scenario.

  • Filesharing
  • File Distribution
  • Instant Messaging and Voice-over-IP
  • Multicast
  • Peer-to-Peer Video Streaming
  • Anonymous communication and services

The application is the purpose of the Peer-to-Peer network.

The application and its requirements determine if a given graph is a good or a bad choice.

Underlay Peers

Multicast from the white game server to its peers. Is this a good graph for fast delivery?  No, a balanced tree allows O(logn) diameter.

slide-10
SLIDE 10

ILab2

10

Operational aspects

 Find someone to

  • get something
  • use a service
  • interact
  • interact for a cooperative service or goal
  • maintain network

 Find something (item, data, information, etc.) to

  • get it
  • set it

 Interact with other nodes to cooperatively

  • provide a service
  • share resources
  • run an algorithm

 …

slide-11
SLIDE 11

Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universität München, Germany

iLab²

Network Coordinate Systems

Dirk Haage Chair for Network Architectures and Services Department of Computer Science Technische Universität München http://www.net.in.tum.de

slide-12
SLIDE 12

ILab2

12

Main Goal: Localization of Node

slide-13
SLIDE 13

ILab2

13

Main Goal: Localization of Node

 Choosing of servers

  • Load balancing between hosting location
  • Choose nearest instance of a service (anycast)
  • Locate nearest peers in P2P networks
  • Content delivery networks
  • Online games (gameserver)
  • Resource placement in distributed systems
  • TO

 Optimization of application layer multicast trees  …

slide-14
SLIDE 14

ILab2

14

What it isn’t

 Provide location-based services

  • Local advertisements
  • Extend/reduce service for local/non-local users

(e.g. IPTV often restricted to country boundaries)

 Find friends, coworkers, …

  • Google Latitude

 …

For this, you use GeoIP or similar approaches

  • GPS, Cellular Positioning, Triangulation, etc.
slide-15
SLIDE 15

ILab2

15

Network coordinates

 Latencies between nodes as a metric for distance

  • Round trip time
  • Simplest measurement at all (ping)
  • Most accurate (only one clock involved)
  • Similar to real distance (propagation speed nearly constant)

 How to get?  Simple approach:

Measurements between all pairs of nodes

  • O(n²)
  • Does not scale (cannot be used for large networks)
  • Rely on actual traffic  hybrid measurement
  • Normally no traffic to all nodes available
  • Active measurements (even worse scaling)

 You want to know the distance to a node without having to

communicate with it in the first place

slide-16
SLIDE 16

ILab2

16

Network coordinates (II)

 Measure the distances to some neighbors

  • Neighbors might be known hosts, not near hosts

 Calculate a artificial coordinate in a metric space

  • Metric space = distance between nodes can be

calculated

  • E.g. Euclidean n-space

 Approximate the latency

  • Distance between nodes in the coordinate system

is approximation to the latency

 Abstract definition:

  • Embed network graph into a metric space
  • Metric embedding/ graph embedding
slide-17
SLIDE 17

ILab2

17

Example

A D C B A D C B RTT(A,D) RTT(D,C) RTT(B,C) RTT(D,B) (x4,y4) (x3,y3) (x1,y1) (x2,y2) d(B,A)

Measured distance Estimated distance

 

2 1 2 1 1 1 2 2

, ) , ( ), , ( ) , ( y y x x y x y x A B d    

Internet Euclidean space (2D)

slide-18
SLIDE 18

ILab2

18

Network coordinates (III)

 Advantages

  • Small overhead
  • Only requires small number of measurements
  • No additional traffic

(application traffic = measurement traffic)

  • Piggy-back the coordinate information
  • Each host can calculate the distance to every other host
  • Only requires the coordinates

 Design goals

  • Accuracy: small error for RTT estimations
  • Scalability: large-scale networks, small overhead, no

bottlenecks

  • Flexibility: adapt coordinates to network changes
  • Stability: no drift, oscillation of coordinates
  • Robustness: small impact of error by malicious nodes, nodes

with high errors

slide-19
SLIDE 19

ILab2

19

Triangle inequality

 Intuition:

direct latency between 2 nodes should be smaller than any indirection

 Triangle inequality violations (TIV) inherent to Internet

routing structure

  • Selective/ private peering
  • Hot potato routing
  • Link metric ≠ latency
  • Asymmetric links (e.g. DSL, UMTS)

 TIVs are common

  • >85% of all host pairs part of a TIV
  • For 20-35% exists a path that is at least 20% shorter

(Traces: King, Azureus)

) , ( ) , ( ) , ( c a d c b d b a d  

slide-20
SLIDE 20

ILab2

20

Triangle inequality (II)

 Possible spaces for embedding are metric

  • Distance function satisfies triangle inequality

 Embedding can not be exact

  • Number and weight of TIVs limits embedding quality

A C B A C B 22ms 17ms 53ms 26ms 19ms 38ms

Embedding

slide-21
SLIDE 21

ILab2

21

History

Global Network Positioning (Ng, Zhang, 2002)

  • Landmark nodes measure distance between eachother
  • New nodes measure distance to landmarks
  • Coordinates relative to landmarks
  • Embedding via Downhill-Simplex in 3D space
  • Problems:
  • Scalability
  • Placement of landmarks
  • Single point of failure

Lighthouse (Pias et al., 2003)

  • Several groups of landmarks

PIC (Costa, Castro, Rowstron, Key, 2004)

  • Generalization of GNP
  • All nodes with known coordinates can be landmarks

Big-Bang-Simulation (2004)

  • Analogy to physics: nodes as particles in a force field

L L L L L L C

slide-22
SLIDE 22

ILab2

22

Vivaldi (Dabek, Cox, Kaashoek, Morris, 2004)

 Fully distributed

  • No infrastructure, no specialized

nodes

 Continuous upgrade of coordinates

with new latency values

 Based on application traffic  Small number of communication

partners required for meaningful results

 Can be used with various types of

spaces

 State of the art  Actively used (e.g. bittorrent,

azureus)

slide-23
SLIDE 23

ILab2

23

Vivaldi Algorithm

1.

Choose random (obviously wrong) position

2.

Initiate communication with some nodes

3.

Measure latency

4.

Nodes provide coordinates and error estimation

5.

Revise coordinates (relative to other nodes)

slide-24
SLIDE 24

ILab2

24

Optimization (II)

 Spring Embedder

  • Physical analogy: network of springs
  • Between each pair (i,j) of hosts exists a spring
  • Length in equilibrium position: Lij
  • Current length: ||xi-xj||
  • Potential energy proportional to expansion squared:

(Lij-||xi-xj||)2

– Energy of the spring = error – Minimal energy in the system = minimal global error

  • Force between i and j (Hooks law)
  • Move node to minimize its energy

) ( ) (

j i j i ij ij

x x u x x L F     

slide-25
SLIDE 25

ILab2

25

Example

A B B A d(a,b) = 120ms d(a,b) = 95ms rtt(a,b) = 80ms

t t+1

slide-26
SLIDE 26

ILab2

26

Example (II)

C B A C B A rtt(a,c) = 80ms d(a,c) = 50ms d(a,c) = 75ms

T+2 t+3

slide-27
SLIDE 27

ILab2

27

Which space to choose?

 Physics:

  • Anology uses 3D space
  • Any space with a definition of distance, difference between

coordinates and scalar multiplication possible

 Question:

Which space characterizes the Internet most?

  • 2D, 3D
  • Sphere, torus
  • Complex network  complex space?
  • From GNP: embedding in 3D, why?

 Result from tests and simulations:

  • 2-3 dimension sufficient
  • More dimensions require more computation without

significant improvement

slide-28
SLIDE 28

ILab2

28

Handling TIVs

 Again:

  • TIVs occur for asymmetric routes, links, …
  • Occur quite often
  • Enlarge the error for the embedding

 Instead of using n dimensions, use n-1 +

height

  • Euclidean n-space models the core

network

  • High connectivity
  • Fast, symmetric links
  • Height models the slow access links
  • Packets are transmitted in the core, not

above it

  • Slow hosts are pushed out of the plane
slide-29
SLIDE 29

ILab2

29

Overview

Setup:

Outline of the final lab:

  • Emulate a Kademlia network
  • Extend Kademlia implementation with ICS (skeleton provided in PreLab)
  • Generate DHT lookups; measure lookup delay without and with ICS
  • Plot CDFs
slide-30
SLIDE 30

ILab2

30

Thanks for listening! Questions?