Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universität München, Germany
iLab P2P Networks Dirk Haage Chair for Network Architectures and - - PowerPoint PPT Presentation
iLab P2P Networks Dirk Haage Chair for Network Architectures and - - PowerPoint PPT Presentation
Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universitt Mnchen, Germany iLab P2P Networks Dirk Haage Chair for Network Architectures and Services Department of Computer Science Technische
ILab2
2
Motivation
More and more private users on
the Internet (further away from the infrastructur)
Powerful private end systems Flatrates with always-on users
Why waste these resources?
Internet
Lets work together and be our own network
A need to provide services independent
from commercial or dedicated server providers
Application-specific network structures
instead of machine location-based addressing For instance, friends want their computers to be together. Why not their own network - their own overlay network?
Today, P2P causes more than 50 % of all traffic on the Internet (2007)
ILab2
3
Term: Peer-to-Peer
Peer-to-Peer systems
Distributed systems that consist of equals (peers) with no predefined
distinction between client and server and no dedicated servers or central authority. Characteristics
Peer-to-Peer networks are decentralized and take advantage of
resources at the edge of the Internet, say the computers of users, the users, etc.
End systems do not primarily serve the purpose of the Peer-to-Peer
system. their resources must not be exhausted by the Peer-to-Peer network
Computers are not always-on.
environment is less stable and more dynamic than in the traditional client-server case.
ILab2
4
Peer-to-Peer or not Peer-to-Peer
Auctions / Ebay
Peer-to-Peer
- Money and goods exchange (nothing to do with the network)
Not Peer-to-Peer
- The platform itself (Auctions, Accounts, Information transfer) and its
Information Management Skype
Peer-to-Peer
- Lookup, User Interaction, Data Exchange
Not Peer-to-Peer
- Login, Account Management
Many Peer-to-Peer systems are not purely Peer-to-Peer.
ILab2
5
Some terms from Graph Theory
Graph G=(V,E) Vertex set V = {v1, v2, …vn}
- We usally say nodes.
- n = |V|
Edge set E = {e1, e2, …em}
- We usually say links.
- m = |E|
- Can have attributes like distance, etc.
Graph G Vertex set V Edge set E v5 v4 v1 v6 v3 v2 e2 e4 e5 e2 e1 e3 e6 e7
ILab2
6
Some terms from Graph Theory
Distance d(i,j)
- Shortest path between nodes vi and vj
Diameter D of G
- Longest distance in graph G
Degree
- Node degree = number of edges adjacent to node
- Degree of a graph = max. node degree
A graph is connected if there is a path from any node in the graph to any other node in the graph.
A graph is k-connected if any k-1 nodes can be removed without causing the resulting subgraph to become disconnected.
diameter(G)=5
2 4 1 1 2 3 2 1
degree (v5) = 3 v5
distance d(v1,v6)=5
v1 v6
ILab2
7
Peer-to-Peer network
Underlay
Provides connectivity between all peers in the Peer-to-Peer network
(overlay). Peers V = {v1, v2, …vn}
Peers are the nodes of the graph G. Peers may have a name (identities are usually necessary). The set of edges E needs to be created by the Peer-to-Peer
algorithms.
- The graph needs to be connected.
- The structure should be good for the purpose of the Peer-to-Peer
system.
Underlay Peers
ILab2
8
P2P network is not static – Peers join and leave
Node joins
- Needs to be added to the
network
- Usually via some node in
the network already known (rendezvous point, list/cache of nodes)
Node leaves
- Important to keep the
graph connected
- Better not rely on a single
node that could leave anytime
How to organize such a network?
- e.g. k-connected graph
join ? leave ?
2-connected -- each node can be removed without disconnecting the graph disconnected when this node fails or leaves.
ILab2
9
Application Requirements
Application
Peer-to-Peer networks are usually created for an application or application scenario.
- Filesharing
- File Distribution
- Instant Messaging and Voice-over-IP
- Multicast
- Peer-to-Peer Video Streaming
- Anonymous communication and services
- …
The application is the purpose of the Peer-to-Peer network.
The application and its requirements determine if a given graph is a good or a bad choice.
Underlay Peers
Multicast from the white game server to its peers. Is this a good graph for fast delivery? No, a balanced tree allows O(logn) diameter.
ILab2
10
Operational aspects
Find someone to
- get something
- use a service
- interact
- interact for a cooperative service or goal
- maintain network
Find something (item, data, information, etc.) to
- get it
- set it
Interact with other nodes to cooperatively
- provide a service
- share resources
- run an algorithm
…
Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universität München, Germany
iLab²
Network Coordinate Systems
Dirk Haage Chair for Network Architectures and Services Department of Computer Science Technische Universität München http://www.net.in.tum.de
ILab2
12
Main Goal: Localization of Node
ILab2
13
Main Goal: Localization of Node
Choosing of servers
- Load balancing between hosting location
- Choose nearest instance of a service (anycast)
- Locate nearest peers in P2P networks
- Content delivery networks
- Online games (gameserver)
- Resource placement in distributed systems
- TO
Optimization of application layer multicast trees …
ILab2
14
What it isn’t
Provide location-based services
- Local advertisements
- Extend/reduce service for local/non-local users
(e.g. IPTV often restricted to country boundaries)
Find friends, coworkers, …
- Google Latitude
…
For this, you use GeoIP or similar approaches
- GPS, Cellular Positioning, Triangulation, etc.
ILab2
15
Network coordinates
Latencies between nodes as a metric for distance
- Round trip time
- Simplest measurement at all (ping)
- Most accurate (only one clock involved)
- Similar to real distance (propagation speed nearly constant)
How to get? Simple approach:
Measurements between all pairs of nodes
- O(n²)
- Does not scale (cannot be used for large networks)
- Rely on actual traffic hybrid measurement
- Normally no traffic to all nodes available
- Active measurements (even worse scaling)
You want to know the distance to a node without having to
communicate with it in the first place
ILab2
16
Network coordinates (II)
Measure the distances to some neighbors
- Neighbors might be known hosts, not near hosts
Calculate a artificial coordinate in a metric space
- Metric space = distance between nodes can be
calculated
- E.g. Euclidean n-space
Approximate the latency
- Distance between nodes in the coordinate system
is approximation to the latency
Abstract definition:
- Embed network graph into a metric space
- Metric embedding/ graph embedding
ILab2
17
Example
A D C B A D C B RTT(A,D) RTT(D,C) RTT(B,C) RTT(D,B) (x4,y4) (x3,y3) (x1,y1) (x2,y2) d(B,A)
Measured distance Estimated distance
2 1 2 1 1 1 2 2
, ) , ( ), , ( ) , ( y y x x y x y x A B d
Internet Euclidean space (2D)
ILab2
18
Network coordinates (III)
Advantages
- Small overhead
- Only requires small number of measurements
- No additional traffic
(application traffic = measurement traffic)
- Piggy-back the coordinate information
- Each host can calculate the distance to every other host
- Only requires the coordinates
Design goals
- Accuracy: small error for RTT estimations
- Scalability: large-scale networks, small overhead, no
bottlenecks
- Flexibility: adapt coordinates to network changes
- Stability: no drift, oscillation of coordinates
- Robustness: small impact of error by malicious nodes, nodes
with high errors
ILab2
19
Triangle inequality
Intuition:
direct latency between 2 nodes should be smaller than any indirection
Triangle inequality violations (TIV) inherent to Internet
routing structure
- Selective/ private peering
- Hot potato routing
- Link metric ≠ latency
- Asymmetric links (e.g. DSL, UMTS)
TIVs are common
- >85% of all host pairs part of a TIV
- For 20-35% exists a path that is at least 20% shorter
(Traces: King, Azureus)
) , ( ) , ( ) , ( c a d c b d b a d
ILab2
20
Triangle inequality (II)
Possible spaces for embedding are metric
- Distance function satisfies triangle inequality
Embedding can not be exact
- Number and weight of TIVs limits embedding quality
A C B A C B 22ms 17ms 53ms 26ms 19ms 38ms
Embedding
ILab2
21
History
Global Network Positioning (Ng, Zhang, 2002)
- Landmark nodes measure distance between eachother
- New nodes measure distance to landmarks
- Coordinates relative to landmarks
- Embedding via Downhill-Simplex in 3D space
- Problems:
- Scalability
- Placement of landmarks
- Single point of failure
Lighthouse (Pias et al., 2003)
- Several groups of landmarks
PIC (Costa, Castro, Rowstron, Key, 2004)
- Generalization of GNP
- All nodes with known coordinates can be landmarks
Big-Bang-Simulation (2004)
- Analogy to physics: nodes as particles in a force field
L L L L L L C
ILab2
22
Vivaldi (Dabek, Cox, Kaashoek, Morris, 2004)
Fully distributed
- No infrastructure, no specialized
nodes
Continuous upgrade of coordinates
with new latency values
Based on application traffic Small number of communication
partners required for meaningful results
Can be used with various types of
spaces
State of the art Actively used (e.g. bittorrent,
azureus)
ILab2
23
Vivaldi Algorithm
1.
Choose random (obviously wrong) position
2.
Initiate communication with some nodes
3.
Measure latency
4.
Nodes provide coordinates and error estimation
5.
Revise coordinates (relative to other nodes)
ILab2
24
Optimization (II)
Spring Embedder
- Physical analogy: network of springs
- Between each pair (i,j) of hosts exists a spring
- Length in equilibrium position: Lij
- Current length: ||xi-xj||
- Potential energy proportional to expansion squared:
(Lij-||xi-xj||)2
– Energy of the spring = error – Minimal energy in the system = minimal global error
- Force between i and j (Hooks law)
- Move node to minimize its energy
) ( ) (
j i j i ij ij
x x u x x L F
ILab2
25
Example
A B B A d(a,b) = 120ms d(a,b) = 95ms rtt(a,b) = 80ms
t t+1
ILab2
26
Example (II)
C B A C B A rtt(a,c) = 80ms d(a,c) = 50ms d(a,c) = 75ms
T+2 t+3
ILab2
27
Which space to choose?
Physics:
- Anology uses 3D space
- Any space with a definition of distance, difference between
coordinates and scalar multiplication possible
Question:
Which space characterizes the Internet most?
- 2D, 3D
- Sphere, torus
- Complex network complex space?
- From GNP: embedding in 3D, why?
Result from tests and simulations:
- 2-3 dimension sufficient
- More dimensions require more computation without
significant improvement
ILab2
28
Handling TIVs
Again:
- TIVs occur for asymmetric routes, links, …
- Occur quite often
- Enlarge the error for the embedding
Instead of using n dimensions, use n-1 +
height
- Euclidean n-space models the core
network
- High connectivity
- Fast, symmetric links
- Height models the slow access links
- Packets are transmitted in the core, not
above it
- Slow hosts are pushed out of the plane
ILab2
29
Overview
Setup:
Outline of the final lab:
- Emulate a Kademlia network
- Extend Kademlia implementation with ICS (skeleton provided in PreLab)
- Generate DHT lookups; measure lookup delay without and with ICS
- Plot CDFs
ILab2
30