The Need for Collaboration between ISPs and P2P 1 P2P systems from - - PowerPoint PPT Presentation

the need for collaboration between isps and p2p
SMART_READER_LITE
LIVE PREVIEW

The Need for Collaboration between ISPs and P2P 1 P2P systems from - - PowerPoint PPT Presentation

The Need for Collaboration between ISPs and P2P 1 P2P systems from an ISP view Structured DHTs (distributed hash tables), e.g., Chord, Pastry, Tapestry, CAN, Tulip, Globally consistent protocol with efficient search


slide-1
SLIDE 1

1

The Need for Collaboration between ISPs and P2P

slide-2
SLIDE 2

2

P2P systems from an ISP view

❒ Structured

❍ DHTs (distributed hash tables), e.g., Chord, Pastry,

Tapestry, CAN, Tulip, …

❍ Globally consistent protocol with “efficient” search ❍ Ignores the underlay, arbitrary placement of “data” ❍ Inefficient routing (log n is no good)

❒ Unstructured

❍ Arbitrary neighbors, e.g., Gnutella, FastTrack, … ❍ Ignores the underlay, neighbor selection, download

location selection

❍ Inefficient routing ❍ Does its own “traffic engineering”

slide-3
SLIDE 3

3

P2P traffic

❒ Some source claim >50% of Internet traffic

❍ Examples: Bittorrent, eDonkey, Skype, GoogleTalk…

Internet traffic distribution 2007 (Germany) Source: ipoque GmbH (Nov 2007)

slide-4
SLIDE 4

4

Application Detection

slide-5
SLIDE 5

5

Problem application detection

❒ Usually only by port number! ❒ Yet applications use arbitrary ports

Benign reasons and malicious reasons

❒ Example:

Network Intrusion Detection Systems

Internet

NIDS

slide-6
SLIDE 6

6

Ports accounting > 1% of conns.

0.00% 0.00% 1.66% 1042 1.71% 1.05% 1.85% Mail 25 1.71% 1.75% 2.12% SSH 22 1.29% 2.08% 2.34% Web 443 0.00% 0.00% 1.06% 1433 0.00% 0.01% 3.53% 445

72.59% 68.13% 70.82%

Web 80

20.95% 4.08% 16.32% > 1024

79.05% 73.73% 83.68% < 1024 0.00% 0.00% 1.04% 135 % Payload % Success % Conns Port

slide-7
SLIDE 7

7

Signature-based app. detection

1,416K 125,296 73,962 94,326K Signature

265 27,279 2,495 2,126K

  • ther port

1,415K 98,017 71,467 92,228K expected port 1,447K 151,700 75,876 93,429K Port (succ.) SMTP FTP IRC HTTP Method

❒ Port information offers no information for ports > 1024 ❒ l7-filter system application signatures ❒ HTTP highly attractive for hiding other applications ❒ Most successful conns. trigger expected signature ❒ FTP higher percentage of false negatives

slide-8
SLIDE 8

8

Signature detection: well known ports

❒ Some connections trigger more than one signature ❒ Not yet wide-spread abuse ❒ But some misappropriate use of well known ports

31,889 195

1,415,428

2 459 25 SMTP 524 4,238

71,650

1,217 666x 1,158,977 41,086 59

92,228,291

80 No match Other IRC HTTP Port

slide-9
SLIDE 9

9

Architecture for dynamic analysis

❒ Goals

❍ Detection scheme independence ❍ Dynamic analysis ❍ Modularity ❍ Efficiency ❍ Customizability

❒ Design (USENIX Security’06)

❍ Dynamic processing path ❍ Per connection

dynamic analyzer trees

slide-10
SLIDE 10

10

Bro: a flexible NIDS

❒ Facts

❍ Open source ❍ Developed since 1995 by Vern Paxson ❍ Used in many research environments, e.g.,

UCB, LBL, TUM, The Grid, NERSC, ESnet, NCSA

❍ Supports anomaly as well as misuse detection

❒ Design goals

❍ Reliable detection of attacks ❍ High-performance ❍ Separation of base functionality from specific

security policies

❍ Robust against attacks on itself

slide-11
SLIDE 11

11

Bro’s protocol analyzers

❒ Full analysis

❍ HTTP, FTP, telnet, rlogin, rsh, RPC, DCE/RPC, DNS,

Windows Domain Service, SMTP, IRC, POP3, NTP, ARP, ICMP, Finger, Ident, Gnutella ❒ Partial analysis

❍ NFS, SMB, NCP, SSH, SSL, IPv6, TFTP, …

❒ In progress

❍ AIM, BGP, DHCP, Windows RPC, SMB, NetBIOS, NCP,

Skype, Bittorent

slide-12
SLIDE 12

12

Reliable detection of non-standard ports

❒ UCB: 1 day

internal remote FTP servers: 6 17 HTTP servers: 568 54,830 IRC servers: 2 33 SMTP servers: 8 8

❒ MWN similar ❒ Non-standard port connection

❍ UCB: 99% HTTP (28% Gnutella, 22% Apache) ❍ MWN: 92% HTTP (21% BitTorrent, 20% Gnutella),

7% FTP

❍ Two open HTTP proxy detected: now closed ❍ SMTP server that allowed relay: now closed

slide-13
SLIDE 13

13

Payload inspection of FTP data transfers

❒ FTP data transfers use arbitrary ports ❒ No longer a problem: dynamic prediction table ❒ File analyzer examines connection’s payload

❍ Can determine file-type (LIBMAGIC) ❍ Can check if actual file-type == expected file-type

❒ Extensions:

❍ SMTP analyzer (using pipeline) ❍ Virus checker

slide-14
SLIDE 14

14

Detecting IRC-based Botnets

❒ Idea

❍ Botnets like IRC protocol (remote control features) ❍ Botnet detector on top of IRC analyser

  • Checks client nickname for typical patterns
  • Checks channel topics for typical botnet commands
  • Checks if new clients connect with IRC to identified bot-servers

❒ Results

❍ MWN:

  • > 100 distinct IPs with Botnet clients
  • Now part of a automatic prevention system

❍ UCB:

  • 15 distinct IPs
slide-15
SLIDE 15

15

Summary: dynamic app. analysis

❒ Ideas:

❍Dynamic processing path ❍Per connection dynamic analyzer trees

❒ Operational at three large-scale networks ❒ Detected significant number of security incidents ❒ Bot-detection now automatically blocks IP

slide-16
SLIDE 16

16

The Need for Collaboration between ISPs and P2P

slide-17
SLIDE 17

17

P2P from an ISPs view

❒ Good:

❍ P2P applications fill a void ❍ P2P applications are easy to develop and deploy ❍ P2P applications spur broadband demand

❒ Bad:

❍ P2P systems form overlays at application layer ❍ Routing layer functionality duplicated at app layer ❍ P2P topology agnostic of underlay performance loss ❍ Traffic engineering difficult with P2P traffic

❒ ISPs are in a dilemma

slide-18
SLIDE 18

18

ISP dilemma: Unstructured networks

Random/RTT-based peer selection inefficient network resource usage

slide-19
SLIDE 19

19

Solution? ISP-P2P cooperation

❒ Insight: ISP knows its network

❍ Node: bandwidth, geographical location, service class ❍ Routing: policy, OSPF/BGP metrics, distance to peers

slide-20
SLIDE 20

20

Solution?: ISP-P2P cooperation

❒ Insight: ISP knows its network

❍ Node: bandwidth, geographical location, service class ❍ Routing: policy, OSPF/BGP metrics, distance to peers

❒ One proposal:

❍ ISPs: offer oracle that provides network distance info ❍ P2P: use oracle to build P2P neighborhoods

slide-21
SLIDE 21

21

Solution?: ISP-P2P cooperation

❒ Insight: ISP knows its network

❍ Node: bandwidth, geographical location, service class ❍ Routing: policy, OSPF/BGP metrics, distance to peers

❒ One proposal:

❍ ISPs: offer oracle that provides network distance info ❍ P2P: use oracle to build P2P neighborhoods

❒ General proposal:

❍ Offer network based interfaces to applications ❍ To enable information exchange ❍ To enable pushing services inside the network ❍ Network based enablers…

slide-22
SLIDE 22

22

Solution?: ISP-P2P cooperation

❒ Insight: ISP knows its network

❍ Node: bandwidth, geographical location, service class ❍ Routing: policy, OSPF/BGP metrics, distance to peers

❒ Oracle concept

❍ Service of AS / ISP ❍ Input: list of possible dst IPs ❍ Ouput: ranked list of dst IPs

  • E.g. according to distances between src IP and dst IPs
slide-23
SLIDE 23

23

Oracle service

slide-24
SLIDE 24

24

Oracle service (2.)

Oracle-based peer selection for topology and content exchange

slide-25
SLIDE 25

25

Oracle service (3.)

Oracle-based peer selection localizes topology and traffic

slide-26
SLIDE 26

26

ISP-P2P cooperation?

❒ ISP-aided optimal P2P neighbour selection

❍ Simple and general solution, open for all overlays ❍ Run as Web server or UDP service at known location

❒ Benefits: P2P

❍ No need to measure path characteristics ❍ Easy to avoid bottlenecks => better performance

❒ Benefits: ISPs

❍ Regains control over traffic ❍ Cost savings ❍ No legal issues (as no content is cached)

slide-27
SLIDE 27

27

Evaluation?????

❒ Impact

❍ Topology ❍ Congestion ❍ End-user performance

❒ Methodology????

slide-28
SLIDE 28

28

Evaluation?????

❒ Impact

❍ Topology ❍ Congestion ❍ End-user performance

❒ Methodology

❍ Sensitivity study ❍ Use different ISP / P2P topologies ❍ Use different user behavioral patterns

  • Content availability, churn, query patterns

❍ Evaluate effects of on end-user experience

slide-29
SLIDE 29

29

End-user performance evaluation

❒ Packet-level simulations

❍ Scalable Simulation Framework (SSFNet) ❍ Models for IP, TCP, HTTP, BGP, OSPF, etc. ❍ Limited to about 700 overlay peers (memory constraints)

❒ Gnutella-based P2P system

❍ Content search via flooding ❍ Content exchange via HTTP

❒ Topologies: several ❒ User behavioral patterns: several

slide-30
SLIDE 30

30

Topologies: ISP vs. P2P

❒ Germany

❍ 12 ISP’s (subset derived from published measurements) ❍ 700 peers distributed according to ISP-published customer numbers

❒ USA

❍ 25 Major ISP’s (from Rocketfuel) ❍ 700 peers distributed in AS’s according to city population

❒ World topologies

❍ Sub-sample of measured Internet AS-Topologies: 16 AS’s, 700 peers

World3 World2 World1 1 / 50 1 / 355 1 / 10 Tier1 (# AS / # peers) 5 / 46 5 / 23 5 / 46 Tier2 (# AS / # peers) 10 / 42 10 / 23 10 / 46 Tier3 (# AS / # peers)

slide-31
SLIDE 31

31

P2P user behavior

❒ Churn: online/offline duration

❍ Pareto and Weibull – close to observed behavior ❍ Uniform – base comparison ❍ Poisson – reflects worst-case scenario

❒ Content: type, availability and distribution

❍ Constant size (512kB) ❍ Pareto and Weibull – typical (many free-riders) ❍ Uniform – base comparison ❍ Poisson – hypothetical case (most peers sharing)

slide-32
SLIDE 32

32

ISP experience: Intra-AS content

❒ Content stays within ISPs network

❍ Without oracle 10 to 35% ❍ With oracle 55 to 80%

❒ Consistent with Telefonica field trial results for BBC

slide-33
SLIDE 33

33

ISP experience: Intra AS content (2.)

❒ Content stays within ISPs network

slide-34
SLIDE 34

34

User experience: Download time

❒ Mean download time reduction: 1 – 3 secs (16 – 34%) ❒ Consistent across topologies

slide-35
SLIDE 35

35

User experience: Download time (2.)

❒ Reduced mean download time

slide-36
SLIDE 36

36

Overlay-underlay topology correlation

Random vs. biased P2P topology

slide-37
SLIDE 37

37

Summary

❒ Oracle

❍ Simple and easy to implement

❒ Evaluation shows

❍ Overlay graph structure not affected ❍ Reduced AS distance

  • P2P topology correlated with AS topology

❍ Traffic congestion analysis

  • Reduces inter-AS traffic => load and costs
  • Traffic distribution close to theoretical optimum

❒ Benefits

❍ ISPs: regain control of network traffic ❍ P2P network: sees performance improvements

slide-38
SLIDE 38

38

Potential advantage of Multi-Homing

slide-39
SLIDE 39

39

Community network

slide-40
SLIDE 40

40

Potential advantage of Multi-Homing

❒ Idea

❍ Share broadband connections of private customers to

third party users via WiFi

❍ Enable nomadic Internet users to get access with better

coverage at a lower cost ❒ Advantage

❍ Public WiFi coverage will dramatically increase without

rolling out costly infrastructure

❍ More revenues are generated by nomadic users ❍ Ubiquitous WiFi roaming can be achieved

slide-41
SLIDE 41

41

Possible benefits of Multi-Homing?

❒ Explore impact of each component

❍ Algorithm ❍ Traffic ❍ Network

  • DSL
  • Wireless
slide-42
SLIDE 42

42

Traffic?

❒ Artificial

❍ P2P Bittorrent ❍ Web workload

❒ Real

❍ Flow level traces

  • From TU-München: 2007
  • Crawdad
slide-43
SLIDE 43

43

Algorithm?

❒ Direct

❍ No rerouting

❒ FatPipe

❍ Ideal case

❒ FullKnowledge

❍ Min # of bandwidth limited flows

❒ MinLarge Flows

❍ Min # of large flows

slide-44
SLIDE 44

44

Approach

❒ Multifacet

❍ Simulation

  • Fast special purpose simulator
  • Flow level
  • Fair sharing
  • Slow start
  • Fluid assumption
  • Different flow types

– RTT limited – Interactive – Bandwidth limited

❍ Test bed

slide-45
SLIDE 45

45

Test bed

❒ Network: Wired and wireless ❒ Access: DSL or NistNet

slide-46
SLIDE 46

46

Evaluation via simulator: 2Mbit DSL

❒ Significant benefit for bulky flows

slide-47
SLIDE 47

47

Simulator vs. test bed?

❒ Good agreement

slide-48
SLIDE 48

48

Simulator: direct vs. routed

❒ Clear difference :-)

slide-49
SLIDE 49

49

❒ Benefit increases with congestion

Simulator: varying DSL connectivity

slide-50
SLIDE 50

50

❒ More congestion => more benefit

Simulator: varying DSL connectivity

slide-51
SLIDE 51

51

❒ Lots of potential: heuristics are promising

Simulator: algorithms (2Mbit DSL)

slide-52
SLIDE 52

52

Test bed: Bittorrent – NistNet

❒ Three clients (1 Mbit) => factor 3 improvement :-) 200 400 600 800 1000 0.8 1.6 2.4 Experiment time [s] Download rate [Mbits/s] direct lib w/minf

slide-53
SLIDE 53

53

Test bed: Web – 2Mbit NistNet

❒ Overhead for small flows (prototype)

but significant benefits (~ factor 2.5 for flows > 0.5 sec)

slide-54
SLIDE 54

54

Test bed: MWN – 2Mbit NistNet

❒ Overhead for small flows (prototype)

but significant benefits (~ factor 3 for flows > 0.5 sec)

slide-55
SLIDE 55

55

Test bed: MWN trace – 2Mbit DSL

❒ Mean improves by 2.2 for bulky (blue) flows ❒ Mean improves by 3 for bulky flows > 0.5 seconds

slide-56
SLIDE 56

56

Test bed: MWN – DSL vs. NistNet

❒ Small differences

slide-57
SLIDE 57

57

Test bed: MWN – wired vs. wireless

❒ Almost no difference

slide-58
SLIDE 58

58

Benefit of flow-based routing

❒ It is possible (have prototype) ❒ Significant benefits (up to a factor of 3) ❒ Achievable benefit already quite nice.

Still some room for improvement.

❒ Methodology

❍ Simulation and test bed approach valuable ❍ Simulation: quick (and dirty) ❍ Test bed: slow but with real world constraints

slide-59
SLIDE 59

59

Two approaches: Router vs. Client

❒ Router:

❍ Operator-assisted/ controlled ❍ Modifications required in the

wireless router firmware, vendors participation

❍ No multihomed end user

devices needed

❍ More accurate congestion

information (wired/wireless)

❒ Client

❍ No operator control on client

flow re-routing

❍ No modifications to the router,

no involvement of vendors

❍ Only a software running in the

client