Internet measurements at complexnetworks.fr Guillaume Valadon - - - PowerPoint PPT Presentation

internet measurements at complexnetworks fr
SMART_READER_LITE
LIVE PREVIEW

Internet measurements at complexnetworks.fr Guillaume Valadon - - - PowerPoint PPT Presentation

Internet measurements at complexnetworks.fr Guillaume Valadon - http://valadon.complexnetworks.fr LIP6 (CNRS - UPMC) Complex Networks team http://complexnetworks.fr The team http://complexnetworks.fr : plots & videos 4 permanent


slide-1
SLIDE 1

Internet measurements at complexnetworks.fr

Guillaume Valadon - http://valadon.complexnetworks.fr

LIP6 (CNRS - UPMC) Complex Networks team http://complexnetworks.fr

slide-2
SLIDE 2

The team

  • http://complexnetworks.fr : plots & videos

– 4 permanent members : Jean-Loup Guillaume, Matthieu

Latapy, Bénédicte Le Grand, Clémence Magnien

– 2 postdocs, 9 Ph.D. students

  • Focus & interests:

– Internet topology, P2P networks, social networks – measurements – analysis

2

slide-3
SLIDE 3

Outline

3

  • 1. Internet topology measurements

Frédéric Ouedraogo, Clémence Magnien, Matthieu Latapy

  • 2. eDonkey measurements: server side
  • 3. eDonkey measurements: honeypot
slide-4
SLIDE 4

Context

  • IP topology of the Internet : using traceroute-like

tools

  • few sources, high numbers of destinations
  • measures :

– long & high cost, – bias : fake links & missed links

4

slide-5
SLIDE 5

Traceroute measurements

5

slide-6
SLIDE 6

Traceroute measurements

5

slide-7
SLIDE 7

Traceroute measurements

5

slide-8
SLIDE 8

Traceroute measurements

5

slide-9
SLIDE 9

Traceroute measurements

5

slide-10
SLIDE 10

Traceroute measurements

5

slide-11
SLIDE 11

Traceroute measurements

5

*

slide-12
SLIDE 12

Traceroute measurements

5

*

slide-13
SLIDE 13

Traceroute: unbalanced load

6

many probes on links closest

500 1000 1500 2000 2500 3000 5 10 15 20 25 30

Distance from the monitor # times probed

slide-14
SLIDE 14

Traceroute: unbalanced load

6

many probes on links closest

500 1000 1500 2000 2500 3000 5 10 15 20 25 30

Distance from the monitor # times probed

Traceroute limitations: unbalanced load, information redundancy, obtained view is not a tree

slide-15
SLIDE 15

Ego-centered view

  • tracetree http://data.complexnetworks.fr/Radar/

– one source – fixed set of destinations – the result is a tree – fast measurement (~100 round per day) –

7

slide-16
SLIDE 16

Tracetree measurements

8

slide-17
SLIDE 17

Tracetree measurements

8

slide-18
SLIDE 18

Tracetree measurements

8

*

slide-19
SLIDE 19

Tracetree measurements

8

*

*

slide-20
SLIDE 20

Tracetree measurements

8

*

*

*

slide-21
SLIDE 21

Parameters

  • Many parameters:

– number of destinations – delay between rounds – maximum TTL ? – ...

  • We want:
  • 1. high frequency
  • 2. large ego-centered view
  • 3. low network load

9

slide-22
SLIDE 22

Parameters : frequency

10

!"##$!%" &#'!%(#&#)*

+,-.-- +,-/-- +,-0-- +,,--- +,,1-- +,,.-- +,,/-- +,,0-- +,1---

  • ,-

1- 2- .- 3- /- 4- 0-

# Hours # IP Test monitor Control monitor

slide-23
SLIDE 23

Parameters : frequency

10

!"##$!%" &#'!%(#&#)*

+,-.-- +,-/-- +,-0-- +,,--- +,,1-- +,,.-- +,,/-- +,,0-- +,1---

  • ,-

1- 2- .- 3- /- 4- 0-

# Hours # IP Test monitor Control monitor

frequency has no impact on discovered addresses

slide-24
SLIDE 24

Parameters : destination number

11

3000 d. 1000 d. 10000 d. 1000 d. (sim) 3000 d. 3000 d. (sim)

5000 10000 15000 20000 25000 30000 10 20 30 40 50 60 70 80 90

Test monitor Control monitor # Hours # IP

slide-25
SLIDE 25

Parameters : destination number

11

3000 d. 1000 d. 10000 d. 1000 d. (sim) 3000 d. 3000 d. (sim)

5000 10000 15000 20000 25000 30000 10 20 30 40 50 60 70 80 90

Test monitor Control monitor # Hours # IP

too many destinations == loss of efficiency

slide-26
SLIDE 26
  • Two parameter sets:

– normal: 3000 destinations, max TTL 30, 10

minutes delay (~100 rounds / day)

– fast: 1000 destinations, max TTL 15, 1 minute delay

(~ 800 rounds / day)

  • Available data at http://data.complexnetworks.fr/Radar/

– several sets of random destinations – 150 monitors – several months of uninterrupted measures

Available data

[ADN’08, ICIMP’09]

12

slide-27
SLIDE 27

Outline

13

  • 1. Internet topology measurements
  • 2. eDonkey measurements: server side

Frédéric Aidouni, Matthieu Latapy, Clémence Magnien

  • 3. eDonkey measurements: honeypot
slide-28
SLIDE 28

Context

  • study exchanges in P2P networks

– files diffusion – communities of interests – popularity

  • some motivations

– understand users behaviour – develop new P2P protocols – blind content detection – detect pedophile activities – protocol and exchange simulations

14

slide-29
SLIDE 29

eDonkey exchanges

  • 1. inter-clients: file downloads
  • 2. inter-servers: statistical data
  • 3. clients-servers: files & sources search

15

Client Serveur

Sources search Keywords search Sources list Files list

slide-30
SLIDE 30

Capturing traffic on a real server

16

Capture client eDonkey server PCAP dump

PCAP flow

PCAP decoding

eDonkey traffic UDP traffic

inspection Anonymisation and formatting XML encoding eDonkey exchanges

<opcode dir="received" TS="2786402.373146" IP="0045125351" type="high" port="02029"><OP_GLOBSEARCHREQ> <tags count="1"><anon-string>3108886</anon-string></tags> </OP_GLOBSEARCHREQ></opcode>

slide-31
SLIDE 31

Basic analysis : files sizes

17

175 MB 230 MB 350 MB 700 MB 1 GB 1.4 GB small files

5e+08 1e+09 1.5e+09 2e+09 2.5e+09 1 10 100 1000 10000

  • obtained from the server answers
  • CD-ROM size and fractions (1/2, 1/3, and 1/4)

➡ related to classical sizes of storage support

Size (in MB) # of files

slide-32
SLIDE 32

Basic analysis : time between queries

18

Time (in seconds) # of peers

slide-33
SLIDE 33

Basic analysis : time between queries

18

Time (in seconds) # of peers

regularities of queries

slide-34
SLIDE 34

Resulting data set in numbers

[HotP2P’09]

  • 10 weeks measurements
  • ~500 GB of compressed XML
  • ~ 10 billions messages
  • ~ 90 millions clients
  • ~ 280 millions of distinct files

➡ anonymized data available online at http://antipaedo.lip6.fr

19

slide-35
SLIDE 35

Outline

20

  • 1. Internet topology measurements
  • 2. eDonkey measurements: server side
  • 3. eDonkey measurements: honeypot

Oussama Allali, Matthieu Latapy, Clémence Magnien

slide-36
SLIDE 36

Honeypot based measurements

21

  • eDonkey honeypot:

– customized eDonkey client – announce files to a server (filename, hash, size) – log queries made by regular clients

  • Manager:

– control distributed honeypots – send commands to honeypots: server to

connect, files to exchange, ...

slide-37
SLIDE 37

eDonkey exchanges

22

slide-38
SLIDE 38

eDonkey exchanges

22

Send nothing Send random content

slide-39
SLIDE 39

Methodology

23

  • 24 PlanetLab nodes, running distributed honeypots:

– 12 sending no content – 12 sending random content

  • 1 greedy honeypot:

– learn files during the first day – afterwards, announce these files

distributed greedy Honeypots 24 1 Duration in days 32 15 Shared files 4 3 175 Distinct peers 110 049 871 445 Distinct files 28 007 267 047

slide-40
SLIDE 40

Parameters : distributed or greedy

24

20000 40000 60000 80000 100000 120000 5 10 15 20 25 30 35 1000 2000 3000 4000 5000 6000 Total number of peers Number of new peers Days total peer new peers 100000 200000 300000 400000 500000 600000 700000 800000 900000 2 4 6 8 10 12 14 16 18 10000 20000 30000 40000 50000 60000 70000 80000 Total number of peers Number of new peers Days total peer new peers

  • long measurements are relevant
  • effects of blacklisting and file popularity

Distributed Greedy

slide-41
SLIDE 41

Parameters : no-content & random-content

25

10000 20000 30000 40000 50000 60000 70000 80000 90000 5 10 15 20 25 30 35 Number of peers Days random content no content 200000 400000 600000 800000 1e+06 1.2e+06 1.4e+06 1.6e+06 1.8e+06 2e+06 5 10 15 20 25 30 35 Number of REQUEST-PART queries Days random content no content

HELLO messages REQUEST

  • part messages
  • advantage of sending random content
  • global and local blacklisting
slide-42
SLIDE 42

Parameters : number of honeypots

26

20000 40000 60000 80000 100000 120000 5 10 15 20 25 Number of peers Number of honeypots lower bound of 100 samples avreage of 100 samples upper bound of 100 samples

  • important benefit in using several honeypots
slide-43
SLIDE 43

Conclusion

  • several data sets available

– IP topology – eDonkey measurement:

  • server side
  • client side
  • honeypot
  • Ongoing works

– understand topology dynamics – community of interests in eDonkey – anomaly detection in the IP topology – ...

27

slide-44
SLIDE 44

Questions ?

28