analysis and modeling of the kad p2p network
play

Analysis and modeling of the KAD P2P network Bachelor thesis - PowerPoint PPT Presentation

Lehrstuhl f ur Netzarchitekturen und Netzdienste Analysis and modeling of the KAD P2P network Bachelor thesis summary presentation Maximilian Sievert Lehrstuhl f ur Netzarchitekturen und Netzdienste Institut f ur Informatik


  1. Lehrstuhl f¨ ur Netzarchitekturen und Netzdienste Analysis and modeling of the KAD P2P network Bachelor thesis summary presentation Maximilian Sievert Lehrstuhl f¨ ur Netzarchitekturen und Netzdienste Institut f¨ ur Informatik Technische Universit¨ at M¨ unchen May 29, 2013 M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 1

  2. Outline Introduction and Context Crawling framework and conducted crawls Evaluation Conclusion and future research M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 2

  3. Part I Introduction and context M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 3

  4. Motivation and goals P2P network simulators Used to analyze interaction between P2P overlay and IP underlay Behavior of P2P nodes Geographic distribution of nodes, AS Analysis of KAD (aMule/eMule) to determine metrics PlanetLab vantage points Reasons for KAD one of the largest active P2P networks simple, open source protocol M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 4

  5. Kademlia / Kad Kademlia P2P distributed hash table (DHT) Kademlia: structured P2P network, XOR distance metric Routing Table: unbalanced binary tree of k-buckets Protocol changes in Kad (eMule, aMule): 128 bit md4 key/node IDs instead of 160 bit 2 protocol versions: Packet compression above certain size M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 5

  6. Related work Steiner 2008: Blizzard crawler: [IP , TCP , UDP , ID] mapping snapshot daily full crawls for a year, zone crawls every 5 minutes for 6 months Jie Yu et al 2009 ‘ID Repetition in Kad’: similar crawler ID reuse port aliasing non-persistent IDs silent peers M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 6

  7. Part II Crawling framework M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 7

  8. Crawling process Adaptation of Steiner’s crawler Blizzard. Own additions: Protocol version 2 decompression throttle delay parameter Data structures: Queue U of discovered yet uncontacted nodes Hashset D of all discovered nodes (stores IP , UDP port) Process: Initialize U with inital set of starting peers Sender-thread loop, Receiver-thread loop Abort conditions: U empty for a while, timeout, network issue Output: Binary dump of sent requests / received responses, text log M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 8

  9. Crawling parameters Limitations: Bandwidth Firewall/Rate restrictions Parameters: Request type: number of contacts, 1-31 Request burst size: under 10 to avoid remote spam block Request throttling: Limit on nodes queried per second Zone filter: restrict ’valid’ nodes to specific ID prefix M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 9

  10. Conducted crawls TU Munich (net.in.tum.de) Early crawls: # Start Duration Discovered Queried Responsive 1 22/03/11 15:24 00:59:50 2.685.010 63% 26% 2 22/03/11 21:54 00:59:56 1.950.599 65% 29% 3 22/03/11 22:55 00:51:20 1.920.401 64% 29% 4 05/04/11 16:27 00:59:53 3.070.211 64% 24% 5 06/04/11 13:24 00:46:46 2.816.465 55% 29% 6 13/04/11 01:38 00:56:56 1.960.204 100% 21% 7 14/04/11 20:31 01:14:24 2.334.305 69% 32% 8 18/04/11 18:11 01:59:57 2.735.059 81% 19% 9 19/04/11 04:47 01:53:29 2.229.108 100% 16% 10 19/04/11 23:42 01:35:16 1.853.972 100% 20% 11 20/04/11 09:05 01:59:58 2.421.390 90% 15% 12 20/04/11 18:07 01:59:59 2.596.452 81% 18% 13 20/04/11 20:26 01:53:58 2.153.941 100% 20% Adaptation and addition of features and parameters to crawler. M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 10

  11. Conducted crawls TU Munich (net.in.tum.de) # Start Duration P Discovered Queried Responsive 14 09/05/11 12:49 01:59:56 3 ms b8 2.733.412 77% 19% 15 10/05/11 07:34 01:39:59 3 ms 2.388.565 76% 19% 16 10/05/11 14:52 01:59:59 3 ms 2.939.943 68% 21% 17 10/05/11 19:25 02:23:21 3 ms 2.517.320 100% 20% 18 10/05/11 22:15 01:50:54 3 ms 2.033.205 100% 23% 19 18/05/11 09:50 01:59:57 2 ms 2.523.813 99% 16% 20 18/05/11 12:24 01:59:57 2 ms 2.637.730 90% 16% 21 19/05/11 10:23 02:08:34 2 ms 2.615.946 100% 16% 22 19/05/11 13:46 02:30:18 2 ms 3.002.986 100% 16% 23 19/05/11 17:24 07:18:46 2 ms 3.589.031 100% 17% 24 23/05/11 01:28 02:38:23 5 ms -n 2.229.040 100% 20% 25 26/05/11 13:11 03:54:46 10 ms -m 2.957.286 70% 20% 26 30/05/11 15:12 03:35:54 4 ms -m 2.771.051 63% 21% 27 30/05/11 20:09 03:23:33 4 ms -m 2.214.414 100% 22% 28 20/06/11 07:28 01:09:37 4 ms r7 2.359.295 37% 43% 29 20/06/11 08:52 07:22:31 4 ms r7 5.787.889 100% 18% 30 22/06/11 12:00 03:09:35 4 ms r7 3.821.829 100% 17% M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 11

  12. Conducted crawls PlanetLab nodes as global vantage points: # Start Duration P Discovered Queried Responsive China (planetlab-1.sjtu.edu.cn) 31 19/05/11 13:33 07:31:44 10 ms 3.682.385 100% 15% 32 22/05/11 13:08 08:06:00 10 ms 3.875.197 100% 15% 33 20/06/11 04:27 03:09:08 10 ms r7 2.795.254 49% 28% Brazil (plab2.larc.usp.br) 34 19/05/11 13:38 08:08:17 10 ms 4.059.290 100% 17% 35 22/05/11 23:56 02:29:54 10 ms 2.259.123 100% 21% US: Denver (linux2.cs.du.edu) 36 20/06/11 00:50 05:12:27 10 ms 3.895.919 58% 25% 37 20/06/11 06:12 04:22:08 10 ms 3.818.514 54% 29% 38 22/06/11 04:03 09:59:58 20 ms 4.034.685 73% 19% US: California (planet4.cs.ucsb.edu) 39 19/05/11 14:11 00:26:25 10 ms 1.934.332 12% 61% US: Mass. (planetlab2.cs.umass.edu) 40 20/05/11 19:15 05:16:11 5 ms 2.300.916 100% 16% Italy (planetlab2.di.unito.it) 41 19/05/11 16:42 05:24:48 10 ms 2.779.760 90% 19% 42 20/05/11 19:32 03:03:37 5 ms 2.034.142 100% 18% M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 12

  13. Part III Evaluation M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 13

  14. Topology: network size M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 14

  15. Topology: ID distribution IDs commonly (aMule, eMule) randomly initialized once, then persistent. Figure : 8 bit ID prefix histogram for nodes in crawl 20110622 M M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 15

  16. Topology: ID distribution (filtered) Notable IDs: 0x0000... 0x09262ce48db41838ce94c80cdaab3fab 0x025e747cea687ccab41c95fa62a27a5d 0x1000000 4 byte prefix Figure : Filtered 8 bit ID prefix histogram for nodes in crawl 20110622 M Except for a number of client classes the assumption is valid. M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 16

  17. Topology: Node IN degree Measured as: number of unique remote nodes having a node as a contact in their routing table Figure : Histogram of observed in degrees of all/responsive nodes in crawl 20110510 M2 M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 17

  18. Geographic distribution Determined by IP mapping. Figure : Overview of geographic location of nodes in crawl 20110622 M M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 18

  19. Geographic distribution: continents Figure : Distribution of observed key metrics by continents of crawl 20110510 M2 (left) and 20110622 M (right) M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 19

  20. Geographic distribution: widely known peers Over all crawls: stable nodes (found in at least 10 crawls) with persistently large IN degree (average over 1000) Figure : Geographic location of widely known and stable peers. M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 20

  21. Autonomous systems Mapping data: 38143 distinct autonomous systems Participating number of AS: 6658 for 20110510 M2 7626 for 20110622 M [IP , TCP] clients AS in 20110510 M2 AS in 20110622 M 1 2504 2601 2 - 10 2583 3092 11 - 100 1113 1384 101 - 1000 330 392 1001 - 10000 96 120 10001 - 100000 29 33 100001 - 1000000 3 4 ASN / Name of crawl responsive ID uniqueness AS4134 Chinanet 27.2547% 15.61% 41.08 % AS4837 CNCGROUP China169 Backbone 19.9828% 20.80% 41.48 % AS3269 Telecom Italia S.p.a. 4.2992% 34.02% 82.11 % AS4808 CNCGROUP IP network China169 Beijing Province Network 3.0233% 20.37% 67.42 % AS4812 China Telecom (Group) 2.6414% 13.86% 74.38 % AS3462 Data Communication Business Group 2.1754% 40.30% 83.83 % AS3352 Internet Access Network of TDE 2.1413% 27.50% 82.02 % AS3215 France Telecom - Orange 1.7685% 30.37% 88.64 % AS9394 CHINA RAILWAY Internet(CRNET) 1.7389% 27.12% 79.33 % AS1267 Infostrada S.p.A 1.5942% 35.96% 77.10 % M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 21

  22. Autonomous systems: characteristics Characterization of AS by following features: Responsiveness: percentage of nodes from AS that responded ID uniqueness: Number of unique IDs in AS divided by Number of nodes in AS M.Sievert (TU M¨ unchen) Analysis and modeling of the KAD P2P network 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend