deploying large payloads at scale
play

Deploying large payloads at scale Ramon van Alteren Wednesday, - PowerPoint PPT Presentation

Deploying large payloads at scale Ramon van Alteren Wednesday, November 9, 2011 Hyves 9,7M dutch members (16,7M population) ~7M unique visitors / month (Comscore 09/2011) ~2.3M unique visitors / day 800.000 photo uploads /


  1. Deploying large payloads at scale Ramon van Alteren Wednesday, November 9, 2011

  2. Hyves • 9,7M dutch members (16,7M population) • ~7M unique visitors / month (Comscore 09/2011) • ~2.3M unique visitors / day • 800.000 photo uploads / day • 7M chat messages / day Wednesday, November 9, 2011

  3. Hyves - Operational environment • 3500 node serverpark in 3 datacenters • 6Gbps daily outgoing traffic • System Engineering team: 12 • Development team: 33 Wednesday, November 9, 2011

  4. Weekend project Wednesday, November 9, 2011

  5. Weekend project Result: 4.5 x speed/throughput increase Wednesday, November 9, 2011

  6. Weekend -> Company project Wednesday, November 9, 2011

  7. A Few Minor problems Wednesday, November 9, 2011

  8. A Few Minor problems • compilation took ~40-60 minutes Wednesday, November 9, 2011

  9. A Few Minor problems • compilation took ~40-60 minutes • resulting binary was 750MB Wednesday, November 9, 2011

  10. A Few Minor problems • compilation took ~40-60 minutes • resulting binary was 750MB • Code issues Wednesday, November 9, 2011

  11. A Few Minor problems • compilation took ~40-60 minutes • resulting binary was 750MB • Code issues • gcc 4.5.2 required + added deps Wednesday, November 9, 2011

  12. Part I - Solving the build problem Jenkins to the rescue Wednesday, November 9, 2011

  13. Part I - Solving the build problem Jenkins to the rescue Add distCC to speed up compilation times Wednesday, November 9, 2011

  14. Part I - Solving the build problem Add some serious hardware Wednesday, November 9, 2011

  15. Part I - Solving the build problem Add some serious hardware Compile / build in < 6 mins Wednesday, November 9, 2011

  16. Part II: Deploying Sequential ? Wednesday, November 9, 2011

  17. Part II: Deploying Sequential ? 500MB @ 1Gb/s = 4 seconds 500MB @ 500Mb/s = 8 seconds 500MB @ 200Mb/s = 20 seconds Wednesday, November 9, 2011

  18. Part II: Deploying Sequential ? 500MB @ 1Gb/s = 4 seconds 500MB @ 500Mb/s = 8 seconds 500MB @ 200Mb/s = 20 seconds 450 servers * 8 seconds = 3600 seconds == 1 hour 450 servers * 20 seconds = 9000 seconds == 2.5 hour Wednesday, November 9, 2011

  19. Part II: Deploying Sequential ? 500MB @ 1Gb/s = 4 seconds 500MB @ 500Mb/s = 8 seconds 500MB @ 200Mb/s = 20 seconds 450 servers * 8 seconds = 3600 seconds == 1 hour 450 servers * 20 seconds = 9000 seconds == 2.5 hour Diffs ? Wednesday, November 9, 2011

  20. Part II: Deploying Sequential ? 500MB @ 1Gb/s = 4 seconds 500MB @ 500Mb/s = 8 seconds 500MB @ 200Mb/s = 20 seconds 450 servers * 8 seconds = 3600 seconds == 1 hour 450 servers * 20 seconds = 9000 seconds == 2.5 hour Diffs ? binary diff would be between 10KB - 400MB Even on consecutive runs without Wednesday, November 9, 2011

  21. Part II: Deploying - Bittorrent Wednesday, November 9, 2011

  22. Bittorrent - Previous experiences Naive run using bittorrent to transport 300MB throughout our serverpark • Near-complete network outage due to bandwidth starvation • Several crucial subsystems delayed or unreachable due to network bandwidth shortage Wednesday, November 9, 2011

  23. Bittorrent - Previous experiences Wednesday, November 9, 2011

  24. Bittorrent - The Problem Every server has 1Gb/s link to every other server Wednesday, November 9, 2011

  25. Bittorrent - The Problem Every server has 1Gb/s link to every other server they don’t Wednesday, November 9, 2011

  26. Bittorrent - Actual bandwidth available Core 1-4Gb/s Network Wednesday, November 9, 2011

  27. Bittorrent - Actual bandwidth available Production traffic Core Network Administration traffic Wednesday, November 9, 2011

  28. Murder - Why not ? Murder uses two tricks: • Clients (including the seeder) capped to 1 upload peer • Every client receives every peer from the tracker • No download bandwidth cap (easy to add though) Wednesday, November 9, 2011

  29. Murder - Why not ? Murder uses two tricks: • Clients (including the seeder) capped to 1 upload peer • Every client receives every peer from the tracker • No download bandwidth cap (easy to add though) Peers will still connect all over the place Wednesday, November 9, 2011

  30. Murder - Why not ? Murder uses two tricks: • Clients (including the seeder) capped to 1 upload peer • Every client receives every peer from the tracker • No download bandwidth cap (easy to add though) Peers will still connect all over the place It’s slow, timing run over 25 peers took 7 mins Wednesday, November 9, 2011

  31. DIY: Location aware tracker Wednesday, November 9, 2011

  32. SMDB - Location metadata We have bandwidth information available in our server management database Build two-tier bittorrent swarms: • 1 swarm with 2 peers / rack(uplink) • 1 additional swarm per rack (uplink) • cap every client @ 96mbit/s Wednesday, November 9, 2011

  33. DIY: 2-tier swarms Wednesday, November 9, 2011

  34. DIY - Tracker Tracker in python + Flask: • 1100 lines of code (1900 with tests) • Stores transfer metadata in redis • Connects to our SMDB using REST • Exposes REST interface Wednesday, November 9, 2011

  35. DIY - Client We use a slightly modified rtorrent client Same things twitter modified: • Remove features related to operating on the big bad internet. • Make various timeouts more aggresive • No DHT, UPNP etc. Nice bonus: RPC remote API Wednesday, November 9, 2011

  36. Results - single deploy ~300 hosts 2000 1500 1000 500 0 Without build With build Classic deploy Bittorrent deploy Wednesday, November 9, 2011

  37. Graphs - full webfarm cluster Wednesday, November 9, 2011

  38. Graphs - single node Wednesday, November 9, 2011

  39. Graphs - single rack (24 nodes) Wednesday, November 9, 2011

  40. Bitorrent - Statistics == release: 101482 expected: 287 actual: 107 seeders: 0 progress: 0.00% start: 12:04:20 last_completed: none failed: 0 == == release: 101482 expected: 287 actual: 267 seeders: 0 progress: 0.00% start: 12:04:20 last_completed: none failed: 0 == == release: 101482 expected: 287 actual: 286 seeders: 0 progress: 0.00% start: 12:04:20 last_completed: none failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 1 progress: 0.35% start: 12:04:20 last_completed: none failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 2 progress: 42.11% start: 12:04:20 last_completed: 12:06:01 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 5 progress: 44.48% start: 12:04:20 last_completed: 12:06:05 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 34 progress: 45.80% start: 12:04:20 last_completed: 12:06:09 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 95 progress: 48.17% start: 12:04:20 last_completed: 12:06:10 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 95 progress: 48.17% start: 12:04:20 last_completed: 12:06:10 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 97 progress: 48.46% start: 12:04:20 last_completed: 12:06:15 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 101 progress: 49.23% start: 12:04:20 last_completed: 12:06:21 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 108 progress: 50.72% start: 12:04:20 last_completed: 12:06:24 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 118 progress: 52.95% start: 12:04:20 last_completed: 12:06:27 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 131 progress: 55.88% start: 12:04:20 last_completed: 12:06:30 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 172 progress: 67.77% start: 12:04:20 last_completed: 12:06:34 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 213 progress: 80.53% start: 12:04:20 last_completed: 12:06:37 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 239 progress: 87.36% start: 12:04:20 last_completed: 12:06:40 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 246 progress: 91.24% start: 12:04:20 last_completed: 12:06:43 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 265 progress: 97.55% start: 12:04:20 last_completed: 12:06:46 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 274 progress: 98.79% start: 12:04:20 last_completed: 12:06:49 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 280 progress: 99.07% start: 12:04:20 last_completed: 12:06:52 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 283 progress: 99.26% start: 12:04:20 last_completed: 12:06:55 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 284 progress: 99.36% start: 12:04:20 last_completed: 12:06:56 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 284 progress: 99.36% start: 12:04:20 last_completed: 12:06:56 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 285 progress: 99.52% start: 12:04:20 last_completed: 12:07:04 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 286 progress: 99.85% start: 12:04:20 last_completed: 12:07:06 failed: 0 == == release: 101482 expected: 287 actual: 287 seeders: 287 progress: 100.00% start: 12:04:20 last_completed: 12:07:37 failed: 0 == Wednesday, November 9, 2011

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend