tcp congestion signatures
play

TCP CONGESTION SIGNATURES Srikanth Sundaresan (Princeton Univ.) - PowerPoint PPT Presentation

TCP CONGESTION SIGNATURES Srikanth Sundaresan (Princeton Univ.) Amogh Dhamdhere (CAIDA/UCSD) kc Claffy (CAIDA/UCSD) Mark Allman (ICSI) 1 w w w . cai da. or Typical Speed Tests Dont Tell Us Much 2 w w w . cai da. or Typical Speed


  1. TCP CONGESTION SIGNATURES Srikanth Sundaresan (Princeton Univ.) Amogh Dhamdhere (CAIDA/UCSD) kc Claffy (CAIDA/UCSD) Mark Allman (ICSI) 1 w w w . cai da. or

  2. Typical Speed Tests Don’t Tell Us Much 2 w w w . cai da. or

  3. Typical Speed Tests Don’t Tell Us Much 2 w w w . cai da. or

  4. Typical Speed Tests Don’t Tell Us Much 2 w w w . cai da. or

  5. Typical Speed Tests Don’t Tell Us Much • Upload and download throughput measurements: no information beyond that 2 w w w . cai da. or

  6. Typical Speed Tests Don’t Tell Us Much What type of congestion did the TCP flow experience? 2 w w w . cai da. or

  7. Two Potential Sources of Congestion in the End-to-end Path 3 w w w . cai da. or

  8. Two Potential Sources of Congestion in the End-to-end Path • Self-induced congestion - Clear path, the flow itself induced congestion - eg: last-mile access link 3 w w w . cai da. or

  9. Two Potential Sources of Congestion in the End-to-end Path • Self-induced congestion - Clear path, the flow itself induced congestion - eg: last-mile access link • External congestion - Flow starts on an already congested path - eg: congested interconnect 3 w w w . cai da. or

  10. Two Potential Sources of Congestion in the End-to-end Path • Self-induced congestion - Clear path, the flow itself induced congestion - eg: last-mile access link • External congestion - Flow starts on an already congested path - eg: congested interconnect Distinguishing the two cases has implications for users / ISPs / regulators 3 w w w . cai da. or

  11. How can we distinguish the two? • Cannot distinguish using just throughput numbers - Access plan rates vary widely, and are typically not available to content / speed test providers - eg: Speed test reports 5 Mbps – is that the access link rate (DSL), or a congested path? 4 w w w . cai da. or

  12. How can we distinguish the two? • Cannot distinguish using just throughput numbers - Access plan rates vary widely, and are typically not available to content / speed test providers - eg: Speed test reports 5 Mbps – is that the access link rate (DSL), or a congested path? We can use the dynamics of TCP’s startup phase, i.e., Congestion Signatures 4 w w w . cai da. or

  13. TCP’s RTT Congestion Signatures 5 w w w . cai da. or

  14. TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT 5 w w w . cai da. or

  15. TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT • Externally congested flows encounter an already full buffer - Less potential for RTT increases 5 w w w . cai da. or

  16. TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT • Externally congested flows encounter an already full buffer - Less potential for RTT increases • Self-induced congestion therefore has higher RTT variance compared to external congestion 5 w w w . cai da. or

  17. TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT • Externally congested flows encounter an already full buffer - Less potential for RTT increases • Self-induced congestion therefore has higher RTT variance compared to external congestion We can quantify this using Max-Min and CoV of RTT 5 w w w . cai da. or

  18. Example Controlled Experiment 1 . 0 • 20 Mbps “access” link External 0 . 8 Self with 100 ms buffer 0 . 6 CDF • 1 Gbps “interconnect” 0 . 4 link with 50 ms buffer 0 . 2 Max-Min RTT 0 . 0 10 1 10 2 1 . 0 • Self-induced External 0 . 8 Self congestion flows have 0 . 6 higher values for both CDF metrics and are clearly 0 . 4 distinguishable 0 . 2 CoV RTT 0 . 0 10 − 2 10 − 1 10 0 6 w w w . cai da. or

  19. Example Controlled Experiment 1 . 0 • 20 Mbps “access” link External 0 . 8 Self with 100 ms buffer 0 . 6 CDF • 1 Gbps “interconnect” 0 . 4 link with 50 ms buffer 0 . 2 Max-Min RTT 0 . 0 10 1 10 2 1 . 0 • Self-induced External 0 . 8 Self congestion flows have 0 . 6 higher values for both CDF metrics and are clearly 0 . 4 distinguishable 0 . 2 CoV RTT 0 . 0 10 − 2 10 − 1 10 0 The two types of congestion exhibit widely contrasting behaviors 6 w w w . cai da. or

  20. Model • Max-min and CoV of RTT derived from RTT samples during slow start • We feed the two metrics into a simple Decision Tree - We control the depth of the tree to a low value to minimize complexity • We build the decision tree classifier using controlled experiments and apply it to real-world data 7 w w w . cai da. or

  21. Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Pi 2 Server 4 8 w w w . cai da. or

  22. Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Server 4 8 w w w . cai da. or

  23. Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Throughput Server 4 tests 8 w w w . cai da. or

  24. It’s Real 9 w w w . cai da. or

  25. It’s Real Fantastic Post-it Cabling defined effort networking 9 w w w . cai da. or

  26. Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Throughput Server 4 tests • Emulated access link + “core” link - Wide range of access link throughputs, buffer sizes, loss rates, cross- traffic (background and congestion-inducing) - Can accurately label flows in training data as “self” or “externally” congested 10 w w w . cai da. or

  27. Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Throughput Server 4 tests High accuracy: precision and recall > 80% robust to model settings 11 w w w . cai da. or

  28. Validating the Method: Step 2 ISP B ISP A Ark VP • From Ark VP in ISP A identified congested link with ISP B using TSLP* *Luckie et al. “Challenges in Inferring Internet Interdomain Congestion”, IMC 2014 12 w w w . cai da. or

  29. Validating the Method: Step 2 ISP B congested link ISP A Ark VP • From Ark VP in ISP A identified congested link with ISP B using TSLP* *Luckie et al. “Challenges in Inferring Internet Interdomain Congestion”, IMC 2014 12 w w w . cai da. or

  30. Validating the Method: Step 2 M-lab NDT server ISP B congested link ISP A Ark VP • Periodic NDT tests from Ark VP to M-Lab NDT server “behind” the congested interdomain link 13 w w w . cai da. or

  31. Validation of the Method: Step 2 30 25 d/l Mbps 20 15 10 5 0 02/18 02/25 03/04 03/11 TSLP latency (far side) 70 60 50 40 30 20 10 02/18 02/25 03/04 03/11 Strong correlation between throughput and TSLP latency: flows during elevated TSLP latency labeled as “externally” congested 14 w w w . cai da. or

  32. Validation of the Method: Step 2 30 25 “Externally” d/l Mbps 20 congested 15 10 5 0 02/18 02/25 03/04 03/11 TSLP latency (far side) 70 60 50 40 30 20 10 02/18 02/25 03/04 03/11 Strong correlation between throughput and TSLP latency: flows during elevated TSLP latency labeled as “externally” congested 14 w w w . cai da. or

  33. Validation of the Method: Step 2 30 25 “Externally” d/l Mbps 20 congested 15 “self” 10 5 congested 0 02/18 02/25 03/04 03/11 TSLP latency (far side) 70 60 50 40 30 20 10 02/18 02/25 03/04 03/11 Strong correlation between throughput and TSLP latency: flows during elevated TSLP latency labeled as “externally” congested 14 w w w . cai da. or

  34. Validation of the Method: Step 2 30 25 d/l Mbps 20 15 10 5 0 02/18 02/25 03/04 03/11 TSLP latency (far side) 70 60 50 40 30 20 10 02/18 02/25 03/04 03/11 75%+ accuracy in detecting external congestion, 100% accuracy for self-induced congestion 15 w w w . cai da. or

  35. Validation of the Method: Step 3 • We use Measurement Lab’s NDT test data for real-world validation • Cogent interconnect issue in late 2013/early 2014 - NDT tests to Cogent servers saw significant drops in throughput during peak hours - Several major U.S. ISPs were affected, except Cox - The problem was identified as congested interconnects 16 w w w . cai da. or

  36. Using the M-lab Data 40 Comcast TimeWarner January 2014 Cox Verizon 30 Mbps 20 10 0 5 10 15 20 Hour of day (local) 40 April 2014 Comcast TimeWarner Cox Verizon 30 Mbps 20 10 0 5 10 15 20 Hour of day (local) 17 w w w . cai da. or

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend