using cpu as a traffic co processing unit in commodity
play

Using CPU as a Traffic Co-processing Unit in Commodity Switches - PowerPoint PPT Presentation

Using CPU as a Traffic Co-processing Unit in Commodity Switches Guohan Lu , Rui Miao + , Yongqiang Xiong and Chuanxiong Guo Microsoft Research Asia + Tsinghua University Background Commodity switches are the basic building blocks in


  1. Using CPU as a Traffic Co-processing Unit in Commodity Switches Guohan Lu , Rui Miao + , Yongqiang Xiong and Chuanxiong Guo Microsoft Research Asia + Tsinghua University

  2. Background • Commodity switches are the basic building blocks in enterprise and data center networks – PortLand and VL2 build entire DCN with 1U commodity switches

  3. Background (cont ’) • Commodity switches now widely adopt single switching chip design CPU for control All-in-one plane DRAM switching ASIC • Greatly simplifies switch design and lowers down the cost

  4. Limitation (I) • Limited forwarding table size for flow-based forwarding schemes, e.g. Openflow – Openflow provides finest granularity for better security (Ethane), traffic load balancing (Hedera), Energy saving (ElasticTree) – 4k flow entries for most recent BRCM switching chip Data center for map- reduce style applications with 120 ToR and ~5k servers # of active flows ≥ 4096 for 95%+ time 4k

  5. Limitation (II) • Shallow packet buffer for bursty traffic – Switching ASIC has only several MB buffer – Bursty traffic pattern, e.g. TCP incast, TCP flash crowds – Packet drops lead to degraded network performance Senders Receiver R2 R0 R1

  6. Design Goals • Large forwarding table – Support large forwarding table for forwarding schemes such as OpenFlow • Deep packet buffer – Absorb temporary traffic bursts, e.g., TCP incast, TCP flash crowds

  7. Assumptions for Commodity Switches Multicore CPU for Large DRAM as off- packet processing chip packet buffer High speed interconnect as high speed data All-in-one switching ASIC channel Ethernet ports Future switch box

  8. Large forwarding table software CPU fwd table hw fwd table Switch ASIC • Complete forwarding table in software • Partial forwarding table in hardware

  9. TraFfic Offloading Ratio (TFOR) • TFOR: Traffic forwarded by HW v.s. all traffic • Obtain TFOR: For every minute, get flow rates, sort the flows based on the rates, put k fastest flows in HW. • TFOR ≥ 92% for 95%+ time when k = 4096

  10. Flow Management • k fastest flows are forwarded by hardware, rest are forwarded by software • Assume one byte counter per flow in hardware • Procedures • Count software-forwarded flow bytes, periodically read the counters from hardware • Rank flows based on their rates and determine k fastest flows • Offload fast flows to hardware and onload slow flows to software

  11. Deep Packet Buffer Internal high bandwidth CPU Channel Memory Memory Memory Switching Server Low watermark chip High watermark • Phase 1: Traffic redirection • Phase 2: Cancel redirection

  12. Internal bandwidth Needed CPU 𝑒𝑏𝑢𝑏 = 2𝐷 data flow 𝑆 𝑗𝑜 𝑁𝑇𝑇 ≥2C ? ack flow 𝑒𝑏𝑢𝑏 = 𝐷 𝑆 𝑝𝑣𝑢 S 𝑁𝑇𝑇 R S 𝑏𝑑𝑙 = 𝐷 𝑆 𝑗𝑜 S 𝑁𝑇𝑇 Switch ASIC • Receiver: delayed ack disabled • Senders: TCP slow start • No packet drops when internal bandwidth is larger than 2C.

  13. Prototype • A 16xGE port switch using 4 ServerSwitch cards • HP z800 workstation – 8 CPU cores 16xGE – 48GB DRAM 10GE • Kernel code for packet forwarding • User space code for switch ASIC management

  14. Large Forwarding Table • 10 min synthesized traffic using flow size distribution from DCN S R measurements S R • 1,792 HW flow entries S R S R Interval Total bytes # of active TFOR ratio (GB) flows 1x 33.6 10,644 96.1% 1/10x 336 106,544 90.5%

  15. Deep Packet Buffer S Requests S C 15 Servers Responses S TCP Flash Crowds last for 1 second 1024 SYN/ACK Data Fast Packet Requests timeout timeout Recovery drops TCP 109 180 690 15962 DCTCP 23 395 173 3302 DeepBuf 0 0 0 0

  16. Conclusions • Two major limitations of current commodity switches – Limited forwarding table for Openflow – Shallow packet buffer for bursty traffic pattern • Use CPU as traffic co-processor to address these two limitations

  17. QUESTIONS?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend