performance datacenters
play

Performance Datacenters HotNets15 Xpander: Unveiling the Secrets of - PowerPoint PPT Presentation

Towards Optimal- Performance Datacenters HotNets15 Xpander: Unveiling the Secrets of High-Performance Datacenters Asaf Valadarsky 3 , Michael Dinitz 1 , Michael Schapira 3 CoNext16 Xpander: Towards Optimal-Performance Datacenters


  1. Towards Optimal- Performance Datacenters HotNets’15 – Xpander: Unveiling the Secrets of High-Performance Datacenters Asaf Valadarsky 3 , Michael Dinitz 1 , Michael Schapira 3 CoNext’16 – Xpander: Towards Optimal-Performance Datacenters Asaf Valadarsky 3 , Michael Dinitz 1 , Gal Shahaf 3 , Michael Schapira 3 SIGCOMM’17 – Beyond Fat-Trees Without Antennae, Mirrors, and Disco-Balls Simon Kassing 2 , Asaf Valadarsky 3 , Gal Shahaf 3 , Michael Schapira 3 , Ankit Singla 2 3 2 1

  2. Designing A Datacenter Architecture Reconfigurab Reconfigurab † † † † † † † † † † † † † Abstract— Ceiling% mirror% Mirror assembly Reflected beam Traffic% Pa= erns% Received beam Diffracted beam Towards destination fs—simple FSO% reconf% “f tree”-lik Photodetectors FireFly% DMDs Controller% figure Rule% µ fic Lasers Input beam Steerable% % ToR% reconfigurable change% FSOs% switch% reconfigurability fle Array of Micromirrors Rack% r% Rack% N% Rack% 1% flo reconfigurable; fline Network topology? Routing? Congestion Control? significant benefits fle Reconfig. flo reconfigurable 30–95% 25–40%. fle ⇡ • µ Firefly fle µ Reconfigurablility econfigurable benefits orks—electrical topology—has fle reconfigurable Reconfigurablility fic fit reconfigurable fle profit first specific reconfiguration first significant fic reconfigurable ’16, reconfigurable 1–3) fic reconfigures fic profit first specific S IGCOMM’14, 17–22, fle econfigur

  3. Designing A Datacenter Architecture Performance Deployability ➡ Throughput ➡ Cabling complexity ➡ Resiliency to failures ➡ Operations cost ➡ Path diversity ➡ Equipment costs ➡ Flow completion time ➡ ”Easy to reason about” ➡ … ➡ …

  4. What Is The “RIGHT” Datacenter Architecture? ???? Jellyfish PERFORMANCE Slim-Fly Small-World Datacenters, Dcell, Bcube, Legup, Hedera, c-Through, etc … FatTree DEPLOYABILITY

  5. In This (and the next) Talk • Reaching that upper-right corner entails designing “expander datacenters” • Xpander: a tangible and near-optimal datacenter design • Next talk: Theoretical advances in the field of expander datacenters

  6. Expander Datacenters • An expander datacenter architecture: ➡ Utilizes an expander graph as its network topology ( see next slide + Michael’s talk ) ➡ Employs multi-path routing to exploit path diversity

  7. Expander Graphs: Intuition A graph is called an “expander graph” if it has • “good” edge expansion S V\S • Intuition: In a d-regular graph, with constant edge expansion c , there are at least |S| c links crossing any cut (S,V\S) ➡ We want high values of c (ideally ~d/2) ➡ Traffic is never bottlenecked at small set of links ➡ Many paths between any source/destination pairs

  8. Expander Datacenters Achieve Near-Optimal Performance ➡ Support higher traffic loads ➡ More resilient to failures ➡ Support more servers with less network devices ➡ Multiple short-paths between hosts ➡ Incrementally expandable

  9. Our Evaluation ➡ Theoretical analyses ➡ Flow- and packet-level simulations ➡ Experiments on a network emulator ➡ Experiments on an SDN-capable network

  10. Expander Datacenters ARE The State-Of-The-Art Datacenters Random Graph Jellyfish ???? PERFORMANCE Low-Diameter Graph Slim-Fly Breaking news! Small-World Large low- diameter graphs Datacenters, Dcell, are good Bcube, Legup, Hedera, expanders c-Through, etc … FatTree DEPLOYABILITY

  11. CAN WE HAVE IT ALL? A well structured Near optimal design performance YES! :)

  12. Designing A Datacenter Architecture Performance Deployability ➡ Throughput ➡ Cabling complexity ➡ Resiliency to failures ➡ Operations cost Deployment- Expander Oriented ➡ Path diversity ➡ Equipment costs Datacenter Construction ➡ Flow completion time ➡ ”Could reason about” ➡ … ➡ …

  13. Xpander Datacenter Architecture No links within the Same same number meta- of links node between ToR every two ToR meta- ToR ToR ToR ToR ToR ToR ToR nodes ToR ToR Meta ToR Node ToR ToR ToR Same ToR Meta Node number of Leverages a deterministic graph-theoretic ToRs within construction of expanders [BL ’06] any meta- node

  14. Xpander Datacenter Architecture Topology ToR ToR ToR ToR Routing K-Shortest Paths Congestion DCTCP [ SIGCOMM’10] Control

  15. Expander datacenters Achieve Near- Optimal performance ➡ Support higher traffic loads ➡ More resilient to failures ➡ Support more servers with less network devices ➡ Multiple short-paths between hosts ➡ Incrementally expandable

  16. Datacenter Throughput • How much traffic can a datacenter network support? o The network is modelled as a capacitated graph G=(V,E,c) coupled with a demand matrix D o The maximum-concurrent-flow a D is the maximum a such that each commodity in D sends exactly an a of its demand o Common selections of D: All-to-All, Permutation, Many-to-One, and One-to-Many

  17. Near Optimal All-To-All Throughput * All-to-All Throughput Normelized Throughput 1 * 18-port 0.95 0.9 Xpander switches 0.85 0.8 Jellyfish 0.75 0.7 LPS_54 0.65 LPS_62 0.6 0.55 0.5 0 500 1000 1500 2000 Number Of Servers Theorem: In the all-to-all setting, the throughout of any d - regular expander G on n vertices is within a factor of O(log d ) of that of the throughput-optimal d -regular graph on n vertices

  18. Resilience To Failures Observation: In any d-regular expander (with edge expansion >=1), any two vertices are connected by exactly d edge-disjoint paths.

  19. Datacenter Traffic • Datacenter traffic is unpredictable o Different tenants want different things o Varying degree of mixture between long and short flows • With different types of skewness (i.e., percentage of chatty servers) o Could range between a uniform to highly skewed distributions

  20. Near-Optimal Throughput Even Against Adversarial Traffic! Theorem 1: Throughput of any expander on n vertices is a logarithmic (in n ) factor away from the optimum with respect to any traffic pattern Theorem 2: For any d -regular graph G on n vertices there is some traffic matrix under which the throughput of G is a logarithmic (in n ) factor away from the optimum Distance from Optimum Xpander throughput<80% <1% 80 % ≤ throughput < 85% 2.3% 85% ≤ throughput <90% 16.14% 90% ≤ throughput <95% 44.48% 95% ≤ throughput 36.61%

  21. Dynamic Networks: Set Up Network Connections On The Fly

  22. Are Static Networks Irrelevant? • Are fewer but flexible ports better than many cheaper static ones? We show that Xpander attains performance comparable to state-of-the-art dynamic networks at a • Do static networks need sophisticated comparable cost! routing/congestion control schemes to match the performance of dynamic networks? This and more in our new SIGCOMM paper 

  23. Deploying A New Datacenter Architecture • Need to address the concerns of IT managing the datacenter, mainly: o Keeping changes to the protocol stack to a minimum: DCTCP as the congestion control mechanism and K- Shortest paths routing o Minimize cabling complexity (see next slide) o Have the ability to increase the datacenter size More on this in Michael’s talk (coming up next)

  24. Cabling Xpander No links Same within the number same of links meta- between ToR node every two ToR ToR meta- ToR nodes ➡ Place ToRs of each meta-node in close proximity ➡ Bundle cables between two meta-nodes ➡ Use color-coding to distinguish between different meta-nodes and bundles of cables

  25. Conclusion We show that expander datacenters outperform • traditional datacenters ✓ Sheds light on past results about random and low- diameter datacenter networks We present Xpander , a novel datacenter architecture • ✓ Suggests a tangible alternative to today’s datacenter architectures ✓ Achieves near-optimal performance

  26. Thank you! Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend