CompSci 514: Computer Networks L18: Datacenter Network - PowerPoint PPT Presentation

CompSci 514: Computer Networks L18: Datacenter Network Architectures II Xiaowei Yang 1

Outline • Design and evaluation of VL2 • Discussion – FatTree vs VL2 • What common challenges did each address? • What methods did each use to address those challenges? 2

Virtual Layer 2: A Scalable and Flexible Data-Center Network Microsoft Research Changhoon Kim Work with Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta

Tenets of Cloud-Service Data Center • Agility: Assign any servers to any services – Boosts cloud utilization • Scaling out: Use large pools of commodities – Achieves reliability, performance, low cost Statistical Economies Multiplexing Gain of Scale 4

What is VL2? The first DC network that enables agility in a scaled-out fashion • Why is agility important? – Today’s DC network inhibits the deployment of other technical advances toward agility • With VL2, cloud DCs can enjoy agility in full 5

Status Quo: Conventional DC Network Internet CR CR DC-Layer 3 . . . AR AR AR AR DC-Layer 2 Key S S • CR = Core Router (L3) • AR = Access Router (L3) . . . S S S S • S = Ethernet Switch (L2) • A = Rack of app. servers … … A A A A A A ~ 1,000 servers/pod == IP subnet Reference – “Data Center: Load balancing Data Center Services”, Cisco 2004 6

Conventional DC Network Problems CR CR ~ 200:1 AR AR AR AR S S S S ~ 40:1 . . . S S S S S S S S ~ 5:1 A A A A A A A A A … … A A A … … • Dependence on high-cost proprietary routers • Extremely limited server-to-server capacity 7

And More Problems … CR CR ~ 200:1 AR AR AR AR S S S S S S S S S S S S A A A A A A A A A A … … A A A … … IP subnet (VLAN) #1 IP subnet (VLAN) #2 • Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency) 8

And More Problems … CR CR ~ 200:1 AR AR AR AR Complicated manual L2/L3 re-configuration S S S S S S S S S S S S A A A A A A A A A A … … A A A … … IP subnet (VLAN) #1 IP subnet (VLAN) #2 • Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency) 9

And More Problems … CR CR AR AR AR AR S S S S S S S S S S S S A A A A A A A A A … … A A A … … Revenue lost Expense wasted • Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency) 10

Designing VL2 • Measuring to know the characteristics of datacenter networks • Design routing schemes that work well with the traffic patterns • Q: limitations of this design approach? 11

Measuring Traffic • Instrumented a large cluster used for data mining and identified distinctive traffic patterns – a highly utilized 1500 node cluster in a data center that supports data mining on petabytes of data. – The servers are distributed roughly evenly across 75 ToR switches – Collected socket-level event logs from all machines over two months. 12

Traffic analysis 1. The ratio of traffic volume between servers in our data centers to traffic entering/leaving our data centers is currently around 4:1 (excluding CDN applications). 2. Datacenter computation is focused where high speed access to data on memory or disk is fast and cheap. Although data is distributed across multiple data centers, intense computation and communication on data does not straddle data centers due to the cost of long-haul links. 3. The demand for bandwidth between servers inside a data center is growing faster than the demand for bandwidth to external hosts 4. The network is a bottleneck to computation. We frequently see ToR switches whose uplinks are above 80% utilization. 13

Flow Distribution Analysis 0.45 Flow Size PDF 0.4 0.35 Total Bytes PDF 0.3 PDF 0.25 0.2 0.15 0.1 0.05 0 1 100 10000 1e+06 1e+08 1e+10 1e+12 Flow Size (Bytes) 1 Flow Size CDF 0.8 CDF Total Bytes CDF 0.6 0.4 0.2 0 1 100 10000 1e+06 1e+08 1e+10 1e+12 Flow Size (Bytes) Figure  : Mice are numerous;  of f ows are smaller than  MB. However, more than  of bytes are in f ows between  MB and  GB. 14

Number of Concurrent Flows 0.04 1 Fraction of Time PDF 0.8 CDF Cumulative 0.03 0.6 0.02 0.4 0.01 0.2 0 0 1 10 100 1000 Number of Concurrent flows in/out of each Machine Figure  : Number of concurrent connections has two modes: (  )  f ows per node more than  of the time and (  )  f ows per node for at least  of the time. 15

Implications • The distributions of flow size and number of concurrent flows both imply that VLB will perform well on this traffic. Since even big flows are only 100MB (1 s of transmit time at 1 Gbps), randomiz- ing at flow granularity (rather than packet) will not cause perpetual congestion if there is unlucky placement of a few flows. • Moreover, adaptive routing schemes may be difficult to implement in the data center, since any reactive traffic engineering will need to run at least once a second if it wants to react to individual flows. 16

Traffic Matrix Analysis • Q: Is there regularity in the traffic that might be exploited through careful measurement and traffic engineering? • Method – Compute the ToR-to-ToR TM — the entry TM(t) i,j is the number of bytes sent from servers in ToR i to servers in ToR j during the 100 s beginning at time t. We compute one TM for every 100 s interval, and servers outside the cluster are treated as belonging to a single “ToR” – Cluster similar TMs and choose one representative TM per cluster 17

Results • No representative TMs • On a timeseries of 864 TMs • Approximat ing with 50 − 60 clusters • The fitting error remains high (60%) and only decreases moderately beyond that point 18

Instability of Traffic Patterns 40 Index of the Containing Cluster 35 300 30 200 Frequency Frequency 25 200 20 100 15 100 10 50 5 0 0 0 0 200 400 600 800 1000 0 5 10 20 2.0 3.0 4.0 Run Length log(Time to Repeat) Time in 100s intervals (a) (b) (c) Figure  : Lack of short-term predictability: Ti e cluster to which a tra f c matrix belongs, i.e., the type of tra f c mix in the TM, changes quickly and randomly. 19

Failure Characteristics • Most failures are small in size – 50% of network device failures involve < 4 devices – 95% of network device failures involve < 20 devices while large correlated failures are rare (e.g., the largest correlated failure involved 217 switches) – Downtimes can be significant: 95% of failures are resolved in 10 min, 98%in<1hr, 99.6%in<1 day,but 0.09% last>10days. 20

Questions to ponder • What design choices may change if observed different traffic patterns? 21

Know Your Cloud DC: Challenges • Instrumented a large cluster used for data mining and identified distinctive traffic patterns • Traffic patterns are highly volatile – A large number of distinctive patterns even in a day • Traffic patterns are unpredictable – Correlation between patterns very weak Optimization should be done frequently and rapidly 22

Know Your Cloud DC: Opportunities • DC controller knows everything about hosts • Host OS’s are easily customizable • Probabilistic flow distribution would work well enough, because … – Flows are numerous and not huge – no elephants! – Commodity switch-to-switch links are substantially thicker (~ 10x) than the maximum thickness of a flow DC network can be made simple 23

All We Need is Just a Huge L2 Switch, or an Abstraction of One CR CR . . . AR AR AR AR S S S S . . . S S S S S S S S A A A A A A A A A A A A A A A A A A A A A … A A … A A A A A A A A A A A A A A … … 24

All We Need is Just a Huge L2 Switch, or an Abstraction of One 1. L2 semantics 2. Uniform high 3. Performance capacity isolation A A A A A A A A A A A A A A A A A A A A A … A A … A A A A A A A A A A A A A A … … 25

Specific Objectives and Solutions Approach Solution Objective Name-location 1. Layer-2 Employ flat separation & semantics addressing resolution service 2. Uniform Guarantee Flow-based random high capacity bandwidth for traffic indirection between servers hose-model traffic (Valiant LB) Enforce hose model 3. Performance using existing TCP Isolation mechanisms only 26

CompSci 514: Computer Networks L18: Datacenter Network - PowerPoint PPT Presentation

CompSci 514: Computer Networks L18: Datacenter Network Architectures II Xiaowei Yang 1 Outline Design and evaluation of VL2 Discussion FatTree vs VL2 What common challenges did each address? What methods did each use to

CompSci 514: Computer Networks Lecture 15 Practical Datacenter Networks Xiaowei Yang Overview

CompSci 514: Computer Networks Lecture 14 Datacenter Transport protocols II Xiaowei Yang

CompSci 514: Computer Networks Lecture 17: Datacenter Network Architectures Xiaowei Yang

Camera Calibration COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Camera

CompSci 514: Computer Networks Lecture 16: Network Function Virtualization Xiaowei Yang Adapted

CompSci 514: Computer Networks Lecture 17: Network Support for Remote Direct Memory Access

Rigid Geometric Transformations COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision

Training Neural Nets COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Training

Tracking Feature Windows COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision

CompSci 514: Computer Networks Lecture 5: Congestion Control Xiaowei Yang 1 Outline

CompSci 514: Computer Networks Lecture 13: Distributed Hash Table Xiaowei Yang Overview

CompSci 514: Computer Networks Lecture 04: Evolution of the Internet Xiaowei Yang

CompSci 514: Computer Networks Lecture 13 TCP incast and Solutions Xiaowei Yang Roadmap

CompSci 514 Computer Networks Lecture 20: Combating Denial of Service Attacks Xiaowei Yang How

CompSci 514: Computer Networks Lecture 21-2: From BitTorrent to BitTyrant Problem Statement

CompSci 514: Computer Networks Lecture 11: Software Defined Networking Xiaowei Yang 1

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors Austin T.

NFSv4 Replication for Grid Storage Middleware Peter Honeyman Center for Information Technology

Our cloud is thirsty ! Shaolei Ren Florida International University sren@cs.fiu.edu 1 A

The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora {mbalassi, gyfora}@apache.org

Topic 2 Current, Voltage and Power Prof Peter Cheung Dyson School of Design Engineering

Interprocess Communication Chester Rebeiro IIT Madras 1 Virtual Memory View During

CS 889 Advanced Topics in Human- Computer Interaction RepliCHI Scheduling Friday classes

CS 5150 Software Engineering 9. Usability and User Interfaces William Y. Arms The Importance of