Faithfully Emulating Large Production Networks
1
Hongqiang Harry Liu, Yibo Zhu Nuno Lopes, Andrey Rybalchenko Micr croso soft t Resear arch h Jitu Padhye, Jiaxin Cao, Sri T allapragada, Guohan Lu, Lihua Yuan Micr croso soft t Azure
Cr Cryst stalNet alNet Faithfully Emulating Large Production - - PowerPoint PPT Presentation
Cr Cryst stalNet alNet Faithfully Emulating Large Production Networks Hongqiang Harry Liu, Yibo Zhu Jitu Padhye, Jiaxin Cao, Sri T allapragada, Nuno Lopes, Andrey Rybalchenko Guohan Lu, Lihua Yuan Micr croso soft t Resear arch h Micr
1
Hongqiang Harry Liu, Yibo Zhu Nuno Lopes, Andrey Rybalchenko Micr croso soft t Resear arch h Jitu Padhye, Jiaxin Cao, Sri T allapragada, Guohan Lu, Lihua Yuan Micr croso soft t Azure
2
I can trust clouds, right? Cloud Computing Services
3
4
k/hour ur or above
k/hour ur or above
nes) or above
es) or above
es) or above
– USA SA Today y Su Surve vey y of 200 data cen enter er manager ers – High h avail ilability bility survey vey over er 100 0 companie nies by Infor
ation Tec echn hnolo
gy and Intel ellig ligen ence ce Corp
5
6
Impact Root Cause Down Time Service Fabric, SQL DB, IoT Hub, HDInsight, etc. An incorrect network configuration change. 2 hours Date Aug 22nd, 2017 Sep 20th, 2017 DynamoDB service disruption in US-East A network disruption. 3.5 hours Jun 8th, 2017 asia-northeast1 region experienced a loss of network connectivity Configuration error during network upgrades. 1.1 hours
Cloud A Cloud B Cloud C
Availability Maximum Downtime per Year 99.99% (four nines) 52.56 minutes 99.999% (five nines) 5.26 minutes
We must prevent vent such ch outages es proactively! actively!
7
managem agement ent softwar are switch ch config igur urat ation
switch ch
Unit Tests Feature Tests Testbeds Vendor Tests
These e tests s say little tle about t how they y work k in production uction
8
So Softw twar are e Bugs(36%) (36%) Configuration iguration Bugs(27% 27%) Human n Errors(6% (6%) Hardwar ware e Failur lures es(29% (29%) Unident dentified ified(2%) (2%)
Bugs s in route ters, s, middle leboxes es and management agement tools wrong g ACL polici cies es, , route e leaki king ng, , route e black ckhol holes es ASIC driver ver failur ures es, silent nt pack cket et drops, s, fiber cuts, s, power failur ures es
Data Interval: 01/2015 – 01/2017
9
10
Product duction ion Network
Configuration Software Hardware Configuration Software Hardware Configuration Software Hardware
A C Copy of P Product duction ion Network
Configuration Software Hardware Configuration Software Hardware Configuration Software Hardware
switch ch Most t cost t is from hardwar are
11
Product duction ion Network
Configuration Software Hardware Configuration Software Hardware Configuration Software Hardware
An E Emulat lation ion Product ductio ion n Network
Configuration Software vHardware Configuration Software vHardware Configuration Software vHardware
An E Emulat lation ion Product ductio ion n Network with h Real l Hardwar are
Configuration Software vHardware Configuration Software vHardware Configuration Software Hardware
High-fid idel elit ity y product uctio ion n envir ironm
ents
Softwar ware Bugs Config figura ration
Human Errors Hardwar are e Failu lures es Unidentifie fied
CUSTOMER OMER IMPACTI CTING NG NETWO WORK RK INCIDENTS ENTS IN AZURE E (2015-20 2017) 7) >69%
12
13
S1 S2
Host A Host B Host C Management VM (Linux, Windows, etc.)
T
ls by Operat rators
Management agement
rlay ay Virtual tual links nks
Control Prepare
Orch chestrator estrator
L1 T1 T2 L2 T3 T4 Monitor
Production uction external
B1
topo config software version route
Host D
Probing & testing traffic
scalabi bilit lity to emulate large networks
flexibi xibilit lity to accommodate heterogeneous switches
correct ctness ness and co cost st eff ffici ciency ency of emulation boundary
14
15
switch network
16
Publi lic c Cloud ud
Priva vate te Cloud
private network load balancer special hardware
labil ility ity to emulate large networks
▪ scaling out emulations transparently on multiple hosts and clouds
exibilit ility to accommodate heterogeneous switches
corr rrectne ctness ss and cost effi fficiency ciency of emulation boundary
17
18
Potential ntial switch tch sandbo boxes xes
Docke ker Container iner:
Virtual ual Machin hine:
Bare-met metal al:
19
S1 S2
Host A Host B Host C
L1 T1 T2 L2 T3 T4 B1
Container VM Bare Metal
Management Agent Management Agent Management Agent
20
S1 S2
Host A Host B Host C
L1 T1 T2 L2 T3 T4 B1
Container VM Bare Metal
21
S1’ S2’ Host A Host B L1’ T1’ T2’ L1 T1 T2 T3’ T4’ T3 T4 L2’ L2 Host C S1 S2 S’ PhyNet container L T S Heterogenous switch Share network namespace
Key idea: a: maintaining taining network
h a h homogeno enous us layer er of contai ainer ers
Management Agent
labilit bility to emulate large networks
▪ scaling out emulations transparently on multiple hosts and clouds
flexibil ibility ity to accommodate heterogeneous devices
▪ Introducing a homogeneous PhyNet layer to open and unify network name space of devices
ctness and cost t eff fficie iency ncy of emulation boundary
22
23
S1 S2 L1 L2 T1 T2 L3 L4 T3 T4 L5 L6 T5 T6 C1 C2
Core Network & Internet (non-emulate emulated) Data Center Network (emulate ated) A t trans nspar parent ent boundar dary:
C1 C2
24
S1 S2 L1 L2 T1 T2 L3 L4 T3 T4 L5 L6 T5 T6 C1 C2
Static ic speake aker devic ices es:
no reaction to dynamics inside emulation 0.0.0.0/0 Core Network & Internet (non-emulate emulated) Data Center Network (emulate ated) routing information
S1 S2 L1 L2 T1 T2 L3 L4 T3 T4 L5 L6 T5 T6
Correctness?
25
S1 S2 L1 L2 T1 T2 L3 L4 T3 T4 L5 L6 T5 T6 Add 10.1.0.0/16
A unsafe boundary
26
S1 S2 L1 L2 T1 T2 L3 L4 T3 T4 L5 L6 T5 T6 Add 10.1.0.0/16
A proven safe boundary
AS100 AS200 AS300
The boundary is a single AS, announcements never return
Se See paper r for proofs s and safe e boundar ary y for OSP SPF, IS-IS, S, etc.
27
S1 S2 L1 L2 T1 T2 L3 L4 T3 T4 L5 L6 T5 T6
AS100
AS200 AS300
S1 S2 L1 L2 T1 T2 L3 L4 T3 T4 L5 L6 T5 T6
AS100 AS200 AS300
Emulating individual PodSet Emulating Particular Layers
Cost savin vings gs from Emulatin lating g the entir ire e DC: 96%~9 ~98% 8% (S (See the paper r for the algorit
hm)
28
29
US-EAST US-WEST Core Backbone Regional Backbone Regional Backbone Good news:
performance for intra- region traffic once the migration is finished Bad news:
this migration without user impact DC-1 DC-2 DC-3 DC-4
30
Common policies in Azure’s datacenters: 1. Reachability among the servers are always on;
[Severity 1: Id XXX] Date: 10/19/2016 Impact act: the entire region X unreachable Root Cause: Human Typo [Severity 1: Id YYY] Date: 10/26/2016 Impact act: VM crashes and service failures region Y Root Cause: Wrong operation order
31
emulat lation ion: $30/hour ($1000/hour without safe & small boundary design)
s found: 50+, including configuration, management script, switch software and operation errors
ntial al savin ing: 5+ outages
idents nts in produc uction ion: 0
32
33
#Bor
ers #Spines ines #Leave eaves #T
#Route utes O(10) O(100) O(1000) O(1000) O(10M) One of t the large gest st DC n network
10 20 30 40 50 60
500VM/2000Cores 1000VM/4000Cores Latency (Minutes)
Emulation Startup Latency
Network Ready Route Ready
34
alability
xibility ty to handle heterogeneous switch sandboxes
rrectn ectness ess & cost t effic icienc iency of a transparent emulation boundary
network validation process in Azure
35
36
crystal stalne net-dev@mic dev@micros
.com
At Microsoft, our mission is to empower every person and every organization on the planet to achieve more.