SLIDE 1 FairCloud: Sharing the Network in Cloud Computing
Lucian Popa Gautam Kumar Mosharaf Chowdhury Arvind Krishnamurthy Sylvia Ratnasamy Ion Stoica
(UC Berkeley) (UC Berkeley) (UC Berkeley) (UC Berkeley) (Univ Washington) (HP Labs)
SLIDE 2
Motivation
Network?
SLIDE 3
Context
Networks are more difficult to share than other resources
X
SLIDE 4 Context
- Several proposals that share network differently, e.g.:
– proportional to # source VMs (Seawall [NSDI11]) – statically reserve bandwidth (Oktopus [Sigcomm12]) – …
- Provide specific types of sharing policies
- Characterize solution space and relate policies to
each other?
SLIDE 5 This Talk
- 1. Framework for understanding network
sharing in cloud computing
– Goals, tradeoffs, properties
- 2. Solutions for sharing the network
– Existing policies in this framework – New policies representing different points in the design space
SLIDE 6 Goals
- 1. Minimum Bandwidth Guarantees
– Provides predictable performance – Example: file transfer finishes within time limit
A1 A2 Timemax = Size / Bmin Bmin
SLIDE 7 Goals
- 1. Minimum Bandwidth Guarantees
- 2. High Utilization
– Do not leave useful resources unutilized – Requires both work-conservation and proper incentives
A B B B
Both tenants active Non work-conserving Work-conserving
SLIDE 8 Goals
- 1. Minimum Bandwidth Guarantees
- 2. High Utilization
- 3. Network Proportionality
– As with other services, network should be shared proportional to payment – Currently, tenants pay a flat rate per VM network share should be proportional to #VMs (assuming identical VMs)
SLIDE 9 Goals
- 1. Minimum Bandwidth Guarantees
- 2. High Utilization
- 3. Network Proportionality
– Example: A has 2 VMs, B has 3 VMs
A1 A2 BwA B1 B3 B2 BwB
BwB BwA = 2 3
When exact sharing is not possible use max-min
SLIDE 10 Goals
- 1. Minimum Bandwidth Guarantees
- 2. High Utilization
- 3. Network Proportionality
Not all goals are achievable simultaneously!
SLIDE 11
Tradeoffs
Not all goals are achievable simultaneously!
SLIDE 12
Tradeoffs
SLIDE 13 Tradeoffs
BwB BwA
A B
Access Link L Capacity C BwB = 11/13 C BwA= 2/13 C
10 VMs
Network Proportionality
BwA ≈ C/NT 0 #VMs in the network BwB = 1/2 C BwA= 1/2 C
Minimum Guarantee
SLIDE 14
Tradeoffs
SLIDE 15
Tradeoffs
L
B1 B3 B2 B4 A1 A3 A2 A4
SLIDE 16
Tradeoffs
L
B1 B3 B2 B4 A1 A3 A2 A4
BwB = 1/2 C BwA= 1/2 C
Network Proportionality
SLIDE 17 Tradeoffs
L
B1 B3 B2 B4 A1 A3 A2 A4
P
Uncongested path
SLIDE 18 Tradeoffs
L
B1 B3 B2 B4 A1 A3 A2 A4
BwB BwA<
Network Proportionality
L L
Tenants can be disincentivized to use free resources
If A values A1A2 or A3A4 more than A1A3
BwA+BwA= BwB
L L P
P
Uncongested path
SLIDE 19 Tradeoffs
L
B1 B3 B2 B4 A1 A3 A2 A4
P
Network proportionality applied only for flows traversing congested links shared by multiple tenants
Uncongested path
SLIDE 20 Tradeoffs
L
B1 B3 B2 B4 A1 A3 A2 A4
Uncongested path
P
BwB BwA=
Congestion Proportionality
L L
SLIDE 21
Tradeoffs
Still conflicts with high utilization
SLIDE 22
Tradeoffs
B1 B3 A1 A3 B2 B4 A2 A4
L2
C1 = C2 = C
L1
SLIDE 23 Tradeoffs
B1 B3 A1 A3 B2 B4 A2 A4
L1 L2
BwB BwA =
Congestion Proportionality
L1 L1
BwB BwA =
L2 L2
C1 = C2 = C
SLIDE 24
Tradeoffs
B1 B3 A1 A3 B2 B4 A2 A4
L1 L2
Demand drops to
ε
C1 = C2 = C
SLIDE 25
Tradeoffs
B1 B3 A1 A3 B2 B4 A2 A4 ε
C - ε
ε
C - ε
Tenants incentivized to not fully utilize resources
C1 = C2 = C
L1 L2
SLIDE 26
Tradeoffs
B1 B3 A1 A3 B2 B4 A2 A4 ε
C - 2ε Uncongested
ε
C - ε
L1 L2
C1 = C2 = C
Tenants incentivized to not fully utilize resources
SLIDE 27
L2
Tradeoffs
B1 B3 A1 A3 B2 B4 A2 A4 ε
C - 2ε C/2 C/2 Uncongested C1 = C2 = C
Tenants incentivized to not fully utilize resources
L1
SLIDE 28
Tradeoffs
Proportionality applied to each link independently
L2
B1 B3 A1 A3 B2 B4 A2 A4
L1
SLIDE 29
L2
Tradeoffs
B1 B3 A1 A3 B2 B4 A2 A4
Full incentives for high utilization
L1
SLIDE 30
Goals and Tradeoffs
SLIDE 31
Guiding Properties
Break down goals into lower-level necessary properties
SLIDE 32
Properties
SLIDE 33 Work Conservation
- Bottleneck links are fully utilized
- Static reservations do not have this property
SLIDE 34
Properties
SLIDE 35 Utilization Incentives
- Tenants are not incentivized to lie about demand
to leave links underutilized
- Network and congestion proportionality do not
have this property
- Allocating links independently provides this
property
SLIDE 36
Properties
SLIDE 37 Communication-pattern Independence
- Allocation does not depend on communication
pattern
- Per flow allocation does not have this property
– (per flow = give equal shares to each flow)
Same Bw
SLIDE 38
Properties
SLIDE 39 Symmetry
- Swapping demand directions preserves allocation
- Per source allocation lacks this property
– (per source = give equal shares to each source)
Same Bw Same Bw
SLIDE 40
Goals, Tradeoffs, Properties
SLIDE 41 Outline
- 1. Framework for understanding network
sharing in cloud computing
– Goals, tradeoffs, properties
- 2. Solutions for sharing the network
– Existing policies in this framework – New policies representing different points in the design space
SLIDE 42
Per Flow (e.g. today)
SLIDE 43
Per Source (e.g., Seawall *NSDI’11+)
SLIDE 44
Static Reservation (e.g., Oktopus *Sigcomm’11+)
SLIDE 45
New Allocation Policies
3 new allocation policies that take different stands on tradeoffs
SLIDE 46
Proportional Sharing at Link-level (PS-L)
SLIDE 47 Proportional Sharing at Link-level (PS-L)
- Per tenant WFQ where weight = # tenant’s VMs on link
A B WQA= #VMs A on L
BwB BwA = #VMs A on L #VMs B on L
Can easily be extended to use heterogeneous VMs (by using VM weights)
SLIDE 48
Proportional Sharing at Network-level (PS-N)
SLIDE 49 Proportional Sharing at Network-level (PS-N)
- Congestion proportionality in severely restricted context
- Per source-destination WFQ, total tenant weight = # VMs
SLIDE 50 Proportional Sharing at Network-level (PS-N)
WQA1A2= 1/NA1 + 1/NA2
A1 A2
NA2 NA1
Total WQA = #VMs A
- Congestion proportionality in severely restricted context
- Per source-destination WFQ, total tenant weight = # VMs
SLIDE 51 Proportional Sharing at Network-level (PS-N)
WQB WQA = #VMs A #VMs B
- Congestion proportionality in severely restricted context
- Per source-destination WFQ, total tenant weight = # VMs
SLIDE 52
Proportional Sharing on Proximate Links (PS-P)
SLIDE 53
- Assumes a tree-based topology: traditional, fat-tree, VL2
(currently working on removing this assumption)
Proportional Sharing on Proximate Links (PS-P)
SLIDE 54
- Assumes a tree-based topology: traditional, fat-tree, VL2
(currently working on removing this assumption)
– Hose model – Admission control
Proportional Sharing on Proximate Links (PS-P)
A1
BwA1
A2
BwA2
An
BwAn
…
SLIDE 55
- Assumes a tree-based topology: traditional, fat-tree, VL2
(currently working on removing this assumption)
– Hose model – Admission control
– Per source fair sharing towards tree root
Proportional Sharing on Proximate Links (PS-P)
SLIDE 56
- Assumes a tree-based topology: traditional, fat-tree, VL2
(currently working on removing this assumption)
– Hose model – Admission control
– Per source fair sharing towards tree root – Per destination fair sharing from tree root
Proportional Sharing on Proximate Links (PS-P)
SLIDE 57 Deploying PS-L, PS-N and PS-P
– All allocations can use hardware queues (per tenant, per VM or per source-destination)
– PS-N and PS-P can be deployed using CSFQ *Sigcomm’98+
– PS-N can be deployed using only hypervisors – PS-P could be deployed using only hypervisors, we are currently working on it
SLIDE 58 Evaluation
- Small Testbed + Click Modular Router
– 15 servers, 1Gbps links
– 3200 nodes, flow level simulator, Facebook MapReduce traces
SLIDE 59 Many to one
BwB BwA
A B N
One link, testbed
PS-P offers guarantees
BwA
N
SLIDE 60 MapReduce
One link, testbed
BwB BwA
5 R 5 M M+R = 10 M
BwB (Mbps)
PS-L offers link proportionality
SLIDE 61
MapReduce
Network, simulation, Facebook trace
SLIDE 62
MapReduce
Network, simulation, Facebook trace
PS-N is close to network proportionality
SLIDE 63
MapReduce
Network, simulation, Facebook trace
PS-N and PS-P reduce shuffle time of small jobs by 10-15X
SLIDE 64 Conclusion
- Sharing cloud networks is not trivial
- First step towards a framework to analyze network
sharing in cloud computing
– Key goals (min guarantees, high utilization and proportionality), tradeoffs and properties
- New allocation policies, superset properties from past work
– PS-L: link proportionality + high utilization – PS-N: restricted network proportional – PS-P: min guarantees + high utilization