PAM: When Overloaded, Push Your Neighbor Aside! Zili Meng Jun Bi Chen - - PowerPoint PPT Presentation
PAM: When Overloaded, Push Your Neighbor Aside! Zili Meng Jun Bi Chen - - PowerPoint PPT Presentation
SIGCOMM 2018 Student Research Competition (Undergraduate Category) Also in Proceedings of SIGCOMM Posters and Demos 2018 PAM: When Overloaded, Push Your Neighbor Aside! Zili Meng Jun Bi Chen Sun Shuhe Wang Minhu Wang Hongxin Hu NFV Bright
NFV — Bright Side vs. Dark Side
2
VPN Firewall Monitor Load Balancer
NFV: Commodity Hardware Devices
VM VM VM VM
Dedicated Dedicated Dedicated Dedicated
Low Cost Flexibility Scalability High Latency VirtualizationTechniques
200 μs ~ 1 ms × 7 …… Service Chain
Accelerating NFV with SmartNICs
- NPU-based Multicore SmartNICs
– Netronome, Mellanox – Offloading NFs to SmartNIC to improve performance. – Easy to develop & debug
- NF Migration between SmartNIC and CPU
– SmartNIC may also be overloaded. – UNO [SoCC’17]: Ensure consistency.
3
Existing Solutions Cause Performance Degradation
4
Firewall Monitor Logger Load Balancer CPU SmartNIC PCIe Before migration. (Monitor is overloaded) Firewall Monitor Logger Load Balancer CPU SmartNIC PCIe redundant packet transmissions Naive migration. packets
Measurement of Transmission Latency
5
Packet transmission time is comparable to the packet processing time.
644 μs Service Chain Latency
Measurement of Transmission Latency
6
Packet transmission time is comparable to the packet processing time.
644 μs 100 μs One round-trip transmission Service Chain Latency
Can we reduce the additional transmission latency due to NF migration?
7
Key Novelty – Push Aside Migration
8
When overloaded, push your neighbor aside and occupy its resources.
Push Aside Migration
9
Firewall Monitor Logger Load Balancer CPU SmartNIC PCIe Before migration. (Monitor is overloaded) Push Aside Migration packets Firewall CPU SmartNIC PCIe Logger
At the service chain scope…
Monitor Load Balancer Logger
Can we reduce the additional transmission latency due to NF migration? Yes, push border NFs away to make space for the overloaded NF.
10
Dynamic Scaling of Service Chains
11
CPU SmartNIC vNF ℬ ℬ ℬ
Which border NF to migrate?
Greedy-based Border vNF Selection Algorithm
Goal: Minimize the Number of vNF to Migrate
- Always select the border vNF with minimum capacity.
Minimum capacity given fixed resource Maximum resource consumed given fixed throughput
- Constraints ensured:
– Overload on SmartNIC will be alleviated. – Migration should not create new hot spots on CPU.
- Please refer to our paper for more details.
12
Evaluation – Latency Reduction
13
20% latency reduction
FW MON LOG LB Original FW MON LOG LB Naive PAM FW LOG MON LB
Evaluation – Throughput Maintenance
14 FW MON LOG LB Original FW MON LOG LB Naive PAM FW LOG MON LB
throughput maintained
Discussions
- Suitability of vNF on different devices.
– Some kinds of NFs may be suitable only to SmartNIC or CPU. – Potential solution (ongoing work): Introduce suitability of NFs.
15
Discussions
- Suitability of vNF on different devices.
– Some kinds of NFs may be suitable only to SmartNIC or CPU. – Potential solution (ongoing work): Introduce suitability of NFs.
- Isolation on SmartNICs.
– Unlike CPU, there is no mature isolation mechanisms on SmartNIC. – Potential solution (future work): Software isolation (NetBricks [OSDI’16])
16
Discussions
- Suitability of vNF on different devices.
– Some kinds of NFs may be suitable only to SmartNIC or CPU. – Potential solution (ongoing work): Introduce suitability of NFs.
- Isolation on SmartNICs.
– Unlike CPU, there is no mature isolation mechanisms on SmartNIC. – Potential solution (future work): Software isolation (NetBricks [OSDI’16])
- Precise analysis on PCIe and SmartNIC resource.
– Potential solution (future work): PCIe modelling (pcie-bench [Sigcomm’18])
17
Applying PAM to Other Scenarios
- PAM aims to bring a new direction for NF scaling.
- When multiple NFs share resources, by pushing other NFs away,
the overload NF could automatically preempt resource for scaling.
18
NF2 NF4 NF1 NF3 NF3’ sync VM
Applying PAM to Other Scenarios
19
GPU FPGA
- PAM is designed for the scenario of SmartNIC-CPU cooperation.
- Can it be extended to other application scenarios?
– Multiple kinds of devices.
Future Thoughts on Selecting NFs to Migrate
- PAM is heuristic, but which NF to migrate is a problem.
- Can we improve the performance further and globally?
– Inspired by scheduling problem on cluster jobs, using reinforcement learning for further performance improvement (DeepRM [HotNets’16]).
20
Environment Agent State Network traffic vNF load Action a: vNFs to migrate Reward r: end-to-end performance Observations from the environment Policy 𝜌𝜄 𝑡, 𝑏
Future Thoughts on Selecting NFs to Migrate
- PAM is heuristic, but which NF to migrate is a problem.
- Can we improve the performance further and globally?
– Inspired by scheduling problem on cluster jobs, using reinforcement learning for further performance improvement (DeepRM [HotNets’16]).
21
Environment Agent State Network traffic vNF load Action a: vNFs to migrate Reward r: end-to-end performance Observations from the environment State Embedding Policy Network
Conclusion & Takeaway
22
Problem: Migration between SmartNIC and CPU degrades performance. Intuition: When one NF is overloaded, we can migrate other NFs away and grab their resources to alleviate the hot spot. Question: Which NFs to migrate? Answer: Migrate NFs on the border between SmartNIC and CPU with minimum capacity. Evaluation: 18% latency benefits.
Thank you!
www.zilimeng.info mengzl15@mails.tsinghua.edu.cn
Backup Slides
24
Resource Analysis
25
vNF 𝒋 𝜾𝒋
𝒯
𝜾𝒋
𝒟
Firewall 10Gbps 4Gbps Logger 2Gbps 4Gbps Monitor 3.2Gbps 10Gbps Load Balancer >10Gbps 4Gbps Payload Analyzer 5Gbps 200Mbps
Throughput Capacity 𝜄𝑗
𝒟, 𝜄𝑗 𝒯: throughput capacity of vNF 𝑗 on
CPU (𝒟) or SmartNIC (𝒯).
𝜄𝑗
𝒯
Resource Analysis
26
Assumption
Resource utilization of a vNF increases linearly with its throughput: 𝑠
𝑗 𝒯 = 𝜄𝑑𝑣𝑠
𝜄𝑗
𝒯 ,
𝑠
𝑗 𝒟 = 𝜄𝑑𝑣𝑠
𝜄𝑗
𝒟
Resource Analysis
27
Assumption
Resource utilization of a vNF increases linearly with its throughput: 𝑠
𝑗 𝒯 = 𝜄𝑑𝑣𝑠
𝜄𝑗
𝒯 ,
𝑠
𝑗 𝒟 = 𝜄𝑑𝑣𝑠
𝜄𝑗
𝒟
Deduction
The capacity 𝜄′ of the chain 𝐹1 → 𝐹2: 𝜄′ 𝜄1
𝒯 + 𝜄′
𝜄2
𝒯 = 1 ⇒ 𝜄′ =
𝜄1
𝒯𝜄2 𝒯
𝜄1
𝒯 + 𝜄2 𝒯
For “Payload Analyzer → Monitor”: 𝜄𝑛𝑓𝑏𝑡𝑣𝑠𝑓
′
=1.8Gbps ≈ 𝜄𝑢ℎ𝑓𝑝𝑠𝑧
′
=1.9Gbps 𝐹1 𝐹2
Border vNF Selection Algorithm
Step 1: Border vNFs Identification
- ℬ: border elements on SmartNIC in a service chain (graph).
- Check whether a NF is placed together with its
upstream/downstream NFs.
28
CPU SmartNIC vNF ℬ ℬ ℬ
Border vNF Selection Algorithm
Step 2: Migration vNFs Selection
- Select the NF with minimum capacity.
– Intuition: migrating the NF with minimum capacity will alleviate
- verload more efficiently.
𝑐0 = argmin
𝑐∈ℬ
𝜄𝑐
𝒯
29
CPU SmartNIC vNF ℬ ℬ ℬ
Border vNF Selection Algorithm
Step 3: Overload Alleviation Check
- (ℂ1): Migration should not cause new hot spots on CPU.
𝑗∈ 𝑂𝐺𝑡 𝑝𝑜 𝒟
𝜄𝑑𝑣𝑠 𝜄𝑗
𝒟 + 𝜄𝑑𝑣𝑠
𝜄𝑐0
𝒟
< 1
- Migrate 𝑐0 if (ℂ1) is satisfied. Otherwise go back to Step 2.
30
CPU SmartNIC vNF ℬ ℬ ℬ
Border vNF Selection Algorithm
Step 3: Overload Alleviation Check
- (ℂ2): The overload on SmartNIC should be alleviated.
𝑗∈ 𝑂𝐺𝑡 𝑝𝑜 𝒯 ,𝑗≠𝑐0
𝜄𝑑𝑣𝑠 𝜄𝑗
𝒯 < 1
- Algorithm ends if (ℂ2) is satisfied. Otherwise go back to Step 2.
31
CPU SmartNIC vNF ℬ ℬ ℬ
- (ℂ2): The overload on SmartNIC should be alleviated.
𝑗∈ 𝑂𝐺𝑡 𝑝𝑜 𝒯 ,𝑗≠𝑐0
𝜄𝑑𝑣𝑠 𝜄𝑗
𝒯 < 1
- Algorithm ends if (ℂ2) is satisfied. Otherwise go back to Step 2.
Border vNF Selection Algorithm
Step 3: Overload Alleviation Check
32
CPU SmartNIC vNF ℬ ℬ ℬ
Policy Gradient Algorithm – REINFORCE
33
From https://lilianweng.github.io/lil-log/2018/04/08/policy-gradient-algorithms.html#reinforce