Open Infrastructure Summit, Shanghai
Nov 4, 2019 Ash Bhalgat (Sr. Director, Cloud Marketing, Mellanox Technologies) Yossef Efraim (Director, Software Development, Mellanox Technologies) Zhike Wang (JD Cloud IaaS Architect and Product Lead, JD.com)
Software Defined Security Reloaded Open Infrastructure Summit, - - PowerPoint PPT Presentation
Software Defined Security Reloaded Open Infrastructure Summit, Shanghai Nov 4, 2019 Ash Bhalgat (Sr. Director, Cloud Marketing, Mellanox Technologies) Yossef Efraim (Director, Software Development, Mellanox Technologies) Zhike Wang (JD
Open Infrastructure Summit, Shanghai
Nov 4, 2019 Ash Bhalgat (Sr. Director, Cloud Marketing, Mellanox Technologies) Yossef Efraim (Director, Software Development, Mellanox Technologies) Zhike Wang (JD Cloud IaaS Architect and Product Lead, JD.com)
Agenda
▪JD Cloud SmartNIC Requirements ▪JD Cloud Conntrack Use Cases (Virtualized and Baremetal Cloud) ▪JD Cloud SmartNIC CT Offload Performance in Production
Futuriom Survey: Security bottlenecks in SDN world
To Learn More: Download the “Untold Secrets of the Efficient Data Center” Futuriom Report
Software Defined Everything (SDX) Kills Performance
Bare Metal
Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core
Application
Core
Software Defined Hardware Accelerated
Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core
Virtualized & Software Defined
Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core
Application
Core
Virtualization, Security & SDX Penalty
Core
Smart NIC
Smart NIC Improves Security & Restores Server Application Performance!
Mellanox SmartNICs – an Acceleration Strategy
▪ Not programmable ▪ Stateless offloads ▪ 1G/10G NICs with CPU
doing heavy lifting
▪ Priced as per the value ▪ Best performance for price ▪ Built-in hardware offloads ▪ Extra flexibility, efficiency
and performance
▪ Highly customizable ▪ Leverage hardware
accelerations
▪ Full programmability
Commodity NICs ConnectX-5/6/6-Dx BlueField 1 and 2
x86 Core Available for Application x86 Core processing packets – virtualization, security, storage
Basic NICs ConnectX SmartNIC BlueField SmartNIC
NIC Common Operations in Networking
▪Most network functions share some data-path operations
▪ Packet Classification (into flows) ▪ Action based on the classification result
▪Mellanox SmartNICs offload both the classification and actions in hardware
Classification A Action A Classification B Action B Classification N Action N
Packets In Processed Packets Out
SDN: Flow Tables Overview
▪Each table contains Match and Action
Rule Entries
▪Multiple tables
▪ Programmable table size ▪ Programmable table cascading
▪Dedicated, isolated tables per hypervisor,
VM, Container, port
▪Practically unlimited table size
▪Can support million of rules/flows
SDN: Flow Tables – Classify/Match Rules and Take Action
▪ Match Fields (5-Tuple, 7-tuple) ▪ Ethernet Layer 2
▪ Destination MAC ▪ 2 outer VLANs / priority ▪ Ethertype
▪ IP (v4 /v6)
▪ Source address ▪ Destination address ▪ Protocol / Next header
▪ TCP /UDP
▪ Source port ▪ Destination port
▪ Flexible fields extraction by
Mellanox “Flexparse”
▪ Actions (ConnectX-5) ▪ Steering and Forwarding ▪ Drop / Allow ▪ Encap/Decap ▪ VXLAN, NVGRE,
Geneve, NSH, MPLSoGRE/UDP
▪ Flex encap/decap ▪ Report Flow ID ▪ Header rewrite ▪ Hairpin mode ▪ Counter set ▪ Connection Tracking New
ASAP2: OVS Datapath Hardware Offload
▪Mellanox Branding: Accelerated Switching and Packet Processing (ASAP2) ▪Best of Both Worlds! Enable SR-IOV data path with OVS control plane ▪Enable support for most SDN controllers with SR-IOV data plane
VM
ConnectX-5 eSwitch
VM Hypervisor
OVS
SR-IOV VF SR-IOV VF
Data Path
PF
OVS Kernel (Slow Path)
User Kernel
First Packet Subsequent Packets(Slow Path)
ASAP2: OVS Datapath Hardware Offload
OVS Kernel (Slow Path) eSwitch (Fast Path)
User Kernel SmartNIC
First Packet Subsequent packets with HW Offload (Fast Path) VM
ConnectX-5 eSwitch
VM Hypervisor
OVS
SR-IOV VF SR-IOV VF
Data Path
PF
▪Mellanox Branding: Accelerated Switching and Packet Processing (ASAP2) ▪Best of Both Worlds! Enable SR-IOV data path with OVS control plane ▪Enable support for most SDN controllers with SR-IOV data plane
Basic Architecture: Linux TC_Flower API
User Space Kernel HW
OVS vswitchd
netlink
Driver TC_Flower
FDB
eSwitch
< - - - - > | op < - - - - > | op < - - - - > | op < - - - - > | op
Rules Table
< - - - - > | op < - - - - > | op < - - - - > | op < - - - - > | op
OVS Data Path
HW offload control path Standard control path
OVS Performance: DPDK vs. ASAP2
▪ Mellanox OVS Offload (ASAP2) ▪ 20X higher performance than vanilla OVS ▪ 8X-10X better performance than OVS-DPDK ▪ Line rate performance at 25/40/50/100Gbps
▪
Open Source - No Vendor Lock-In
▪ Adopted broadly by Linux community & industry ▪ Full Community Support (OVS, Linux, OpenStack) ▪ Industry Ecosystem Support
▪ Nuage/Nokia, Red Hat, Ubuntu, Dell, F5, etc.
7.6 MPPS 66 MPPS
10 20 30 40 50 60 70
OVS over DPDK OVS Offload
Million Packet Per Second
(ASAP2)
ASAP2: Highest Packet Rate with Zero CPU Load
2 Cores Zero CPU Load!
8X-10X Better Higher is Better
Open Ecosystem Components
Open Source Components: ✓ Kernel code is upstream: Kernel 4.8+ ✓ OVS code is upstream: OVS 2.8+ ✓ OpenStack Release: Queens Commercial Products ✓ Mellanox SmartNICs: ConnectX-5 and BlueField ✓ Red Hat: RHEL 7.5+ and RHOSP 13 (Tech Preview) ✓ Nuage Networks: VSP 5.4.1
What is Connection Tracking (conntrack)?
OVS CT (connection tracking)
mechanism as the iptables/nftables
action is determined
Connection Tracking, packet sent to Connection Tracker Netfilter module
from CT Table sent to OVS
recirculated and steered as per the rule
ASAP2 CT Offload Concept
software (conntrack)
hardware offload programming
path (OVS-CT and Linux ConnTrack)
switched to VF by SmartNIC e-switch
hardware for two reasons:
NIC eSwitch (datapath)
vPorts
packets configuration
NetDev n NetDev2 NetDev1 VF 1 PF VF 2 HW vendor driver Generic modified SW conntrack CT state Flow based DB CT
5 tuple CT state
TC
ASAP2 CT - OpenStack Integration
▪All changes for OVS Connection Tracking offloads – Transparent to OpenStack ▪No OpenStack Changes. Works Seamlessly with no modifications! ▪Just use OVS Firewall Driver to provision OpenStack Security Groups ▪It just works!
ASAP2 CT - Open Source Software Contributions
➢ Linux Kernel modules
✓ TC flowers (support CT match/action) ✓ CT offload modules ✓ Netfliter
✓ Mellanox drivers ▪ OVS User-space ▪ Linux IProute
Demo: Setup
▪Load generator ▪Runs T-rex ▪OVS host (24 cores) ▪1 core for slow-path (softirq+OVS) ▪1 VM, running DPDK over 4 vCPUs ▪1 VF passthrough to the VM via SR-IOV ▪VxLAN is configured ▪Both connected by ConnectX-5 Ex 100GbE
Load generator – T-rex OVS host 12 cores
100GbE
OVS VM
VxLAN
Demo: OpenFlow Rules for Connection Tracking
▪‘table=0, arp, action=normal’ ▪'table=0, ip, ct_state=-trk, action=ct(table=1)' ▪'table=1, priority=1, ip, ct_state=+trk+new, action=ct(commit),normal' ▪‘table=1, priority=1, ip, ct_state=+trk+est, action=normal
table 0 table 1 action=normal CT action=ct(commit) ip arp new established
Demo
▪
Best Case
▪
Worst Case
Initial state
Slow Path
Slow Path to Fast path
Fast Path
Demo Results
miss)
Mellanox Lab Tested Results
Date 2019.11.04
Zhike Wang, Architect of IAAS, JD Cloud
➢ Why SmartNIC ➢ What are JD Cloud SmartNIC Solution Requirements? ➢ Flavors of SmartNIC ➢ Security Group & Conntrack ➢ Conntrack offload challenge ➢ Use Case scenario in JD Cloud ➢ JD Cloud SmartNIC Performance in production ➢ JD Cloud on-going work
➢Committed SLA ✓ High throughput and PPS ✓ Low latency and jitter ✓ Committed bandwidth ➢Release host CPU resource
➢ Data path offload ✓ Including traffic through Security Group/Conntrack ➢ HW programmable/flexible ➢ Keep SmartNIC HW simple and robust ✓ Match and Forward/Drop ✓ Tunnel push/pop ✓ Packet modification ✓ QoS policer and shaper ➢ Keep complexity in SW ✓ New service introduced frequently ✓ Some logic is hard for SmartNIC HW
Flavor Advantage Disadvantage Multicore CPU Easy to program Performance limited NP Programmable, flexible Hard to program, Performance good FPGA Programmable, flexible Hard to program, Performance good ASIC Best performance, Software Programmable Cannot change HW logic SoC Easy to program, Best performance, Software Programmable Expensive due to on-board compute and memory
➢ Key point: Stateful firewall ➢ Key design: ✓ How Conntrack state keeps sync between HW and SW? ✓ How partial match goes to OVS? ➢ Solution and development:
✓ Joint work between MLNX and JD
Cloud team ➢ Changed modules ✓ TC flowers (support CT match/action) ✓ CT offload modules ✓ Kernel ✓ MLNX drivers & FW
Number_of_conn_tracked Number_of_conn_tracked 36 37 38 39 40 41 42 10k 100k 500k
ASAP2 CT (Mpps)
0.5 1 1.5 2 2.5 3 3.5 4K 8K 20K
OVS_DPDK CT (Mpps) ➢ 64bytes Packet Size ➢ 2 cores with HT (Mpps) ➢ Tested with Mellanox ConnectX-5 25G SmartNIC
15X Better
Packet Rate (Mpps) Packet Rate (Mpps)
➢User space Conntrack Offloading
✓less kernel dependency. ✓Rapid recovery ✓Easy maintenance
➢Virtio Acceleration
✓Live migration ✓No special driver insides VM
Key Takeaways
▪Software Defined Security is Reloaded now with SmartNICs
▪ Performance and Efficiency
▪SmartNICs efficiently augment existing software mechanism through CT state
hardware offloads for faster performance while saving CPU cores
▪CT offloads are transparent to OpenStack and configured via open APIs (Linux_TC) ▪JD Cloud achieved 1500% packet rate improvement for thousands of flows by
deploying Mellanox ConnectX-5 and BlueField SmartNICs. ▪ Both Virtualized and Baremetal JD Clouds using Mellanox ASAP2 Security in production
▪ASAP2 Security offloads improve performance of many Data Center security
features/applications including security groups, firewall, NAT, DDoS, IDS/IPS, etc.