Software Defined Security Reloaded Open Infrastructure Summit, - - PowerPoint PPT Presentation

software defined security reloaded
SMART_READER_LITE
LIVE PREVIEW

Software Defined Security Reloaded Open Infrastructure Summit, - - PowerPoint PPT Presentation

Software Defined Security Reloaded Open Infrastructure Summit, Shanghai Nov 4, 2019 Ash Bhalgat (Sr. Director, Cloud Marketing, Mellanox Technologies) Yossef Efraim (Director, Software Development, Mellanox Technologies) Zhike Wang (JD


slide-1
SLIDE 1

Open Infrastructure Summit, Shanghai

Nov 4, 2019 Ash Bhalgat (Sr. Director, Cloud Marketing, Mellanox Technologies) Yossef Efraim (Director, Software Development, Mellanox Technologies) Zhike Wang (JD Cloud IaaS Architect and Product Lead, JD.com)

Software Defined Security – Reloaded

slide-2
SLIDE 2

Agenda

▪Mellanox ASAP2 Security

▪Futuriom Survey: Security bottlenecks in SDN world ▪Network Virtualization Challenges ▪Mellanox ASAP2 Overview ▪Traditional Connection tracking: Linux and OVS ▪ASAP2 Security: Efficient ConnTrack Offloads ▪Demo and Benchmark comparison

▪JD Cloud: ASAP2 Security Deployment

▪JD Cloud SmartNIC Requirements ▪JD Cloud Conntrack Use Cases (Virtualized and Baremetal Cloud) ▪JD Cloud SmartNIC CT Offload Performance in Production

▪Key Takeaways

slide-3
SLIDE 3

Futuriom Survey: Security bottlenecks in SDN world

To Learn More: Download the “Untold Secrets of the Efficient Data Center” Futuriom Report

slide-4
SLIDE 4

Software Defined Everything (SDX) Kills Performance

Bare Metal

Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core

Application

Core

Software Defined Hardware Accelerated

Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core

Virtualized & Software Defined

Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core

Application

Core

Virtualization, Security & SDX Penalty

Core

Smart NIC

Smart NIC Improves Security & Restores Server Application Performance!

slide-5
SLIDE 5

Mellanox SmartNICs – an Acceleration Strategy

▪ Not programmable ▪ Stateless offloads ▪ 1G/10G NICs with CPU

doing heavy lifting

▪ Priced as per the value ▪ Best performance for price ▪ Built-in hardware offloads ▪ Extra flexibility, efficiency

and performance

▪ Highly customizable ▪ Leverage hardware

accelerations

▪ Full programmability

Commodity NICs ConnectX-5/6/6-Dx BlueField 1 and 2

x86 Core Available for Application x86 Core processing packets – virtualization, security, storage

Basic NICs ConnectX SmartNIC BlueField SmartNIC

slide-6
SLIDE 6

NIC Common Operations in Networking

▪Most network functions share some data-path operations

▪ Packet Classification (into flows) ▪ Action based on the classification result

▪Mellanox SmartNICs offload both the classification and actions in hardware

Classification A Action A Classification B Action B Classification N Action N

Packets In Processed Packets Out

slide-7
SLIDE 7

SDN: Flow Tables Overview

▪Each table contains Match and Action

Rule Entries

▪Multiple tables

▪ Programmable table size ▪ Programmable table cascading

▪Dedicated, isolated tables per hypervisor,

VM, Container, port

▪Practically unlimited table size

▪Can support million of rules/flows

slide-8
SLIDE 8

SDN: Flow Tables – Classify/Match Rules and Take Action

▪ Match Fields (5-Tuple, 7-tuple) ▪ Ethernet Layer 2

▪ Destination MAC ▪ 2 outer VLANs / priority ▪ Ethertype

▪ IP (v4 /v6)

▪ Source address ▪ Destination address ▪ Protocol / Next header

▪ TCP /UDP

▪ Source port ▪ Destination port

▪ Flexible fields extraction by

Mellanox “Flexparse”

▪ Actions (ConnectX-5) ▪ Steering and Forwarding ▪ Drop / Allow ▪ Encap/Decap ▪ VXLAN, NVGRE,

Geneve, NSH, MPLSoGRE/UDP

▪ Flex encap/decap ▪ Report Flow ID ▪ Header rewrite ▪ Hairpin mode ▪ Counter set ▪ Connection Tracking New

slide-9
SLIDE 9

ASAP2: OVS Datapath Hardware Offload

▪Mellanox Branding: Accelerated Switching and Packet Processing (ASAP2) ▪Best of Both Worlds! Enable SR-IOV data path with OVS control plane ▪Enable support for most SDN controllers with SR-IOV data plane

VM

ConnectX-5 eSwitch

VM Hypervisor

OVS

SR-IOV VF SR-IOV VF

Data Path

PF

  • vs-vswitchd

OVS Kernel (Slow Path)

User Kernel

First Packet Subsequent Packets(Slow Path)

slide-10
SLIDE 10

ASAP2: OVS Datapath Hardware Offload

  • vs-vswitchd

OVS Kernel (Slow Path) eSwitch (Fast Path)

User Kernel SmartNIC

First Packet Subsequent packets with HW Offload (Fast Path) VM

ConnectX-5 eSwitch

VM Hypervisor

OVS

SR-IOV VF SR-IOV VF

Data Path

PF

▪Mellanox Branding: Accelerated Switching and Packet Processing (ASAP2) ▪Best of Both Worlds! Enable SR-IOV data path with OVS control plane ▪Enable support for most SDN controllers with SR-IOV data plane

slide-11
SLIDE 11

Basic Architecture: Linux TC_Flower API

User Space Kernel HW

OVS vswitchd

netlink

Driver TC_Flower

FDB

eSwitch

< - - - - > | op < - - - - > | op < - - - - > | op < - - - - > | op

Rules Table

< - - - - > | op < - - - - > | op < - - - - > | op < - - - - > | op

OVS Data Path

HW offload control path Standard control path

slide-12
SLIDE 12

OVS Performance: DPDK vs. ASAP2

▪ Mellanox OVS Offload (ASAP2) ▪ 20X higher performance than vanilla OVS ▪ 8X-10X better performance than OVS-DPDK ▪ Line rate performance at 25/40/50/100Gbps

Open Source - No Vendor Lock-In

▪ Adopted broadly by Linux community & industry ▪ Full Community Support (OVS, Linux, OpenStack) ▪ Industry Ecosystem Support

▪ Nuage/Nokia, Red Hat, Ubuntu, Dell, F5, etc.

7.6 MPPS 66 MPPS

10 20 30 40 50 60 70

OVS over DPDK OVS Offload

Million Packet Per Second

(ASAP2)

ASAP2: Highest Packet Rate with Zero CPU Load

2 Cores Zero CPU Load!

8X-10X Better Higher is Better

slide-13
SLIDE 13

Open Ecosystem Components

Open Source Components: ✓ Kernel code is upstream: Kernel 4.8+ ✓ OVS code is upstream: OVS 2.8+ ✓ OpenStack Release: Queens Commercial Products ✓ Mellanox SmartNICs: ConnectX-5 and BlueField ✓ Red Hat: RHEL 7.5+ and RHOSP 13 (Tech Preview) ✓ Nuage Networks: VSP 5.4.1

slide-14
SLIDE 14

What is Connection Tracking (conntrack)?

  • Tracks connections and stores information about the state of connections
  • For each packet
  • Finds the connection in DB or creates a new entry
  • CT state for every packet can be
  • New – The connection is starting (SYN for TCP)
  • Established – The connection has already been established
  • Related - The connection is related to an establish connection
  • Invalid - packets do not follow the expected behavior of a connection
  • CT also used for NAT
slide-15
SLIDE 15

OVS CT (connection tracking)

  • OVS CT uses the same conntrack

mechanism as the iptables/nftables

  • Step1: Incoming Packet is classified and

action is determined

  • Step2: If there is an OVS action for

Connection Tracking, packet sent to Connection Tracker Netfilter module

  • Step3: Connection State information

from CT Table sent to OVS

  • Step4: Packet with CT meta-data is

recirculated and steered as per the rule

slide-16
SLIDE 16

ASAP2 CT Offload Concept

  • Connection establishment is done by

software (conntrack)

  • HW enforces CT state, augmenting SW CT
  • Control Plane: Linux TC API extended for CT

hardware offload programming

  • First packet in a connection follows slow

path (OVS-CT and Linux ConnTrack)

  • Data Plane: Subsequent packets fast

switched to VF by SmartNIC e-switch

  • Software CT State Table replicated in

hardware for two reasons:

  • Fast switching/forwarding (Match : Action)
  • Saving CPU Cores through hardware offloads

NIC eSwitch (datapath)

vPorts

packets configuration

NetDev n NetDev2 NetDev1 VF 1 PF VF 2 HW vendor driver Generic modified SW conntrack CT state Flow based DB CT

5 tuple CT state

TC

slide-17
SLIDE 17

ASAP2 CT - OpenStack Integration

▪All changes for OVS Connection Tracking offloads – Transparent to OpenStack ▪No OpenStack Changes. Works Seamlessly with no modifications! ▪Just use OVS Firewall Driver to provision OpenStack Security Groups ▪It just works!

slide-18
SLIDE 18

ASAP2 CT - Open Source Software Contributions

➢ Linux Kernel modules

✓ TC flowers (support CT match/action) ✓ CT offload modules ✓ Netfliter

✓ Mellanox drivers ▪ OVS User-space ▪ Linux IProute

slide-19
SLIDE 19

Demo: Setup

▪Load generator ▪Runs T-rex ▪OVS host (24 cores) ▪1 core for slow-path (softirq+OVS) ▪1 VM, running DPDK over 4 vCPUs ▪1 VF passthrough to the VM via SR-IOV ▪VxLAN is configured ▪Both connected by ConnectX-5 Ex 100GbE

Load generator – T-rex OVS host 12 cores

100GbE

OVS VM

VxLAN

slide-20
SLIDE 20

Demo: OpenFlow Rules for Connection Tracking

▪Trivial connection tracking rules:

▪‘table=0, arp, action=normal’ ▪'table=0, ip, ct_state=-trk, action=ct(table=1)' ▪'table=1, priority=1, ip, ct_state=+trk+new, action=ct(commit),normal' ▪‘table=1, priority=1, ip, ct_state=+trk+est, action=normal

table 0 table 1 action=normal CT action=ct(commit) ip arp new established

slide-21
SLIDE 21

Demo

Best Case

Worst Case

slide-22
SLIDE 22

Initial state

slide-23
SLIDE 23

Slow Path

slide-24
SLIDE 24

Slow Path to Fast path

slide-25
SLIDE 25

Fast Path

slide-26
SLIDE 26

Demo Results

▪ 5k UDP streams over OVS with HW offload ▪Best case performance from HW perspective ▪45 MPPS ▪ 200k UDP streams over OVS with HW offload ▪ Worst case performance from HW perspective (100% cache

miss)

▪15 MPPS

Mellanox Lab Tested Results

slide-27
SLIDE 27

Date 2019.11.04

JD Cloud SmartNIC and Conntrack offload

Zhike Wang, Architect of IAAS, JD Cloud

slide-28
SLIDE 28

Agenda

➢ Why SmartNIC ➢ What are JD Cloud SmartNIC Solution Requirements? ➢ Flavors of SmartNIC ➢ Security Group & Conntrack ➢ Conntrack offload challenge ➢ Use Case scenario in JD Cloud ➢ JD Cloud SmartNIC Performance in production ➢ JD Cloud on-going work

slide-29
SLIDE 29

Why SmartNIC

➢Committed SLA ✓ High throughput and PPS ✓ Low latency and jitter ✓ Committed bandwidth ➢Release host CPU resource

slide-30
SLIDE 30

JD Cloud SmartNIC Solution Requirements

➢ Data path offload ✓ Including traffic through Security Group/Conntrack ➢ HW programmable/flexible ➢ Keep SmartNIC HW simple and robust ✓ Match and Forward/Drop ✓ Tunnel push/pop ✓ Packet modification ✓ QoS policer and shaper ➢ Keep complexity in SW ✓ New service introduced frequently ✓ Some logic is hard for SmartNIC HW

slide-31
SLIDE 31

Flavors of SmartNIC implementation

Flavor Advantage Disadvantage Multicore CPU Easy to program Performance limited NP Programmable, flexible Hard to program, Performance good FPGA Programmable, flexible Hard to program, Performance good ASIC Best performance, Software Programmable Cannot change HW logic SoC Easy to program, Best performance, Software Programmable Expensive due to on-board compute and memory

slide-32
SLIDE 32

Use Case: Security Group & Conntrack

slide-33
SLIDE 33

Security Group & Conntrack Challenge

➢ Key point: Stateful firewall ➢ Key design: ✓ How Conntrack state keeps sync between HW and SW? ✓ How partial match goes to OVS? ➢ Solution and development:

✓ Joint work between MLNX and JD

Cloud team ➢ Changed modules ✓ TC flowers (support CT match/action) ✓ CT offload modules ✓ Kernel ✓ MLNX drivers & FW

slide-34
SLIDE 34

Use Scenario in JD Cloud

slide-35
SLIDE 35

JD Cloud – SmartNIC CT Performance in Production

Number_of_conn_tracked Number_of_conn_tracked 36 37 38 39 40 41 42 10k 100k 500k

ASAP2 CT (Mpps)

0.5 1 1.5 2 2.5 3 3.5 4K 8K 20K

OVS_DPDK CT (Mpps) ➢ 64bytes Packet Size ➢ 2 cores with HT (Mpps) ➢ Tested with Mellanox ConnectX-5 25G SmartNIC

15X Better

Packet Rate (Mpps) Packet Rate (Mpps)

slide-36
SLIDE 36

JD Cloud on-going work

➢User space Conntrack Offloading

✓less kernel dependency. ✓Rapid recovery ✓Easy maintenance

➢Virtio Acceleration

✓Live migration ✓No special driver insides VM

slide-37
SLIDE 37

Key Takeaways

▪Software Defined Security is Reloaded now with SmartNICs

▪ Performance and Efficiency

▪SmartNICs efficiently augment existing software mechanism through CT state

hardware offloads for faster performance while saving CPU cores

▪CT offloads are transparent to OpenStack and configured via open APIs (Linux_TC) ▪JD Cloud achieved 1500% packet rate improvement for thousands of flows by

deploying Mellanox ConnectX-5 and BlueField SmartNICs. ▪ Both Virtualized and Baremetal JD Clouds using Mellanox ASAP2 Security in production

▪ASAP2 Security offloads improve performance of many Data Center security

features/applications including security groups, firewall, NAT, DDoS, IDS/IPS, etc.

slide-38
SLIDE 38

THANKS!