Tom Tofigh, AT&T Nic Viljoen, Netronome Bryan Sullivan, AT&T - - PowerPoint PPT Presentation
Tom Tofigh, AT&T Nic Viljoen, Netronome Bryan Sullivan, AT&T - - PowerPoint PPT Presentation
The Need for Complex Analytics from Forwarding Pipelines Tom Tofigh, AT&T Nic Viljoen, Netronome Bryan Sullivan, AT&T Agenda Problem Statement Proposed SDN Based Observability Gaps in Real Time Observability
- Problem Statement
- Gaps in Real Time Observability
- Proposed SDN Based Observability
- Importance of Real-time Programmable Analytics
- Data Plane Programmability for Complex Analytics
- Programmable NIC Cards
- Summary
2
Agenda
- Require real time observability at data plane and control plane level
- Require programmable granular systems without the unscalable
approach of metering all the data all the time
Looking for the Call Drop Reason!
Problem Statement
4
- Achieve autonomous control through programmable data plane
analytics
- Real time dynamic instrumentation-virtual probes that gather trend data
- Targets specific flows, SOC/SmartNICs, VMs or containers for
- bservation
- Enables instant root cause analysis
- Provide scalable solutions for fine grained observation
Gaps: Dynamic & Real-Time Programmable Analytics
Autonomous Control System Concept
Measure Analyze
Proposed Evolution for Dynamic Probing
Dynamic Probe & Measurement Examples
QoE
- Flow jitter, latency measurement
- Packet drop rate
- Application analysis
- DDoS detection
- Deep packet inspection
- Stateful flow monitor
Customer Care
- Custom statistics
- Flow tracing
- Root cause analysis
Optimization
- Load estimation
- Traffic matrix calculation
- Elephant flow identification
compile disseminate configure collect analyze present dynamic P4 query Models Complex analytics
Security
ROADM (Core)
Spine Routers
Leaf-Spine Fabric
Spine Routers Spine Routers Spine Routers Leaf routers Leaf Routers Leaf Routers Leaf Routers Leaf Routers Leaf Routers
VM VM VM VM
OVS
VM VM VM
GPON (Access)
PON OLT MACs
Measurement Abstraction Interface
Analytics Platform (XOS + Services)
Apps Apps Apps Customer Care Security Diagnosis
ONOS + XOS
SmartNIC
ACORD Observability @ L0 – L7
2.8Tbps
The SmartNIC
Nic Viljoen, Netronome Systems
The Programmable SmartNIC
Challenges with Fixed-Function NICs
- Networking applications have diverse requirements
- Fixed-function ASICs have “baked-in” functionality and lack
flexibility Programmable NIC Advantages
- Develop custom networking applications
- High performance at network
- Preserve CPU cycles
- CPU OVS @40Gbps-12 cores
- Offload OVS @40Gbps-1 core
- Dynamic analytics
- High-level languages-P4/C
- Examples of SmartNICs: Netronome’s Agilio, Cavium LiquidIO
Programmable NIC Architecture
“Sea of Workers” for customized networking workloads Support for P4 and Match/Action structures Optimized memory architecture
vProbe Application
- Interpret flow stats and features
- Aggregate info to controllers-More
- n next slide
Flow Cache
- Keep state for >million flows
- Programmable state based on
vProbe application requirements
- 25G/40G line rate
- Programmable payload size/
number of flows tradeoff
- Self-learning
Augmenting Netronome’s Agilio OVS Software for Virtual Probing
Compute Node vProbe Application VM VM OVS Userspace Processes (ovs-dbserver, ovs-vswitchd)
Action Arguments
Linux Kernel
Agilio-CX Adapter
OVS Datapath
Actions Match Tables
Controller
Tunnels
Deliver to Host Update Statistics
OVS Datapath
Kernel Flow Table, Fallback Path Actions Exact Match Flow Cache
Flow Stats and Features Offload
F i r s t P a c k e t
- f
F l
- w
R e m a i n i n g P a c k e t s
- f
F l
- w
Flow Stats and Features Packet Rx/Tx
vProbe Application
Exact Match Flow Cache
vProbe Application
- Flow-based data and stat aggregation using techniques
such as machine learning
- Enables powerful use-cases through use of flow
analytics:
- Dynamic configuration for DDoS at VM level using
high speed clustering/classification algorithms (next slide)
- Network shaping based on predictive flow
characteristics-Work with University of Arizona has shown 50% improvement in offload utilisation
- Elastic VM resource provisioning
- Filtering and grouping for analysis at various levels of
visibility
- Rack, Data Center, Metro, Regional, National
Classify Aggregate Analyze React and Configure
Cycle Required in < 12s
1 2 3 4
OVS vProbe vProbe OVS
East/West DDOS Use Case
Per VM egress clustering
Drop traffic (targeted/all), Reduce VM resources, Shut down VM
- E/W DDoS attacks are prevalent
- Use vProbe to quickly identify infected VMs
and react by modifying flow rules or VMs
- Policy dictated by higher-level orchestrator
- Aggregated data can be disseminated to
multiple orchestration levels
- Enables distributed response at server/
rack/DC/regional levels
1) Classify 2) Aggregate 3) Analyze 4) Configure
1 3 4 2
- Intelligent network would benefit from programmable switches, NICs
and CPU
- NIC based offload is essential as CPU power is not scaling at the rate
- f Network traffic increase
- AT&T’s John Donovan estimated our traffic has increased by 150,000%
since 2007
- This means offload is essential to negate cost and maintain
performance
- Flexible offload opens up potential analytics use cases that have
previously not been tenable
Observability-Intelligence at the Edge
Overview-What do you need to find a needle
OBSERVABILITY the ability to statefully observe connections COMPUTABILITY the ability to monitor and aggregate complex data in real time FLEXIBILITY the ability to create a real time feedback loop using dynamic data plane and control functions
With Dynamic Programmable vProbe
- We are looking to gather a list of use cases for a dynamic analytics platform
currently being developed
- Email: Tom Tofigh (Tofigh@att.com) or Nic Viljoen
(nick.viljoen@netronome.com)-email address with an k!
- Join us for the next series of POCs
Thank You!
Call to Action-We Need Your Use Cases!