Algorithms and Data Structures to Accelerate Network Analysis - PowerPoint PPT Presentation

Algorithms and Data Structures to Accelerate Network Analysis Reservoir Labs Jordi Ros-Giralt, Alan Commike, Peter Cullen, Richard Lethin {giralt, commike, cullen, lethin}@reservoir.com 4th International Workshop on Innovating the Network for Data Intensive Science November 12, 2017 632 Broadway Suite 803 New York, NY 10012 1 4th International Workshop on Innovating the Network for Data Intensive Science

Roadmap • Problem definition • Optimizations • Long queue emulation • Lockless bimodal queues • Tail early dropping • LFN tables • Multiresolution priority queues • Benchmarks 2 4th International Workshop on Innovating the Network for Data Intensive Science

Problem Definition • System wide optimization of network components like routers, firewalls, or network analyzers is complex. • Hundreds of different SW algorithms and data structures interrelated in subtle ways. • Two inter-related problems: • Shifting micro-bottlenecks • Nonlinear performance collapse 3 4th International Workshop on Innovating the Network for Data Intensive Science

Problem Definition Shifting Micro-Bottlenecks It’s difficult... 4 4th International Workshop on Innovating the Network for Data Intensive Science

Problem Definition Shifting Micro-Bottlenecks ...to optimize... 5 4th International Workshop on Innovating the Network for Data Intensive Science

Problem Definition Shifting Micro-Bottlenecks ...bottlenecks... 6 4th International Workshop on Innovating the Network for Data Intensive Science

Problem Definition Shifting Micro-Bottlenecks ...that keep moving... 7 4th International Workshop on Innovating the Network for Data Intensive Science

Problem Definition Shifting Micro-Bottlenecks ...every microsecond... 8 4th International Workshop on Innovating the Network for Data Intensive Science

Problem Definition Shifting Micro-Bottlenecks ...or so. 9 4th International Workshop on Innovating the Network for Data Intensive Science

Non-linear Performance Collapse Disk I/O L1-I cache: 896 kB L1-D cache: 896 kB 10.4 Gbps Cache L2 cache: 7168 kB L3 cache: 71680 kB Net PCIE CPU I/O 40Gbps 64 Gbps 56 GHz Memory 1092 Gbps 10 4th International Workshop on Innovating the Network for Data Intensive Science

Non-linear Performance Collapse Healthy cache regime: - CPU operates out of cache - High cache hit ratios Disk I/O L1-I cache: 896 kB L1-D cache: 896 kB 10.4 Gbps Cache L2 cache: 7168 kB L3 cache: 71680 kB Net PCIE CPU I/O 40Gbps 64 Gbps 56 GHz Memory 1092 Gbps State 1: network is the bottleneck 11 4th International Workshop on Innovating the Network for Data Intensive Science

Non-linear Performance Collapse Highly inefficient memory regime: - CPU operates out of RAM - High cache miss ratios Disk I/O L1-I cache: 896 kB L1-D cache: 896 kB 10.4 Gbps Cache L2 cache: 7168 kB L3 cache: 71680 kB Net 10x penalty PCIE CPU I/O 40Gbps 64 Gbps 56 GHz Memory 1092 Gbps State 2: network is no longer the bottleneck 12 4th International Workshop on Innovating the Network for Data Intensive Science

Non-linear Performance Collapse Highly inefficient memory regime: - CPU operates out of RAM - High cache miss ratios Disk I/O L1-I cache: 896 kB L1-D cache: 896 kB 10.4 Gbps Cache L2 cache: 7168 kB L3 cache: 71680 kB Net 10x penalty PCIE CPU I/O 40Gbps 64 Gbps 56 GHz output Memory 1092 Gbps input State 2: network is no longer the bottleneck By removing the network bottleneck, system spends more time processing packets that will need to be dropped anyway → net performance degradation (performance collapse) 13 4th International Workshop on Innovating the Network for Data Intensive Science

Performance Optimization: Approach • The process of performance optimization needs to be a meticulous one involving small but safe steps to avoid the pitfall of pursuing short term gains that can lead to new and bigger bottlenecks down the path. 14 4th International Workshop on Innovating the Network for Data Intensive Science

Performance Optimization: Algorithms and Data Structures Long queue emulation Reduces packet drops due to fixed-size hardware rings Lockless bimodal queues Improves packet capturing performance Tail early dropping Increases information entropy and extracted metadata LFN tables Reduces state sharing overhead Multiresolution priority queues Reduces cost of processing timers 15 4th International Workshop on Innovating the Network for Data Intensive Science

Long Queue Emulation Dispatcher Model: Long queue emulation Model: - Packet read cache penalty. - Packet drop penalty under certain - Descriptor read cache penalty conditions 16 4th International Workshop on Innovating the Network for Data Intensive Science

Long Queue Emulation 17 4th International Workshop on Innovating the Network for Data Intensive Science

Long Queue Emulation Use LQE 18 4th International Workshop on Innovating the Network for Data Intensive Science

Long Queue Emulation • Optimal LQE size 19 4th International Workshop on Innovating the Network for Data Intensive Science

Lockless Bimodal Queues • Goal: move packets from the memory ring to the disk without using locks 20 4th International Workshop on Innovating the Network for Data Intensive Science

Lockless Bimodal Queues • Goal: move packets from the memory ring to the disk without using locks 21 4th International Workshop on Innovating the Network for Data Intensive Science

Lockless Bimodal Queues 22 4th International Workshop on Innovating the Network for Data Intensive Science

Tail Early Dropping Information/Entropy Connection bits in sequence of arrival 23 4th International Workshop on Innovating the Network for Data Intensive Science

Tail Early Dropping Information/Entropy Connection bits in sequence of arrival 24 4th International Workshop on Innovating the Network for Data Intensive Science

Tail Early Dropping 25 4th International Workshop on Innovating the Network for Data Intensive Science

LFN Tables 26 4th International Workshop on Innovating the Network for Data Intensive Science

Multiresolution Priority Queues • Priority queue: element at the front of the queue is the greatest of all the elements it contains, according to some total ordering defined by their priority . • Found at the core of important computer science problems: • Shortest path problem • Packet scheduling in Internet routers • Event driven engines • Huffman compression codes • Operating systems • Bayesian spam filtering • Discrete optimization • Simulation of colliding particles • Artificial intelligence 29 4th International Workshop on Innovating the Network for Data Intensive Science

Multiresolution Priority Queues Year Author Data structure Insert Extract Notes 1964 Williams [3] Binary heap O(log(n)) O(log(n)) Simple to implement. 1984 Fredman et al. Fibonacci Heaps O(1) O(log(n)) More complex to implement. [4] 1988 Brown [8] Calendar queues O(1) O(c) Need to be balanced and resolution cannot be tuned. 2000 Chazelle [5] Soft heaps O(1) O(1) Unbounded error. 2008 Mehlhorn et Bucket queues O(1) O(c) Priorities must be small integers and al. [7] resolution cannot be tuned. 2017 Ros-Giralt et Multiresolution O(1), O(1) Tunable/bounded resolution error. al. (this work) priority queue O(r) or Error is zero if priority space is O(log(r)) multi-resolutive. n: number of elements in the queue c: maximum integer priority value r: number of resolution groups supported by the multiresolution priority queue 30 4th International Workshop on Innovating the Network for Data Intensive Science

Multiresolution Priority Queues • A multiresolution priority queue is a container data structure that at all times maintains the following invariant: • Intuitively: 1. Discretize the priority space into a sequence of slots or resolution groups 2. Prioritize elements according to the slot in which they belong. 3. Elements belonging to lower slots are given higher priority. 4. Within a slot, ordering is not guaranteed. This enables a mechanism to control the trade-off accuracy versus performance. 31 4th International Workshop on Innovating the Network for Data Intensive Science

Multiresolution Priority Queues • The larger the parameter p Δ → the lower the resolution of the queue → the higher the error → the higher the performance (and vice versa) • Instead of ordering the space of elements, an MR-PQ orders the space of priorities. • The information theoretic barriers of the problem are broken by introducing error in a way that entropy is reduced: - In many real world problems, the space of priorities has much lower entropy than the space of keys. - Example: - Space of keys is the set of real numbers (S k ) - Space of priorities is the set of distances between any two US cities (S p ) - Entropy(S k ) >> Entropy(S p ) 32 4th International Workshop on Innovating the Network for Data Intensive Science

Algorithms and Data Structures to Accelerate Network Analysis - PowerPoint PPT Presentation

Algorithms and Data Structures to Accelerate Network Analysis Reservoir Labs Jordi Ros-Giralt, Alan Commike, Peter Cullen, Richard Lethin {giralt, commike, cullen, lethin}@reservoir.com 4th International Workshop on Innovating the Network for

ACCELERATE AUDIT ACCELERATE ATTAIN ALIGN ACCREDIT THE 4 STAGE PROCESS ACCELERATE ACCREDIT

CS 310 - Advanced Data Structures and Algorithms Basic Data Structures May 31, 2018 Mohammad

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Data Structures Data Structures Lists Trees Trees Graphs CSE 680 Review basic

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Algorithms and Data Structures: Overview Algorithms and data structures Data Abstraction,

Algorithms and Data Structures, or . . . Classical Algorithms of the 50s, 60s and 70s Mary Cryan

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

COL106: Data Structures and Algorithms Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL106: Data

4/3/13 CS200 Algorithms and Data Structures Colorado State University Pa Part 7. Tables

Introduction to Algorithms and Data Structures CSC 1051 Algorithms and Data Structures I Dr.

Exams Gerth Stlting Brodal Algorithms and Data Structures Retreat, Sandbjerg, Denmark, March 3,

welcome to data structures and algorithms data structures and algorithms 2020 08 31 lecture 1

Algorithms and Data Structures Lecture 6 Binary Search Trees I Fabian Kuhn Algorithms and

Oracle Accelerate for Midsize Companies Ian Boyling, Director and Lead Consultant Prject (EU)

"Manufacturing challenges - now and future - how will we ensure patient access to these

Public Meeting Fox Chapel Area High School Auditorium December 12, 2017 Project Team PennDOT

Broadband and Competition In Alaska September 7- 9, 2016 alaskacommunications.com 1 | Alaska

Multi-tenant Distributed Systems Jonathan Mace Peter Bodik Rodrigo Fonseca Madanlal Musuvathi

elena international The power grid: bottleneck or enabler of the energy transition? #clisciety

I nfluence of Recovery Time on TCP Behaviour Chris Develder Didier Colle Pim Van Heuven Steven

Using the Theory of Constraints to Using the Theory of Constraints to Coach Agile Teams Coach

+32 479 727 905 +32 483 68 25 34 evin@amia-systems.com mho@amia-systems.com 10/03/2017 SIMOGGA

Algorithms and Data Structures to Accelerate Network Analysis - PowerPoint PPT Presentation

Algorithms and Data Structures to Accelerate Network Analysis Reservoir Labs Jordi Ros-Giralt, Alan Commike, Peter Cullen, Richard Lethin {giralt, commike, cullen, lethin}@reservoir.com 4th International Workshop on Innovating the Network for

ACCELERATE AUDIT ACCELERATE ATTAIN ALIGN ACCREDIT THE 4 STAGE PROCESS ACCELERATE ACCREDIT

CS 310 - Advanced Data Structures and Algorithms Basic Data Structures May 31, 2018 Mohammad

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Data Structures Data Structures Lists Trees Trees Graphs CSE 680 Review basic

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Algorithms and Data Structures: Overview Algorithms and data structures Data Abstraction,

Algorithms and Data Structures, or . . . Classical Algorithms of the 50s, 60s and 70s Mary Cryan

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

COL106: Data Structures and Algorithms Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL106: Data

4/3/13 CS200 Algorithms and Data Structures Colorado State University Pa Part 7. Tables

Introduction to Algorithms and Data Structures CSC 1051 Algorithms and Data Structures I Dr.

Exams Gerth Stlting Brodal Algorithms and Data Structures Retreat, Sandbjerg, Denmark, March 3,

welcome to data structures and algorithms data structures and algorithms 2020 08 31 lecture 1

Algorithms and Data Structures Lecture 6 Binary Search Trees I Fabian Kuhn Algorithms and

Oracle Accelerate for Midsize Companies Ian Boyling, Director and Lead Consultant Prject (EU)

&quot;Manufacturing challenges - now and future - how will we ensure patient access to these

Public Meeting Fox Chapel Area High School Auditorium December 12, 2017 Project Team PennDOT

Broadband and Competition In Alaska September 7- 9, 2016 alaskacommunications.com 1 | Alaska

Multi-tenant Distributed Systems Jonathan Mace Peter Bodik Rodrigo Fonseca Madanlal Musuvathi

elena international The power grid: bottleneck or enabler of the energy transition? #clisciety

I nfluence of Recovery Time on TCP Behaviour Chris Develder Didier Colle Pim Van Heuven Steven

Using the Theory of Constraints to Using the Theory of Constraints to Coach Agile Teams Coach

+32 479 727 905 +32 483 68 25 34 evin@amia-systems.com mho@amia-systems.com 10/03/2017 SIMOGGA

"Manufacturing challenges - now and future - how will we ensure patient access to these