Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan - PowerPoint PPT Presentation

Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan Rasool ‡ , Dilip Vasudevan*, Andrew A. Chien* † Argonne National Laboratory † King Faisal University ‡ University of Chicago *

Big Data Applications ◦ Deep Packet Inspection ◦ Bioinformatics (DNA Alignment) ◦ JSON/XML Parsing ◦ Signal Triggering 2 6/24/2014 UNIVERSITY OF CHICAGO

Deep Packet Inspection High speed network : 100Gb/s Growing number of patterns : 6000 Snort Rules Speed requirement : > 75 Tera DFAops/s Power budget : 200 W Energy efficiency requirement : > 375Gops/J 3 6/24/2014 UNIVERSITY OF CHICAGO

Bioinformatics (DNA Alignment) Genome size ： Bioinformatics database: millions of species 130G base pairs Speed requirement : > 1 Tera DFAops/s Power budget : 200 W Energy efficiency requirement : > 5 Gops/J 4 6/24/2014 UNIVERSITY OF CHICAGO

Deterministic Finite Automata (DFA) 5 6/24/2014 UNIVERSITY OF CHICAGO

Programmable Approaches target Intel Xeon E5-2600: 17G DFAops/second with 130W, 0.13Gops/J ; 6 6/24/2014 UNIVERSITY OF CHICAGO

Approach ◦ Workload M input characters(M DFA transitions) N DFA rules perform on the M input characters ◦ Goal Compute N x M transitions efficiently ◦ Approach Parallelize DFA execution Fused Instruction 7 6/24/2014 UNIVERSITY OF CHICAGO

What Is Micro-Engine Generalized Pattern Matching Micro-Engine ( GenPM ) is one micro-engine of 10x10 approach Local Memory I-Cache I-Cache I-Cache I-Cache Basic Micro- Micro- Micro- RISC engine engine engine CPU 2 3 4 I-Cache I-Cache I-Cache I-Cache Micro- Micro- Micro- GenPM engine engine engine 7 8 6 Shared L1 Data Cache 8 6/24/2014 UNIVERSITY OF CHICAGO

GenPM Micro Architecture 9 6/24/2014 UNIVERSITY OF CHICAGO

Fused Instructions: Multi-Step String String a a b b c c 0 0 1 Acc_Vec Acc_Vec buffer buffer Current State Current State A A Q 1 Q 1 Accept Accept ALU ALU D D Q 4 Q 4 address address Local Mem Local Mem ENB ENB Next State Next State 10 6/24/2014 UNIVERSITY OF CHICAGO

Fused Instructions: Multi-Step String a b c 0 1 Acc_Vec buffer Current State A Q 1 Accept ALU D Q 4 address Local Mem ENB Next State 11 6/24/2014 UNIVERSITY OF CHICAGO

Fused Instructions: Multi-Step String String a a b b c c 0 0 1 1 0 Acc_Vec Acc_Vec buffer buffer Current State Current State A A Q 1 Q 1 Accept Accept ALU ALU D D Q 4 Q 4 address address Local Mem Local Mem ENB ENB Next State Next State 12 6/24/2014 UNIVERSITY OF CHICAGO

Fused Instructions: Multi-Step String String a a b b c c 0 0 1 1 0 Acc_Vec Acc_Vec buffer buffer Acc_Vec Current State Current State A A Q 1 Q 1 Accept Accept ALU ALU D D Q 4 Q 4 address address Local Mem Local Mem CHECK ENB ENB Next State Next State 13 6/24/2014 UNIVERSITY OF CHICAGO

Parallel DFA: Vector Instruction SSE ADD + + + + + + + 14 6/24/2014 UNIVERSITY OF CHICAGO

Parallel DFA: Vector Instruction GMVSNEXT DFAop DFAop DFAop DFAop DFAop DFAop DFAop 15 6/24/2014 UNIVERSITY OF CHICAGO

GenPM Code Example Data movement Multi-step parallel DFA execution Find precise matching position 16 6/24/2014 UNIVERSITY OF CHICAGO

Methodology • Design space: Parallelism and step length • Baseline • 32-bit 6-stage in-order RISC • 4GB DDR3 DRAM • 32KB L1 I-cache, 24KB L1 D-cache, 512KB L2 (modeled on Intel Silverthorne) • GenPM • 1MB Local memory (up to 64 banks) • Vector and Fused Instructions • Performance/Power Model • Core : 32nm synthesis by Synopsys Processor Designer • Memories : MARSSX86/CACTI 6 + DRAMSim2 • Workload • 64 Snort rules from 2.9.5.6 snapshot, 10KB random network dump 17 6/24/2014 UNIVERSITY OF CHICAGO

Performance Speedup 3000 GenPM_8way GenPM_64way 2498 speedup versus RISC 2500 1947 2000 1500 1000 289 300 500 243 36 0 1 8 16 step length 18 6/24/2014 UNIVERSITY OF CHICAGO

Energy Efficiency GenPM_8way GenPM_64way energy improvement versus RISC 1200 980 861 1000 800 600 400 213 174 151 200 31 0 1 8 16 step length 19 6/24/2014 UNIVERSITY OF CHICAGO

Throughput/watt (absolute) Throughput/watt 40 GenPM_8way GenPM_64way 35 Throughput per watt(Gops/J) 30 25 20 15 10 5 0 1 8 16 step length Scale to a 75W chip, GenPM delivers > 2.6 Tera DFAops/second 20 6/24/2014 UNIVERSITY OF CHICAGO

Energy Breakdown 100% 90% 80% 70% total energy 60% 50% 40% LM_max = 30% 83% 20% 10% 0% RISC GenPM_8B_1S GenPM_8B_8S GenPM_8B_16S GenPM_64B_1S GenPM_64B_8S GenPM_64B_16S LM L1_I L1_D L2 DRAM Core 21 6/24/2014 UNIVERSITY OF CHICAGO

General Comparison 22 6/24/2014 UNIVERSITY OF CHICAGO

Related Work ASIC: [Brodie, et.al. ISCA 2006], [Titanic System RXP], [ Cisco SCE ] FPGA: [Yang Xu, et.al. ANCS 2011], [ T Song, et.al. INFOCOM 2008], [I Sourdis et.al. VLSI 2008] CPU: [Mytkowicz et.al. ASPLOS 2014 ] , [ Intel HyperScan] GPU: [Vasiliadis G, et.al. CCS 2011], [ Lin CH, et.al. INFOCOM 2012] SoC: [C Johnson et.al. ISSCC 2010 ], [ Cavium Octeon ], [ IBM PowerEN ] 23 6/24/2014 UNIVERSITY OF CHICAGO

Summery • GenPM is a high performance and energy efficient accelerator for pattern matching workloads • ISA exploits parallelism and multi-step execution • Scale to a 75W chip, GenPM delivers > 2.6 Tera DFAops/second • GenPM approaches ASIC efficiency and integrates it into a programmable core 24 6/24/2014 UNIVERSITY OF CHICAGO

Future Work • DFA table compression • Scale up with multiple GenPM micro-engines • Explore more applications 25 6/24/2014 UNIVERSITY OF CHICAGO

Acknowledgements • Defense Advanced Research Projects Agency (DARPA) • Agilent Technologies (now Keysight Technologies) • Synopsys Academic program • Dr. Tung Hoang and members of the Large Scale Systems Group in the Department of Computer Science 26 6/24/2014 UNIVERSITY OF CHICAGO

Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan - PowerPoint PPT Presentation

Generalized Pattern Matching Micro-Engine Yuanwei Fang, Raihan Rasool , Dilip Vasudevan, Andrew A. Chien* Argonne National Laboratory King Faisal University University of Chicago * Big Data Applications Deep Packet

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

LPEG: a new approach to pattern LPEG: a new approach to pattern matching in Lua matching in Lua

Pattern matching and lexing Informatics 2A: Lecture 6 John Longley School of Informatics

Simpler and efficient LZW-compressed multiple pattern matching Pawe Gawrychowski July 4, 2012

Quantum pattern matching fast on average Ashley Montanaro Department of Computer Science,

Pattern Matching a b a c a a b 1 a b a c a b 4 3 2 a b a c a b Pattern

Globbing, pattern matching Globbing is the term used for bashs form of pattern matching in

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

CSE182-L7 Dicitionary matching Pattern matching October 09 CSE182 Dictionary Matching

Concurrent Pattern Matching: combining discovery, privacy and symmetry using pattern matching

CS 126 Lecture T1: Pattern Matching Outline Introduction Pattern matching in Unix

Pattern-Matching Spi-Calculus A Type System for Cryptographic Protocols Christian Haack and Alan

Exact Pattern Matching p t Goal: Find all occurrences of a pattern in a text Input: Pattern p = p

Awk, Awk Pattern matching and processing language Looks for pattern in file If pattern

Exact Pattern Matching p t Goal: Find all occurrences of a pattern in a text Input: Pattern p = p

Small is Beautiful: the design of Lua Roberto Ierusalimschy PUC-Rio Language design many

Monitoring Command-and-Control Channels with ccSpy Final Presentation Oliver Gasser

Styles of Intrusion Detection Misuse intrusion detection Try to detect things known to be

CyberProbe: Towards Internet-Scale Active Detection of Malicious Servers a. nappa , z. xu, m.z.

Discourse BSc Artificial Intelligence, Spring 2011 Raquel Fernndez Institute for Logic,

Operational Practices Internet Security [1] VU Engin Kirda engin@infosys.tuwien.ac.at

Lecture 12 Malware Defenses Stephen Checkoway University of Illinois at Chicago CS 487

DB ENCH -OLTP (2005) tpmC Baseline Performance $ Tf Performance With Faults $ tpmC Tf

Sambuz

Useful Links

Newsletter

Mail Us

Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan - PowerPoint PPT Presentation

Generalized Pattern Matching Micro-Engine Yuanwei Fang*, Raihan Rasool , Dilip Vasudevan*, Andrew A. Chien* Argonne National Laboratory King Faisal University University of Chicago * Big Data Applications Deep Packet

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

LPEG: a new approach to pattern LPEG: a new approach to pattern matching in Lua matching in Lua

Pattern matching and lexing Informatics 2A: Lecture 6 John Longley School of Informatics

Simpler and efficient LZW-compressed multiple pattern matching Pawe Gawrychowski July 4, 2012

Quantum pattern matching fast on average Ashley Montanaro Department of Computer Science,

Pattern Matching a b a c a a b 1 a b a c a b 4 3 2 a b a c a b Pattern

Globbing, pattern matching Globbing is the term used for bashs form of pattern matching in

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

CSE182-L7 Dicitionary matching Pattern matching October 09 CSE182 Dictionary Matching

Concurrent Pattern Matching: combining discovery, privacy and symmetry using pattern matching

CS 126 Lecture T1: Pattern Matching Outline Introduction Pattern matching in Unix

Pattern-Matching Spi-Calculus A Type System for Cryptographic Protocols Christian Haack and Alan

Exact Pattern Matching p t Goal: Find all occurrences of a pattern in a text Input: Pattern p = p

Awk, Awk Pattern matching and processing language Looks for pattern in file If pattern

Exact Pattern Matching p t Goal: Find all occurrences of a pattern in a text Input: Pattern p = p

Small is Beautiful: the design of Lua Roberto Ierusalimschy PUC-Rio Language design many

Monitoring Command-and-Control Channels with ccSpy Final Presentation Oliver Gasser

Styles of Intrusion Detection Misuse intrusion detection Try to detect things known to be

CyberProbe: Towards Internet-Scale Active Detection of Malicious Servers a. nappa , z. xu, m.z.

Discourse BSc Artificial Intelligence, Spring 2011 Raquel Fernndez Institute for Logic,

Operational Practices Internet Security [1] VU Engin Kirda engin@infosys.tuwien.ac.at

Lecture 12 Malware Defenses Stephen Checkoway University of Illinois at Chicago CS 487

DB ENCH -OLTP (2005) tpmC Baseline Performance $ Tf Performance With Faults $ tpmC Tf

Sambuz

Useful Links

Newsletter

Mail Us

Generalized Pattern Matching Micro-Engine Yuanwei Fang, Raihan Rasool , Dilip Vasudevan, Andrew A. Chien* Argonne National Laboratory King Faisal University University of Chicago * Big Data Applications Deep Packet