2 Related Work and Background 2.2 Aho-Corasick Algorithm 2.1 - PDF document

Multi-Core Architecture on FPGA for Large Dictionary String Matching ∗ Qingbo Wang, Viktor K. Prasanna Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, CA 90089-2562 qingbow, prasanna@usc.edu Abstract network traffic, high performance algorithms are required to prevent an IDS from becoming a network bottleneck. FPGAs have been attractive for high performance imple- FPGA has long been considered an attractive platform mentations of string matching due to their high I/O band- for high performance implementations of string matching. width and computational parallelism. Application specific However, as the size of pattern dictionaries continues to optimizations for string matching algorithms have been pro- grow, such large dictionaries can be stored in external posed for FPGA-based designs [18]. They typically use a DRAM only. The increased memory latency and limited small dictionary, on the order of a few thousand patterns bandwidth pose new challenges to FPGA-based designs, (e.g., see [3, 4]). Thus the state transition table ( STT ) gen- and the lack of spatial and temporal locality in data access erated from a Deterministic Finite Automaton (DFA) repre- also leads to low utilization of memory bandwidth. In this sentation of the pattern dictionary, or the pattern signatures paper, we propose a multi-core architecture on FPGA to ad- themselves, can be stored in the on-chip memory or in the dress these challenges. We adopt the popular Aho-Corasick logic of FPGAs. (AC-opt) algorithm for our string matching engine. Utiliz- However, the size of dictionaries has increased greatly. ing the data access feature in this algorithm, we design a A dictionary can have 10,000 patterns or more [14,15] now, specialized BRAM buffer for the cores to exploit a data re- resulting in an STT table tens of megabytes in size. Such use existing in such applications. Several design optimiza- large tables can be stored only in external memory and in- tion techniques are utilized to realize a simple design with cur long access latency. Since every character searched re- high clock rate for the string matching engine. An imple- quires a memory reference, this latency increase degrades mentation of a 2-core system with one shared BRAM buffer the string matching performance. The problem is worsened on a Virtex-5 LX155 achieves up to 3.2 Gbps throughput on a 64 MB state transition table stored in DRAM. Perfor- by the fact that string matching presents little memory ac- mance of systems with more cores is also evaluated for this cess locality and that access to the STT is irregular. architecture, and a throughput of over 5.5 Gbps can be ob- In this paper, we propose a multi-core architecture on tained for some application scenarios. FPGA for large dictionary string matching. We use the Aho-Corasick algorithm (AC-opt) for design verification, but the architecture can be applied to any such algorithms 1 Introduction that employ a DFA stored in DRAM for pattern matching [16]. Our study shows, using AC-opt algorithm, that a String matching looks for all occurrences of a pattern small number of frequently visited states exist in the process dictionary, in a steam of input data. It is the key operation of string matching, and the majority of memory references in search engines, and is a core function of network mon- during string matching go to these “hot” states. When we itoring, intrusion detection systems (IDS), virus scanners, allocate these states on FPGA to enable on-chip access to and spam/content filters [3, 4, 15]. For example, the open- them, not only can the traffic to external memory be signif- source IDS Snort [15] has thousands of content-based rules, icantly reduced, but the throughput for the string matching many of which require string matching against entire net- engine is also improved due to fast on-chip access. Our ma- work packets, i.e. deep packet inspection. To support heavy jor contributions are: • To the best of our knowledge, our architecture is the ∗ Supported by the United States National Science Foundation under first multi-core architecture on FPGA for large dic- grant No. CCR-0702784. Equipment grant from Xilinx Inc. is gratefully tionary string matching to address the challenge of acknowledged.

2 Related Work and Background 2.2 Aho-Corasick Algorithm 2.1 - PDF document

Multi-Core Architecture on FPGA for Large Dictionary String Matching Qingbo Wang, Viktor K. Prasanna Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, CA 90089-2562 qingbow, prasanna@usc.edu

Epistemic Answer Set Programming Ezgi Iraz Su CILC 2019 @ Trieste, ITALY, June 2019 1 / 39

Tax Exempt Bond Financing The Related Group Related Urban (RUDG) Related Affordable (RAP)

Case Law Update 2017 Work related and non work related injury Lower court applied

TAPPING THE POTENTIAL FOR REDUCING WORK-RELATED ROAD DEATHS AND INJURIES 23 October 2017,

WORK-RELATED WORK-RELATED INTERSTITIAL LUNG INTERSTITIAL LUNG DISCLOSURES DISCLOSURES

1 Related Work Related Work Related Work Related Work Gromov-Hausdorff Gromov-Hausdorff

AASB124 RELATED PARTY DISCLOSURES 26 July 2017 BACKGROUND Scope of AASB 124 Related Party

AN INTRODUCTION TO BACKGROUND SETTINGS: Allows you to change background BACKGROUND SETTINGS: Allows

Special Education Related Services Carrie Rea, Program Manager Related and Low Incidence Services

Drug Related Deaths in the North East Tom Le Ruez Tees Preventing Drug Related Deaths

Is it ever appropriate to use foreclosure Is it ever appropriate to use foreclosure related sales

Sequences are related Darwin: all organisms are related through descent with modification

Health-Related Services 101 September 22, 2020 Agenda Welcome Overview of

Ontarios Related Party Rules Allow for TAX FREE / LESS TAX GST Carrying on Business Policy

Chapter 2 Background This chapter covers general background material for the thesis and provides

The Work Environment Act and The Work Environment Ordinance The Work Environment Act and The

Storebrand Q4 2012 13 February 2013 Odd Arild Grefstad CEO Lars Aa. Lddesl CFO 1

Dynamic Power Management Framework for Multi-Core Portable Embedded System Presenter: Yan Like

More OpenGL More OpenGL Gustav Taxn, Ph. D. Associate Professor CSC / KTH gustavt@csc.kth.se

Computer Graphics Hardware An Overview Graphics System Monitor GPU CPU/Memory Input devices

Common Naming Nomenclature A Standard is Needed? No Perfect Solution Will any standard be

Trans European Motorways (TEM) Project UNECE/Transport Division Antonio Lucas-Alba University of

Computer Science Class XI ( As per CBSE Board) Visit : python.mykvs.in for regular updates

B RIAN C LIBURN , M ATT C OLE , S COTT F ENNELL , J ASON S HIVER , A NGIE R OBERTSON , M ELISSA T