Beyond TCAMs: An SRAM-based Parallel Multi-Pipeline Architecture for - PDF document

Beyond TCAMs: An SRAM-based Parallel Multi-Pipeline Architecture for Terabit IP Lookup Weirong Jiang, Qingbo Wang and Viktor K. Prasanna Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, CA 90089, USA Email: { weirongj, qingbow, prasanna } @usc.edu shown in Table I, SRAM outperforms TCAM with respect to Abstract —Continuous growth in network link rates poses a strong demand on high speed IP lookup engines. While Ternary speed, density and power consumption. However, traditional Content Addressable Memory (TCAM) based solutions serve SRAM-based solutions, most of which can be regarded as most of today’s high-end routers, they do not scale well for some form of tree traversal, need multiple clock cycles to the next-generation [1]. On the other hand, pipelined SRAM- complete a lookup. For example, trie [3], a tree-like data based algorithmic solutions become attractive. Intuitively multiple pipelines can be utilized in parallel to have a multiplicative structure representing a collection of prefixes, is widely used effect on the throughput. However, several challenges must be in SRAM-based solutions. It needs multiple memory accesses addressed for such solutions to realize high throughput. First, to search a trie to find the longest matched prefix for an IP the memory distribution across different stages of each pipeline packet. as well as across different pipelines must be balanced. Second, the traffic on various pipelines should be balanced. TABLE I In this paper, we propose a parallel SRAM-based multi- C OMPARISON OF TCAM AND SRAM TECHNOLOGIES (18 M BIT CHIP ) pipeline architecture for terabit IP lookup. To balance the memory requirement over the stages, a two-level mapping scheme TCAM SRAM is presented. By trie partitioning and subtrie-to-pipeline mapping, Maximum clock rate (MHz) 266 [5] 400 [6], [7] we ensure that each pipeline contains approximately equal Power consumption (Watts) 12 ∼ 15 [8] ≈ 0.1 [9] number of trie nodes. Then, within each pipeline, a fine-grained Cell size (# of transistors per bit) [10] 16 6 node-to-stage mapping is used to achieve evenly distributed memory across the stages. To balance the traffic on different pipelines, both pipelined prefix caching and dynamic subtrie-to- Several researchers have explored pipelining to improve pipeline remapping are employed. Simulation using real-life data the throughput significantly. Taking trie-based solutions as shows that the proposed architecture with 8 pipelines can store a an example, a simple pipelining approach is to map each core routing table with over 200K unique routing prefixes using trie level onto a pipeline stage with its own memory and 3.5 MB of memory. It achieves a throughput of up to 3.2 billion packets per second, i.e. 1 Tbps for minimum size (40 bytes) processing logic. One IP lookup can be performed every clock packets. cycle. However, this approach results in unbalanced trie node distribution over the pipeline stages. This has been identified I. I NTRODUCTION as a dominant issue for pipelined architectures [11], [12]. IP lookup with longest prefix matching is a core function In an unbalanced pipeline, the “fattest” stage, which stores of Internet routers. It has become a major bottleneck for the largest number of trie nodes, becomes a bottleneck. It backbone routers as the Internet continues to grow rapidly adversely affects the overall performance of the pipeline for [2]. With the advances in optical networking technology, link the following reasons. First, it needs more time to access the rates in high speed IP routers are being pushed from OC- larger local memory. This leads to reduction in the global clock 768 (40 Gbps) to even higher rates. Such high rates demand rate. Second, a fat stage results in many updates, due to the that IP lookup in routers must be performed in hardware. For proportional relationship between the number of updates and instance, 40 Gbps links require a throughput of 8 ns per lookup the number of trie nodes stored in that stage. Particularly dur- for a minimum size (40 bytes) packet. Such throughput is ing the update process caused by intensive route insertion, the impossible using existing software-based solutions [3]. fattest stage can also result in memory overflow. Furthermore, Most hardware-based solutions for high speed IP lookup since it is unclear at hardware design time which stage will fall into two main categories: TCAM (ternary content ad- be the fattest, we need to allocate memory with the maximum dressable memory)-based and DRAM/SRAM (dynamic/static size for each stage. This results in memory wastage. random access memory)-based solutions. Although TCAM- To achieve a balanced memory distribution across stages, based engines can retrieve IP lookup results in just one clock several novel pipeline architectures have been proposed [13], cycle, their throughput is limited by the relatively low speed [14]. However, their non-linear pipeline structures result in of TCAMs. They are expensive and offer little flexibility for throughput degradation, and most of them must disrupt on- adapting to new addressing and routing protocols [4]. As going operations during a route update. Our previous work

Beyond TCAMs: An SRAM-based Parallel Multi-Pipeline Architecture for - PDF document

Beyond TCAMs: An SRAM-based Parallel Multi-Pipeline Architecture for Terabit IP Lookup Weirong Jiang, Qingbo Wang and Viktor K. Prasanna Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, CA 90089,

Processor + SRAM By: Jakub Hladik, Tim Lindquist The SRAM SRAM REQUIREMENTS: 256x8bit

COMP 590-154: Computer Architecture Memory / DRAM SRAM vs. DRAM SRAM = Static RAM As

Hardware Design with VHDL Design Example: SRAM ECE 443 External SRAM A common type of system

WARM SRAM: A Novel Scheme to Reduce Static Leakage Energy in SRAM Arrays Mahadevan

Background Allen Tanner built an SRAM/ROM generator program back in 2004 the ROM seems to

Background w Allen Tanner built an SRAM/ROM generator program back in 2004 n the ROM seems

Background memCellsF09 Allen Tanner built an SRAM/ROM generator program back in 2004 Single-

Beyond Parallel Corpora Philipp Koehn 29 October 2020 Philipp Koehn Machine Translation: Beyond

Main Memory and DRAM Nima Honarmand Spring 2016 :: CSE 502 Computer Architecture SRAM vs.

C ONTENT-ADDRESSABLE MEMORY (CAM) is a thus, slow updates retarded the lookup performance in

Office of Pipeline Safety Office of Pipeline Safety Presentation on Presentation on Damage

Ma Magic Mountain Pipeline Phase 6 gic Mountain Pipeline Phase 6 Project ject Board Meeting

Internal Pipeline Corrosion Kenneth Lee Pipeline Safety Director, Engineering & Research

Pipeline Construction Pipeline Construction Challenges Challenges NAPCA Workshop August 19,

Pipeline A Presentation by Team Pipeline Ben Lai Brandon Bakhshai Jeffrey Serio Somya

1,000 foot pipeline Connect Replacement (Saugus 3 and 4) Wells to Magic Mountain Pipeline

Mobile Avionics Weather The Capstone Experience Team GE Aviation Eric Cook Mike Dunn Drew

Gem5 in a nutshell Christophe Huriaux, Post-doc Inria, IRISA CAIRN Project-Team CAIRN

May 2017 Annual Parent Meeting Broad Run High School Band Boosters Association May 9, 2017

Curriculum on Basics Cadet Responsibilities Cadet Responsibility Agenda B1. Guard Duty

Dr Theodore Zamenopoulos Louise Dredge The Open

PERSON CENTRED CARE Why are we talking about AND dementia? DEMENTIA CARE MAPPING People are

Update on Home-Based Assignment Plan April 13, 2016 * BOSTON PUBLIC SCHOOLS BOSTON PUBLIC

Welcoming Inclusion Network (WIN) When Inclusion Happens We All WIN LORI STILLMAN AND JOHN C.

Beyond TCAMs: An SRAM-based Parallel Multi-Pipeline Architecture for - PDF document

Beyond TCAMs: An SRAM-based Parallel Multi-Pipeline Architecture for Terabit IP Lookup Weirong Jiang, Qingbo Wang and Viktor K. Prasanna Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, CA 90089,

Processor + SRAM By: Jakub Hladik, Tim Lindquist The SRAM SRAM REQUIREMENTS: 256x8bit

COMP 590-154: Computer Architecture Memory / DRAM SRAM vs. DRAM SRAM = Static RAM As

Hardware Design with VHDL Design Example: SRAM ECE 443 External SRAM A common type of system

WARM SRAM: A Novel Scheme to Reduce Static Leakage Energy in SRAM Arrays Mahadevan

Background Allen Tanner built an SRAM/ROM generator program back in 2004 the ROM seems to

Background w Allen Tanner built an SRAM/ROM generator program back in 2004 n the ROM seems

Background memCellsF09 Allen Tanner built an SRAM/ROM generator program back in 2004 Single-

Beyond Parallel Corpora Philipp Koehn 29 October 2020 Philipp Koehn Machine Translation: Beyond

Main Memory and DRAM Nima Honarmand Spring 2016 :: CSE 502 Computer Architecture SRAM vs.

C ONTENT-ADDRESSABLE MEMORY (CAM) is a thus, slow updates retarded the lookup performance in

Office of Pipeline Safety Office of Pipeline Safety Presentation on Presentation on Damage

Ma Magic Mountain Pipeline Phase 6 gic Mountain Pipeline Phase 6 Project ject Board Meeting

Internal Pipeline Corrosion Kenneth Lee Pipeline Safety Director, Engineering &amp; Research

Pipeline Construction Pipeline Construction Challenges Challenges NAPCA Workshop August 19,

Pipeline A Presentation by Team Pipeline Ben Lai Brandon Bakhshai Jeffrey Serio Somya

1,000 foot pipeline Connect Replacement (Saugus 3 and 4) Wells to Magic Mountain Pipeline

Mobile Avionics Weather The Capstone Experience Team GE Aviation Eric Cook Mike Dunn Drew

Gem5 in a nutshell Christophe Huriaux, Post-doc Inria, IRISA CAIRN Project-Team CAIRN

May 2017 Annual Parent Meeting Broad Run High School Band Boosters Association May 9, 2017

Curriculum on Basics Cadet Responsibilities Cadet Responsibility Agenda B1. Guard Duty

Dr Theodore Zamenopoulos Louise Dredge The Open

PERSON CENTRED CARE Why are we talking about AND dementia? DEMENTIA CARE MAPPING People are

Update on Home-Based Assignment Plan April 13, 2016 * BOSTON PUBLIC SCHOOLS BOSTON PUBLIC

Welcoming Inclusion Network (WIN) When Inclusion Happens We All WIN LORI STILLMAN AND JOHN C.

Internal Pipeline Corrosion Kenneth Lee Pipeline Safety Director, Engineering & Research