Fang Yu and Randy H. Katz - - PDF document

fang yu and randy h katz
SMART_READER_LITE
LIVE PREVIEW

Fang Yu and Randy H. Katz - - PDF document

Fang Yu and Randy H. Katz zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Efficient Multi-Match Packet Classification contents. Rule headers may have overlaps, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA with TCAM results. We call


slide-1
SLIDE 1

Efficient Multi-Match Packet Classification with TCAM

Fang Yu and Randy H. Katz zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Abstract- Today's packet classification systems are designed to

provide the highest priority matching result, e.g., the longest prefix match, even if a packet matches multiple classification

  • rules. However, new network applications, such as intrusion

detection systems, require information about all the matching

  • results. We call this the multi-mtch classijication zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

problem In

several complex network applications, multi-match classification is immediately followed by other processing dependent on the classification results. Therefore, classification should be even faster than the line rate. Pure software solutions cannot be used due to their slow speeds.

In this paper, we present a solution based on Ternary Content

Addressable Memory (TCAM), which produces multi-match classification results with only one TCAM lookup and one SRAM lookup per packet- about ten times fewer memory lookups than

a pure software approach. In addition, we present a scheme to

remove the negation format in rule sets, which can save u p to

95% of TCAM space compared with the straight forward

  • solution. We show that using the pre-processing scheme we

present, header processing for the SNORT rule set can be done with one TCAM and one SRAM lookup using a zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

135KB

TCAM.

Index Term-Packet Classification, Multi-Match Packet

Classification, Ternary CAM, Negation Removing. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

  • I. INTRODUCTION

ew network applications are emerging that demand multi- match classification, that is, requiring all matching results instead of only the highest priority match. One example

  • f such an application is the network intrusion detection

system, which monitors packets in a network and detects malicious intrusions or DOS attacks. Systems like SNORT

[I], employ thousands of rules. Figure 1.a gives an example SNORT rule that detects a MS-SQL worm probe. Figure zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

1 .b is

a rule for detecting an RPC old password overflow attempt. Each rule has two components: a rule header and a rule zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

@on.

The rule header is a classification rule that consists of five fixed fields: protocol, source IP, source port, destination IP, and destination port. The rule option is more complicated: it specifies intrusion patterns to be used to scan packet This work was

supported in pm by the UC zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Micro @ant number 03-041and 02-032 with matching support fromNTTMCL, HP, Cisco, and Microson. Fang Yu and Randy H. Katz are with the Electrical Engineering and Computer Science Depmtmenf, University of California Berkeley, Berkeley, CA 94720 (phone: 510-642-8284; e-mail: (&U, randy)@ eecs.berkeley.edu).

N

  • contents. Rule headers may have overlaps, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

so a packet may

match multiple rule headers (e.g., both examples above). Multi-match classification is used to fmd all the rule headers that match a given packet so that we can check the corresponding rule options one by one later.

contmt:~p4l":

daplh.1; contenl'"l81 F1 0) 01 04 O B $1 F1 0117

I

cO"lB"t:--K-:

I

conten1:"lW 01 86 AOl"; 0ffset:lZ: depth4

1.a: A rule for MSSQL 1.b A rule for RPC Old

  • Fig. 1. Snort rule examples.

Another application is. proganunable network elements (PNEs) [2, 31 proposed for implementing edge network

  • functions. Typically, a packet traverses a number of network

devices that perform different functions, e.g., firewall, HTTP load balancing, intrusion detection, NAT, etc. This can be highly inefficient because a packet has to traverse every device even if only a subset of them needs to operate on the packet contents. In addition, becauweach network device is separately built, common functions like classification are repeatedly applied. This wastes resources and induces extra

  • delay. To address this problem, PNEs are evolving to support

multiple functions in one device. Multi-match classification is

  • ne important building block in PNEs: when a packet first

enters a PNE, it is classified to identify the relevant functions. Then, only those selected functions will be applied, which saves resources and increases processing speed. As we can see !?om the above two applications, multi- match classification is usually the fmt step in performing complex network system functions followed by processing that is dependent on the classification results. Applications that require only single-match classifications, however, tend to have further processing that is also simple (e.g., go to a specific port, or drop a packet, etc.). Therefore, to maintain the same line rate, multi-match classification must be much faster than single-match to leave enough time for subsequent processing without increasing latency too much.

0-7803-8686-8/04/$20.00

02004 IEEE 28

Authorized licensed use limited to: National Cheng Kung University. Downloaded on January 13, 2009 at 03:09 from IEEE Xplore. Restrictions apply.

slide-2
SLIDE 2

The single-match problem on multiple fields is complex zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA [4]. For zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA n arbitrary non-overlapping regions in F dimensions, it is possible to achieve an optimized query time of O(log(n)), with a complexity of O($) storage in the theoretical worst case [SI. However, real-world rule sets are typically simpler than the theoretical worst case, and heuristic approaches [6, 7, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA 8,9] provide faster solutions, e.g., 20-30 memory accesses per packet in the ‘‘worst case”. The multi-match classification problem is more complex to implement than single-match classification since it needs all the matching results. Thus, some of the heuristic optimizations used for the single-match classification do not apply for the multiple-match case. Pure software solutions for multi-match classification are expected to take longer than that for single- match classification. Furthermore, multi-match classification, because of the complex follow-up processing, is likely to have much tighter time requirements. Hence pure software solutions, which require tens of memory access, are

  • insufficient. Instead, we need a solution that requires few

memory lookups, with deterministic lookup time for any input to keep up with the high data rate. In this paper, we present a scheme that provides a solution ‘for the multi-match problem with two memory lookups: one using a Ternary Content Addressable Memory (TCAM), a type of memory that can do parallel search at high speed, and the other using a standard Static Random Access Memory (SRAM). zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

O u r solution can save 95% of TCAM space

compared with the straight forward solution. Using our scheme, header processing for the SNORT rule set can be done with one TCAM and one SRAM lookup using a 135KB TCAM. The remainder of the paper is organized as follows: we will begin by exploring some design choices and technical challenges in Section 2. Section 3 and 4 present our solution to the multi-match classification problem with TCAM. Finally we present simulation results in Section 5 and conclude in Section 7. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

I n p u t d [ ‘ ~

  • o 0 1

~*‘entry--l zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

r t

n’hentr----cn

2 1 0 x

3 1 0

x x

TCAM Rule 1 Rule 3

. :

RuV2, 3 SRAM Match list

  • Fig. 2. TCAM
  • 11. MOTIVATION AND

TECHNICAL

CHALLENGES

A TCAM consists of many entries, the top entry of the TCAM has the smallest index and the bottom entry has the

  • largest. Each entry has several cells that can be used to store a
~

29

  • string. A TCAM works as follows: given an input string, it

compares the string against all entries in its memory in parallel, and reports the “first” entry that matches the input. The lookup time (5 ns or less) is deterministic for any input. Unlike a binary CAM, each cell in a TCAM can take one of three states: 0, 1, or ‘do not care’ (X). With ‘do not care’ states, a TCAM can support matching on variable prefix ClDR IP addresses and thus can be used in high-speed IP lookups [IO, 1 I]. Also because it has ‘do not care’ states, one input may match multiple TCAM entries. In this paper, we assume the use of the widely-adopted zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA jirst-match TCAM, which gives out the lowest index match of the input string if there are multiple matches as shown in Figure 2. To solve the multi-match classification problem with TCAM, there are two challenges to be tackled: rule ordering and negation representation. Challenge zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA 1: Arrange rules in TCAM compatible order Currently, the commercially available TCAM reports only the first matching result if there are multiple matches. This type of TCAM cannot directly report multi-match result. If we can change the TCAM hardware and let it return a bit vector

  • f all matching results, one bit per entry, it still doesn’t solve

the problem. This is because we still need to process the bit vector and extract the matching result, the complexity is still O(n). In the reminder of the paper, we use a first match TCAM. Rules can have different relationships such as subset, intersection, and superset. These relationships can cause problems for the matching results given a first-match TCAM. For example, suppose we have the following two rules: (a) “Tcp $SQL-SER VER I433 +$EXTERNAL-NET any” (b) “Tcp Any Any + Any 139” If we put rule (a) before rule (b) in the TACM, a packet matching both rules will report a match of (a) and never report (b), and vice versa. This is because rule (a) and (b) have an intersection relationship. Hence, we need an algorithm to add additional rules into the rule sets and order the rules in a specific way to avoid the above problem. We call such an

  • rdering a “TCAM compatible order”, which means: when a

packet is compared with rules according to this order, we can retrieve all matching results solely based on the first matched

  • rule. There should be no need to check the successive rules.

Challenge 2: Representing Negation with TCAM The negation (!) operation is common in rule sets. For example, if we wish to find packets that are not destined to TCP port 80, we will use a rule “tcp any any +

any !go”.

The 16-bit binary form of 80 is 0000 0000 0101 0000. There is no direct way to map the negation into one TCAM entry. If we directly flip every bit over, 1 I 1 I 11

1 1 1010 11

11 stands for 65375, which is only a subset of ! 8 . To represent the whole range of !

8 , we need 16 TCAM entries as shown in Figure 3. The basic approach flips one bit in one of the 16

Authorized licensed use limited to: National Cheng Kung University. Downloaded on January 13, 2009 at 03:09 from IEEE Xplore. Restrictions apply.

slide-3
SLIDE 3

binary positions and puts ‘do not care’ to all the others. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

I zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

lxxxxxxxuxxxxxx I

I

X l X X X X X X X X X X X X X X

  • Fig. 3. Binary representation of !SO in TCAM

In addition to port negation, some rules require subnet addresses to be negated. For example, $EXTERNAL_NET frequently appears in rule sets, where $EXTERNAL_NET = !$HOME-NET. To represent this in TCAM directly, we need

to flip every bit in the prefix of $HOME-NET and put ‘do not care’ to the other positions. Because IP subnet addresses are 32 bits, each negated address costs up to 32 TCAM entries. Moreover, there could be several negations in one rule. For example, the rule zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA “tcp zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA $EXTERNAL-NET zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

any+

$EXTERNAL-NET zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

!80” requires

up to’ a total

  • f

32*32*16=16384 TCAM entries for this single rule! This is

  • bviously not an acceptable approach since TCAMs have a

much smaller capacity than S U M S (e.g., 2MB with current technology). The next two sections describe our approach to addressing these technical challenges.

  • 111. CREATE RULE

SETS I N

TCAM COMPATIBLE ORDER To obtain multi-match results in one lookup with a first- match TCAM, we need to identify intersections between rules. Studies in [4, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

121 show that the number of intersections

between real-world rules is significantly smaller than the, theoretical upper bound because each field has a limited number of values (e.g., all known port numbers) instead of unconstrained random values. So maintaining all the intersection rules is feasible. Indices of the rules used to generate the intersection are stored in a list. We call this a “Match List” and store the list in SRAM. Given a packet, we first perform a.TCAM lookup and then use the matching index to retrieve all matching results with a secondary SRAM lookup as shown in Figure 2. The extended rules plus the

  • riginal rules form an extended rule set Throughout the

remainder of this paper, a “rule” refers to a member of the extended rule set, unless otherwise specified as a member of the original rule set. The items in the match list are the indices

  • f rules in the original rule set.

As defined in Section 2, the TCAM compatible order requires rules to be ordered so that the first match should record all the matching results in the match list. We first enumerate the relationships between any two different rules E, and E,, with match list M, and M ? There are four cases: exclusive, subset, superset, and intersection, each with following corresponding requirements:

  • 1. Exclusive (Ei

n zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

E , ,

=

4):

then zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

i and

j can have any order.

  • 2. Subset ( E i ~ E , ) :

then i<jandMjcM,

  • 3. Superset (E,cE,):thenj<iandM,GM,.
  • 4. Intersection ( E i n E , f 4): then there is a rule E, =

(E,nE,)(I<i, kj), and(MiUM,)CM,. Case 1 is trivial: if E, and’E, are disjoint, they can be in any

  • rder since every packet matching E, never matches E

, . For

Case 2 where E, is a subset of Ej, every packet matching E, will match E, as well, so E, should be put before Ej and match list M, should include Mp In this way, packets fust matching E, will not miss matching E,. Similar operations are required

for Case 3. Besides these three cases, partially overlapping

rules lead to Case 4. In this case, we need a new rule E, recording the intersection of these zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

two rules (Ei

A E , ) placed

before both E, and Ej with both match results included in its match list (M, UM,) GM,). Note that the intersection of E, and E, may be further divided into smaller regions by other rules (e.g., Ei in Figure 4). In this case, all the smaller regions (Ei

n

E, and E, n E, n Eh) have to be presented before both E, and E , . This can actually be deduced by requirement (4).

  • E,

E 1

  • Fig. 4. An example of intersections of three rules.

Cases 1 to 4 cover all the possible relationships between any two rules. By applying the corresponding operations talked above, we can meet the requirements and get a TCAM compatible order. Figure 5 is the pseudo-code for creating a TCAM compatible order. The algorithm takes the original rule set R={R,, R2, _._., RJ as the input. Each rule Ri is associated with a match list, which is an index of itself (

{ Q ) . The algorithm

  • utputs an extended rule set E in TCAM compatible order.

The algorithm inserts one rule at a time into the extended rule set E, which is initially empty (the empty set obviously follows the requirements of TCAM compatible order). Next, we show that after each insertion, E still meets the

  • requirements. h e + ,

E) is the routine to insert rule x into E.

30

Authorized licensed use limited to: National Cheng Kung University. Downloaded on January 13, 2009 at 03:09 from IEEE Xplore. Restrictions apply.

slide-4
SLIDE 4

It scans every rule zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA E, in zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

E and checks the relationship between E, and zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

  • x. If they are exclusive, then we bypass E,. I f E, is a

subset of x, we just add match list M, to M, and proceed to the next rule. If E, is a superset of x, we add x before E, according to requirement zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

(3) and ignore all the rules after E, (see the

proof in the appendix). Otherwise, if they intersect, then according to requirement zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

(4), a new rule E , n x is inserted

before E, if it is not already been added. The match list for the new rule is M,u MI. As you can see, we strictly follow the four requirements when adding every new rule, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA so the generated extended rule set E is in TCAM compatible order. Due to space limitations, we do not present the details of the deletion algorithm. extend_rule-set(R){

E=+;

for all the rule R, in R return zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

E;

E=insert(R, E);

1

insert@,

E){ for all the rule E, in E { switch the relationship between E, and x.

case exclusive:

case subset: continue; M,

~ M, U

Mi;

continue;

M, = M,U zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

M,;

case superset:

add x before E, ;

return E;

i f ( E ( n x % E and M,Q MJ case intersection:

add zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

f =

E, n

x before E, ;

M,=M,U M,;

I

add x at the end of E and return E;

I

1

2

  • Fig. 5. Code for generating TCAM compatible order.

Original Rule Set Tcp ISQL-SERVER 1433 --f $EXTERNAL-NET any Tcp $EXTERNAL-NET 119 + $HOME-NET Any

1

2

3

Original Rule Set Tcp ISQL-SERVER 1433 --f $EXTERNAL-NET any Tcp $EXTERNAL-NET 119 + $HOME-NET Any Tcn Anv Anv +

Anv 139

Table 1. Example of original rule set with 3 rules.

3

Extended Rule Set

Match

Tcp SSQL-SERVER 1443 -+$EXTERNAL_NET

139

Tcp SSQL-SERVER 1433 +$EXTERNAL-NET zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

any

List Tcp Any Any +

Any 139 Tcpanyany + any 139

1 3

Table 2. Extended rule set in TCAM compatible order. To illustrate the algorithm, let’s look at the following example in Table 1 which contains three rules. To generate extended rule set E, first we insert rule I . Rule 2 does not intersect with rule 1 so it can be added directly. Now, we have rule 1 followed by rule 2. When inserting rule 3, we find that it intersects with both rule 1 and rule 2, so we add two intersection rules with match.list { 1, 3) and (2, 3 f and put rule 3 at the bottom of the TCAM. The final extended rule set E is presented in Table 2.

  • Iv. NEGATION

REMOVING

The scheme presented in Section 3 can be used to generate a set of rules in TCAM compatible order. In this section, we describe how to insert them into TCAM. As explained before, each cell in the TCAM can take one of three states: 0, 1 or ‘do not care’. Hence, each rule needs to be represented by these three states. Usually, a rule contains IP addresses, port information, protocol type, etc. 1P addresses in the ClDR form can be represented in the TCAM using the ‘do not care’ state. However, the port number may be selected fiom an arbitrary

  • range. Liu [I I ] has proposed a scheme tn efficiently solve port

range problem. However, we don’t use it here because it requires two additional memory lookup and SNORT rule set doesn’t contain a huge number of ranges. We just directly map range into TCAM using multiple entries, e.g., port 2-5 is represented as 01* and lo*. A more complicated problem for the TCAM is that some IP and port .information is in a negation form. As explained in Section 2, each negation consumes many TCAM entries, so in this section, our goal is to remove negation from the rule set to save TCAM space. Before we present our scheme, let us first look at the combinations of source and destination IP address spaces as shown in Figure 6. Use the rule set in Table 1 as an example, rule 3 applies to all 4 regions since it is ‘‘any’’ source to “any” destination; rule 1 applies to region D because we usume $SQL-SERVER is inside $HOME-NET; and rule 2 applies to region A. The regions that contain negation ($EXTERNAI-NET) are region A ($EXTERNAL-NET to $HOME-NET), D ($HOME-NET to $EXTERNAL-NET), and B ($EXTERNAL-NET to $EXTERNAL-NET).

c

Destination

I P

  • Fig. 6. Source and destination IP addresses space

Consider region A as an example: the rules in this region are in the form of “* $EXTEWAL_NET * j$HOME-NET‘

*”. Note that * means it could be any thing (e.g. ‘Ycp” or

“any” or a specific value). $HOME-NET+ stands for

Home Net External Net

31

Authorized licensed use limited to: National Cheng Kung University. Downloaded on January 13, 2009 at 03:09 from IEEE Xplore. Restrictions apply.

slide-5
SLIDE 5

$HOME_NET and any subset of it such as $SQL_SERVER. If we can extend rules in region A to region A and C, we can replace $EXTERNAL-NET with keyword “any” and now rules are in the format of zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

”* zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

any * +$HOME-NET

*”.

However, after extending the region, we change the semantics

  • f the rule and this may affect packets in region C. In other

words, a packet in the format of “* SHOME-NET * + $HOME-NEr *” will report a match of this rule as well. This problem, however, is solvable because TCAM only reports the first matching result. With this properly, we can first extract all the rules applying to region C and put those at the top of TCAM. Next, we add a separator rule between region C and region zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

  • A. “any $HOME-NET

any + $HOME_NET any” with an empty action list In this way, all the packets in region C will be matched zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

fust and thus ignore all the rules afterwards. With this separator rule, we can now

extend all the rules in region A to region A and C. Similarly, rules in region D can be extended to region C and D, rules in region B can be extended to region A, B, C, and D. Therefore, we will put all the rules in the following order: Rules in region C: ”* SHOME-NET‘ * +$HOME-NET

*”

Separator rule 1: “any $HOME_NET any +$HOME-NET Rules in region D, specified in the form of region C and D:

“* $HOME_NET‘ * +any *”

Rules in region A, specified in the form of region A and C: “*any * +$HOME-NETI *” Separator rule 2: “any $HOME_NET any --f any any” Separator rule 3: “any any any +$HOME-NETany”

Rules in region B, specified in the form of region A, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

B,

C and any”

D: “*any * +any *” Putting extended rule sets in this order can be simply achieved by first adding all three separator rules to the beginning of the original rule set, then following the algorithm in Section 2. If a rule applies to region A, it will automatically intersect with the separator I, and generate a new rule in region C. If a rule applies in region B, it will intersect with all three separators and create three intersection rules. After that, we can replace all the $EXTERNAL-NET references with the keyword “any”.

ITCAM

ITCAM

entries

[Match I Index I

2

3 4 5 6 7

list tcp SHOMI-NET any +$HOME-NET 139 3 any $HOM!_NET any --f $HOME-NET any

3, 1

1

2,3

tcp SSQL-SERVER 1433 +

any 139

tcp SSQL-SERVER 1433 +

any any

tcp any I19 + $HOME-NET 139 tcpany I19 +$HOMENETany

2

tcuanvanv ~ a n v 1 3 9

3

, . , ,

Table 3. Extended rule set in a TCAM with no negation. Table 3 shows the result of mapping the rule set of Table 1 into TCAM. The first rule in region C is extracted from rule 3 that applies to all four regions. The second rule is a separator rule. With these two rules, we can replace the $EXTERNAL-NET in rules 3-6 with keyword “any”. At the end, there is rule 7 which applies to all the regions. Separator rules 2 and 3 are omitted because no rule is in the form of $EXTERNAL-NET to $EXTERNALPNET in the original rule set. In this example, by adding only two rules, we can completely remove the $EXTERNAL-NET. Compared this to the solution in Table 2, which needs up to 4*32 +1 = 129 TCAM entries, this is 94.5% space saving! The above example is a special case because there is only

  • ne type of negation ($EXTERNAL_NET) in one field. In a

more general case, there can be more than one negation in each field. For example, there could be both !SO and !90 or !subnet1 and !subnet2 in the same field. Our scheme can be easily extended. If there are zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

k unique negations in one field

and their non-negation forms do not intersect (e.g., 80 and 90), then we need k separators of the non-negation form zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

(SO,

90) and they can be in any order. If they intersect, we need up to 2

’ zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

  • 1 separation rules for this field. For instance, suppose

there are !subnet1 and !subnet2. There should be three separation rules applying to subnetl n subnet2, subnet2, and

  • subnetl. k is usually a very small number because it is limited

by the number of peered subnets. In general, if each field i needs ki separators, then at most (n(k,

+ I ) )-1 separator

rules should be added. In our previous example of removing $EXTERNAL-NET from source and destination IP addresses,

k,= k2=1,

so we need a total of 2*2-1=3 separator rules.

  • v. SIMULATION RESULTS

To test the effectiveness &our algorithm, we use the SNORT [I] rule set. The SNORT rule set has undergone significant changes since 1999. We tested all the publicly available versions after 2.0. Although each rule set has around 1700-2000 rules, many of the rules share a common rule

  • header. As illustrated in Table 4, unique rule headers in each

version are relatively stable. Note that we omitted the versions that share the same rule headers with the previous version. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

1 1

2.0.0 4/14/2003 2.0.1 7/22/2003

2.

I .o

12/38/2003

257 2.1.1

2/25/2004

263

Table 4. SNORT rule headers statistics. Our task is to put these rule headers into TCAM as classification rules, and store the corresponding matching rule indices in the match list. Hence, given an incoming packet, with one TCAM lookup and another SRAM lookup, we can implement multi-match packet classification. The second column in Table 5 records the size of extended rule set in TCAM compatible order. It is roughly 15 times the

  • riginal rule sef which is well below the theoretical upper
  • bound. This agrees with the findings in [4,7, 81.

32

Authorized licensed use limited to: National Cheng Kung University. Downloaded on January 13, 2009 at 03:09 from IEEE Xplore. Restrictions apply.

slide-6
SLIDE 6

Table zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

  • 5. Statistics of extended rules set.

Table zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

  • 6. Performance of negation removing scheme

The number of negations in the extended rule set is

  • significant. As shown in Table 5, on average 62.4% of the

rules have one negation, 1.295% of the rules have two negations and there are a very small number of rules with three negations. In our simulation, we assume the home network is a class C address with a 24 bit prefix, so each $EXTERNAL-NET needs 24 TCAM entries. Negation of a port, e.g., zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

!SO,

!21:23 consumes 16 TCAM entries. Under this setting, a single negation takes up to 24 TCAM entries; a double negation consumes up to 24*24=576 TCAM entries; and a triple negation requires up to 24*24*16=9216 TCAM

  • entries. Hence, if we directly put all the rules with negation

into the TCAM, it takes up to 151,923 TCAM entries as shown in the third column of Table 6. Our negation removing scheme presented in Section 3 significantly saves TCAM space. For the SNORT rule header set, we added zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA 2*3*2*2-1 = 23 separation rules in 60nt of the

  • riginal rule set because there are four types of negations:

$EXTERNAL-NET at source IP, $EXTERNAL-NET at destination IP, !21:23 and zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

!SO at source port, and !SO at

destination port. It only adds about 10% extra rules in the extended rule set (4” column of Table 6). However, with these 10% more rules, we reduce the number of TCAM entries by

  • ver 93%.

Note that this total number is larger than the extended rule set size. This is because some rules contain port ranges that consume extra TCAM entries. The range mapping approach

in [ I l l is not used because this approach requires two

additional memory lookups for key translations, which reduces classification speed. If a lower speed is acceptable, then we can also incorporate the range mapping technique. In this case, the total TCAM entries needed is just the size o

f

extended rule set after removing negations. Each rule is 104 hits (8 hits protocol id, 2 ports with 16 bits each, 2 IP addresses with 32 hits each), which can be rounded up to use a 128 hits entry TCAM. The total TCAM space needed for SNORT rule header set is 128*8649=135KB. To study the effect of negation, we randomly vary the negation percedages in the original rule set. In the SNORT

  • riginal rule header sets, 89.7% of rules contain single

negation and zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

1.1% of the rules contain double negation. So,

we first consider single negation. Figure 7 shows the TCAM space needed both with and without our negation removing

  • scheme. When the percentage of negation is very low, the two

schemes perform similarly. If we study closely, when the negation percentage is very small (<2%), putting negation directly is better than our scheme since we introduce extra separation rules that may intersect with other rules. However, as the percentage of negation is higher, the TCAM space needed for the “with negation” case grows very fast. In contrast, the curve of our scheme remains flat and thus can save a significant TCAM space. For example, when 98% of the rules involve negation, our scheme saves 95.2% of the TCAM space compared to the ‘kith negation” case. This is

  • nly for the single negation case. For double negations, or

triple negations, the saving would he even higher since each doubleitriple negation rule requires many more TCAM entries.

  • Fig. 7. Negation removing scheme
  • VI. RELATED WORK

As far as we know, we are the first to study the multi-match classification problem. The most relevant work is the filter conflict study by Hari et al. [12]. They showed that even for single-match classification problem, classification rules can intersect and thus introduce conflict. There are cases where commonly used conflict resolution schemes based on filter

  • rdering do not work. They proposed to solve the problem by

adding new filters in a manner similar to our approach. There have been extensive studies of the single-match classification problem and some of them can be extended to report multi-match results. For example, Grid of Tries 191 is proposed to solve the two dimension (source and destination IP addresses) classification problem. Their algorithm can be extended for multiple fields with caching techniques. Other heuristic algorithms like Recursive Flow Classification (RFC) [7], HiCuts and HyperCuts [6, 81 work well for real world rule sets for single-match classification. ‘However, these heuristic algorithms require several memory lookups and do not provide a deterministic lookup time. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA So, they are not well matched to the multi-match classification due to the tight time requirements for subsequent processing.

~

I I

33

Authorized licensed use limited to: National Cheng Kung University. Downloaded on January 13, 2009 at 03:09 from IEEE Xplore. Restrictions apply.

slide-7
SLIDE 7

Most recently, TCAM is used in high end routers for single- match packet classification. Since TCAM is smaller and more expensive than SRAM, different approaches are proposed to save TCAM space or reduce TCAM power consumption. For example, Liu [I I] proposed an algorithm for mapping range values into TCAM. CoolCAMs zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA [I31 partitioned TCAM so that for a given packet, they searched only several partitions to achieve lower power consumption. Spitmagel et al. [IO] extended this idea and organized the TCAM as a two level hierarchy in which an index block was used to enableldisable the querying of the main blocks. In addition, they also incorporated circuits for range comparisons within the TCAM memory array. Our work focuses on multi-match problem and negation removing.

  • VII. CONCLUS~ONS zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

’ In this paper, we use a TCAM-based solution to solve the

multi-match classification problem. The solution reports all the matching results with a single TCAM lookup and a SRAM

  • lookup. In addition, we propose a scheme to remove negation

in the rule sets, thus saving 93% to 95% of the TCAM space

  • ver a straightfonvard implementation. From our simulation

results, the SNORT rule header set can easily fit into a small TCAM of size 1 3 3 3 , and is able to retrieve all matching results within two memory accesses. We believe a TCAM- based approach is viable, as TCAM is now becoming a common extension to network processors. Although TCAM is more expensive and has higher power consumption than standard memory such as DRAM and SRAM, the capability and speed it offers still make it an attractive approach in high speed networks. APPENDIX Claim in Section 3: If E, is the first superset ofx zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

( x c zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

EJ in

E, we can add zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

x before E, according to requirement (3) and

bypass all the rules after E,. ProoE For any rule zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

E,

after E,, there could be four cases. We will study it one by one and show why we can bypass all

  • fthem.

First, we can bypass any rule E, that is disjoint with zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

x,

according to requirement (I). Second, it is impossible that E, c

  • x. If so, E, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

c

x c zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA E,,

which contradicts with requirement (2). Third, if x c

4, E

, must also be a superset of E,. Otherwise,

the intersection of Q and E, must be a superset of x zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

as well

and it must be presented before E,, according to requirement (4). This contradicts with the assumption that E, is the first superset of x in E. Therefore, E, c

E , and we have Mi

C Mi according to requirement (2). In this case, we don’t need to process E,since we can extract all the information from M,. Fourth case, if Ej intersects with x and suppose z = Q n x , then z must have appeared before E,- This is because E, must intersect with E, as well since E, is a superset of x. Let Ek =

E , n E , , according to requirement (4), k < 1 . In addition, z =

E ,

n

x= E, n x n

E, = Ex

n

x, because x c

E,. Therefore, we must have generated z when processing .

E k

which is before E,. This meets the requirement (

4 ) , so we can bypass E

, .

Hence, all the rules after E, are either exclusive to x, or their intersections have already been included, so we can skip all those rules. ACKNOWLEDGEMR‘ITS Special thank to Dr. T.V. Lakshman from Lucent Bell labs for suggesting TCAM as a possible solution for the multi- match classification problem. Without his insightful discussion and timely feedback, this paper will not be

  • possible. We would like to extend our ‘gratitude to SNORT

system developers for implementing the powerful tool and making it open source. We would also like to thank Li Yin, Me1 Tsai, Matthew Caesar, Yanlei Diao, and Ananth Rao for proof reading. Finally, we want to thank anonymous reviewers for valuable comments and suggestions.

REFERENCES

[I] SNORT network intrusion detection system, www.snort.org, [2] D. L. Tennenhouse and D. J. Wetherall, “Towards an Active

Network Architecture,” Computer Communicotion Review, Vol. 26,

  • No. 2, April 1996

[3] G. Porter, M. Tsai, L. Yin, and R. Katq “The OASIS Group at

U.C. Berkeley: Research Summary and Future Directions,”

http://oasis.cs. berkeley. eddpubs/oaris-wp.doc [4] M. Kounavis, A. Kumar, HM Vin, R. Yavatkar, and A. Campbell,

“Directions in Packet Classification for Network Processors,” NP2

Workhop, Feburary 2003 [SI

  • M. H. Overmars and A. F. Srappen, “Range searching and point

location among fat objects,” European Symposium on Algorithms,

1994 [6] P. Gupta, N. McKeown “Packet classification using hierarchical

intelligent cuttings,” in Hot Interconnects, August 1999

[7] P. Gupta, N. McKeown “Packet classification

  • n multiple fields,”

in SIGCOMM, August 1999 [ S I

  • S. Singh, F. Baboescu, G. Varghese, and J. Wang, “Packet

Classification Using Multidimensional Cutting,” in SIGCOMM, August 2003 [9] V. Snnivasan, G. Varghese, S. Suri, M. Waldvogel, “Fast and Scalable Layer Four Switching”, in SIGCOMM, September 1998

[IO]

  • E. Spitmagel, D. Taylor, and J. Tumer, “Packet Classification

Using Extended TCAMs,“ ICNP, November 2003

[ I l l P. Gupta, and N. McKeown, “Algorithms for Packet

Classification,” I€€€ Network March 2001

[ I l l H. Liu, “Reducing Routing Table Size Using Temary-

C A M , Hot Interconnects, August 2001 [I21 A. Hari ,

  • S. Sun, and G. Parulkar, “Detecting and

Resolving Packet Filter Conflicts”, in INFOCOM, March 2000 [I31 F. Zane, G. Narlikar, and A. Basu, “CoolCAMs: Power- Efficient TCAMs for Forwarding Engines,” in INFOCOM, March 2003

Authorized licensed use limited to: National Cheng Kung University. Downloaded on January 13, 2009 at 03:09 from IEEE Xplore. Restrictions apply.