Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee - - PowerPoint PPT Presentation

multi dimensional packet classification
SMART_READER_LITE
LIVE PREVIEW

Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee - - PowerPoint PPT Presentation

A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison Packet classification S1 L1 D S2 R Internet L2 Subnet A Subnet B


slide-1
SLIDE 1

A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification

Yadi Ma, Suman Banerjee University of Wisconsin-Madison

slide-2
SLIDE 2

Packet classification

R Internet S1 S2 Subnet A Subnet B D From To Traffic type Action S1 D Port 80 Forward via L1 S2 D * Drop all traffic A B * Reserve 50 Mbps L1 L2 Classifier at Router R

slide-3
SLIDE 3

Definition

  • Packet classification: given a classifier, find the first (highest priority)

matching rule for each incoming packet

  • A classifier contains a set of rules ordered by priority
  • Our focus: n-tuple classification
  • Example classifier:
  • Given a packet header: (32.75.226.153, 198.35.180.5, 80,1040, UDP)

Rule # Source IP

  • Dest. IP

Source Port

  • Dest. Port

Protocol Action 1 * 10.112.*.* 5001 - 65535 * TCP deny 2 32.75.226.153 * * 1001 - 2000 UDP deny 3 199.36.184.* * 49152 - 65535 * UDP deny 4 * * * * * permit

slide-4
SLIDE 4

Packet classification schemes

  • Software-based schemes

– Tradeoff between memory usage and speed – Examples: HiCuts, HyperCuts, EffiCuts, etc

  • Hardware (TCAM)-based schemes

– Popular for high-throughput packet classification

slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10

Problem Statement

  • TCAMs are power-hungry
  • Design a TCAM-based method that:

– Greatly reduces power consumption of TCAMs, especially for large classifiers – Uses commodity TCAMs – Is easy to implement

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17

Outline

Introduction and motivation Design of SmartPC

– Algorithms to manage two-stage classification

Evaluation methods and results Conclusion

slide-18
SLIDE 18

Packet classification system for SmartPC

  • Two-stage classification

– First stage: pre-classifier – Second stage: two parallel searches

Index TCAM (Pre-classifier entries) Match index Index SRAM TCAM (Classifier rules) Associated SRAM (priorities + actions) “General” blocks Priority resolution Action “Specific” block

How to build an efficient pre-classifier?

slide-19
SLIDE 19

Pre-classifier

  • How to build a pre-classifier?

– Built on two dimensions: source IP address and destination IP addresses – By expanding and combining two dimensional rules recursively

  • Also shuffle original rules into different

TCAM blocks accordingly

slide-20
SLIDE 20

Why 5d to 2d is a good choice?

Maximum number of overlapping rules in the two-dimensional space

  • Analyze more than 200 real classifiers ranging in

size from 3 to 15,181

Maximum number of overlapping rules is an order of magnitude smaller than classifier size.

slide-21
SLIDE 21

An example classifier containing 14 rules

slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24

Same example classifier containing 14 rules

slide-25
SLIDE 25
slide-26
SLIDE 26
slide-27
SLIDE 27

27 27 27

SmartPC

2 1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0 P1

0,1,5,6,8 P0,P1

TCAM

Pre-classifier

slide-28
SLIDE 28

28 28 28

SmartPC

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0 P1

0,1,5,6,8 2, 3,4,9,10 P0,P1

Specific blocks

TCAM

Pre-classifier

slide-29
SLIDE 29

29 29 29

SmartPC

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0 P1

0,1,5,6,8 2, 3,4,9,10 P0,P1

TCAM

Pre-classifier General block

7,11,12,13

Specific blocks

slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34
slide-35
SLIDE 35

35 35 35

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

slide-36
SLIDE 36

36 36 36

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

, 1

slide-37
SLIDE 37

37 37 37

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

, 1

slide-38
SLIDE 38

38 38 38

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

, 1, 5, 6

slide-39
SLIDE 39

39 39 39

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

, 1, 5, 6 7

slide-40
SLIDE 40

40 40 40

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

, 1, 5, 6 7 , 8

slide-41
SLIDE 41

41 41 41

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

, 1, 5, 6 7 ,11,12,13 , 8

slide-42
SLIDE 42

42 42 42

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0

2

, 1, 5, 6 7 ,11,12,13 , 8

P1

, P1

slide-43
SLIDE 43

43 43 43

Example: how to build a pre-classifier

1 2 3/4 5 6 7 8 9 10 11/12/13

Dst_addr Src_addr

P0

P0 0 , 1, 5, 6 7 ,11,12,13 , 8

P1

2, 3,4,9,10 , P1

Specific blocks General block Pre-classifier

packet

slide-44
SLIDE 44

44 44 44 Index TCAM (Pre-classifier entries) Match index Incoming packet Index SRAM 0, 1, 5, 6, 8 7, 11, 12, 13 TCAM (Classifier rules) Associated SRAM (priorities + actions) General block(s) 1, accept Priority resolution accept 7, deny 1 1 P0 P1 2 ,3, 4, 9, 10 Specific block

. . .

. . .

Packet classification system for SmartPC

0, 1, 5, 6, 8 7, 11, 12, 13 1, accept 7, deny

slide-45
SLIDE 45

Properties of pre-classifiers

  • Entries in a pre-classifier are non-overlapping
  • Each rule in a classifier is either covered by only
  • ne pre-classifier entry, or marked as general
slide-46
SLIDE 46

Rule update

  • Rule update overhead of SmartPC is generally smaller

than that of regular TCAMs

  • The ordering of TCAM entries is kept within one specific

block or within a small number of general blocks, rather than throughout all the blocks

  • Rule update

– Insert a rule – Delete a rule

slide-47
SLIDE 47

Outline

Introduction and motivation Design of SmartPC

– Algorithms to manage two-stage classification

Evaluation methods and results Conclusion

slide-48
SLIDE 48

Experimental setup (1)

  • Summary of classifiers

Name Size MaxOveralps Wildcard S1 9802 22 4 S2 9416 126 57 S3 9497 76 18 S4 9624 82 12 S5 7255 28 S6 99823 27 5 S7 87039 249 79 S8 99836 89 47 S9 99866 81 38 S10 99220 10

10 real classifiers 10 synthetic classifiers

Name Size MaxOveralps Wildcard R1 5233 49 18 R2 5626 63 32 R3 5874 98 48 R4 6339 47 16 R5 7356 38 5 R6 8063 64 35 R7 8475 31 4 R8 10054 1 R9 11574 334 271 R10 15181 177 143

slide-49
SLIDE 49

Experimental setup (2)

  • Block size of TCAMs

– Evaluated various sizes: 32, 64, 128, 256, 512 and 1024, respectively.

  • Metric

– Power reductions

  • Percentage of reductions on activated blocks

– Storage overhead of pre-classifier entries

  • Percentage of pre-classifier size compared to the size of a whole

classifier

  • Schemes

– SmartPC – Default TCAM (without SmartPC) – A naïve scheme named Naive-divide

slide-50
SLIDE 50

Power reductions

With block size 128, the median and average power reductions are 91% and 88%, respectively

Real classifiers Synthetic classifiers

Percentage of power reductions vs. TCAM block size

slide-51
SLIDE 51

Storage overhead

Real classifiers Synthetic classifiers

Small storage overhead, less than 4% for every classifier.

Fraction of storage overhead vs. TCAM block size

slide-52
SLIDE 52

Comparison of SmartPC with Naïve-divide

Real classifiers Synthetic classifiers

SmartPC outperforms naïve-divide by more than 20% on average.

Percentage of power reductions with block size 128

slide-53
SLIDE 53

Discussion

  • Effect of prefix distribution and prefix length
  • Power reduction on small classifiers
  • Power reduction on IPv6 classifiers
slide-54
SLIDE 54

Conclusion

Uses commodity TCAMs Is easy to implement Greatly reduces power consumptions of TCAMs, especially for larger classifiers

  • Propose SmartPC, which:
slide-55
SLIDE 55

Questions

slide-56
SLIDE 56

Thanks

slide-57
SLIDE 57

Backup slides

slide-58
SLIDE 58

Prior work on Packet Classification

  • Software-based approaches

– Examples: HiCuts, HyperCuts, EffiCuts, etc

  • TCAM-based approaches

– High speed but suffer from some deficiencies such as high power consumption – Schemes for power efficiency:

  • CoolCAMs (INFOCOM 2003): reduce power consumption of

TCAMs, but limited to IP forwarding

  • Extended TCAMs (ICNP 2003): requires a new type of TCAM

that returns multiple matches

  • Significant recent work within companies and are of

proprietary nature

slide-59
SLIDE 59

Number of blocks activated vs. block size

R1 R9 S4 S10

slide-60
SLIDE 60

Observations

  • TCAMs

– The main component of power consumption in TCAMs

is proportional to the number of searched entries – Hardware supports turning on a small number of blocks – Hardware supports multiple searches simultaneously, such as Cisco’s TCAM4

  • Classifiers

– For each incoming packet, often only a small number of matching rules in a classifier need to be searched

http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps4324/prod_white_paper0900aecd806dc821.html

slide-61
SLIDE 61

Some stats

  • A 2006 report reported:

– Data centers in U.S. today consume about 61 billion kWh (1.5%

  • f total U.S. electricity consumption) for a total electricity cost of

about $4.5 billion – National energy consumption by servers and data centers could nearly double by 2011 to more than 100 billion kWh

  • According to a Sigcomm CCR 2008 paper, network

consumes 10-20% of a data center's total power.

  • With the growing sizes of classifiers, and the transition

from IPv4 to IPv6, the high power consumption of TCAMs increases both power supply cost and cooling cost

Report to Congress on Server and Data Center Energy Efficiency by U.S. Environmental Protection Agency. The cost of a cloud: research problems in data center networks in SIGCOMM CCR 2009

slide-62
SLIDE 62

Properties of real classifiers

Maximum number of overlapping rules in the two-dimensional space Number of wildcard rules in the two-dimensional space

  • Analyze more than 200 real classifiers ranging in size

from 3 to 15,181

Reduce the five-dimensional problem to two-dimensional!

slide-63
SLIDE 63

Pre-process a classifier

  • Given a mutlti-dimensional classifier C

containing a number of rules:

– The two-dimensional space is divided into non-

  • verlapping rectangles. Each rectangle covers a

cluster of rules and represents an entry in the pre- classifier P for C – Shuffle rules in C such that each pre-classifier entry is associated with a TCAM block, named a specific block – If the number of rules that intercept with a pre- classifier entry exceeds TCAM block size, those extra rules are stored in TCAM blocks named general block(s)

slide-64
SLIDE 64

2, 3, 4, 16 5, 6, 7, 8, 9 11, 12, 13, 14, 15

Dst_addr Src_addr

Given a classifier which contains 19 rules, block size = 5

1 2 3 4 5 7 8 9 6 10 13 11 14 12 15 19

P1 P2 P3

P1 P2 P3

16 17 18

1, 10, 17, 18, 19

Pre-process a classifier

2-dimensional pre-classifiers entries In TCAM block(s) 5-dimensional classifier rules in TCAM blocks

Specific blocks General blocks

slide-65
SLIDE 65

Result Key Expect huge power reduction on large classifiers

Pre-classifier

TCAM

Proposed solution: SmartPC

How to build an efficient pre-classifier?