Design for Large-Scale Collection System Using Flow Mediators - - PowerPoint PPT Presentation

design for large scale collection system using flow
SMART_READER_LITE
LIVE PREVIEW

Design for Large-Scale Collection System Using Flow Mediators - - PowerPoint PPT Presentation

Design for Large-Scale Collection System Using Flow Mediators Atsushi Kobayashi, Tsuyoshi Kondoh, and Keisuke Ishibashi NTT Information Sharing Laboratories Jan 8, 2008 FloCon 2008 1 Outline Introduction Why do we need a large-scale


slide-1
SLIDE 1

Jan 8, 2008 FloCon 2008 1

Design for Large-Scale Collection System Using Flow Mediators

Atsushi Kobayashi, Tsuyoshi Kondoh, and Keisuke Ishibashi NTT Information Sharing Laboratories

slide-2
SLIDE 2

FloCon 2008 2 Jan 8, 2008

Outline

Introduction

Why do we need a large-scale collection system? What is Flow Mediator?

Requirements

I tried to explore the possibility of a large-scale

collection system for large networks.

Heuristic method of designing traffic collection

system

Estimate number of flow records after

aggregation or sampling

Adjust several parameters based on this result

Summary

slide-3
SLIDE 3

FloCon 2008 3 Jan 8, 2008

Introduction

Traffic volumes in ISP networks are becoming

huge in the last few years.

The number of exported flow records is becoming so huge that

a single collector cannot handle them.

A smaller sampling rate makes small flows

invisible.

Even if traffic grows, network operators would like to maintain

the same sampling rate as much as possible.

Aggregated flow records from router make port

number or IP address invisible.

Exporting 5-tuple flow records from router is better.

The demand for a large-scale traffic-collection system is growing.

slide-4
SLIDE 4

FloCon 2008 4 Jan 8, 2008

What is Flow Mediator?

Flow Mediator† is a device that “mediates” flow

records and has the following functions:

collects Flow Records from various exporters stores original flow records aggregates flow

records flexibly

distributes appropriate

flow records for collectors/analyzers

† draft-kobayashi-ipfix-mediator-model-01.txt

Mediator DB

Traffic Matrix measurement

Network designer

Accounting System

Customer Service

Troubleshoot System

Network Operator

Exporters

Flow mediator ought to be useful for making large-scale collection system.

slide-5
SLIDE 5

FloCon 2008 5 Jan 8, 2008

You can easily make Flow Mediation code

Net::Flow perl module is available on CPAN.

http://search.cpan.org/~ akoba/Net-Flow-0.02/ The module can encode and decode NetFlow/IPFIX packets. The encoding and decoding functions have a similar IF.

NetFlow v.5 NetFlow v.9 IPFIX NetFlow v.9 IPFIX Flow Mediator

decode encode

Flow Aggregation Flow distribution Protocol Converter Anonymization my ( $HeaderHashRef, $TemplateArrayRef, $FlowArrayRef, $ErrorsArrayRef) = Net::Flow::decode( \$packet, $TemplateArrayRef ) ; my ( $EncodeHeaderHashRef, $PktsArrayRef, $ErrorsArrayRef) = Net::Flow::encode( $EncodeHeaderHashRef, \ @MyTemplates, $FlowArrayRef, 1400 ) ;

slide-6
SLIDE 6

FloCon 2008 6 Jan 8, 2008

Requirements

Make traffic-collection system to meet

following requirements

Requirement 1: measure traffic flow of entire

networks

measure traffic matrices PoP by PoP and router by

router

Requirement 2: store received 5-tuple flow

records from router

When traffic incident happens, allow inspection of

traffic.

Requirement 3: design scalable architecture to

accommodate large ISP traffic volume

slide-7
SLIDE 7

FloCon 2008 7 Jan 8, 2008

Goal

PoP #1 PoP #2 PoP #10

NetFlow Observation point Edge Router Core Edge Router

Network model

Core Router

Explore heuristic method of designing collection

system for introduction into actual network.

Proposed collection system needs to accommodate

following network model.

Total traffic volume 500 Gb/s, 100 Mp/s

Edge Router 20/PoP×10 PoP = 200 NetFlow is enabled on IngressIF of Edge router.

slide-8
SLIDE 8

FloCon 2008 8 Jan 8, 2008

Top Collector 10 PoPs, 20 routers/PoP, Mediators are located in each PoP.

Observation Point Edge Core Edge

Mediator # 1

PoP # 1

Mediator # 2

PoP # 2

Mediator # 10

PoP # 10

Hierarchical Collection System

Mediators are allocated in each PoP.

They store all flow records, aggregate them,

and export them to next collector.

Requirement 1 Requirement 2 Requirement 3

Top Collector

measures wide-area

traffic matrices, such as router by router, pop by pop.

Inspection

If traffic incident

happens, we can retrieve detailed flow records from Flow Mediator.

slide-9
SLIDE 9

FloCon 2008 9 Jan 8, 2008

Visualize Traffic Matrices

Top collector can visualize Router/PoP/AS Traffic

Matrixes.

We can select all traffic or specific VPN (customer). Destination PoP Source PoP

Color indicates traffic volume of Source/ Destination pair. Nail is the name of our traffic matrix visualizer.

slide-10
SLIDE 10

FloCon 2008 10 Jan 8, 2008

Heuristic Design Method

Suitable values of several parameters are decided

by the following steps.

Step 0: measure performance limit of flow mediator and

top collector.

Step 1: reveal relation between number of flow records

and packet sampling

Step 2: reveal relation between number of flow records

and aggregation that depends on several factors.

Aggregation methods (BGP Next-Hop, Prefix, host) Aggregation interval time (20 s, 60 s, 90 s…)

Step 3: select suitable value within performance limit.

Large sampling rate is preferable. Small granularity of aggregation is preferable.

slide-11
SLIDE 11

FloCon 2008 11 Jan 8, 2008

Consideration Points

List several considerations, as follows.

Maximum performances of the top collector and

mediators are 5 Kf/s and 10 Kf/s.

Top Collector 10 PoPs, 20 routers/PoP, Mediators are located in each PoP.

Observation Point Edge Core Edge

Mediator# 1

PoP # 1

Mediator# 2

PoP # 2

Mediator# 10

PoP # 10

  • Max. 5 Kf/s
  • Max. 10 Kf/s

How many flow records does the top collector receive? How many flow records does the mediator receive? Which is better aggregation method and interval time? Maximum sampling rate?

Step 1 Step 2 Step 3 - A Step 3 - B

slide-12
SLIDE 12

FloCon 2008 12 Jan 8, 2008

Step 1: estimate flow records after sampling

0.000001 0.000010 0.000100 0.001000 0.010000 0.100000 1.000000 1 10 100 1000 10000 100000 # of packet per flow Density Function

F(x) = 0.5 × x -1.73

( )

( )

all x x sampled

f x F r f × × − − = ∑

∞ =

) ( / 1 1 1

1

305 kf/s 1/100 5.2 kf/s 1/10000 43 kf/s fsampled 1/1000 Sampling rate

Extraction probability

0.5x -1.73

Roughly estimate as follows. 100 Mpps ÷ 20 packets = 5 Mf/s Approximate # of flows when total traffic volume is 500 Gb/s.

Estimate number of flow records based on density

function of packets per flow.

# of packets per flow: x Packets per flow density function: F(x) Sampling rate: 1/r Total number of unsampled flow: fall

slide-13
SLIDE 13

FloCon 2008 13 Jan 8, 2008

Too many flow records without mediator

Even if sampling rate is 1/10,000 packets, the

number of flow records exceeds performance limit.

Top Collector 10 PoP, 20 routers/PoP, Mediators are located in each PoP.

Observation Point Edge Core Edge

PoP# 1 PoP# 2 PoP# 10

  • Max. 5Kf/s

305 kf/s 1/100 5.2 kf/s 1/10000 43 kf/s fsampled 1/1000 Sampling rate

5.2 kf/ s Sampling rate = 1/ 10000 packets

slide-14
SLIDE 14

FloCon 2008 14 Jan 8, 2008

Step 2: flow records after aggregation

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 300 600 900 Elapsed Time(s) Aggregation Ratio (=fe /fr )

PAIR_HOST SRC_HOST DST_HOST PAIR_PREFIX SRC_PREFIX DST_PREFIX BIFLOW

Top Collector

Mediator

Sampling rate = 1/ 1 packets

fr fe

What is the # of flow records after aggregation? Mediator aggregates unsampled flow records at 20-second

interval.

Aggregation efficiency: Prefix > HOST > Pair Prefix > Pair HOST >

Bi-Flow

The prefix length “/24” is uniformly applied to Prefix Aggregation. Bi-flow is aggregated from two flow directions.

slide-15
SLIDE 15

FloCon 2008 15 Jan 8, 2008

Step 2: Flow records after aggregation, sampling

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 300 600 900 Elapsed Time(s) Aggregation Ratio (=fe/f r)

PAIR_HOST IP_SRC_ADDR IP_DST_ADDR PAIR_PREFIX SRC_PREFIX DST_PREFIX BIFLOW

Each aggregation

method becomes ineffective gradually.

Bi-flow becomes

ineffective immediately.

sensitive to

sampling rate.

Sampling rate 1/ 128

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 300 600 900 Elapsed Time(s) Aggregation Ratio

FLOW PAIR_HOST IP_SRC_ADDR IP_DST_ADDR PAIR_PREFIX SRC_PREFIX DST_PREFIX BIFLOW

Sampling rate 1/ 1024

Aggregation Ratio (= fe/fr) Aggregation Ratio (= fe/fr)

slide-16
SLIDE 16

FloCon 2008 16 Jan 8, 2008

Step 2: Which factor influences aggregation?

Aggregation ratio depends on several factors.

Traffic Volume through observation point. Sampling rate Aggregation interval time

I guess that the aggregation ratio depends on the number of flow records received in interval time.

32% 30% DST_PREFIX Aggregation ratio 43% 128 300 3562 1 Sampling rate (1/r) 45% DST_HOST Aggregation ratio 10 3450 Aggregation Interval Time (s) Received Flows

slide-17
SLIDE 17

FloCon 2008 17 Jan 8, 2008

Step 2: Which factor influences aggregation?

I plotted all experimental data into one graph.

Three MAWI traffic data samples have different volumes. Aggregation Interval time:5 – 300s Sampling rate:1/1 – 1/1024

0.2 0.4 0.6 0.8 1 10 100 1000 10000 100000 1000000 # of flow records Aggregation Ratio

PAIR_HOST SRC_HOST DST_HOST PAIR_PREFIX SRC_PREFIX DST_PREFIX BIPAIR

Aggregation ratio depends on number of received flow records.

Aggregation Ratio (= fe/fr)

slide-18
SLIDE 18

FloCon 2008 18 Jan 8, 2008

Step 2: Formulation of Aggregation Ratio

Aggregation ratio (R) can be estimated from

number of flow records (fr), as follows.

DST Host aggregation: DST Prefix aggregation:

After all, the aggregation ratio depends on the #

  • f unique hosts or prefixes versus # of flows.

# of flow records DST Hosts Aggregation ratio = DST Hosts/Flows

log log log

18 .

80 . 1

× =

r dsthost

f R

26 .

34 . 2

× =

r dstprefix

f R

# of flow records

slide-19
SLIDE 19

FloCon 2008 19 Jan 8, 2008

Step 3: Selection of Suitable Values

I selected suitable value within performance limit. 0.6 kf/ s 4.4 kf/ s

30 kf/s

ー ー

# of received flow records in mediator (fr )

0.62 kf/ s 3.0 kf/ s

12 kf/s Interval time = 300s DST_Prefix aggregation

1.2 kf/ s

7.0 kf/s 34 kf/s Interval time = 300s DST_HOST aggregation Interval time = 60s Interval time = 60s DST_Prefix aggregation DST_HOST aggregation Sampling Rate

0.94 kf/ s 1.6 kf/ s

1/10000

4.7 kf/ s

21 kf/s 45 kf/s 1/100 9.0 kf/s # of received flow records in top collector

(=∑fe )

1/1000

slide-20
SLIDE 20

FloCon 2008 20 Jan 8, 2008

Example of collection system

Sampling Rate: 1/1000 Aggregation Interval time: 60 s

Traffic Matrix View ・

  • Max. 5 kf/s

  • Max. 10 kf/s

10 PoPs, 20 routers/PoP, Mediators are located in each PoP. 4.7 kf/ s 4.4 kf/ s

Mediator # 10

PoP #10 Method =DST Prefix Aggregation, I nterval time = 60 s

NetFlow

  • bservation

Point Edge Core Edge

Mediator # 1

PoP #1

Mediator # 2

PoP #2 Sampling Rate = 1/ 1000 packets

slide-21
SLIDE 21

FloCon 2008 21 Jan 8, 2008

Conclusion

To make large scale traffic collection

system, flow mediator is efficient.

Revealed relation between number of flow

records and several factors:

Traffic volume Sampling rate Aggregation method Aggregation interval time

Demonstrated that traffic collection

system using mediator can be introduced into actual large-scale networks.

slide-22
SLIDE 22

FloCon 2008 22 Jan 8, 2008

Thank you for your attention.

This study was supported by the Ministry of Internal Affairs and Communications of Japan.