Jan 8, 2008 FloCon 2008 1
Design for Large-Scale Collection System Using Flow Mediators - - PowerPoint PPT Presentation
Design for Large-Scale Collection System Using Flow Mediators - - PowerPoint PPT Presentation
Design for Large-Scale Collection System Using Flow Mediators Atsushi Kobayashi, Tsuyoshi Kondoh, and Keisuke Ishibashi NTT Information Sharing Laboratories Jan 8, 2008 FloCon 2008 1 Outline Introduction Why do we need a large-scale
FloCon 2008 2 Jan 8, 2008
Outline
Introduction
Why do we need a large-scale collection system? What is Flow Mediator?
Requirements
I tried to explore the possibility of a large-scale
collection system for large networks.
Heuristic method of designing traffic collection
system
Estimate number of flow records after
aggregation or sampling
Adjust several parameters based on this result
Summary
FloCon 2008 3 Jan 8, 2008
Introduction
Traffic volumes in ISP networks are becoming
huge in the last few years.
The number of exported flow records is becoming so huge that
a single collector cannot handle them.
A smaller sampling rate makes small flows
invisible.
Even if traffic grows, network operators would like to maintain
the same sampling rate as much as possible.
Aggregated flow records from router make port
number or IP address invisible.
Exporting 5-tuple flow records from router is better.
The demand for a large-scale traffic-collection system is growing.
FloCon 2008 4 Jan 8, 2008
What is Flow Mediator?
Flow Mediator† is a device that “mediates” flow
records and has the following functions:
collects Flow Records from various exporters stores original flow records aggregates flow
records flexibly
distributes appropriate
flow records for collectors/analyzers
† draft-kobayashi-ipfix-mediator-model-01.txt
Mediator DB
Traffic Matrix measurement
Network designer
Accounting System
Customer Service
Troubleshoot System
Network Operator
Exporters
Flow mediator ought to be useful for making large-scale collection system.
FloCon 2008 5 Jan 8, 2008
You can easily make Flow Mediation code
Net::Flow perl module is available on CPAN.
http://search.cpan.org/~ akoba/Net-Flow-0.02/ The module can encode and decode NetFlow/IPFIX packets. The encoding and decoding functions have a similar IF.
NetFlow v.5 NetFlow v.9 IPFIX NetFlow v.9 IPFIX Flow Mediator
decode encode
Flow Aggregation Flow distribution Protocol Converter Anonymization my ( $HeaderHashRef, $TemplateArrayRef, $FlowArrayRef, $ErrorsArrayRef) = Net::Flow::decode( \$packet, $TemplateArrayRef ) ; my ( $EncodeHeaderHashRef, $PktsArrayRef, $ErrorsArrayRef) = Net::Flow::encode( $EncodeHeaderHashRef, \ @MyTemplates, $FlowArrayRef, 1400 ) ;
FloCon 2008 6 Jan 8, 2008
Requirements
Make traffic-collection system to meet
following requirements
Requirement 1: measure traffic flow of entire
networks
measure traffic matrices PoP by PoP and router by
router
Requirement 2: store received 5-tuple flow
records from router
When traffic incident happens, allow inspection of
traffic.
Requirement 3: design scalable architecture to
accommodate large ISP traffic volume
FloCon 2008 7 Jan 8, 2008
Goal
PoP #1 PoP #2 PoP #10
NetFlow Observation point Edge Router Core Edge Router
Network model
Core Router
Explore heuristic method of designing collection
system for introduction into actual network.
Proposed collection system needs to accommodate
following network model.
Total traffic volume 500 Gb/s, 100 Mp/s
Edge Router 20/PoP×10 PoP = 200 NetFlow is enabled on IngressIF of Edge router.
FloCon 2008 8 Jan 8, 2008
Top Collector 10 PoPs, 20 routers/PoP, Mediators are located in each PoP.
Observation Point Edge Core Edge
Mediator # 1
PoP # 1
Mediator # 2
PoP # 2
Mediator # 10
PoP # 10
Hierarchical Collection System
Mediators are allocated in each PoP.
They store all flow records, aggregate them,
and export them to next collector.
Requirement 1 Requirement 2 Requirement 3
Top Collector
measures wide-area
traffic matrices, such as router by router, pop by pop.
Inspection
If traffic incident
happens, we can retrieve detailed flow records from Flow Mediator.
FloCon 2008 9 Jan 8, 2008
Visualize Traffic Matrices
Top collector can visualize Router/PoP/AS Traffic
Matrixes.
We can select all traffic or specific VPN (customer). Destination PoP Source PoP
Color indicates traffic volume of Source/ Destination pair. Nail is the name of our traffic matrix visualizer.
FloCon 2008 10 Jan 8, 2008
Heuristic Design Method
Suitable values of several parameters are decided
by the following steps.
Step 0: measure performance limit of flow mediator and
top collector.
Step 1: reveal relation between number of flow records
and packet sampling
Step 2: reveal relation between number of flow records
and aggregation that depends on several factors.
Aggregation methods (BGP Next-Hop, Prefix, host) Aggregation interval time (20 s, 60 s, 90 s…)
Step 3: select suitable value within performance limit.
Large sampling rate is preferable. Small granularity of aggregation is preferable.
FloCon 2008 11 Jan 8, 2008
Consideration Points
List several considerations, as follows.
Maximum performances of the top collector and
mediators are 5 Kf/s and 10 Kf/s.
Top Collector 10 PoPs, 20 routers/PoP, Mediators are located in each PoP.
Observation Point Edge Core Edge
Mediator# 1
PoP # 1
Mediator# 2
PoP # 2
Mediator# 10
PoP # 10
- Max. 5 Kf/s
- Max. 10 Kf/s
How many flow records does the top collector receive? How many flow records does the mediator receive? Which is better aggregation method and interval time? Maximum sampling rate?
Step 1 Step 2 Step 3 - A Step 3 - B
FloCon 2008 12 Jan 8, 2008
Step 1: estimate flow records after sampling
0.000001 0.000010 0.000100 0.001000 0.010000 0.100000 1.000000 1 10 100 1000 10000 100000 # of packet per flow Density Function
F(x) = 0.5 × x -1.73
( )
( )
all x x sampled
f x F r f × × − − = ∑
∞ =
) ( / 1 1 1
1
305 kf/s 1/100 5.2 kf/s 1/10000 43 kf/s fsampled 1/1000 Sampling rate
Extraction probability
0.5x -1.73
Roughly estimate as follows. 100 Mpps ÷ 20 packets = 5 Mf/s Approximate # of flows when total traffic volume is 500 Gb/s.
Estimate number of flow records based on density
function of packets per flow.
# of packets per flow: x Packets per flow density function: F(x) Sampling rate: 1/r Total number of unsampled flow: fall
FloCon 2008 13 Jan 8, 2008
Too many flow records without mediator
Even if sampling rate is 1/10,000 packets, the
number of flow records exceeds performance limit.
Top Collector 10 PoP, 20 routers/PoP, Mediators are located in each PoP.
Observation Point Edge Core Edge
PoP# 1 PoP# 2 PoP# 10
- Max. 5Kf/s
305 kf/s 1/100 5.2 kf/s 1/10000 43 kf/s fsampled 1/1000 Sampling rate
5.2 kf/ s Sampling rate = 1/ 10000 packets
FloCon 2008 14 Jan 8, 2008
Step 2: flow records after aggregation
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 300 600 900 Elapsed Time(s) Aggregation Ratio (=fe /fr )
PAIR_HOST SRC_HOST DST_HOST PAIR_PREFIX SRC_PREFIX DST_PREFIX BIFLOW
Top Collector
Mediator
Sampling rate = 1/ 1 packets
fr fe
What is the # of flow records after aggregation? Mediator aggregates unsampled flow records at 20-second
interval.
Aggregation efficiency: Prefix > HOST > Pair Prefix > Pair HOST >
Bi-Flow
The prefix length “/24” is uniformly applied to Prefix Aggregation. Bi-flow is aggregated from two flow directions.
FloCon 2008 15 Jan 8, 2008
Step 2: Flow records after aggregation, sampling
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 300 600 900 Elapsed Time(s) Aggregation Ratio (=fe/f r)
PAIR_HOST IP_SRC_ADDR IP_DST_ADDR PAIR_PREFIX SRC_PREFIX DST_PREFIX BIFLOW
Each aggregation
method becomes ineffective gradually.
Bi-flow becomes
ineffective immediately.
sensitive to
sampling rate.
Sampling rate 1/ 128
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 300 600 900 Elapsed Time(s) Aggregation Ratio
FLOW PAIR_HOST IP_SRC_ADDR IP_DST_ADDR PAIR_PREFIX SRC_PREFIX DST_PREFIX BIFLOW
Sampling rate 1/ 1024
Aggregation Ratio (= fe/fr) Aggregation Ratio (= fe/fr)
FloCon 2008 16 Jan 8, 2008
Step 2: Which factor influences aggregation?
Aggregation ratio depends on several factors.
Traffic Volume through observation point. Sampling rate Aggregation interval time
I guess that the aggregation ratio depends on the number of flow records received in interval time.
32% 30% DST_PREFIX Aggregation ratio 43% 128 300 3562 1 Sampling rate (1/r) 45% DST_HOST Aggregation ratio 10 3450 Aggregation Interval Time (s) Received Flows
FloCon 2008 17 Jan 8, 2008
Step 2: Which factor influences aggregation?
I plotted all experimental data into one graph.
Three MAWI traffic data samples have different volumes. Aggregation Interval time:5 – 300s Sampling rate:1/1 – 1/1024
0.2 0.4 0.6 0.8 1 10 100 1000 10000 100000 1000000 # of flow records Aggregation Ratio
PAIR_HOST SRC_HOST DST_HOST PAIR_PREFIX SRC_PREFIX DST_PREFIX BIPAIR
Aggregation ratio depends on number of received flow records.
Aggregation Ratio (= fe/fr)
FloCon 2008 18 Jan 8, 2008
Step 2: Formulation of Aggregation Ratio
Aggregation ratio (R) can be estimated from
number of flow records (fr), as follows.
DST Host aggregation: DST Prefix aggregation:
After all, the aggregation ratio depends on the #
- f unique hosts or prefixes versus # of flows.
# of flow records DST Hosts Aggregation ratio = DST Hosts/Flows
log log log
18 .
80 . 1
−
× =
r dsthost
f R
26 .
34 . 2
−
× =
r dstprefix
f R
# of flow records
FloCon 2008 19 Jan 8, 2008
Step 3: Selection of Suitable Values
I selected suitable value within performance limit. 0.6 kf/ s 4.4 kf/ s
30 kf/s
ー ー
# of received flow records in mediator (fr )
0.62 kf/ s 3.0 kf/ s
12 kf/s Interval time = 300s DST_Prefix aggregation
1.2 kf/ s
7.0 kf/s 34 kf/s Interval time = 300s DST_HOST aggregation Interval time = 60s Interval time = 60s DST_Prefix aggregation DST_HOST aggregation Sampling Rate
0.94 kf/ s 1.6 kf/ s
1/10000
4.7 kf/ s
21 kf/s 45 kf/s 1/100 9.0 kf/s # of received flow records in top collector
(=∑fe )
1/1000
FloCon 2008 20 Jan 8, 2008
Example of collection system
Sampling Rate: 1/1000 Aggregation Interval time: 60 s
Traffic Matrix View ・
- Max. 5 kf/s
・
- Max. 10 kf/s
10 PoPs, 20 routers/PoP, Mediators are located in each PoP. 4.7 kf/ s 4.4 kf/ s
Mediator # 10
PoP #10 Method =DST Prefix Aggregation, I nterval time = 60 s
NetFlow
- bservation
Point Edge Core Edge
Mediator # 1
PoP #1
Mediator # 2
PoP #2 Sampling Rate = 1/ 1000 packets
FloCon 2008 21 Jan 8, 2008
Conclusion
To make large scale traffic collection
system, flow mediator is efficient.
Revealed relation between number of flow
records and several factors:
Traffic volume Sampling rate Aggregation method Aggregation interval time
Demonstrated that traffic collection
system using mediator can be introduced into actual large-scale networks.
FloCon 2008 22 Jan 8, 2008
Thank you for your attention.
This study was supported by the Ministry of Internal Affairs and Communications of Japan.