internet traffic analysis modeling with real world aspects
play

Internet Traffic: Analysis, Modeling with real-world aspects Pierre - PowerPoint PPT Presentation

Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Internet Traffic: Analysis, Modeling with real-world aspects Pierre B ORGNAT CNRS ENS Lyon, Laboratoire de Physique (UMR


  1. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Internet Traffic: Analysis, Modeling with real-world aspects Pierre B ORGNAT CNRS – ENS Lyon, Laboratoire de Physique (UMR 5672) TERA-NET – 07/2010

  2. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + • Internet traffic metrology: some basics • Analysis: Scale Invariance, LRD, Robust Estimation • Modeling: LRD / Heavy-Tails • Anomaly Detection; Host classification • Acknowledgements • P Abry, G Dewaele, P Flandrin, A Scherrer, P Gonçalves, P Loiseau, P Primet (Lyon, ENSL, CNRS & INRIA) • Ph Owezarksi, N Larrieu (LAAS-CNRS) Metrosec (ACI Sécurité & Informatique), ANR OSCAR JL Guillaume, M Latapy, C Magnien (LIP6) • K Fukuda, R Fontugne, Y Himura (NII), K Cho (IIJ) (Tokyo) • D Veitch, N Hohn (Melbourne Univ.) • O Michel (GIPSA-lab, INPGrenoble)

  3. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + • Internet traffic metrology: some basics • Analysis: Scale Invariance, LRD, Robust Estimation • Modeling: LRD / Heavy-Tails • Anomaly Detection; Host classification • Acknowledgements • P Abry, G Dewaele, P Flandrin, A Scherrer, P Gonçalves, P Loiseau, P Primet (Lyon, ENSL, CNRS & INRIA) • Ph Owezarksi, N Larrieu (LAAS-CNRS) Metrosec (ACI Sécurité & Informatique), ANR OSCAR JL Guillaume, M Latapy, C Magnien (LIP6) • K Fukuda, R Fontugne, Y Himura (NII), K Cho (IIJ) (Tokyo) • D Veitch, N Hohn (Melbourne Univ.) • O Michel (GIPSA-lab, INPGrenoble)

  4. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Traffic & Network Measurement Overview of networks properties • Heterogeneity (of information, devices, topologies, geography,...) • Evolve with time (new services, increased usage,...) • Complexity • individual elements � behaviour of the whole • interplay: architecture / protocols / usages • Crucial choice: level of description • Information flows? → Signals • Network’s level? → Graphs, or Multivariate Signals → Need for a statistical approach

  5. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Traffic & Network Measurement: What for? • Analysis of networks: (protocols, routeurs, provisioning,...) • Modeling of traffic and of its properties • Classification or recognition of traffic (with new needs: Peer to Peer, real-time, wireless,...) • Définition of service agreements (Pricing, QoS, Committed QoS...) • Security of Networks; Intrusion Detection Systems; Anomaly Detection (DDoS, scans, computer virus, worms, outages...) [ACI METROPOLIS 2001, AS Métrologie des réseaux de l’Internet 2003, ACI METROSEC 2007,...]

  6. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Passive Measurements of traffic • On networks: Internet Protocol → Packets+information • Monitoring facilities: add a time-stamp to data (dynamics) • link level , monitor packets: intercept (port-mirroring, splitter,...); capture (tcpdump, DAG, GNET,...); filter (...) Time IP Source Destination Source Destination protocol Address Address Port Port → Point processes (marked) • node level (routeur) → multivariate data Device: routeur ! Netflow (CISCO), flow-tools (Juniper) • network level → multivariate data, graph Synchronising several link or node monitoring?

  7. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Passive Measurements of traffic • → Huge stream of data. • Aggregated cout process = # of packets during ∆ Bin Size Time Time 2 3 2 4 5 2 3 4 4 3 5 3 8 Time 6 # Packets 4 2 0 0 0.2 0.4 0.6 0.8 1 time (s) ∆ = 1ms 10000 5000 0 0 10 20 30 40 50 60 time (min) ∆ = 1s • Problematic: understand the features of traffic

  8. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Short Biblio. on Longitudinal Traffic Analysis • Many works during the past 15 years. • Some Focus on newest application at the time: • FTP , Mail in early 90’s [kc claffy et al. , Comm. ACM 94] • Web, mid-90’s [Crovella & Bestravos, ToN 95] • P2P , early 2000’s [Karagiannis et al. , Globecom’04] • Video Streams, late 2000’s [Cha et al. , IMC’07] • ... • Anomalies: History of Scanning [Allman et al. , IMC’07] • Wireless, Mobile,... • Some focus on non-classical statistical properties: • ‘Failure of Poisson modeling’ / Self-similarity / Scaling / LRD [Leland et al. , 94] [Paxson & Floyd, 95], [Willinger et al. , 97], [Veitch & Abry, 01], [Cao et al. , 02], [Karagiannis et al. , 04], [Hohn et al. , 05], [Robeiro et al. , 05]

  9. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Internet traffic: not a simple renewal process The Failure of Poisson Modeling. Paxson & Floyd 1994 • If Internet ≃ phone • Packets would follow a Poisson process • Short-range correlations only • Aggregated traffic: Gaussian law (per Central Limit Thm) • The thruth: much more variabilities and burstiness ∆ =1ms ∆ =1ms 1s 1s ∆ =10ms ∆ =10ms 1s 1s ∆ =1s ∆ =1s 100s 100s IP Traffic Poisson Traffic

  10. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Internet traffic: not a simple renewal process The Failure of Poisson Modeling. Paxson & Floyd 1994 • If Internet ≃ phone • Packets would follow a Poisson process • Short-range correlations only • Aggregated traffic: Gaussian law (per Central Limit Thm) • The thruth: much more variabilities and burstiness ∆ =1ms 1s Slope −0.7 log10(Frequency) ∆ =10ms 1s ∆ =1s 100s 0 1 2 3 4 5 6 log10(#Pkts per flow) • # packets per ∆ � = Poisson distrib. • waiting times � = Exponential distribution • correlations � = short-range only

  11. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Traffic series: aggregation at several time-scales δ =12ms 10000 5000 0 50 100 150 200 250 300 350 400 450 500 δ =12 * 8 ms 8000 6000 4000 2000 0 50 100 150 200 250 300 350 400 450 500 δ =12 * 8 * 8 ms 6000 4000 2000 0 50 100 150 200 250 300 350 400 450 500 4000 δ =12 * 8 * 8 *8 ms 2000 0 50 100 150 200 250 300 350 400 450 500 • Same kinds of fluctuations seens at all the different levels

  12. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Marginal probability distributions Traffic trace LBL-TCP-3 (1994) • Empirical histograms of the # of packets per ∆ • Estimation: count the number of occurrences 0.7 0.1 0.02 0.6 0.08 0.5 0.015 0.06 0.4 0.01 0.3 0.04 0.2 0.005 0.02 0.1 0 0 0 0 2 4 6 8 10 0 10 20 30 40 50 0 50 100 150 200 250 ∆ = 4ms ∆ = 32ms ∆ = 256ms Gaussian: p ( x ) = e − ( x − µ ) 2 / 2 σ 2 • Exp. p ( x ) = e − x /β /β √ 2 πσ � α − 1 1 � x � − x � • Fit/Model: Gamma Γ α,β ( x ) = exp . β Γ( α ) β β

  13. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Marginal probability distributions Traffic trace LBL-TCP-3 (1994) • Empirical histograms of the # of packets per ∆ • Estimation: count the number of occurrences 0.7 0.1 0.02 0.6 0.08 0.5 0.015 0.06 0.4 0.01 0.3 0.04 0.2 0.005 0.02 0.1 0 0 0 0 2 4 6 8 10 0 10 20 30 40 50 0 50 100 150 200 250 ∆ = 4ms ∆ = 32ms ∆ = 256ms Gaussian: p ( x ) = e − ( x − µ ) 2 / 2 σ 2 • Exp. p ( x ) = e − x /β /β √ 2 πσ � α − 1 1 � x � − x � • Fit/Model: Gamma Γ α,β ( x ) = exp . β Γ( α ) β β

  14. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Long-Range Dependence (or Long Memory) The Self-Similar Nature of Ethernet Traffic. Leland, Taqqu, Willinger & Wilson 1993 Property of Long-Range Dependence (LRD) Covariance tends to a non-summable power-law (at large lags) ⇒ Spectrum F X ( ν ) ∼ c | ν | − γ , | ν | → 0 , avec 0 < γ < 1 . • Spectrum – (Wiener-Khintchine) → Correlation Z T 2 ˛ ˛ 1 Z ˛ e − i 2 πν t X ( t ) d t ˛ C X ( τ ) e − i 2 πντ d τ F X ( ν ) = = ˛ ˛ T ˛ ˛ 0 Self-similarity: statistical invariance under dilatation A random process { X ( t ) , t ≥ 0 } is self-similar with index H (“ H -ss”) if for all dilation factor λ > 0 , X ( λ t ) d = λ H X ( t ) , t > 0 . • H -ss for H > 0 . 5 ⇒ LRD.

  15. Traffic Measurement Analysis & Robust Methods Modeling Anomaly Detection Traffic Classification Conclusion + Time-Scale Representation Definition : Wavelet transform Shifted (time) and dilated (scale) versions of ψ 0 : ψ j , k ( t ) = 2 − j / 2 ψ 0 ( 2 − j t − k ) . Wavelet coefficients: d X ∆ ( j , k ) = � ψ j , k , X ∆ � . Efficient Algo. [Mallat 1989]

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend