On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, - PowerPoint PPT Presentation

Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, Deni Torres-Roman. Cinvestav, Campus Guadalajara, Mexico November 2015 Jayro Santiago-Paz, Deni Torres-Roman. 1/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues Outline Introduction 1 Databases Feature Extraction 2 Windowing in Network Traffic Entropy Calculation 3 Kullback-Leibler divergence Mutual information Entropy calculation Anomaly detection 4 Classification 5 The Classifier Metrics Open Issues 6 Jayro Santiago-Paz, Deni Torres-Roman. 2/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Entropy Calculation Databases Anomaly detection Classification Open Issues Chandola et al. (2009) states that the term anomaly-based intrusion detection in networks refers to the problem of finding exceptional pat- terns in network traffic that do not conform to the expected normal behavior. Given a traffic network and its set of the selected traffic features X = { X 1 , X 2 , . . . , X p } , and N time instances of X , the normal and abnor- mal behaviors of the instances can be studied. The space of all instances of X builds the feature space which can be mapped to another space by employing a function such as entropy. In the literature, Shan- non and generalized Rényi and Tsallis entropy estimators, as well as probability estimators (Balanced, Balanced II), are used. Jayro Santiago-Paz, Deni Torres-Roman. 3/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Entropy Calculation Databases Anomaly detection Classification Open Issues The A-NIDS usually consists of two stages: training and testing stage. In the training stage using a database of “normal” or free-anomaly network traffic, the feature extraction, windowing and entropy calculation modules, a “normal” profile is found. In the testing stage, using the feature extraction, windowing and entropy calculation modules, anomalies in the current network traffic are detected and classified. Figure 1: General architecture of entropy-based A-NIDS. Jayro Santiago-Paz, Deni Torres-Roman. 4/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Entropy Calculation Databases Anomaly detection Classification Open Issues Synthetic The synthetic databases are generated artificially, e.g., the MIT-DARPA 1998 , 1999 , 2000 databases a , which include five major categories: Denial of Service Attacks (DoS), User to Root Attacks (U2R), Remote to User Attacks (R2U) and probes. a http://www.ll.mit.edu/ideval/index.html Real Some real public databases are: CAIDA a , which contains anonymized passive traffic traces from high-speed Internet backbone links, and the traffic data repository, main- tained by the MAWI b Working Group of the WIDE Project. Other researchers have cre- ated their own databases in different universities, e.g., Carnegie Mellon University, Xi’an Jiaotong University, and Clemson University (GENI), or traffic collected from backbone in SWITCH, Abilene, and Géant. a https://www.caida.org/data/passive/passive_2012_dataset.xml b http://mawi.wide.ad.jp/mawi/ Jayro Santiago-Paz, Deni Torres-Roman. 5/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Entropy Calculation Windowing in Network Traffic Anomaly detection Classification Open Issues Motoda H. and Liu H. (2002) Feature selection is a process that chooses a subset of M features from the original set of N features M ≤ N so that the feature space is optimally reduced according to a certain crite- rion. Feature extraction is a process that extracts a set of new features from the original features through some func- tional mapping. Assuming that there are features N Z 1 , Z 2 , . . . , Z N after feature extraction, another set of new features X 1 , X 2 , . . . , X M ( M < N ) is obtained via the mapping func- tions F i , i.e. X i = F i ( Z 1 , Z 2 , . . . , Z N ) . Jayro Santiago-Paz, Deni Torres-Roman. 6/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Entropy Calculation Windowing in Network Traffic Anomaly detection Classification Open Issues Among the algorithms used to reduce the number of features in network traffic anomaly detection are: PCA, Mutual Information and linear correlation, decision tree, and maxi- mum entropy. In network traffic, the most commonly employed features are: source and destination IP addresses and source and destination port numbers. Other features extracted from headers are: protocol field, number of bytes, service, flag, and country code. Zhang et al. (2009) divided the size of packets into seven types and Gu et al. (2005) defined 587 packet classes based on the port number. At flow a level the features selected were: flow duration, flow size distribution (FSD), and average packet size per flow. For KDD Cup 99, 41 features or a subset were employed. On the other hand, Tellenbach et al. (2011) used source port, country code and others, constructing the TES as input data. a An IP flow corresponds to an IP port-to-port traffic exchanged between two IP addresses during a period of time T. Jayro Santiago-Paz, Deni Torres-Roman. 7/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Entropy Calculation Windowing in Network Traffic Anomaly detection Classification Open Issues Window-based methods group consecutive packets or flows based on a sliding window. The i th window of size L packets is represented as W i ( L, τ ) = { pack k , pack k +1 , . . . , pack k + L } , with k = iL − iτ, where τ is the overlapping and τ ∈ { 0 , 1 , . . . , L − 1 } . When the window size is given by time, L can be different in each window. Windowing is performed in two ways: overlapping ( τ � = 0 ) and non overlapping ( τ = 0 ) windows. The window sizes most commonly used are: 5 min, 30 min, 1 min, 100 sec, 5 sec and 0 . 5 sec. Some researchers use windows with a fixed length L = 4096 , 1000 , and 32 packets. Jayro Santiago-Paz, Deni Torres-Roman. 8/19 On Entropy in Network Traffic Anomaly Detection

Introduction Feature Extraction Kullback-Leibler divergence Entropy Calculation Mutual information Anomaly detection Entropy calculation Classification Open Issues Let X be a random variable which takes values of the set { x 1 , x 2 , ..., x M } , p i := P ( X = x i ) the probability of occurrence of x i , and M the cardinality of the finite set; hence, the Shannon entropy is: M H S ( X ) = − � (1) p i log ( p i ) . i =1 The Rényi entropy is defined as: � M � 1 H R ( X, q ) = � p q (2) 1 − q log i i =1 and the Tsallis entropy is � M � 1 H T ( X, q ) = � p q (3) 1 − , i q − 1 i =1 when q → 1 the generalized entropies are reduced to Shannon entropy. In order to compare the changes of entropy at different times, the entropy is normalized, i.e., H ( X ) ¯ (4) H ( X ) = H max ( X ) . Jayro Santiago-Paz, Deni Torres-Roman. 9/19 On Entropy in Network Traffic Anomaly Detection

On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, - PowerPoint PPT Presentation

Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, Deni Torres-Roman. Cinvestav, Campus Guadalajara, Mexico November 2015

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

What is an anomaly? Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Defining

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Isolation trees Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Isolation

Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney Overview

Anomaly Detection of Trajectories Junier B. Oliva Anomaly Detection An anomaly (or outlier)

An Empirical Evaluation of Entropy- based Traffic Anomaly Detection George Nychis, Vyas Sekar,

Anomaly Detection Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824 Administrative

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Structure of Talk Workload-sensitive Timing Behavior Anomaly Detection 1 Motivation in Large

Dataflow Anomaly Detection Presented By Archana Viswanath Computer Science and Engineering The

<Title> Yiqun Hu, SP Group Agenda Condition monitoring & anomaly detection

Road detection via entropy By Anna Zaidman 1 1 What is entropy? Entropy is a mathematically

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Earnings Conference Call Second Quarter 2011 July 27, 2011 Cautionary Statements And Risk

The CMS GEM Project Install triple-GEM detectors (double stations) in 1.5<| |<2.2 endcap

Borrego Valley Groundwater Basin Borrego Springs Subbasin Summary of Draft Groundwater

Earnings Results Second Quarter 2018 August 2, 2018 Cautionary Language Risk Factors. This

Engineering Analysis Dan Wenman DUNE PDR: APA Review March 27, 2019 Contents Charge

NPRM: Safety of Gas Transm ission & Gathering Pipelines (Docket: PMHSA-2011-0023) Published

2004 Mw=9.0 earthquake on India's eastern plate margin Roger Bilham and Kali Wallace CIRES and

Dilatational bands in rubber-toughened polymers A. LAZZ E R I Materials Engineering Centre,

On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, - PowerPoint PPT Presentation

Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, Deni Torres-Roman. Cinvestav, Campus Guadalajara, Mexico November 2015

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

What is an anomaly? Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Defining

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Isolation trees Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Isolation

Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney Overview

Anomaly Detection of Trajectories Junier B. Oliva Anomaly Detection An anomaly (or outlier)

An Empirical Evaluation of Entropy- based Traffic Anomaly Detection George Nychis, Vyas Sekar,

Anomaly Detection Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824 Administrative

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Structure of Talk Workload-sensitive Timing Behavior Anomaly Detection 1 Motivation in Large

Dataflow Anomaly Detection Presented By Archana Viswanath Computer Science and Engineering The

&lt;Title&gt; Yiqun Hu, SP Group Agenda Condition monitoring &amp; anomaly detection

Road detection via entropy By Anna Zaidman 1 1 What is entropy? Entropy is a mathematically

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Earnings Conference Call Second Quarter 2011 July 27, 2011 Cautionary Statements And Risk

The CMS GEM Project Install triple-GEM detectors (double stations) in 1.5&lt;| |&lt;2.2 endcap

Borrego Valley Groundwater Basin Borrego Springs Subbasin Summary of Draft Groundwater

Earnings Results Second Quarter 2018 August 2, 2018 Cautionary Language Risk Factors. This

Engineering Analysis Dan Wenman DUNE PDR: APA Review March 27, 2019 Contents Charge

NPRM: Safety of Gas Transm ission &amp; Gathering Pipelines (Docket: PMHSA-2011-0023) Published

2004 Mw=9.0 earthquake on India's eastern plate margin Roger Bilham and Kali Wallace CIRES and

Dilatational bands in rubber-toughened polymers A. LAZZ E R I Materials Engineering Centre,

<Title> Yiqun Hu, SP Group Agenda Condition monitoring & anomaly detection

The CMS GEM Project Install triple-GEM detectors (double stations) in 1.5<| |<2.2 endcap

NPRM: Safety of Gas Transm ission & Gathering Pipelines (Docket: PMHSA-2011-0023) Published