Anomaly Detection Based on Simplicity Theory Giacomo Casoni Mar - PowerPoint PPT Presentation

Anomaly Detection Based on Simplicity Theory Giacomo Casoni Mar Badias Simó Research Project 1 - #43 Supervisor: Giovanni Sileno Lecturer: Cees de Laat

TABLE OF 01 INTRODUCTION Basic concepts and Research CONTENTS questions 02 THEORY TO PRACTICE Set a context and quantify complexities 03 THE DATA Dataset treatment and feature definition 04 IMPLEMENTATION 05 RESULTS AND CONCLUSIONS 2

01 INTRODUCTION TABLE OF Basic concepts and CONTENTS Research questions 02 THEORY TO PRACTICE Set a context and quantify complexities 03 THE DATA Dataset treatment and feature definition 04 IMPLEMENTATION 05 RESULTS AND CONCLUSIONS 3

Simplicity Theory Calculates unexpectedness of a situation U(s) ● Cognitive probability in terms of complexity and simplicity, rather than standard mathematical, set-based, terms. 4

Simplicity Theory Calculates unexpectedness of a situation U(s) ● Cognitive probability in terms of complexity and simplicity, rather than standard mathematical, set-based, terms. Generation complexity C w (s) ● Description complexity ● C d (s) 5

Simplicity Theory Calculates unexpectedness of a situation U(s) ● Cognitive probability in terms of complexity and simplicity, rather than standard mathematical, set-based, terms. Generation complexity C w (s) ● Description complexity ● C d (s) U(s) = C w (s) - C d (s) 6

Simplicity Theory An example ● Fair lottery draw: 1-2-3-4-5-6 ● Same chances than any other combination ● Odd from a human point of view 7

Simplicity Theory An example ● Fair lottery draw: 1-2-3-4-5-6 ● Same chances than any other combination ● Odd from a human point of view ● Same generation cost of other combinations ● Low description cost ("1 to 6") ● Therefore: U(s) = C w (s) - C d (s) 8

Simplicity Theory A situation is unexpected, in the eyes of an observer, when it is hard to generate (high C w (s)) and/or easy to describe (low C d (s)). 9

Anomaly Detection Anomaly detection systems model the normal behavior of a target system and report abnormal activities, which are analyzed as a possible intrusions. 10

Research Questions 1. How can an anomaly detection tool based on Simplicity Theory be designed and implemented? 2. How effective said tool can be in detecting anomalies in network logs in a system? 11

TABLE OF 01 INTRODUCTION Basic concepts and Research CONTENTS questions 02 THEORY TO PRACTICE Set a context and quantify complexities 03 THE DATA Dataset treatment and feature definition 04 IMPLEMENTATION 05 RESULTS AND CONCLUSIONS 12

Putting it Into Practice U(s) = C w (s) - C d (s) 13

Putting it Into Practice U(s) = C w (s) - C d (s) QUANTIFY COMPLEXITIES How can generation and description complexity be quantified? The quantification needs to be representative and comparable . 14

Putting it Into Practice U(s) = C w (s) - C d (s) SET A CONTEXT Simplicity Theory allows for observer point-of-view bias . Different observer might have different concepts of “abnormal”. 15

Set a Context (1) Define object prototypes. Prototypes, in the conceptual space, are used as baseline to compute generation and description complexity of a given state. Defined in n dimensions, where n is the number of features 16

Set a Context (2) In our case, one of the categorical features... 17

Set a Context (2) In our case, one of the categorical features... ● Source IP: monitor an IP address traffic for abnormal behaviours. (Compromised machine) 18

Set a Context (2) In our case, one of the categorical features... ● Source IP: monitor an IP address traffic for abnormal behaviours. (Compromised machine) Destinatination IP: monitor for unusual traffic to a specific machine. (Server under attack) ● 19

Set a Context (2) In our case, one of the categorical features... ● Source IP: monitor an IP address traffic for abnormal behaviours. (Compromised machine) Destinatination IP: monitor for unusual traffic to a specific machine. (Server under attack) ● ● Protocol: monitor for abnormal protocol-specific traffic. (Specific attacks) 20

Set a Context (2) In our case, one of the categorical features... ● Source IP: monitor an IP address traffic for abnormal behaviours. (Compromised machine) Destinatination IP: monitor for unusual traffic to a specific machine. (Server under attack) ● ● Protocol: monitor for abnormal protocol-specific traffic. (Specific attacks) ...however not necessary ● Combination of categorical features K-Prototypes ● ● No prototypes (aka one prototype) 21

Set a Context (3) Object prototypes TCP DNS Dst. IP Length Dst. IP Dimensions Length Source IP Info 96 104 Feature prototypes 192.168.0.1 192.168.0.2 192.168.0.3 22

Quantifying Complexities - Generation (1) 23

Quantifying Complexities - Generation (1) ” The length of the shortest program that a given environment must execute to achieve a given state ” 24

Quantifying Complexities - Generation (1) ” The length of the shortest program that a given environment must execute to achieve a given state ” Real-life events are often NOT like fair lottery, some events are more likely to happen than others ... 25

Quantifying Complexities - Generation (1) ” The length of the shortest program that a given environment must execute to achieve a given state ” Real-life events are often NOT like fair lottery, some events are more likely to happen than others ... … a ranking of most frequently occurring feature prototypes has to be created. 26

Quantifying Complexities - Generation (2) CODE COMPLEXITY 1st 0 2nd 0 1 3rd 1 1 4th 00 2 5th 01 2 6th 10 2 7th 11 2 8th 000 3 9th 001 3 27

Quantifying Complexities - Generation (2) CODE COMPLEXITY 192.168.0.1 0 192.168.0.2 0 1 192.168.0.3 1 1 192.168.0.4 00 2 192.168.0.5 01 2 192.168.0.6 10 2 192.168.0.7 11 2 192.168.0.8 000 3 192.168.0.9 001 3 28

Quantifying Complexities - Generation (2) CODE COMPLEXITY 1st 0 1 192.168.0.1 0 192.168.0.2 0 1 2nd 3rd 192.168.0.3 1 1 1 0 192.168.0.4 00 2 192.168.0.5 01 2 4th 5th … … … 0 1 192.168.0.6 10 2 192.168.0.7 11 2 8th 9th 192.168.0.8 000 3 192.168.0.9 001 3 29

Quantifying Complexities - Description (1) 30

Quantifying Complexities - Description (1) ”The shortest possible description of a state that an observer can produce to discriminate it without ambiguity” 31

Quantifying Complexities - Description (1) ”The shortest possible description of a state that an observer can produce to discriminate it without ambiguity” It could be the same as the generation complexity... 32

Quantifying Complexities - Description (1) ”The shortest possible description of a state that an observer can produce to discriminate it without ambiguity” It could be the same as the generation complexity... … but an observer can also use its own memory to achieve simpler descriptions. 33

Quantifying Complexities - Description (1) ”The shortest possible description of a state that an observer can produce to discriminate it without ambiguity” It could be the same as the generation complexity... … but an observer can also use its own memory to achieve simpler descriptions. The cheapest option is chosen. 34

Quantifying Complexities - Description (2) MOVES COMPLEXITY At observation time N, the stack pointer N-1 0 1 is here. N-2 1 1 N-3 2 (10) 2 N-4 3 (11) 2 N-5 4 (100) 3 N-6 5 (101) 3 N-7 6 (110) 3 N-8 7 (111) 3 N-9 8 (1000) 4 35

Quantifying Complexities - Numerical (1) PROBLEM! Previous methods work for categorical feature prototypes. Numerical feature prototypes cannot be ranked. 36

Quantifying Complexities - Numerical (1) PROBLEM! Previous methods work for categorical feature prototypes. Numerical feature prototypes cannot be ranked. Idea: numerical feature prototypes could be transformed into categorical ones. 37

Quantifying Complexities - Numerical (2) SOLUTION - Binary Tree Compute mean and standard deviation over all the possible feature prototypes. Describe a feature prototype as being n * (m 𝝉 ) away from the mean. Populate the tree with m 𝝉 intervals, starting from the closest to the mean. 38

Quantifying Complexities - Numerical (3) Mean -m 𝝉 +m 𝝉 -2m 𝝉 +2m 𝝉 -3m 𝝉 +3m 𝝉 -4m 𝝉 +4m 𝝉 000 01 00 0 1 10 11 111 CODES 3 2 2 1 0 0 1 2 2 3 COMPLEXITIES 39

Quantifying Complexities - Numerical (4) 0 𝝉 0 1 -m 𝝉 +m 𝝉 1 0 -2m 𝝉 -3m 𝝉 … … … 0 1 -5m 𝝉 -6m 𝝉 40

Quantifying Complexities - Numerical (5) SOLUTION - Memory Stack Compute mean and standard deviation over all the possible feature prototypes. Describe an observation as being n * (m 𝝉 ) away from a previous observation. Complexity is given by the depth of the previous observation and its distance from the current observation. 41

Anomaly Detection Based on Simplicity Theory Giacomo Casoni Mar - PowerPoint PPT Presentation

Anomaly Detection Based on Simplicity Theory Giacomo Casoni Mar Badias Sim Research Project 1 - #43 Supervisor: Giovanni Sileno Lecturer: Cees de Laat TABLE OF 01 INTRODUCTION Basic concepts and Research CONTENTS questions 02 THEORY

What is an anomaly? Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Defining

Isolation trees Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Isolation

Anomaly Detection of Trajectories Junier B. Oliva Anomaly Detection An anomaly (or outlier)

Anomaly Detection Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824 Administrative

Increased Simplicity, Reduced Footprint Simplicity Size Performance Safety Intuitive

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney Overview

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Structure of Talk Workload-sensitive Timing Behavior Anomaly Detection 1 Motivation in Large

Dataflow Anomaly Detection Presented By Archana Viswanath Computer Science and Engineering The

<Title> Yiqun Hu, SP Group Agenda Condition monitoring & anomaly detection

In Incorporating Feedback in into Tree-based Anomaly Detection Shubhomoy Das, Weng-Keen Wong,

Welcome to Form Simplicity Training Basic use Log-in to Form Simplicity through the web address

Detecting Attacks Anomaly-based Detection Signature-based Signature-based (Misuse)

Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection Jiong Zhang and

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

STEPHEN GREY DRAFT OF OPENING REMARKS CHECK VS DELIVERY 1 Ladies and Gentleman, I am

SMURF Serial MUsic Represented as Functions Van Bui Richard Townsend Lianne Lairmore Kuangya

EXECUTIVE SUMMARY Multiplatform release of a movie-based game with the Smurfs license, based on

Presentation Presentation Piteraq Piteraq A new line of eco bio make up that has the power of

IPv6 Security Issues and Challenges Dr. Omar A. Abouabdalla (omar@ipv6global.my) Head Technology

A Machine Learning Approach for Detecting Distributed Denial of Service Attacks Tanaphon Roempluk

Investor Presentation July 2017 ASX:MMI 1 Executive Summary Metro is rapidly advancing its

Investor Presentation RIU Conference, Sydney May 2019 ASX: KIN Disclaimer Disclaimer This

Anomaly Detection Based on Simplicity Theory Giacomo Casoni Mar - PowerPoint PPT Presentation

Anomaly Detection Based on Simplicity Theory Giacomo Casoni Mar Badias Sim Research Project 1 - #43 Supervisor: Giovanni Sileno Lecturer: Cees de Laat TABLE OF 01 INTRODUCTION Basic concepts and Research CONTENTS questions 02 THEORY

What is an anomaly? Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Defining

Isolation trees Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Isolation

Anomaly Detection of Trajectories Junier B. Oliva Anomaly Detection An anomaly (or outlier)

Anomaly Detection Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824 Administrative

Increased Simplicity, Reduced Footprint Simplicity Size Performance Safety Intuitive

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney Overview

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Structure of Talk Workload-sensitive Timing Behavior Anomaly Detection 1 Motivation in Large

Dataflow Anomaly Detection Presented By Archana Viswanath Computer Science and Engineering The

&lt;Title&gt; Yiqun Hu, SP Group Agenda Condition monitoring &amp; anomaly detection

In Incorporating Feedback in into Tree-based Anomaly Detection Shubhomoy Das, Weng-Keen Wong,

Welcome to Form Simplicity Training Basic use Log-in to Form Simplicity through the web address

Detecting Attacks Anomaly-based Detection Signature-based Signature-based (Misuse)

Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection Jiong Zhang and

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

STEPHEN GREY DRAFT OF OPENING REMARKS CHECK VS DELIVERY 1 Ladies and Gentleman, I am

SMURF Serial MUsic Represented as Functions Van Bui Richard Townsend Lianne Lairmore Kuangya

EXECUTIVE SUMMARY Multiplatform release of a movie-based game with the Smurfs license, based on

Presentation Presentation Piteraq Piteraq A new line of eco bio make up that has the power of

IPv6 Security Issues and Challenges Dr. Omar A. Abouabdalla (omar@ipv6global.my) Head Technology

A Machine Learning Approach for Detecting Distributed Denial of Service Attacks Tanaphon Roempluk

Investor Presentation July 2017 ASX:MMI 1 Executive Summary Metro is rapidly advancing its

Investor Presentation RIU Conference, Sydney May 2019 ASX: KIN Disclaimer Disclaimer This

<Title> Yiqun Hu, SP Group Agenda Condition monitoring & anomaly detection