Anomaly Based Intrusion Detection in Distributed Applications - PowerPoint PPT Presentation

Anomaly Based Intrusion Detection in Distributed Applications without global clock Eric Totel, Mouna Hkimi, Michel Hurfin, Mourad Leslous, Yvan Labiche SEC2-2016 5 July 2016

Outline of the Presentation • Position of the problem • Building a distributed application behavior model • Partial Event Ordering • Automaton recognizing sequences • Temporal properties • Applying Detection on an example • Results on a distributed file system application CentraleSupelec 2

Intrusion Detection in Distributed Systems • Several nodes running processes • Intrusion Detection Systems are deployed • On the network (NIDS) • On each node (HIDS) • Local Detection of compromission • No relationship between the states of the several nodes • Alerts emitted takes into account the state of one node • Current solutions • Alert correlation: Requires total ordering of all alerts • DIDS: Requires total ordering of all events analyzed • In Cloud environments: virtual machines are often desynchronized (clock drift) CentraleSupelec 3

The Case of a Distributed Application • How to enhance the detection ? • Statement • The states of the different nodes are not independent • As such the behaviors of the different nodes of the application are not independent • The actions performed by the nodes are causally dependent on each other • Local actions • Messages exchanged • Solution • Build a reference model that takes into account the causal dependencies between the nodes • Without relying on a total ordering of the events (no global clock) CentraleSupelec 4

Logs and partial ordering • On each node, a process produces a total ordered log • Partial Ordering of the events on different nodes (Lamport happened before relationship) 8 e 2 E α , 8 f 2 E α , e � α f • e occured before f in the same log E α i • e is a message send and f its receipt e � α g and g � α f • there exists g such that • How to learn the right sequences of actions performed by the distributed processes ? CentraleSupelec 5

Example of Logs • A trace: two logs of two processes and a b c1!m Execution α p1 E α E α 1 2 1 a d p2 2 b c1?m d c1?m e 3 c1!m e • On this execution, a � α b • No order relation between b and d: • {a, b, d, c1!m, c1?m, e} is a valid sequence … but not the only one ! CentraleSupelec 6

Notion of a valid sequence • Observed correct normal sequence • Compliant with the partial relationship • A sequence of events is valid iff CentraleSupelec 7

Generation of valid sequences (1) p2 Execution α Generating the lattice of e consistent cuts E α E α 1 2 1 a d c1?m 2 b c1?m A valid sequence is a sequence 3 c1!m e of events consumed by a path d in the lattice of consistent cut a b c1!m p1 CentraleSupelec 8

Generation of valid sequences (2) Generation of an automaton containing all the paths in the lattice of consistent cuts p2 c1!m 3 d e 2 b c1?m e d 13 9 6 c1!m 1 a d 7 0 d c1?m b 17 a 16 d a b c1!m p1 CentraleSupelec 9

Automaton from several executions Execution α Execution β E β E β E α E α 1 2 1 2 1 a d a f 2 b c1?m c1!m c1?m 3 c1!m e g � �� Merge the start states �� of all the automata � �� CentraleSupelec 10

Analysis of the automaton • Contains only the observed valid sequences • In practice: • In a heavy distributed application, it is very difficult to exhibit all the behaviors of the application due to concurrency • It is thus very difficult to learn a complete behavior model • Solution: • Generalization of the automaton • Permits to introduce new unlearned behaviors • Ensures that all the original valid sequences are included in the generalized automaton CentraleSupelec 11

Generalization (k-tail algorithm) �� Disadvantage: can introduce incorrect sequences � � � �� of events at the same time � �� k=1 (a low k permits a higher generalization) �� Advantage: can introduce new valid unlearned � � � �� sequences of events � �� CentraleSupelec 12

How to deal with incorrect sequences ? • Duality of models • Automaton: exhaustive list of sequences • Temporal properties: properties on the types of events • Temporal invariants • Issued from the domain of test • Three invariants considered (a and b are event types) • a is always followed by b • a is never followed by b • a always precedes b CentraleSupelec 13

Invariants on our example �� (total of 59 invariants) Generalization Model checking �� CentraleSupelec 14

Duality of models Model Generalized Invariants that can be Automaton violated by the generalized automaton �� (total of 10 invariants) � � � � �� Non acceptable sequence {a, b, c1!m, d, c1?m, g} CentraleSupelec 15

Valid/Accepted/Acceptable sequences • Invariants computed on the original lattice of consistent cuts ∑ '' acceptable • Invariants on valid sequences of sequences events • Invariants are less restrictive than the automaton ∑ ' • We consider a sequence is sequences acceptable if it is accepted by the accepted by the generalized ∑ automaton and complies with the automaton valid invariants sequences CentraleSupelec 16

Detection algorithm • Given a trace • Is this trace compliant: • With the generalized automaton • With the temporal invariants • Two strategies • All total ordering of the events of the trace are compliant with the model • At least one order of the events of the trace is compliant with the model • In practice • Strategy « all » is more time consuming • Similar false positive rate in both approaches CentraleSupelec 17

Simple Example: e-commerce • 3 processes: article buying, 70 possible different behaviours P2-P1!SEARCH P1-P2?AVAILABLE P1-P2?AVAILABLE P2-P1!BUY P1-P2?SOLD Process (p2) P1-P3!SOLD P3-P1?BUY P3-P1?SEARCH P1-P3!AVAILABLE Server (p1) P2-P1?BUY P1-P2!SOLD P2-P1?SEARCH P1-P2!AVAILABLE Process (p3) P3-P1!SEARCH P1-P3?AVAILABLE P3-P1!BUY P1-P3?SOLD CentraleSupelec 18

Detection Accuracy • Simulations of an intrusion • Removing an event • Modifying the order of events • Adding new events • Violating the integrity of the distributed logs • Are detected by the approach CentraleSupelec 19

Generalization and False Positive Rate False Positive Rate • Learning Phase with 10, 20, 30, 40, 50, 60 traces traces learned=10 traces learned=20 traces learned=30 traces learned=40 traces learned=50 traces learned=60 • With a generalization 90% 85% 84% 80% parameter k=1, 2, 3, 4, 5 75% 71% 70% 70% 68% 65% 60% 52% 50% • Result: 42% 40% 39% • The generalization 31% 30% 30% decreases the rate of the 22% 20% 19% 19% 18% 16% false positives, even with a 10% 10% 9% 9% 8% 8% 6% low number of traces learnt 3% 3% 2% 1% 0% 0% 0% 0% 1 2 3 4 5 k CentraleSupelec 20

Real World Evaluation: XtreemFS • High Availability Distributed Replicated File System • Intrusion Detection approach applied on a simple configuration of the nodes CentraleSupelec 21

Experimentation applied • Writing of a set of files • 500 files used to learn the model • 1640 files written to measure the false positive rate • Traces obtained on each node by instrumenting the code of the file servers • One trace for a complete file write CentraleSupelec 22

Model Size • Number of traces used Model Size to learn the model 7800 800 grow 7600 700 • The number of Number of invariants 7400 600 Number of States invariants lower 7200 500 • The size (number of invariants 7000 400 states) of the 6800 300 States automaton grows (k- 6600 200 tail applied with k=1) 6400 100 6200 0 10 50 100 200 300 400 500 Number of Traces CentraleSupelec 23

Anomaly Based Intrusion Detection in Distributed Applications - PowerPoint PPT Presentation

Anomaly Based Intrusion Detection in Distributed Applications without global clock Eric Totel, Mouna Hkimi, Michel Hurfin, Mourad Leslous, Yvan Labiche SEC2-2016 5 July 2016 Outline of the Presentation Position of the problem Building

What is an anomaly? Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Defining

Styles of Intrusion Detection Misuse intrusion detection Try to detect things known to be

Isolation trees Alastair Rushworth Data Scientist DataCamp Anomaly Detection in R Isolation

Anomaly Detection of Trajectories Junier B. Oliva Anomaly Detection An anomaly (or outlier)

IT INTRUSION IT INTRUSION FinFisher Product Suite IT INTRUSION IT INTRUSION FinFisher

Dataflow Anomaly Detection Presented By Archana Viswanath Computer Science and Engineering The

Intrusion Detection Distributed Host-Based Network-Based ITS335: IT Security Honeypots

Anomaly Detection Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824 Administrative

Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney Overview

Intrusion Detection Principles Basics Models of Intrusion Detection

Outline Introduction Intrusion Detection Characteristics of intrusion detection CS 236

Outline Introduction Intrusion Detection Characteristics of intrusion detection CS 239

Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection Jiong Zhang and

Intrusion Detection System Amir Hossein Payberah payberah@yahoo.com 1 Contents Intrusion

PANACEA: AUTOMATING ATTACK CLASSIFICATION FOR ANOMALY-BASED NETWORK INTRUSION DETECTION SYSTEMS

Data Mining II Anomaly Detection Heiko Paulheim Anomaly Detection Also known as Outlier

Reachability In Parametric Timed Automata With Two Parametric Clocks And One Parameter Is

Max-plus automata and Tropical identities Laure Daviaud University of Warwick Birmingham,

Anthony Ambrose President and CEO Data I/O Corporation Investor Presentation July 2015 Safe

7 th Grade Spring Program for academically talented / gifted students Accelerated and

tt t t ts t

Get Building SDA National Conference - 22 March 2019 Design and Technology - innovations in

CNC PINpad USA, December 2014 Configuration Configuration Description POS Dollar General

Collapsing Nondeterministic Automata Ashutosh Bhatia Nitin Rai Sep 12, 2005 FACTS of NFA and

Sambuz

Useful Links

Newsletter

Mail Us