Detecting Anomalies in Inter- hosts Communication Graph Jan, 14, - PowerPoint PPT Presentation

Detecting Anomalies in Inter- hosts Communication Graph Jan, 14, 2009 Keisuke ISHIBASHI*, Tsuyoshi KONDOH*, Shigeaki HARADA § , Tatsuya MORI § , Ryoichi KAWAHARA § , Shoichiro ASANO ¶ *NTT Information Platform Labs. § NTT Service Integration Labs. ¶ National Information Institute Flocon2009 1

Outline • Anomalous traffic detection • Inter-host communication graph • Anomalies in communication graph • Detecting method for graph anomaly – Similarities between graphs • Experimental results – Synthesized traffic – Actual traffic Flocon09 2

Anomalous traffic detection • DDoS attacks, Network failure etc: can be detected as sudden change in traffic volume • Worm scans or botnet C&C traffic: cannot be found as volume change – Whose traffic volume is very small, and buried in normal traffic • May be found as sudden change in traffic pattern, not volume • Traffic pattern – Entropy: can reveal traffic characteristic per hosts. – Communication pattern between hosts: can reveal anomalous traffic which appears as inter-hosts communication pattern Flocon2009 3

Communication pattern between hosts • Can be represented as graph • Communication graphs for anomalous traffic – Some of them are difficult to detect with conventional methods • Conventional methods: monitoring entropies in number of flows, etc Botnet Victims Botnet Worm C&C C&C infected Victims server Victims server hosts Worm scan Botnet P2P Botnet More difficult to detect Flocon2009 4

5 Time series of communication graph Flocon2009

Challenge • How to detect anomaly (change) in time series of graph? • Visualization or animation of commutation graph[Yurcik06] – Useful especially for digging anomalous event by hand – However, eyeballing by human operator is needed to detect anomalous event • Automated detection: need to define similarity between graphs S(G t ,G t+1 ), where G t and G t+1 are graphs of time t and t+1 – Can judge as an anomaly if S(G t ,G t+1 ) suddenly decreases t=3 t=2 S(G 2, G 3 ) t=1 t=0 S(G 1, G 2 ) S(G o, G 1 ) • [Yurcik06] William Yurcik, “VisFlowConnect-IP: A Link-Based Visualization of NetFlows for Security Monitoring,” 18 th Annual FIRST Conference, June 2006. Flocon2009 6

Similarities between graphs • Graph Kernel – Define “inner product” like function f(•, •), a.k.a kernel, on the space of non-linear spaces [Kashima03] • Edit distance – Number of operations to change graph G to G’ [Bunke06] – operations: add/remove edges/nodes • Can be used to detect anomalies in graph time-series • Difficult to identify the source of anomaly • [Kashima03] H. Kashima, et.al , “Marginalized kernels between labeled graphs,” In Proc. ICML 2003, pp.321-328. • [Bunke06] H. Bunke et.al, “Computer Network Monitoring and Abnormal Event Detection Using Graph Matching and Multidimensional Scaling, ” LNCS Vol. 4065 2006. Flocon2009 7

Linear feature space projection • Linear feature space projection[Ide04] – Mapping a graph to a vector in the linear space that represents the feature of the graph • As feature vectors, adopt a principal eigenvector of adjacency matrix for the graph – ≈ Page Rank vector – Dimension of linear space: Number of nodes in graphs Host3 1 2 3 Host2 1 - 1 1 Host2 Host1 2 1 - 0 Host3 3 1 0 - Principal Host1 eigen Communication graph Feature vector Adjacency matrix vector • [Ide04] Tsuyoshi Ide and Hisashi Kashima: Eigenspace-based Anomaly Detection in Computer Systems, In Proc. 10th ACM SIGKDD Conference (KDD2004), Seattle, WA, USA, 2004. Flocon2009 8

Anomaly detection using feature vector • Periodically generate communication graph from observed traffic data, and calculate feature vectors of the graphs • Calculate similarity between the graph and the previous one Cosine similarity • Judge as anomaly if the similarity suddenly decreases High similarities Vector elements for Host3 Vector at time t Vector at time t+1 Low similarities-> detected as anomaly Host2 Vector at time t+2 Host1 Flocon2009 9

Compressing adjacency matrix • In large communication graph, calculating principal eigen vector of adjacency matrix may be difficult. • Compress adjacency matrix by combining hash matrix and bloom filter Source Address Destination Address Hash(DstIP) 192.168.0.1 → 10.0.0.1 Hashing 1 2 3 M Hashing Source- Destination Pair 1 1 1 1 1 Hash(SrcIP) 2 1 1 2 1 H(192.168.0.1.10.0.0.1) 3 1 1 0 1 Chech whether the pair BloomFIiter M 0 1 1 0 is new or not If new, then increment the corresponding cell Flocon2009 10 10

Experimental results • Observed data: packet capture data of 24-hour long at 1Gbps link • Use packets with ports 135/445(scans)/6667(IRC) – Current python implementation cannot handle whole traffic – Focus on botnet related traffic • Generate graphs every minutes • Hash matrix size ： 1280 × 1280 Flocon2009 11

Time series of simulates of feature vectors • Several sudden decreases in similarities • Try to find the source of anomaly for the first one eigv ec 1.2 1 0.8 Elapsed time 0.6 0.4 eigvec 0.2 0 0:00 3:00 6:00 9:00 12:00 15:00 18:00 21:00 0:00 Similarity Flocon2009 12

Comparison of graphs before/after the anomaly • By comparing graphs and/or vectors before/after the anomaly, we can identify the source of anomaly • Comparing vectors is fit for automated identification • In this case: sudden large virus scan 8000 1 7000 0.8 6000 5000 0.6 degvec-before 4000 eigvec-before 0.4 3000 2000 0.2 1000 0 0 0 200 400 600 800 1000 1200 1400 Flocon2009 13

Evaluation with synthesized anomaly cluster • Which type of anomaly and how large anomaly can be detected by the proposed method? • Evaluation using synthesized anomaly can answer the above question • Firstly, mesh cluster of various size is inserted to actual communication graph and calculate the similarity between the original graph Flocon2009 14

Evaluation with synthesized anomaly cluster • With mesh size > 70, similarity decreases and the anomaly can be found 1.2 1 0.8 Similarity 0.6 degvec 0.4 eigvec 0.2 0 0 20 40 60 80 100 120 Num of mesh nodes Flocon2009 15

Conclusion • Summary – Propose a method to detect anomalies in communication graphs • Projection of graph into linear feature spaces, and compare the simulates between feature vectors – Evaluate using actual traffic data • Found a sudden large worm scan • Future works – Apply to other traffic data to find out which type of anomaly the proposed method can detect – Faster implementation Flocon2009 16

Acknowledgement • This study was supported in part by the Ministry of Internal Affairs and Communications of Japan. Flocon2009 17

Detecting Anomalies in Inter- hosts Communication Graph Jan, 14, - PowerPoint PPT Presentation

Detecting Anomalies in Inter- hosts Communication Graph Jan, 14, 2009 Keisuke ISHIBASHI, Tsuyoshi KONDOH, Shigeaki HARADA , Tatsuya MORI , Ryoichi KAWAHARA , Shoichiro ASANO *NTT Information Platform Labs. NTT Service

Detecting routing anomalies using RIPE Atlas Todor Yakimov Graduate School of Informatics

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Mining Anomalies Andrzej Wasylkowski 1 Why Mine Anomalies? How can we make programs more

world of In Inter Ic Ice-Pump JAN 2016 Presentation of Inter Ice-Pump 1 Inter Ice-Pump ApS //

Why Inter- -Municipal Municipal Why Inter Cooperation? Cooperation? 1 Inter- -Municipal

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Online Meetings with Zoom For Participants and Hosts 1 Zoom for Participants and Hosts July

PTO Meeting/PTO Hosts PTO Meeting/PTO Hosts Six Standards of Effective Six Standards of

Detecting Hidden Anomalies in DNS Communication CZ.NIC Ondrej Mikle-Barat / ondrej.mikle@nic.cz

b s b c anomalies anomalies Found by LHCb (and perhaps Found by several experiments

Detection of electromagnetic anomalies Detection of electromagnetic anomalies before volcanic

Impact of Meteorological Impact of Meteorological A Anomalies on Forest Anomalies on Forest A

Anomalies in Data Maximilian Toller KDDM2 Maximilian Toller, Know-Center > www.tugraz.at 1

Inter-process Communication Emmanuel Fleury B1-201 fleury@cs.aau.dk 1 Outline

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Computer Vision Final Project Graph Recognition Ilya Mirsky 310680657 Tal Baumel 38041653

WHATS ASSUMABLE? DSL PROPOSAL MAY 9, 2012 Defining Assumable Waters for 404 Assumption in

AGENDA 1. Welcome and Introductions 2. Public Comment 3. Approve Minutes (Action Item) 4.

>>>CLICK HERE<<< Vehicular Ad Hoc Networks Powerpoint Presentation New York

Authority and Co-cite, Hub and Co-reference Given the adjacency matrix A (with entries 0 or 1) a i

Quantum Mechanics for Graphs and CW-Complexes Michael Toriyama, Zhe Hu, Boyan Xu, Chengzheng Yu

The linguistic cell Sentence parsing Bacteria Team Sweden (G oteborg) Chalmers Technical

Development Code Amendment WDCA20-0001 Industrial Adjacency Standards Washoe County Planning

Detecting Anomalies in Inter- hosts Communication Graph Jan, 14, - PowerPoint PPT Presentation

Detecting Anomalies in Inter- hosts Communication Graph Jan, 14, 2009 Keisuke ISHIBASHI*, Tsuyoshi KONDOH*, Shigeaki HARADA , Tatsuya MORI , Ryoichi KAWAHARA , Shoichiro ASANO *NTT Information Platform Labs. NTT Service

Detecting routing anomalies using RIPE Atlas Todor Yakimov Graduate School of Informatics

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Mining Anomalies Andrzej Wasylkowski 1 Why Mine Anomalies? How can we make programs more

world of In Inter Ic Ice-Pump JAN 2016 Presentation of Inter Ice-Pump 1 Inter Ice-Pump ApS //

Why Inter- -Municipal Municipal Why Inter Cooperation? Cooperation? 1 Inter- -Municipal

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Online Meetings with Zoom For Participants and Hosts 1 Zoom for Participants and Hosts July

PTO Meeting/PTO Hosts PTO Meeting/PTO Hosts Six Standards of Effective Six Standards of

Detecting Hidden Anomalies in DNS Communication CZ.NIC Ondrej Mikle-Barat / ondrej.mikle@nic.cz

b s b c anomalies anomalies Found by LHCb (and perhaps Found by several experiments

Detection of electromagnetic anomalies Detection of electromagnetic anomalies before volcanic

Impact of Meteorological Impact of Meteorological A Anomalies on Forest Anomalies on Forest A

Anomalies in Data Maximilian Toller KDDM2 Maximilian Toller, Know-Center &gt; www.tugraz.at 1

Inter-process Communication Emmanuel Fleury B1-201 fleury@cs.aau.dk 1 Outline

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Computer Vision Final Project Graph Recognition Ilya Mirsky 310680657 Tal Baumel 38041653

WHATS ASSUMABLE? DSL PROPOSAL MAY 9, 2012 Defining Assumable Waters for 404 Assumption in

AGENDA 1. Welcome and Introductions 2. Public Comment 3. Approve Minutes (Action Item) 4.

&gt;&gt;&gt;CLICK HERE&lt;&lt;&lt; Vehicular Ad Hoc Networks Powerpoint Presentation New York

Authority and Co-cite, Hub and Co-reference Given the adjacency matrix A (with entries 0 or 1) a i

Quantum Mechanics for Graphs and CW-Complexes Michael Toriyama, Zhe Hu, Boyan Xu, Chengzheng Yu

The linguistic cell Sentence parsing Bacteria Team Sweden (G oteborg) Chalmers Technical

Development Code Amendment WDCA20-0001 Industrial Adjacency Standards Washoe County Planning

Detecting Anomalies in Inter- hosts Communication Graph Jan, 14, 2009 Keisuke ISHIBASHI, Tsuyoshi KONDOH, Shigeaki HARADA , Tatsuya MORI , Ryoichi KAWAHARA , Shoichiro ASANO *NTT Information Platform Labs. NTT Service

Anomalies in Data Maximilian Toller KDDM2 Maximilian Toller, Know-Center > www.tugraz.at 1

>>>CLICK HERE<<< Vehicular Ad Hoc Networks Powerpoint Presentation New York