Detecting distributed attacks using distributed processing - - PowerPoint PPT Presentation

detecting distributed attacks using distributed
SMART_READER_LITE
LIVE PREVIEW

Detecting distributed attacks using distributed processing - - PowerPoint PPT Presentation

Detecting distributed attacks using distributed processing frameworks RP2 #59 Sudesh Jethoe Overview Introduction Problem Description Research Questions Method Results Conclusion Introduction


slide-1
SLIDE 1

Detecting distributed attacks using distributed processing frameworks

RP2 #59 Sudesh Jethoe

slide-2
SLIDE 2

Overview

  • Introduction
  • Problem Description
  • Research Questions
  • Method
  • Results
  • Conclusion
slide-3
SLIDE 3

Introduction

http://www.eweek.com/security/slideshows/verisign-sees-sharp-climb-in-ddos-attack-volume-in-q2.html/

slide-4
SLIDE 4

Overview

  • Introduction
  • Problem Description
  • Research Questions
  • Method
  • Results
  • Conclusion
slide-5
SLIDE 5

Problem Description

  • Analysis of large volumes of network traffic data

takes time

  • A lot of time
  • Can we make it faster?
slide-6
SLIDE 6

Solution?

slide-7
SLIDE 7

Overview

  • Introduction
  • Problem Description
  • Research Questions
  • Method
  • Results
  • Conclusion
slide-8
SLIDE 8

Research Questions

Main research question:

  • How can a distributed processing framework be utilized to identify

network anomalies in historical netflow data? Sub questions:

  • Which processing framework is best suited for identifying DDOS

attacks?

  • How can we distinguish anomalies in netflow data?
  • Which algorithms for detecting network anomalies exist and how

can they be applied in a distributed processing environment?

slide-9
SLIDE 9

Overview

  • Introduction
  • Problem Description
  • Research Questions
  • Method
  • Results
  • Conclusion
slide-10
SLIDE 10

Method

1)Review distributed processing frameworks 2)Create application for distributed processing framework 3)Implement DDOS-algorithm in application

slide-11
SLIDE 11

Distributed processing frameworks

slide-12
SLIDE 12

Distributed processing frameworks

slide-13
SLIDE 13

Distributed processing frameworks

  • Hive

– Limited to querying datasets

  • Pig

– Extend queries with scripting and ML

  • Spark

– Extract data, transform, query, extendable python

slide-14
SLIDE 14

Method

1)Review distributed processing frameworks 2)Create application for distributed processing framework 3)Implement DDOS-algorithm in application

slide-15
SLIDE 15

Implementing Spark

  • Cluster

– 26 nodes – 2x2TB disks – AMD Opteron 3vCPU – 1GB/s ethernet

  • Dataset

Route r Dataset Size 1 83,4 MiB 2 126,7 MiB 3 1,1 GiB 4 3,1 GiB 5 10 GiB 6 41,5 GiB 7 88,2 GiB 8 99,3 GiB 9 296,4 GiB 10 444,4 GiB

slide-16
SLIDE 16

Implementing Spark

  • 3 methods

– Traditional – Parallelised – Single MapReduce

slide-17
SLIDE 17

Implementing Spark

  • Traditional

1) retrieve unique intervals 2) partition the data by interval 3) for each interval create counts of packets for each found socket

  • Result

> 1,5 hour / 84,4 MiB

slide-18
SLIDE 18

Implementing Spark

  • Parallelised

1) retrieve unique intervals 2) partition the data by interval 3) Parallel: for each interval create counts of packets for each found socket

  • Result

~ 10 mins / 126,7 MiB

slide-19
SLIDE 19

Implementing Spark

  • Single MapReduce

1) Initialize cluster 2) Read network traffic data from HDFS 3) Apply map/reduce to get flow counts for “dest IP:port:protocol:hour” 4) Filter out all counts < #threshold 5) Group results by “port:protocol” 6) Filter out all combinations < #min results 7) Normalize results by “port:protocol 8) Plot all hits for remaining “port:protocol” combinations

slide-20
SLIDE 20

Implementing Spark

  • Results

Dataset Size (GiB) Execution Time (seconds) Rate (MiB/seconds) 0,128 28 4,57 1,1 45,6 4,07 99,3 430,4 231 444,4 / /

slide-21
SLIDE 21

Results (126,7 MiB)

slide-22
SLIDE 22

Results (126,7 MiB)

slide-23
SLIDE 23

Results (88,2 GiB)

slide-24
SLIDE 24

Results (10,0 GiB)

slide-25
SLIDE 25

Method

1)Review distributed processing frameworks 2)Create application for distributed processing framework 3)Implement DDOS-algorithm in application

slide-26
SLIDE 26

Implement DDOS-algorithm in application

  • Weighted Moving Average

^ x(i+1)=yxi+(1−y) ^ xi

^ x:estimationx

xi:current valueof x

y:smoothing factor

slide-27
SLIDE 27

Implement DDOS-algorithm in application

  • Adaptive threshold

– Uses weighted average – Threshold: Multiple of expected value of the average

alert if xi>threshold∗ ^ xi

slide-28
SLIDE 28

Implement DDOS-algorithm in application

  • Exponential Weighted Moving Average (EWMA)
  • Threshold

Gap = 0, avg = X0, Max_Gap = # If Xi < AVG: update(AVG, Xi) If Xi > AVG: Alert() If Gap >= Max_Gap: Gap = 0 update(AVG, Xi) Gap +=1

slide-29
SLIDE 29

Overview

  • Introduction
  • Problem Description
  • Research Questions
  • Method
  • Results
  • Conclusion
slide-30
SLIDE 30

Results (training 126,7MiB)

slide-31
SLIDE 31

Results (training 126,7MiB)

slide-32
SLIDE 32

Results (84,3MiB)

slide-33
SLIDE 33

Results (88,2 GiB)

slide-34
SLIDE 34

Results (88,2 GiB)

slide-35
SLIDE 35

Overview

  • Introduction
  • Problem Description
  • Research Questions
  • Method
  • Results
  • Conclusion
slide-36
SLIDE 36

Conclusion

  • ~ 100 GiB < 10 minutes
  • Traffic from different routers require different

parameters

  • Traffic patterns differ per router and service
slide-37
SLIDE 37

Future work

  • Optimize framework to handle datasets > 100

GiB

  • Test other algorithms on framework
  • Apply tuned algorithms to live data
  • Identify usage of irregular ports
slide-38
SLIDE 38

Questions

  • ?