Combating Snowshoe Spam with Fire Olivier van der Toorn - - PowerPoint PPT Presentation

combating snowshoe spam with fire
SMART_READER_LITE
LIVE PREVIEW

Combating Snowshoe Spam with Fire Olivier van der Toorn - - PowerPoint PPT Presentation

Combating Snowshoe Spam with Fire Olivier van der Toorn <o.i.vandertoorn@utwente.nl> November 13, 2018 University of Twente, Design and Analysis of Communication Systems ICT OPEN 2018 Overview Introduction Methodology Results


slide-1
SLIDE 1

Combating Snowshoe Spam with Fire

Olivier van der Toorn <o.i.vandertoorn@utwente.nl> November 13, 2018

University of Twente, Design and Analysis of Communication Systems ICT OPEN 2018

slide-2
SLIDE 2

Overview

Introduction Methodology Results Conclusions

1

slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

Background Info

  • Active DNS Measurements
  • Snowshoe Spam

2

slide-5
SLIDE 5

Background Info

  • Active DNS Measurements
  • Snowshoe Spam

2

slide-6
SLIDE 6

Background Info

  • Active DNS Measurements
  • Snowshoe Spam

2

slide-7
SLIDE 7

Starting Point

  • Snowshoe spam is hard to detect
  • Sender Policy Framework (SPF)
  • DNS domain

3

slide-8
SLIDE 8

Starting Point

  • Snowshoe spam is hard to detect
  • Sender Policy Framework (SPF)
  • DNS domain

3

slide-9
SLIDE 9

Starting Point

  • Snowshoe spam is hard to detect
  • Sender Policy Framework (SPF)
  • DNS domain

3

slide-10
SLIDE 10

Research Question

How can we detect snowshoe spam through active DNS measurements?

4

slide-11
SLIDE 11

Methodology

slide-12
SLIDE 12

Methodology

5

slide-13
SLIDE 13

OpenINTEL

  • Active DNS Measurement Platform
  • Queries more than 60% of registered domain names

6

slide-14
SLIDE 14

OpenINTEL

  • Active DNS Measurement Platform
  • Queries more than 60% of registered domain names

6

slide-15
SLIDE 15

Datasets & Features

  • Two types of datasets
  • Labeled
  • Unlabeled
  • 37 Features
  • Long Tail Analysis

7

slide-16
SLIDE 16

Datasets & Features

  • Two types of datasets
  • Labeled
  • Unlabeled
  • 37 Features
  • Long Tail Analysis

7

slide-17
SLIDE 17

Datasets & Features

  • Two types of datasets
  • Labeled
  • Unlabeled
  • 37 Features
  • Long Tail Analysis

7

slide-18
SLIDE 18

Long Tail Analysis

long The tail of the DNS

8

slide-19
SLIDE 19

Long Tail Analysis

long

The tail of the DNS

8

slide-20
SLIDE 20

Machine Learning

  • Trained and evaluated many classifier algorithms
  • Ranked performance based on ‘precision’ metric

9

slide-21
SLIDE 21

Machine Learning

  • Trained and evaluated many classifier algorithms
  • Ranked performance based on ‘precision’ metric

9

slide-22
SLIDE 22

Precision

Precision = True Positives True Positives + False Positives

10

slide-23
SLIDE 23

Machine Learning

  • Trained and evaluated many classifier algorithms
  • Ranked performance based on ‘precision’ metric
  • Selected AdaBoost Classifier as classifier of choice

(110 false positives out of 10851 ham domains)

11

slide-24
SLIDE 24

Realtime Blackhole List

  • DNS based way of hosting a blacklist
  • Daily detections
  • Compared to other blacklists

12

slide-25
SLIDE 25

Realtime Blackhole List

  • DNS based way of hosting a blacklist
  • Daily detections
  • Compared to other blacklists

12

slide-26
SLIDE 26

Realtime Blackhole List

  • DNS based way of hosting a blacklist
  • Daily detections
  • Compared to other blacklists

12

slide-27
SLIDE 27

SURF

  • SURFmailfilter
  • Initially in evaluation mode

13

slide-28
SLIDE 28

SURF

  • SURFmailfilter
  • Initially in evaluation mode

13

slide-29
SLIDE 29

Results

slide-30
SLIDE 30

Comparison training data

10 20 30 40 50 40% 60% 80% 100%

11.2 16.6

Number of A records CDF blacklisted benign

14

slide-31
SLIDE 31

Comparison training data

10 20 30 40 50 40% 60% 80% 100%

11.2 16.6

Number of A records CDF blacklisted benign

20 40 60 80 100 90% 92% 94% 96% 98% 100%

77.0

Number of MX records CDF blacklisted benign

15

slide-32
SLIDE 32

Early Detection

16

10 20 30 40 50 60 70 80 Detection in advance (days) 1 10 100 1000 10000 100000 Number of detected domains

slide-33
SLIDE 33

Early Detection

16

28984 10 20 30 40 50 60 70 80 Detection in advance (days) 1 10 100 1000 10000 100000 Number of detected domains

slide-34
SLIDE 34

Early Detection

16

28984 1961 10 20 30 40 50 60 70 80 Detection in advance (days) 1 10 100 1000 10000 100000 Number of detected domains

slide-35
SLIDE 35

Early Detection

16

28984 1961 1144 10 20 30 40 50 60 70 80 Detection in advance (days) 1 10 100 1000 10000 100000 Number of detected domains

slide-36
SLIDE 36

Early Detection

16

28984 1961 1144 1095 10 20 30 40 50 60 70 80 Detection in advance (days) 1 10 100 1000 10000 100000 Number of detected domains

slide-37
SLIDE 37

Early Detection

16

28984 1961 1144 1095 968 10 20 30 40 50 60 70 80 Detection in advance (days) 1 10 100 1000 10000 100000 Number of detected domains

slide-38
SLIDE 38

Early Detection

16

28984 1961 1144 1095 968 928 10 20 30 40 50 60 70 80 Detection in advance (days) 1 10 100 1000 10000 100000 Number of detected domains

slide-39
SLIDE 39

Early Detection (update)

17

slide-40
SLIDE 40

SURF Results

18

2017-05-24 2017-06-23 2017-07-23 Observation dates daadzgam.com realdrippy.com coachspoke.com stillscratch.com homerope.com quittradition.com Domain names

slide-41
SLIDE 41

SURF Results

18

2017-05-24 2017-06-23 2017-07-23 Observation dates daadzgam.com realdrippy.com coachspoke.com stillscratch.com homerope.com quittradition.com Domain names

slide-42
SLIDE 42

SURF Results

18

2017-05-24 2017-06-23 2017-07-23 Observation dates daadzgam.com realdrippy.com coachspoke.com stillscratch.com homerope.com quittradition.com Domain names Blacklisted Detected

slide-43
SLIDE 43

SURF Results

  • 1080 emails
  • 447 (41.39%) emails with a score of five or higher

19

2017-05-24 2017-06-23 2017-07-23 Observation dates daadzgam.com realdrippy.com coachspoke.com stillscratch.com homerope.com quittradition.com Domain names Blacklisted Detected

slide-44
SLIDE 44

SURF Results

  • 1080 emails
  • 447 (41.39%) emails with a score of five or higher

19

2017-05-24 2017-06-23 2017-07-23 Observation dates daadzgam.com realdrippy.com coachspoke.com stillscratch.com homerope.com quittradition.com Domain names Blacklisted Detected

slide-45
SLIDE 45

SURF Results

  • 633 (58.61%) emails have a score below five
  • 52 unique domains in the body
  • of which 13 domains have never appeared in an email classified as spam
  • these 13 domains appeared in 31 emails (2.87%)

20

2017-05-24 2017-06-23 2017-07-23 Observation dates daadzgam.com realdrippy.com coachspoke.com stillscratch.com homerope.com quittradition.com Domain names Blacklisted Detected

slide-46
SLIDE 46

Additional email blocked

21

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Additional score of the RBL 100 200 300 400 500 600 700 Emails marked as spam

slide-47
SLIDE 47

Additional email blocked

21

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Additional score of the RBL 100 200 300 400 500 600 700 Emails marked as spam

22 120 320 335

slide-48
SLIDE 48

Additional email blocked

21

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Additional score of the RBL 100 200 300 400 500 600 700 Emails marked as spam

22 120 320 335 352 441 497 554

slide-49
SLIDE 49

Additional email blocked

21

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Additional score of the RBL 100 200 300 400 500 600 700 Emails marked as spam

22 120 320 335 352 441 497 554 626 629

slide-50
SLIDE 50

Additional email blocked (update)

22

slide-51
SLIDE 51

Conclusions

slide-52
SLIDE 52

Conclusions

  • Hard to detect spam is detectable
  • Early detection
  • Additional spam blocked

23

slide-53
SLIDE 53

Conclusions

  • Hard to detect spam is detectable
  • Early detection
  • Additional spam blocked

23

slide-54
SLIDE 54

Conclusions

  • Hard to detect spam is detectable
  • Early detection
  • Additional spam blocked

23

slide-55
SLIDE 55

Questions

24