Vulnerability Prediction Models: A case study on the Linux Kernel - - PowerPoint PPT Presentation

vulnerability prediction models a case study on the linux
SMART_READER_LITE
LIVE PREVIEW

Vulnerability Prediction Models: A case study on the Linux Kernel - - PowerPoint PPT Presentation

Vulnerability Prediction Models: A case study on the Linux Kernel Matthieu Jimenez Mike Papadakis Yves Le Traon Jimenez et al. Vulnerability Prediction Models: A Case Study on the Linux Kernel SCAM16 1 Slides: Matthieu


slide-1
SLIDE 1 Slides: Matthieu Jimenez Thème: Sébastien Mosser 1

Matthieu Jimenez Mike Papadakis Yves Le Traon

Vulnerability Prediction Models: A case study on the Linux Kernel

Jimenez et al. “Vulnerability Prediction Models: A Case Study on the Linux Kernel” SCAM’16

slide-2
SLIDE 2 2

Vulnerabilities ?

slide-3
SLIDE 3

A vulnerability

3

“An information security ‘vulnerability’ is a mistake in a software that can be directly used by a hacker to gain access to a system or network.” ~ CVE - website ~

slide-4
SLIDE 4

Vulnerabilities are special

4

More Important - Critical There are more bugs than vulnerabilities Uncovered differently - defects can be easily noticed, while vulnerabilities not.

slide-5
SLIDE 5

Vulnerabilities are

5

Web server used to remotely control the glassware-cleaning machine

CVE for that…

slide-6
SLIDE 6 6

Prediction Model ?

slide-7
SLIDE 7

Prediction Models

7

Models analysing current

and historical events to make

prediction about the

future and/or unknown events !

slide-8
SLIDE 8 8

Vulnerability Prediction Model ?

slide-9
SLIDE 9

Vulnerability Prediction

9

Take advantage of the

knowledge on some part of

a software system and/

  • r previous releases
slide-10
SLIDE 10

Vulnerability Prediction

10

to automatically classify

software entities as

vulnerable or not !

slide-11
SLIDE 11 11

Software Entities ?

slide-12
SLIDE 12

Granularity

12

Possibility to work at :

  • module level
  • file level
  • function level
slide-13
SLIDE 13 13

In this work, we stay at the file level !

*Morrison et al. “Challenges with applying vulnerability prediction models,” in HotSoS’15.

slide-14
SLIDE 14 14

GOAL

slide-15
SLIDE 15 15

Replicating and comparing

the main VPMs approaches

  • n the same software

system.

slide-16
SLIDE 16 16

Replication …

slide-17
SLIDE 17 17

Exact independent replication

slide-18
SLIDE 18

Exact replication

18

procedures of an experiment are followed as closely as possible e.g. here we replicate using the same machine learning settings

slide-19
SLIDE 19

Independent replication

19

deliberately vary one or more major aspects of the conditions

  • f the experiment

e.g. we use our dataset

slide-20
SLIDE 20 20

Approaches …

slide-21
SLIDE 21 21

#Include and f(n) calls

slide-22
SLIDE 22

Include & Function calls

22

Introduced by Neuhaus et al. at CCS’07

slide-23
SLIDE 23

Include & Function calls

23

Introduced by Neuhaus et al. at CCS’07 Intuition : vulnerable files share similar set of imports and function calls

slide-24
SLIDE 24

Include & Function calls

24

Introduced by Neuhaus et

  • al. at CCS’07

Intuition : vulnerable files share similar set of imports and function calls build a model based on either includes or function calls of a file.

slide-25
SLIDE 25

Overview

25

Preprocessing Learning Include & function calls Retrieve all include and function calls of a file SVM with a linear kernel

2 models are build

slide-26
SLIDE 26 26

Software Metrics

slide-27
SLIDE 27

Software Metrics

27

Several works on using metrics to predict vulnerabilities, mostly by Shin et al.

slide-28
SLIDE 28

Software Metrics

28

Several works on using metrics to predict vulnerabilities, mostly by Shin et al. Software metrics are used in defect prediction

build a model based software metrics (complexity, code churn, …)

slide-29
SLIDE 29

Overview

29

Preprocessing Learning Software Metrics Compute complexity metrics of each function (keeping sum, avg and max) code churn and the number of authors of every files. Logistic regression

slide-30
SLIDE 30 30

Text Mining

slide-31
SLIDE 31

Text Mining

31

suggested by Scandariato et

  • al. in 2014.
slide-32
SLIDE 32

Text Mining

32

suggested by Scandariato et

  • al. in 2014.

Aim : building a model requiring no human intuition for feature selection

slide-33
SLIDE 33

Text Mining

33

suggested by Scandariato et

  • al. in 2014.

Aim : building a model requiring no human intuition for feature selection build a model based on a bag of word extracted from a file

slide-34
SLIDE 34

Overview

34

Preprocessing Learning Text mining Creating a bag of word (splitting the code according to the language grammar) for every files

  • Discretisation of the

features (making them boolean)

  • Remove of all

features considered useless

  • Random Forest with

100 trees

slide-35
SLIDE 35 35

Dataset

slide-36
SLIDE 36

Introducing the dataset

36

based on commit and not release

slide-37
SLIDE 37

Introducing the dataset

37
  • CVE-NVD database as a source
  • f vulnerabilities
  • Bugzilla as a source of bugs
slide-38
SLIDE 38

Introducing the dataset

38
  • build automatically
  • with the latest data available
  • on the Linux Kernel
slide-39
SLIDE 39

Overall dataset statistics

39
  • 1,640 vulnerable files, accounting for 743

vulnerabilities

  • 4,900 buggy files related to 3,400 bug

reports

  • more than 50,000 files in total

2006-June 2016

slide-40
SLIDE 40

Research Questions

40
  • RQ1. Can we distinguish between buggy and

vulnerable files?

slide-41
SLIDE 41

Research Questions

41
  • RQ1. Can we distinguish between buggy and

vulnerable files?

  • RQ2. Can we distinguish between vulnerable and non

vulnerable files?

slide-42
SLIDE 42

Research Questions

42
  • RQ1. Can we distinguish between buggy and

vulnerable files?

  • RQ2. Can we distinguish between vulnerable and

non vulnerable files?

  • RQ3. Can we predict future vulnerable when using

past data?

slide-43
SLIDE 43

Research Questions

43
  • RQ1. Can we distinguish between buggy and

vulnerable files?

  • RQ2. Can we distinguish between vulnerable and

non vulnerable files?

  • RQ3. Can we predict future vulnerable when using

past data?

✦ Distinguish between buggy and vulnerable files ✦ Distinguish between vulnerable and non

vulnerable files?

slide-44
SLIDE 44 44

Experimental Dataset

*Buggy vs Vulnerable files

slide-45
SLIDE 45

Experimental dataset

45

Can we distinguish between buggy and vulnerable files?

  • files related to bug report patches vs

files from vulnerability patches

  • ratio 3.3 : 1
slide-46
SLIDE 46 46

Realistic Dataset

*Vulnerable vs Non-Vulnerable files

slide-47
SLIDE 47

Realistic dataset

47
  • Can we distinguish between

Vulnerable and Non-Vulnerable files?

  • Reproduce observed ratio between

different categories of files

  • 3% of (likely) vulnerable files
  • 47% of (likely) buggy files
  • 50% of clear files
slide-48
SLIDE 48 48

Evaluation

slide-49
SLIDE 49

RQ1 - Bugs vs Vulnerabilities

49
  • Function Calls
Includes Software Metrics Text Mining 0.0 0.2 0.4 0.6 0.8 1.0 MCC
slide-50
SLIDE 50

RQ2 - Vulnerable vs Non-

50
  • Function Calls
Includes Software Metrics Text Mining 0.0 0.2 0.4 0.6 0.8 1.0 MCC
slide-51
SLIDE 51

RQ3 Time - Bugs vs

Precision Recall

slide-52
SLIDE 52
  • 0.00

0.25 0.50 0.75 1.00 5 10 15 20

release mcc

  • Function Calls

Includes Software Metrics Text Mining

RQ3 Time - Bugs vs

slide-53
SLIDE 53
  • 0.00

0.25 0.50 0.75 1.00 5 10 15 20

release precision

RQ3 Time - Vulnerable vs Non-

53
  • 0.00

0.25 0.50 0.75 1.00 5 10 15 20

release recall

slide-54
SLIDE 54
  • 0.00

0.25 0.50 0.75 1.00 5 10 15 20

release mcc

  • Function Calls

Includes Software Metrics Text Mining

RQ3 Time - Vulnerable vs

slide-55
SLIDE 55 55

Discussion - Findings

slide-56
SLIDE 56 56

1

VPM’s are working well with historical data

slide-57
SLIDE 57 57

2

Good precision observed even with unbalanced data

slide-58
SLIDE 58 58

3

In the practical case, the best trade off is in favour of include and function calls

slide-59
SLIDE 59 59

4

In the general case, or favouring precision the best one is text mining.

slide-60
SLIDE 60

Previous studies

60

Include and Function calls

Precision 70% Recall 45% Precision 70% Recall 64%

There is no comparison with Metrics or Text Mining

We found Reported

Neuhaus et al. “Predicting vulnerable software components” CCS’07.

There are no results related to time

In the context of Linux we have similar results…

slide-61
SLIDE 61

Previous studies

61

Software Metrics

Precision 3-5, 9, 2-52% Recall 87-90, 91, 66-79% Precision 65% Recall 22%

Shin et al. “Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities” TSE’11.

Shinand et al. “Cantraditionalfaultpredictionmodelsbeused for vulnerability prediction?” ESE’13. Walden et al. “Predicting Vulnerable Components: Software Metrics vs Text Mining” ISSRE’14.

We found Reported 10 fold cross validation

Precision 3% Recall 79-85% Precision 42 : 39% Recall 16 : 24%

We found Reported results based on time

In the context of Linux there are significant differences…

slide-62
SLIDE 62

Previous studies

62

Text Mining

Scandariato et al.“Predicting Vulnerable Software Components via Text Mining” TSE’14. Walden et al. “Predicting Vulnerable Components: Software Metrics vs Text Mining” ISSRE’14.

Precision 76% Recall 58%

We found

Precision 90, 2-57% Recall 77, 74-81%

Reported

10 fold cross validation

Precision 74 : 93% Recall 37 : 27%

We found

Precision 86% Recall 77%

Reported results based on time

In the context of Linux there are again significant differences

slide-63
SLIDE 63 63

DataSet and Replication package and additional results will be available soon…

Please contact Matthieu Jimenez ( Matthieu.Jimenez@uni.lu )

slide-64
SLIDE 64 64

Thank you for your attention !