CMPSC 497: Static Analysis Trent Jaeger Systems and Internet - - PowerPoint PPT Presentation

cmpsc 497 static analysis
SMART_READER_LITE
LIVE PREVIEW

CMPSC 497: Static Analysis Trent Jaeger Systems and Internet - - PowerPoint PPT Presentation

CMPSC 497: Static Analysis Trent Jaeger Systems and Internet Infrastructure Security (SIIS) Lab Computer Science and Engineering Department Pennsylvania State University Systems and Internet Infrastructure Security Laboratory (SIIS) Page 1


slide-1
SLIDE 1

Systems and Internet Infrastructure Security Laboratory (SIIS) Page 1

CMPSC 497: Static Analysis

Trent Jaeger Systems and Internet Infrastructure Security (SIIS) Lab Computer Science and Engineering Department Pennsylvania State University

slide-2
SLIDE 2

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Our Goal

2

  • In this course, we want to develop techniques to

detect vulnerabilities before they are exploited automatically

  • What’s a vulnerability?
  • How to find them?
slide-3
SLIDE 3

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Static Analysis

  • Provides an approximation of behavior
  • “Run in the aggregate”
  • Rather than executing on ordinary states
  • Finite-sized descriptors representing a collection of states
  • “Run in non-standard way”
  • Run in fragments
  • Stitch them together to cover all paths
  • Runtime testing is inherently incomplete, but static

analysis can cover all paths

3

slide-4
SLIDE 4

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Static Analysis

  • A challenge is that static analysis is a bit of an art

form

  • Which analysis technique do you use to answer which

question?

  • That is not so easy

4

slide-5
SLIDE 5

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Static Analysis

  • A challenge is that static analysis is a bit of an art

form

  • Why is it hard?
  • Rice’s Theorem states that all non-trivial questions about

the semantic properties of programs from a universal program language are undecidable. (1953)

  • Syntatic properties (e.g., does program have an if-then-else) are

possible to answer

  • But, the sort of questions we want to answer are often about

semantic properties

  • Thus, static analysis uses approximate program models

5

slide-6
SLIDE 6

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Correctness

  • How does this impact proving a program is correct?

6

slide-7
SLIDE 7

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Correctness

  • Soundness:
  • Predicted results must apply to every system execution
  • Overapproximate the effect of every program statement
  • Absolutely mandatory for trustworthiness of analysis results!
  • Completeness:
  • Behavior of every system execution caught by analysis
  • Prove any true statement in program is really true
  • Usually not guaranteed due to approximation
  • Degree of completeness determines quality of analysis
  • Correctness: Soundness ^ Completeness (rare)

7

slide-8
SLIDE 8

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Soundness

  • Soundness:
  • All executions are represented
  • Implication 1: no false negatives, as static analysis model

represents all executions possible

  • However, unlikely that model is a correct representation of the

program semantics

  • Implication 2: Sound model is not complete
  • Implication 3: A sound static analysis will produce some false

positives

  • The number of false positives determines the quality of the

analysis

8

slide-9
SLIDE 9

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Static Analysis Approaches

  • A challenge is that static analysis is a bit of an art

form

  • Which analysis technique do you use to answer which

question?

  • How about for control flows, type-based analysis,

and taint analysis?

9

slide-10
SLIDE 10

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Static Analysis Approaches

  • Control flow
  • Does a program execute one statement (e.g., security

check) before another statement (e.g., security-sensitive

  • peration)? Ordering of statements
  • Type-based analysis
  • Does a program use data lacking properties (e.g.,

security check) in a statement (e.g., security-sensitive

  • peration)? Label data using types
  • Taint analysis
  • Does a program statement use a tainted value, and what

is the impact of executing the statement on its variables?

10

slide-11
SLIDE 11

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • Can control flow analysis detect this?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

11

slide-12
SLIDE 12

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • What does the CFG look like?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

12

slide-13
SLIDE 13

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • What is the property of the CFG that indicates

violation?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

13

slide-14
SLIDE 14

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • Can we identify the exploitation in this analysis?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

14

slide-15
SLIDE 15

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • What about this code?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf2R1 = (char *) malloc(BUFSIZE2); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

15

slide-16
SLIDE 16

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • What about this code? False positive?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf2R1 = (char *) malloc(BUFSIZE2); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

16

slide-17
SLIDE 17

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • How do we change the property to detect more

accurately (with fewer false positives)?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf2R1 = (char *) malloc(BUFSIZE2); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

17

slide-18
SLIDE 18

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • Does our new rule work for the following?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); bar(&buf2R1); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

18

slide-19
SLIDE 19

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • What would need to be done to check?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); bar(&buf2R1); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

19

slide-20
SLIDE 20

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • What about this one?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf3R1 = buf2R1; buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf3R1); free(buf1R2); }

21

slide-21
SLIDE 21

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Type-based Analysis

  • Does your program have a double free?
  • Can we express the rule with types (type-based)?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf2R1 = (char *) malloc(BUFSIZE2); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

22

slide-22
SLIDE 22

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Type-based Analysis

  • Does your program have a double free?
  • Can we express the rule with types (type-based)?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); DEF buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1), FREE x1 = buf2R1; DEF y = x1, y = (char *) malloc(BUFSIZE2); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(y), FREE x2 = y; free(buf1R2); }

23

slide-23
SLIDE 23

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Type-based Analysis

  • Does your program have a double free?
  • Can we express the rule with types (type-based)?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); DEF buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1), FREE x1 = buf2R1; DEF y = x1, y = (char *) malloc(BUFSIZE2); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(x1); FREE x2 = x1; free(buf1R2); }

24

slide-24
SLIDE 24

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Type-based Analysis

  • Does your program have a double free?
  • Can we express the rule with types (type-based)?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); bar(&buf2R1); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

25

slide-25
SLIDE 25

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Taint Analysis

  • Does your program have a double free?
  • How would taint analysis be applied?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); buf2R1 = (char *) malloc(BUFSIZE2); buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); free(buf1R2); }

26

slide-26
SLIDE 26

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Taint Analysis

  • Does your program have a double free?
  • What is the property to check?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); // taint buf2R1 = (char *) malloc(BUFSIZE2); // untaint buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf2R1); // taint free(buf1R2); }

27

slide-27
SLIDE 27

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

CFG Analysis

  • Does your program have a double free?
  • What about this one?

foo(int x, char **y) // need not be “main” { … buf1R1 = (char *) malloc(BUFSIZE2); buf2R1 = (char *) malloc(BUFSIZE2); free(buf1R1); free(buf2R1); // taint buf3R1 = buf2R1; // taint buf1R2 = (char *) malloc(BUFSIZE1); strncpy(buf1R2, argv[1], BUFSIZE1-1); free(buf3R1); // taint free(buf1R2); }

28

slide-28
SLIDE 28

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Analysis Question

  • Are these proposed approaches sound?
  • What is the implication of an inaccurate, sound analysis?

29

slide-29
SLIDE 29

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Static Analysis Tools

  • Good news – there are static analysis tools available,

so you don’t have to write your own

  • https://www.owasp.org/index.php/Static_Code_Analysis
  • Examples of tools (some examples)
  • Fortify (HP)
  • GrammaTech
  • Checkmarx Static Code Analysis
  • Rational AppScan Source Edition (IBM)
  • Coverity

30

slide-30
SLIDE 30

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Static Analysis Tools

  • Good news – there are static analysis tools available,

so you don’t have to write your own

  • Use of static analysis tools
  • https://www.rsaconference.com/writable/presentations/file_upload/asd-

w02_avoiding-pitfalls-of-static-analysis_copy1.pdf

  • Description of the usage scenarios and pitfalls of those

in practice

31

slide-31
SLIDE 31

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Overall Use Process

  • Understanding what you are

scanning

  • Validate the integrity of the

results

  • Understand how to

customize

  • Still some manual effort at

the end

32

Cannot Do

Web Application, Library, Trusted App, Mobile Application Add custom rules Assess results Add Canaries Assess results Address risk not covered by Static Analysis Assess results Scanning & Verification

slide-32
SLIDE 32

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

What Are You Scanning?

  • Tools are often language

specific

  • Web applications and C/C++/Java

are well supported

  • Scripting languages not so much
  • Often only scans your

executable code – i.e., for what you have source

  • Doesn’t cover libraries
  • Amount of manual work can

vary

33

Cannot Do

Web Application, Library, Trusted App, Mobile Application Add custom rules Assess results Add Canaries Assess results Address risk not covered by Static Analysis Assess results Scanning & Verification — —

y)

Scanning and Verification

slide-33
SLIDE 33

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

How Do Tools Work?

  • Tools convert the program code into an

intermediate representation

  • Then various analyses are applied to the

intermediate representation

  • Make sure your code has been translated to the

intermediate representation

  • Versioning in language, compilers, any scripts, etc.

34

you can see which files were translated by looking for “ translate” in your –

Source Code Intermediate Representation Control Flow, Dataflow, etc. Analysis

slide-34
SLIDE 34

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

How Do Tools Work?

  • Tools convert the program code into an

intermediate representation

  • Then various analyses are applied to the

intermediate representation

  • A common framework for writing your own

analyses is LLVM

  • Not backward compatible among versions, so can create errors

35

you can see which files were translated by looking for “ translate” in your –

Source Code Intermediate Representation Control Flow, Dataflow, etc. Analysis

slide-35
SLIDE 35

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Integrity of Scans

  • What are you scanning?
  • How good are the rules for that

language and type of program?

  • Did all of your source code

get scanned?

  • Were there errors in

translation or analysis?

  • Worked fine it appears
  • Now what?

36

Cannot Do

Web Application, Library, Trusted App, Mobile Application Add custom rules Assess results Add Canaries Assess results Address risk not covered by Static Analysis Assess results Scanning & Verification — —

y)

Scanning and Verification

slide-36
SLIDE 36

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Customization

  • Typically, the default analysis

will not be accurate enough

  • If sound analysis, what will

happen if not accurate?

  • Experiences with Fortify
  • Thus, need to apply

customization to the analysis rules to achieve goals

  • Iterative process

37

Cannot Do

Web Application, Library, Trusted App, Mobile Application Add custom rules Assess results Add Canaries Assess results Address risk not covered by Static Analysis Assess results Scanning & Verification

Add custom rules

slide-37
SLIDE 37

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Customization

  • Examples of customizations
  • Sources
  • Of taint
  • Sinks
  • To detect use of tainted data
  • Passthrough
  • To untaint data
  • To retaint data
  • Summaries
  • Of libraries
  • Rules for what is a violation

38

Cannot Do

Web Application, Library, Trusted App, Mobile Application Add custom rules Assess results Add Canaries Assess results Address risk not covered by Static Analysis Assess results Scanning & Verification

Add custom rules

slide-38
SLIDE 38

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Customization in Tools

  • Customization flexibility in tools
  • Fortify
  • http://community.hpe.com/hpeb/attachments/hpeb/sws- Fortifyforum/217/1/

HP_Fortify_SCA_Custom_Rules_Guide_4.21.pdf

  • Coverity
  • https://www.coverity.com/library/pdf/coverity_extend.pdf
  • Checkmarx
  • https://checkmarx.atlassian.net/wiki/display/KC/Working+with+Queries

40

Checkmarx Fortify Coverity Veracode Whitehat Most Flexible Very Flexible Flexible No custom rules allowed No custom rules allowed

slide-39
SLIDE 39

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Manual Review

  • Of results
  • Remove false positives
  • Improve analysis
  • Of code for more complex properties

43

slide-40
SLIDE 40

Systems and Internet Infrastructure Security Laboratory (SIIS) Page

Take Away

44

  • Static analysis techniques are a common way to

detect software flaws

  • However, designing your own analyses is an art form
  • Demonstrated/reviewed some simple analysis

problems

  • Fortunately, it is possible to look for vulnerabilities

with static analysis tools

  • A variety of tools are available
  • Using them also requires a fair bit of expertise