[PPT] - CO CO447 |LEC EC7 RO ROBUST APPLICATION CO CODE THROUGH AN PowerPoint Presentation

SLIDE 1

CO CO447 |LEC EC7 RO ROBUST APPLICATION CO CODE THROUGH AN ANAL ALYSIS SIS

Dr. Benjamin Livshits

SLIDE 2

Th The DAO

2

¨ The DAO was a complex Smart Contract with a focus on

fair, decentralized operations. In order to allow investors to leave the organization in the case of a disagreement,

¨ The DAO was created with an exit or a ‘split function’.

This function allowed users to revert the involvement process and to have the Ether they had sent to The DAO returned.

¨ If someone wanted to leave The DAO, they would

create their own Child DAOs, wait 28 days and then approve their proposal to send Ether to another address.

https://coincodex.com/article/50/the-dao-hack-what-happened-and-what-followed/

SLIDE 3

Th The Hack

3

¨ On June 18, it was noticed that funds were leaving The DAO and the

Ether balance of the smart contract was being drained. Around 3.6M Ether worth approximately $70M were drained by a hacker in a few hours.

¨ The hacker was able to get the DAO smart contract to return Ether

multiple times before it could update its own balance.

¤ There were two main flaws that allowed this to take place, firstly the smart

contract sent the Ether and then updated the internal token balance.

¤ Secondly, The DAO coders had also failed to consider the possibility of a

recursive call that could act in such a way.

¨ The hack resulted in the proposal of a soft fork that would stop the

stolen funds from being spent, however, this never took place after a bug was discovered within the implementation protocol. This

pened up the possibility of a hard fork with wider reaching

implications.

SLIDE 4

Th The Hard Fork

4 ¨ A hard fork was proposed that would return all the Ether stolen The DAO in

the form of a refund smart contract. The new contract could only withdraw and investors in The DAO could make refund requests for lost Ether.

¨ While it makes perfect sense to seek to reimburse the victims of the attack,

the hard fork uncovered a number of arguments that are still prevalent in the world of cryptocurrency today.

¤ Some opposed the hard fork and argued that the original statement of The DAO

terms and conditions could never be changed.

¤ They also felt that the blockchain should be free from censorship and things that

take place on the blockchain shouldn’t be changed even in the event of negative

utcomes.

¤ Opponents of these arguments felt that the hacker could not be allowed to profit

from his actions and that returning the funds would keep blockchain projects free from regulation and litigation.

¨ The hard fork also made sense as it only returned funds to the original

investors and would also help to stabilize the price of Ether.

SLIDE 5

Wha What Happe ppene ned: d: Jul July y 20 2016

5

¨ The final decision was voted on and approved by Ether

holders, with 89% 89% votin ing for the hard fork and as a result, it took place on July 20 2016 during the 1920000th block.

¨ The immediate result of this was the creation of Ethereum

Classic ETC, 7.22% which shares all the data on the Ethereum blockchain up until block 1920000.

¨ The creation of Ethereum Classic showed that hard forks were

ve very much possible and it can be said that the creation of the second Ethereum currency has had an influence on the creators of subsequent Bitcoin BTC, 7.72% forks.

¨ It also became clear that while the DAO was great idea, it was

not implemented correctly and in order to move forward successfully blockchain projects would have to implement rigid security protocols.

SLIDE 6

The DAO Hack ck

6

¨ The attacker was analyzing DAO.sol, and noticed that the

'splitDAO' function was vulnerable to the recursive send pattern we've described above: this function updates user balances and totals at the end, so if we can get any of the function calls before this happens to call splitDAO again, we get the in infin finit ite recursio ion that can be used to move as many funds as we want (code comments are marked with XXXXX, you may have to scroll to see em):

¨

http://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/

SLIDE 7

Ca Call Sp Split Mor More than Once

7

The basic idea is this: propose a split. Execute the split. When the DAO goes to withdraw your reward, call the function to execute a split before that withdrawal finishes.

SLIDE 8

Moving Funds Multiple Ti Times

8

¨ Basically the attacker is using this to transfer more tokens than they

should be able to into their child DAO.

¨ How does the DAO decide how many tokens to move? Using the

balances array of course:

¨ Because p.splitData[0] is going to be the same every time the

attacker calls this function (it's a property of the proposal p, not the general state of the DAO), and because the attacker can call this function from withdrawRewardFor be before the he ba balanc nces array is upda updated, the attacker can get this code to run arbitrarily many times using the described attack, with fundsToBeMoved coming out to the same value each time.

SLIDE 9

Re Recursive Call in Ma MakerD rDAO

9 ¨ Our team is blessed to have Dr. Christian Reitwießner, Father of Solidity, as

its Advisor. During the early development of the DAO Framework 1.1 and thanks to his guidance we were made aware of a generic vulnerability common to all Ethereum smart contracts. We promptly circumvented this so-called “recursive call vulnerability” or “race to empty” from the DAO Framework 1.1 as can be seen on line 580:

SLIDE 10

Mor More D DAOs

10 10

¨ Three days ago this design vulnerability potential was raised in a

blog post which subsequently led to the discovery of such an issue in an unrelated project, MakerDAO. This was highlighted in a reddit post, with MakerDAO being able to drain their own funds safely before the vulnerability could be exploited.

¨ Around 12 hours ago user Eththrowa on the DAOHub Forum

spotted that while we had identified the vulnerability in one aspect

f the DAO Framework, the existing (and deployed) DAO reward

account mechanism was affected. His message and our prompt confirmation can be found here.

¨ We issued a fix immediately as part of the DAO Framework 1.1

milestone.

https://blog.slock.it/no-dao-funds-at-risk-following-the-ethereum-smart-contract-recursive-call-bug-discovery-29f482d348b

SLIDE 11

Cos Cost of

f Fixing a Defect

11 11

SLIDE 12

How Do We Find Bugs?

12 12

SLIDE 13

Ru Runtime Mon

nitor
ring

¨ Instrument code for testing ¨ Heap memory: Purify ¨ Valgrind: http://valgrind.org ¨ Perl tainting (information

flow)

¨ Java race condition

checking

¨ Pr

Pros:

¤ Easy to reproduce the

bug

¤ Relatively easy to

implement

¨ Co

Cons:

¤ Slows down the

program significantly

¤ 10x-40x slowdowns ¤ Test only: cannot be

used in production

¤ Not all paths executed

13 13

SLIDE 14

Bl Black-bo box x Tes esting ng

¨ Fuzzing and pe

pene netration t n testing ng

¨ Black-box web application security

analysis

¨ Typically, tries to provide cleverly

crafted unexpected inputs

¨ Also knows as inputs of death ¨ Example:

¤ Peach fuzzer

http://peachfuzzer.com/

¤ antifuzzer, Dfuz, SPIKE, GPF, etc. ¨ Pr

Pros:

¤ Easy to reproduce the

bug

¤ Don’t need to

understand the code

¤ Can be done by

someone else

¨ Co

Cons:

¤ Have no visibility into

program logic

¤ Has low coverage ¤ Possibly lots of

missing vulnerabilities

14 14

SLIDE 15

St Static A c Ana nalysis

¨ Static code analysis toos

¤ Coverity ¤ Tools from Microsoft

like Prefix and Prefast

¤ FindBugs (for Java) ¤ Fortify (for security)

15 15

¨ Pr

Pros:

¤ Near-perfect code

coverage, exercise all paths

¤ Can be run,

incrementally as part of development process

¨ Co

Cons:

¤ Can be imprecise ¤ Can scale poorly ¤ Can produce results

that are tough to interpret

SLIDE 16

Fr From C m Coverity

16 16

SLIDE 17

Fr From CP CPyCh Checker

17 17

SLIDE 18

Fr From Fx FxCop

18 18

SLIDE 19

From PVS-Studio

19 19

SLIDE 20

From Vi Visual Lint

20 20

SLIDE 21

XS XSS Detect

21 21

SLIDE 22

Vi Visual Studio

22 22

SLIDE 23

Outline

¨ General discussion of static analysis tools

¤ Goals and limitations ¤ Approach based on abstract states

¨ More about one specific approach

¤ Property checkers from Engler et al., Coverity ¤ Sample security-related results

Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, J. Franklin, A. Aiken, Mitchll …

SLIDE 24

Entry 1 2 3 4

Software

Exit

Behaviors

Entry 1 2 4 Exit 1 2 4 1 2 4 1 3 4 1 2 4 1 3 4 1 2 3 1 2 4 1 3 4 1 2 4 1 2 3 1 3 4 1 2 3 1 2 3 1 3 4 1 2 4 1 2 4 1 3 4

. . .

1 2 4 1 3 4

Manual testing

nly examines

small subset of behaviors

Static Analysis Coverage Advantage

SLIDE 25

Program Analyzers

Code

Report Type Line 1 mem leak 324 2 buffer oflow 4,353,245 3 sql injection 23,212 4 stack oflow 86,923 5 dang ptr 8,491 … … … 10,502 info leak 10,921

Program Analyzer Spec potentially reports many warnings may emit false alarms analyze large code bases

false alarm false alarm

SLIDE 26

Static Analysis Goals

¨ Bu

Bug fi g finding: identify code that the programmer wishes to modify or improve

¨ Cor

Correctness: Verify the absence of certain classes of errors

SLIDE 27

So Soundness ess and Completen eness ess

Property Definition

Soundness If the program contains an error, the analysis will report a warning.

“Sound for reporting correctness”

Completeness If the analysis reports a warning, the program will contain an error.

“Complete for reporting correctness”

SLIDE 28

Complete Incomplete Sound Unsound

Reports all errors Reports no false alarms Reports all errors May report false alarms

Undecidable Decidable Decidable

May not report all errors May report false alarms

Decidable

May not report all errors Reports no false alarms

Decidable?

SLIDE 29

Software

. . .

Behaviors

Sound Over-approximation of Behaviors False Alarm Reported Error

approximation is too coarse… yields too many false alarms

Modules

Over- and Underapproximations

SLIDE 30

entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no

Does This Program Ever Crash?

SLIDE 31

entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no

infeasible path!

verly imprecise

… program will never crash

Does This Program Ever Crash?

SLIDE 32

entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no X = 0 X = 0 X = 1 X = 1 X = 1 X = 1 X = 1 X = 2 X = 2 X = 2 X = 2 X = 2 X = 3 X = 3 X = 3 X = 3 non-termination! … therefore, need to approximate

Try Analyzing Without Approximation

SLIDE 33

X ß X + 1 f din dout

dout = f(din)

X = 0 X = 1 dataflow elements transfer function dataflow equation

Dataflow Analysis Framework

SLIDE 34

X ß X + 1 f1 din1

dout1 = f1(din1)

Is Y = 0 ? f2 dout2 dout1 din2

dout1 = din2 dout2 = f2(din2)

X = 0 X = 1 X = 1 X = 1

Applying the Dataflow Approach

SLIDE 35

dout1 = f1(din1) djoin = dout1 ⊔ dout2 dout2 = f2(din2)

f1 f2 f3 dout1 din

1

din

2

dout

2

djoin din3 dout3

djoin = din3 dout3 = f3(din3)

least upper bound operator Example: union of possible values

What is the space of dataflow elements, D? What is the least upper bound operator, ⊔?

Meet/Join Operator ⊔

SLIDE 36

entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no X = 0 X = 0 X = pos X = T X = neg X = 0 X = T X = T X = T

terminates... … but reports false alarm … therefore, need more precision

lost precision X = T

Try Analyzing with “Signs” Approximation…

SLIDE 37

X = T X = pos X = 0 X = neg X = ^ X ¹ neg X ¹ pos true Y = 0 Y ¹ 0 false X = T X = pos X = 0 X = neg X = ^ signs lattice Boolean formula lattice refined signs lattice

Lattices

SLIDE 38

entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no X = 0 true X = 0 Y=0 X = pos Y=0 X = neg Y¹0 X = pos Y=0 X = neg Y¹0 X = pos Y=0 X = pos Y=0 X = neg Y¹0 X = 0 Y¹0

terminates... … no false alarm … soundly proved never crashes

no precision loss refinement

Try Analyzing with “Path-sensitive signs”

SLIDE 39

Approach ches to Finding Secu curity Bugs

39 39

¨Runtime Monitoring ¨Black-box Testing

¨Static

c Analysis

SLIDE 40

Fr From C m Coverity

40 40

SLIDE 41

Architecture of an Analysis Platform

SLIDE 42

Bugs Detected by Coverity

Crash

sh Causi sing Defect cts

Null pointer dereference

ce

Use

se after free

Do

Double le fr free

Array

y indexi xing errors

Mism

smatch ched array y new new/del delet ete

Potential st

stack ck ove verrun

Potential heap ove

verrun

Return pointers

s to loca cal va variables

Logica

cally y inco consi sist stent co code

Uninitialize

zed va variables

Inva

valid use se of negative ve va values

Passi

ssing large parameters s by y va value

Un

Under-alloca cations s of dyn ynamic c dat data

Memory

y leaks ks

File handle leaks

ks

Network

k reso source ce leaks ks

Unuse

sed va values

Unhandled return co

codes

Use

se of inva valid iterators

42 42

SLIDE 43

Co Coveri rity Ch Checkers

¨ Some coding patterns and

some vulnerabilities are specific to the code base

¨ Issues that apply to the Linux

kernel are unlikely to apply in application software

43 43

SLIDE 44

Ex Exampl ple Che hecker: Missi ssing ng Opti tiona nal Ar Argum guments ts

44 44 ¨ Prototype for open() syscall: ¨ Typical mistake: ¨ Force setting explicit file pe

permissions ns!

¨ Check: Look for oflags == O_CREAT without mode

argument

int open(const char *path, int oflag, /* mode_t mode */...);

fd = open(“file”, O_CREAT);

SLIDE 45

Ex Example: chroot Protocol Check cker

¨ Goal: confine process to a “jail” on the filesystem

¤ chroot() changes filesystem root for a process

¨ Problem

¤ chroot() itself does not change current working

Ta Tainting Checkers

46 46

SLIDE 47

Th The DAO

47 47

¨ The DAO was a complex Smart Contract with a focus on

fair, decentralized operations. In order to allow investors to leave the organization in the case of a disagreement,

¨ The DAO was created with an exit or a ‘split function’.

This function allowed users to revert the involvement process and to have the Ether they had sent to The DAO returned.

¨ If someone wanted to leave The DAO, they would

create their own Child DAOs, wait 28 days and then approve their proposal to send Ether to another address.

https://coincodex.com/article/50/the-dao-hack-what-happened-and-what-followed/

SLIDE 48

Th The Hack

48 48

¨ On June 18, it was noticed that funds were leaving The DAO and the

Ether balance of the smart contract was being drained. Around 3.6M Ether worth approximately $70M were drained by a hacker in a few hours.

¨ The hacker was able to get the DAO smart contract to return Ether

multiple times before it could update its own balance.

¤ There were two main flaws that allowed this to take place, firstly the smart

contract sent the Ether and then updated the internal token balance.

¤ Secondly, The DAO coders had also failed to consider the possibility of a

recursive call that could act in such a way.

¨ The hack resulted in the proposal of a soft fork that would stop the

stolen funds from being spent, however, this never took place after a bug was discovered within the implementation protocol. This

pened up the possibility of a hard fork with wider reaching

implications.

SLIDE 49

Th The Hard Fork

49 49 ¨ A hard fork was proposed that would return all the Ether stolen The DAO in

the form of a refund smart contract. The new contract could only withdraw and investors in The DAO could make refund requests for lost Ether.

¨ While it makes perfect sense to seek to reimburse the victims of the attack,

the hard fork uncovered a number of arguments that are still prevalent in the world of cryptocurrency today.

¤ Some opposed the hard fork and argued that the original statement of The DAO

terms and conditions could never be changed.

¤ They also felt that the blockchain should be free from censorship and things that

take place on the blockchain shouldn’t be changed even in the event of negative

utcomes.

¤ Opponents of these arguments felt that the hacker could not be allowed to profit

from his actions and that returning the funds would keep blockchain projects free from regulation and litigation.

¨ The hard fork also made sense as it only returned funds to the original

investors and would also help to stabilize the price of Ether.

SLIDE 50

Wha What Happe ppene ned: d: Jul July y 20 2016

50 50

¨ The final decision was voted on and approved by Ether

holders, with 89% 89% votin ing for the hard fork and as a result, it took place on July 20 2016 during the 1920000th block.

¨ The immediate result of this was the creation of Ethereum

Classic ETC, 7.22% which shares all the data on the Ethereum blockchain up until block 1920000.

¨ The creation of Ethereum Classic showed that hard forks were

ve very much possible and it can be said that the creation of the second Ethereum currency has had an influence on the creators of subsequent Bitcoin BTC, 7.72% forks.

¨ It also became clear that while the DAO was great idea, it was

not implemented correctly and in order to move forward successfully blockchain projects would have to implement rigid security protocols.

SLIDE 51

The DAO Hack ck

51 51

¨ The attacker was analyzing DAO.sol, and noticed that the

'splitDAO' function was vulnerable to the recursive send pattern we've described above: this function updates user balances and totals at the end, so if we can get any of the function calls before this happens to call splitDAO again, we get the in infin finit ite recursio ion that can be used to move as many funds as we want (code comments are marked with XXXXX, you may have to scroll to see em):

¨

http://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/

SLIDE 52

Ca Call Sp Split Mor More than Once

52 52

The basic idea is this: propose a split. Execute the split. When the DAO goes to withdraw your reward, call the function to execute a split before that withdrawal finishes.

SLIDE 53

Moving Funds Multiple Ti Times

53 53

¨ Basically the attacker is using this to transfer more tokens than they

should be able to into their child DAO.

¨ How does the DAO decide how many tokens to move? Using the

balances array of course:

¨ Because p.splitData[0] is going to be the same every time the

attacker calls this function (it's a property of the proposal p, not the general state of the DAO), and because the attacker can call this function from withdrawRewardFor be before the he ba balanc nces array is upda updated, the attacker can get this code to run arbitrarily many times using the described attack, with fundsToBeMoved coming out to the same value each time.

SLIDE 54

Re Recursive Call in Ma MakerD rDAO

54 54 ¨ Our team is blessed to have Dr. Christian Reitwießner, Father of Solidity, as

its Advisor. During the early development of the DAO Framework 1.1 and thanks to his guidance we were made aware of a generic vulnerability common to all Ethereum smart contracts. We promptly circumvented this so-called “recursive call vulnerability” or “race to empty” from the DAO Framework 1.1 as can be seen on line 580:

SLIDE 55

Mor More D DAOs

55 55

¨ Three days ago this design vulnerability potential was raised in a

blog post which subsequently led to the discovery of such an issue in an unrelated project, MakerDAO. This was highlighted in a reddit post, with MakerDAO being able to drain their own funds safely before the vulnerability could be exploited.

¨ Around 12 hours ago user Eththrowa on the DAOHub Forum

spotted that while we had identified the vulnerability in one aspect

f the DAO Framework, the existing (and deployed) DAO reward

account mechanism was affected. His message and our prompt confirmation can be found here.

¨ We issued a fix immediately as part of the DAO Framework 1.1

milestone.

https://blog.slock.it/no-dao-funds-at-risk-following-the-ethereum-smart-contract-recursive-call-bug-discovery-29f482d348b

SLIDE 56

Sa Sanitize I Integers Be Befor

re U

Use

Linux: 125 errors, 24 false; BSD: 12 errors, 4 false

array[v] while(i < v) … v.clean

Use(v)

v.tainted Syscall param Network packet copyin(&v, p, len) any<= v <= any memcpy(p, q, v) copyin(p,q,v) copyout(p,q,v)

ERROR

Warn when unchecked integers from untrusted sources reach trusting sinks

SLIDE 57

Look Looking f for Bl

r Block
cking F

Funct ction

n Ca

Calls

57 57

SLIDE 58

Mi Missed Lo Lower-bo bound und Chec heck

58 58

¨ d is read from the user ¨ Signed integer d.idx is upper-bound checked but not lower-bound checked ¨ d.used is unchecked, allowing 2GB of user data to be copied into the kernel

/* 2.4.5/drivers/char/drm/i810_dma.c */ if(copy_from_user(&d, arg, sizeof(arg))) return –EFAULT; if(d.idx > dma->buf_count) return –EINVAL; buf = dma->buflist[d.idx]; Copy_from_user(buf_priv->virtual, d.address, d.used);

SLIDE 59

Re Remote Exploit

59 59

¨ msg points to arbitrary network data ¨ This can be used to overflow cmd and write data

nto the stack

/* 2.4.9/drivers/isdn/act2000/capi.c:actcapi_dispatch */ isdn_ctrl cmd; ... while ((skb = skb_dequeue(&card->rcvq))) { msg = skb->data; ... memcpy(cmd.parm.setup.phone, msg->msg.connect_ind.addr.num, msg->msg.connect_ind.addr.len - 1);

SLIDE 60

Example Code with Functions and Calls

¨ We would want to

reason about the flow of the input (si size) and na name provided by the user

61 61

SLIDE 61

atoi main exit free malloc printf fgets say_hello

Call Graph for the Program

62 62

SLIDE 62

char * buf[8]; if (a) b = new char [5]; if (a && b) buf[8] = a; delete [] b; *b = ‘x’; END *a = *b; a !a a && b !(a && b)

Control Flow Graph

63 63

Represent logical structure of code in graph form

SLIDE 63

char * buf[8]; if (a) b = new char [5]; if (a && b) buf[8] = a; delete [] b; *b = ‘x’; END *a = *b; a !a a && b !(a && b)

Path Traversal

64 64

Conceptually: Analyze each path through control graph separately Actually Perform some checking computation once per node; combine paths at merge nodes Conceptually Actually

SLIDE 64

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)

Apply Checking

65 65

Null Null po poin inters Us Use a after f free Ar Array over errun

See how three checkers are run for this path

Defined by a state diagram, with state

transitions and error states Checker

Assign initial state to each program var
State at program point depends on state at

previous point, program actions

Emit error if error state reached

Run Checker

SLIDE 65

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)

Apply Checking

66 66

Null pointers Use after free Array overrun “buf is 8 bytes”

SLIDE 66

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)

Apply Checking

67 67

Null pointers Use after free Array overrun “buf is 8 bytes” “a is null”

SLIDE 67

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)

Apply Checking

68 68

Null pointers Use after free Array overrun “buf is 8 bytes” “a is null” Already knew a was null

SLIDE 68

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)

Apply Checking

69 69

Null pointers Use after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted”

SLIDE 69

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)

Apply Checking

70 70

Null pointers Use after free Array overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!”

SLIDE 70

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)

Apply Checking

71 71

Null pointers Use after free Array overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!”

No more errors reported for b

SLIDE 71

Fa False Positives

72 72 ¨ What is a bug? Something the user will fix. ¨ Many sources of false positives

¤ False paths ¤ Idioms ¤ Execution environment assumptions ¤ Killpaths ¤ Conditional compilation ¤ “third party code” ¤ Analysis imprecision ¤ …

SLIDE 72

char * buf[8]; if (a) b = new char [5]; if (a && b) buf[8] = a; delete [] b; *b = ‘x’; END *a = *b; a !a a && b !(a && b)

A False Path

73 73

SLIDE 73

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b

False Path Pruning

74 74

Integer Range Disequality Branch

SLIDE 74

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b

False Path Pruning

75 75

“a in [0,0]” “a == 0 is true” Integer Range Disequality Branch

SLIDE 75

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b

False Path Pruning

76 76

“a in [0,0]” “a == 0 is true” “a != 0” Integer Range Disequality Branch

SLIDE 76

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b

False Path Pruning

77 77

“a in [0,0]” “a == 0 is true” “a != 0”

Impossible

Integer Range Disequality Branch

SLIDE 77

Ap Application to Security Bu Bugs

78 78

¨ Stanford research project ¨ Ken Ashcraft and Dawson Engler, Using

Programmer-Written Compiler Extensions to Catch Security Holes, IEEE Security and Privacy 2002

¨ Used modified compiler to find over 100 security

holes in Linux and BSD

SLIDE 78

Re Results for BSD and Linux

79 79

Gain control of system 18 15 3 3 Corrupt memory 43 17 2 2 Read arbitrary memory 19 14 7 7 Denial of service 17 5 0 0 Minor 28 1 0 0 Total 125 52 12 12 Linux BSD Violation Bug Fixed Bug Fixed

SLIDE 79

Black box testing and fuzzing

SLIDE 80

Approach ches to Finding Secu curity Bugs

81 81

¨Runtime Monitoring

¨Black

ck-bo box T x Testing ng

¨Static Analysis

SLIDE 81

Fuz Fuzzing ing Bas asic ics

82 82

¨ A form of vulnerability analysis and testing ¨ Many slightly anomalous test cases are input

into the target application

¨ Application is monitored for any sign of error

SLIDE 82

Ex Exampl ple

83 83

¨ Standard HTTP GET request

¤ GET /index.html HTTP/1.1

¨ Anomalous requests

¤ AAAAAA...AAAA /index.html HTTP/1.1 ¤ GET ///////index.html HTTP/1.1 ¤ GET %n%n%n%n%n%n.html HTTP/1.1 ¤ GET /AAAAAAAAAAAAA.html HTTP/1.1 ¤ GET /index.html HTTTTTTTTTTTTTP/1.1 ¤ GET /index.html HTTP/1.1.1.1.1.1.1.1 ¤ etc...

SLIDE 83

Ea Early Successes (1989 Fuzz Project)

84 84

SLIDE 84

IPhone Security Flaw: July 2007

Shortly after the iPhone was released, a group of security researchers at Independent Security Evaluators decided to investigate how hard it would be for a re remote ad adversar ary to compromise the private information stored on the device

85

SLIDE 85

Su Succe ccess

¨ Within two weeks of

part time work, we had successfully

¤ discovered a

vulnerability

¤ developed a toolchain

for working with the iPhone's architecture

¤ created a proof-of-

concept exploit capable

f delivering files from

the user's iPhone to a remote attacker

¨ Notified Apple of the

vulnerability and proposed a patch.

¨ Apple subsequently

resolved the issue.

¨ Released an advisory 86 86

SLIDE 86

CVE-2007-3944 Issued and Patched

87 87

SLIDE 87

88

iPhone Attack

¨ iPhone Safari downloads malicious web page

¤Arbitrary code is run with administrative privileges ¤Can read SMS log, address book, call history, etc. ¤Can transmit collected data to attacker ¤Can perform physical actions on the phone

n system sound and vibrate the phone for a second n could dial phone numbers, send text messages, or record

audio (as a bugging device)

SLIDE 88

89

How Was This Discovered?

¨ WebKit is open source

¤ “WebKit is an open source web browser engine. WebKit is also the name of the

Mac OS X system framework version of the engine that's used by Safari, Dashboard, Mail, and many other OS X applications.”

¨ So we know what they use for code testing

¤ Use code coverage to see which portions of code is not well tested ¤ Tools gcov, icov, etc., measure test coverage

SLIDE 89

Col Collect Co Coverage for

r the Test Su

Suite

90 90

Identify potential focus

points. From development

site: the JavaScriptCore Tests “If you are making changes to JavaScriptCore, there is an additional test suite you must run before landing changes. This is the Mozilla JavaScript test suite.”

SLIDE 90

91

What To Focu cus?

¨ 59.3% of 13,622 lines in JavaScriptCore were covered

¤ 79.3% of main engine covered ¤ 54.

54.7% 7% of Perl Compatible Regular Expression (PCRE) covered

¨ Next step: focus on PCRE

¤ Wrote a PCRE fuzzer (20 lines of perl) ¤ Ran it on standalone PCRE parser (pcredemo from PCRE library) ¤ Started getting errors: PCRE compilation failed at offset 6:

internal error: code overflow

¨ Evil regular expressions crash mobile Safari

SLIDE 91

Fuz Fuzzing ing in in Offic ice

92 92

SLIDE 92

Fuz Fuzzing ing for Mone ney

93 93

SLIDE 93

Mu Mutation Ba Base sed Fu Fuzzing

¨ Little or no knowledge of the

structure of the inputs is assumed

¨ Input anomalies are added to

existing valid inputs

¨ Input anomalies may be

completely random or follow some heuristics

¨ Requires little to no set up time ¨ Success dependent on the inputs

being modified

¨ May help to get to parts of the

code protected by complex conditionals

¨ May fail for protocols with

checksums, those which depend

n challenge response, etc.

¨ Examples:

¤ ZZUF, very successful at finding

bugs in many real-world programs, http://sam.zoy.org/zzuf/

¤ Taof, GPF, ProxyFuzz, FileFuzz,

Filep, etc.

94 94

SLIDE 94

95

Ex Example: Fuzzing a PD PDF Viewer

¨ Google for .pdf (about 1 billion results) ¨ Crawl pages to build a corpus ¨ Use fuzzing tool (or script to) ¤ 1. Grab a file ¤ 2. Mutate that file ¤ 3. Feed it to the program ¤ 4. Record if it crashed (and input that crashed it)

SLIDE 95

Im Imag age e Fo Format Fuzzing?

96 96

SLIDE 96

Ru Rupture Fuz Fuzzer er

97 97

http://www.youtube.com/watch?v=dHYLu3oYpnQ&feature=youtu.be

SLIDE 97

fu fuzzdb: Attack k and Discovery y Pattern Da Databa base for Appl pplicati tion n Fuz uzz Testi ting ng

A TRUE FALSE 00 1

1

1.0

1.0

2

2
20

65536 268435455

268435455

2147483647 0xfffffff NULL null \0 \00 < script > < / script> %0a %00 +%00 \0 \0\0

98 98

dir%00| |dir |dir| |/bin/ls -al ?x= ?x=" ?x=| ?x=> /boot.ini ABCD|%8.8x|%8.8x|%8. 8x|%8.8x|%8.8x|%8.8x| %8.8x|%8.8x|%8.8x|%8. 8x| ../../boot.ini /../../../../../../../../%2A %25%5c..%25%5c..%25% 5c..%25%5c..%25%5c..% 25%5c..%25%5c..%25%5 c..%25%5c..%25%5c..%2 5%5c..%25%5c..% 25%5c..%2 5%5c..%00 %25%5c..%25%5c..%25% 5c..%25%5c..%25%5c..% 25%5c..%25%5c..%25%5 c..%25%5c..%25%5c..%2 5%5c..%25%5c..%

03C &#x0003C &#x00003C &#x000003C < < < < < &#x000003C; &#X3C &#X03C &#X003C &#X0003C &#X00003C &#X000003C &#X3C; &#X03C; &#X003C; &#X0003C; &#X00003C; &#X000003C; \x3c

\x3C \u003c \u003C something%00html ' /' \' ^' @' {'} ['] *' #' ">xxx<P>yyy "><script>" <script>alert("XSS")</scri pt> uname -n -s whoami pwd last cat /etc/passwd ls -la /tmp ls -la /home ping -i 30 127.0.0.1 ping 127.0.0.1 ping -n 30

SLIDE 98

Ge Generatio tion Bas ased Fuzzin zzing

¨ Test cases are generated from

some description of the format: RFC, documentation, grammar, etc.

¨ Knowledge of format or

protocol should give better results than random fuzzing

¨ Can take significant time to

set up

99 99

SLIDE 99

Ge Generatio tion Bas ased: : SPI PIKE

s_string("POST /testme.php HTTP/1.1rn"); s_string("Host: testserver.example.comrn"); s_string("Content-Length: "); s_blocksize_string("block1", 5); s_string("rnConnection: closernrn"); s_block_start("block1"); s_string("inputvar="); s_string_variable("inputval") ; s_block_end("block1");

POST /testme.php HTTP/1.1 Host: testserver.example.com Content-Length: [size_of_data] Connection: close

inputvar=[fuzz_string]

100 100

s_string_variable(“string”); // inserts a fuzzed string into your “SPIKE”. The string “string” will be used for the first iteration of this variable, as well as for any SPIKES where other s_string_variables are being iterated

SLIDE 100

Th The Problems With Fuzzing

101 101 ¨ Mutation based fuzzers can generate a huge number of

test cases... When has the fuzzer run long enough?

¨ Generation based fuzzers generate lots of test cases,

too. What happens when they’re all run and no bugs

are found?

¨ How do you monitor the target application such that

you know when something “bad” has happened?

SLIDE 101

Mor More I Issues w with F Fuzzing

102 102

¨ What happens when you find too many bugs? ¨ Or every anomalous test case triggers the same (boring) bug? ¨ Given a crash, how do you find the actual vulnerability ¨ After fuzzing, how do you know what changes to make to

improve your fuzzer?

¨ When do you stop fuzzing an application?

SLIDE 102

Ex Exampl ple: PDF

103 103

¨ Have a PDF file with 248,000 bytes

¤ There is one byte that, if changed to par

partic icular ular value alues, causes a crash

¤ This byte is 94% of the way through the file

¨ Any single random mutation to the file has a probability of .00000392 of

finding the crash

¨ On average, need 127,512 test cases to find it ¨ At 2 seconds a test case, that’s just under 3 days

SLIDE 103

Example: 3g2 Vi Video Files

104 104

¨

Changing a byte in the file to 0xff crashes QuickTime Player 42% of the time

¨

All these crashes seem to be from the same bug

¨

There may be other bugs “hidden” by this bug

SLIDE 104

105

Types of Code Coverage

¨ Line/block coverage

¤ Measures how many lines of source code have been

executed.

¨ Branch coverage

¤ Measures how many branches in code have been taken

(conditional jmps)

¨ Path coverage

¤ Measures how many paths have been taken

SLIDE 105

Pa Path Coverage Issues

¨

In general, a program with n “reachable” branches will require 2n test cases for branch coverage and 2n test cases for path coverage!

¨

If you consider loops, there are an infinite number of paths

¨

Some paths are infeasible

¨

You can’t satisfy both of these conditionals, i.e. there is only three paths through this code, not four

106 106

if(x>=0){ x = 1; } if(x < 0) { x = -1; }

SLIDE 106

0days Are a Hacker Obsession

¨ An 0day is a vulnerability

that’s not publicly known

¤ Modern 0days often combine

multiple attack vectors & vulnerabilities into one exploit

¤ Many of these used only once on

high value targets

¨ 0day statistics ¤ Often open for months,

sometimes years

107

SLIDE 107

108

How to Find a 0day?

¨ Step #1: obtain information

¤ Hardware, software information ¤ Sometimes the hardest step

¨ Step #2: bug finding

¤ Manual audit ¤ (semi)automated techniques/tools

n Fuzz testing (focus of this lecture)