CO CO447 |LEC EC7 RO ROBUST APPLICATION CO CODE THROUGH AN ANAL ALYSIS SIS
- Dr. Benjamin Livshits
CO CO447 |LEC EC7 RO ROBUST APPLICATION CO CODE THROUGH AN - - PowerPoint PPT Presentation
CO CO447 |LEC EC7 RO ROBUST APPLICATION CO CODE THROUGH AN ANAL ALYSIS SIS Dr. Benjamin Livshits Th The DAO 2 The DAO was a complex Smart Contract with a focus on fair, decentralized operations. In order to allow investors to
2
¨ The DAO was a complex Smart Contract with a focus on
fair, decentralized operations. In order to allow investors to leave the organization in the case of a disagreement,
¨ The DAO was created with an exit or a ‘split function’.
This function allowed users to revert the involvement process and to have the Ether they had sent to The DAO returned.
¨ If someone wanted to leave The DAO, they would
create their own Child DAOs, wait 28 days and then approve their proposal to send Ether to another address.
https://coincodex.com/article/50/the-dao-hack-what-happened-and-what-followed/
3
¨ On June 18, it was noticed that funds were leaving The DAO and the
Ether balance of the smart contract was being drained. Around 3.6M Ether worth approximately $70M were drained by a hacker in a few hours.
¨ The hacker was able to get the DAO smart contract to return Ether
multiple times before it could update its own balance.
¤ There were two main flaws that allowed this to take place, firstly the smart
contract sent the Ether and then updated the internal token balance.
¤ Secondly, The DAO coders had also failed to consider the possibility of a
recursive call that could act in such a way.
¨ The hack resulted in the proposal of a soft fork that would stop the
stolen funds from being spent, however, this never took place after a bug was discovered within the implementation protocol. This
implications.
4 ¨ A hard fork was proposed that would return all the Ether stolen The DAO in
the form of a refund smart contract. The new contract could only withdraw and investors in The DAO could make refund requests for lost Ether.
¨ While it makes perfect sense to seek to reimburse the victims of the attack,
the hard fork uncovered a number of arguments that are still prevalent in the world of cryptocurrency today.
¤ Some opposed the hard fork and argued that the original statement of The DAO
terms and conditions could never be changed.
¤ They also felt that the blockchain should be free from censorship and things that
take place on the blockchain shouldn’t be changed even in the event of negative
¤ Opponents of these arguments felt that the hacker could not be allowed to profit
from his actions and that returning the funds would keep blockchain projects free from regulation and litigation.
¨ The hard fork also made sense as it only returned funds to the original
investors and would also help to stabilize the price of Ether.
5
¨ The final decision was voted on and approved by Ether
holders, with 89% 89% votin ing for the hard fork and as a result, it took place on July 20 2016 during the 1920000th block.
¨ The immediate result of this was the creation of Ethereum
Classic ETC, 7.22% which shares all the data on the Ethereum blockchain up until block 1920000.
¨ The creation of Ethereum Classic showed that hard forks were
ve very much possible and it can be said that the creation of the second Ethereum currency has had an influence on the creators of subsequent Bitcoin BTC, 7.72% forks.
¨ It also became clear that while the DAO was great idea, it was
not implemented correctly and in order to move forward successfully blockchain projects would have to implement rigid security protocols.
6
¨ The attacker was analyzing DAO.sol, and noticed that the
'splitDAO' function was vulnerable to the recursive send pattern we've described above: this function updates user balances and totals at the end, so if we can get any of the function calls before this happens to call splitDAO again, we get the in infin finit ite recursio ion that can be used to move as many funds as we want (code comments are marked with XXXXX, you may have to scroll to see em):
¨
http://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/
7
The basic idea is this: propose a split. Execute the split. When the DAO goes to withdraw your reward, call the function to execute a split before that withdrawal finishes.
8
¨ Basically the attacker is using this to transfer more tokens than they
should be able to into their child DAO.
¨ How does the DAO decide how many tokens to move? Using the
balances array of course:
¨ Because p.splitData[0] is going to be the same every time the
attacker calls this function (it's a property of the proposal p, not the general state of the DAO), and because the attacker can call this function from withdrawRewardFor be before the he ba balanc nces array is upda updated, the attacker can get this code to run arbitrarily many times using the described attack, with fundsToBeMoved coming out to the same value each time.
9 ¨ Our team is blessed to have Dr. Christian Reitwießner, Father of Solidity, as
its Advisor. During the early development of the DAO Framework 1.1 and thanks to his guidance we were made aware of a generic vulnerability common to all Ethereum smart contracts. We promptly circumvented this so-called “recursive call vulnerability” or “race to empty” from the DAO Framework 1.1 as can be seen on line 580:
10 10
¨ Three days ago this design vulnerability potential was raised in a
blog post which subsequently led to the discovery of such an issue in an unrelated project, MakerDAO. This was highlighted in a reddit post, with MakerDAO being able to drain their own funds safely before the vulnerability could be exploited.
¨ Around 12 hours ago user Eththrowa on the DAOHub Forum
spotted that while we had identified the vulnerability in one aspect
account mechanism was affected. His message and our prompt confirmation can be found here.
¨ We issued a fix immediately as part of the DAO Framework 1.1
milestone.
https://blog.slock.it/no-dao-funds-at-risk-following-the-ethereum-smart-contract-recursive-call-bug-discovery-29f482d348b
11 11
12 12
¨ Instrument code for testing ¨ Heap memory: Purify ¨ Valgrind: http://valgrind.org ¨ Perl tainting (information
flow)
¨ Java race condition
checking
¨ Pr
Pros:
¤ Easy to reproduce the
bug
¤ Relatively easy to
implement
¨ Co
Cons:
¤ Slows down the
program significantly
¤ 10x-40x slowdowns ¤ Test only: cannot be
used in production
¤ Not all paths executed
13 13
¨ Fuzzing and pe
pene netration t n testing ng
¨ Black-box web application security
analysis
¨ Typically, tries to provide cleverly
crafted unexpected inputs
¨ Also knows as inputs of death ¨ Example:
¤ Peach fuzzer
http://peachfuzzer.com/
¤ antifuzzer, Dfuz, SPIKE, GPF, etc. ¨ Pr
Pros:
¤ Easy to reproduce the
bug
¤ Don’t need to
understand the code
¤ Can be done by
someone else
¨ Co
Cons:
¤ Have no visibility into
program logic
¤ Has low coverage ¤ Possibly lots of
missing vulnerabilities
14 14
¨ Static code analysis toos
¤ Coverity ¤ Tools from Microsoft
like Prefix and Prefast
¤ FindBugs (for Java) ¤ Fortify (for security)
15 15
¨ Pr
Pros:
¤ Near-perfect code
coverage, exercise all paths
¤ Can be run,
incrementally as part of development process
¨ Co
Cons:
¤ Can be imprecise ¤ Can scale poorly ¤ Can produce results
that are tough to interpret
16 16
17 17
18 18
19 19
20 20
21 21
22 22
¨ General discussion of static analysis tools
¤ Goals and limitations ¤ Approach based on abstract states
¨ More about one specific approach
¤ Property checkers from Engler et al., Coverity ¤ Sample security-related results
Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, J. Franklin, A. Aiken, Mitchll …
Entry 1 2 3 4
Software
Exit
Behaviors
Entry 1 2 4 Exit 1 2 4 1 2 4 1 3 4 1 2 4 1 3 4 1 2 3 1 2 4 1 3 4 1 2 4 1 2 3 1 3 4 1 2 3 1 2 3 1 3 4 1 2 4 1 2 4 1 3 4
1 2 4 1 3 4
Manual testing
small subset of behaviors
Code
Report Type Line 1 mem leak 324 2 buffer oflow 4,353,245 3 sql injection 23,212 4 stack oflow 86,923 5 dang ptr 8,491 … … … 10,502 info leak 10,921
Program Analyzer Spec potentially reports many warnings may emit false alarms analyze large code bases
false alarm false alarm
¨ Bu
¨ Cor
Soundness If the program contains an error, the analysis will report a warning.
“Sound for reporting correctness”
Completeness If the analysis reports a warning, the program will contain an error.
“Complete for reporting correctness”
Complete Incomplete Sound Unsound
Reports all errors Reports no false alarms Reports all errors May report false alarms
Undecidable Decidable Decidable
May not report all errors May report false alarms
Decidable
May not report all errors Reports no false alarms
Software
. . .
Behaviors
Sound Over-approximation of Behaviors False Alarm Reported Error
approximation is too coarse… yields too many false alarms
Modules
entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no
entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no
infeasible path!
… program will never crash
entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no X = 0 X = 0 X = 1 X = 1 X = 1 X = 1 X = 1 X = 2 X = 2 X = 2 X = 2 X = 2 X = 3 X = 3 X = 3 X = 3 non-termination! … therefore, need to approximate
X ß X + 1 f din dout
dout = f(din)
X = 0 X = 1 dataflow elements transfer function dataflow equation
X ß X + 1 f1 din1
dout1 = f1(din1)
Is Y = 0 ? f2 dout2 dout1 din2
dout1 = din2 dout2 = f2(din2)
X = 0 X = 1 X = 1 X = 1
dout1 = f1(din1) djoin = dout1 ⊔ dout2 dout2 = f2(din2)
f1 f2 f3 dout1 din
1
din
2
dout
2
djoin din3 dout3
djoin = din3 dout3 = f3(din3)
least upper bound operator Example: union of possible values
What is the space of dataflow elements, D? What is the least upper bound operator, ⊔?
entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no X = 0 X = 0 X = pos X = T X = neg X = 0 X = T X = T X = T
terminates... … but reports false alarm … therefore, need more precision
lost precision X = T
X = T X = pos X = 0 X = neg X = ^ X ¹ neg X ¹ pos true Y = 0 Y ¹ 0 false X = T X = pos X = 0 X = neg X = ^ signs lattice Boolean formula lattice refined signs lattice
entry X ß 0 Is Y = 0 ? X ß X + 1 X ß X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yes no X = 0 true X = 0 Y=0 X = pos Y=0 X = neg Y¹0 X = pos Y=0 X = neg Y¹0 X = pos Y=0 X = pos Y=0 X = neg Y¹0 X = 0 Y¹0
terminates... … no false alarm … soundly proved never crashes
no precision loss refinement
39 39
40 40
sh Causi sing Defect cts
ce
se after free
Double le fr free
y indexi xing errors
smatch ched array y new new/del delet ete
stack ck ove verrun
verrun
s to loca cal va variables
cally y inco consi sist stent co code
zed va variables
valid use se of negative ve va values
ssing large parameters s by y va value
Under-alloca cations s of dyn ynamic c dat data
y leaks ks
ks
k reso source ce leaks ks
sed va values
codes
se of inva valid iterators
42 42
¨ Some coding patterns and
some vulnerabilities are specific to the code base
¨ Issues that apply to the Linux
kernel are unlikely to apply in application software
43 43
44 44 ¨ Prototype for open() syscall: ¨ Typical mistake: ¨ Force setting explicit file pe
permissions ns!
¨ Check: Look for oflags == O_CREAT without mode
argument
int open(const char *path, int oflag, /* mode_t mode */...);
fd = open(“file”, O_CREAT);
¨ Goal: confine process to a “jail” on the filesystem
¤ chroot() changes filesystem root for a process
¨ Problem
¤ chroot() itself does not change current working
directory
chroot() chdir(“/”)
Error if open before chdir
46 46
47 47
¨ The DAO was a complex Smart Contract with a focus on
fair, decentralized operations. In order to allow investors to leave the organization in the case of a disagreement,
¨ The DAO was created with an exit or a ‘split function’.
This function allowed users to revert the involvement process and to have the Ether they had sent to The DAO returned.
¨ If someone wanted to leave The DAO, they would
create their own Child DAOs, wait 28 days and then approve their proposal to send Ether to another address.
https://coincodex.com/article/50/the-dao-hack-what-happened-and-what-followed/
48 48
¨ On June 18, it was noticed that funds were leaving The DAO and the
Ether balance of the smart contract was being drained. Around 3.6M Ether worth approximately $70M were drained by a hacker in a few hours.
¨ The hacker was able to get the DAO smart contract to return Ether
multiple times before it could update its own balance.
¤ There were two main flaws that allowed this to take place, firstly the smart
contract sent the Ether and then updated the internal token balance.
¤ Secondly, The DAO coders had also failed to consider the possibility of a
recursive call that could act in such a way.
¨ The hack resulted in the proposal of a soft fork that would stop the
stolen funds from being spent, however, this never took place after a bug was discovered within the implementation protocol. This
implications.
49 49 ¨ A hard fork was proposed that would return all the Ether stolen The DAO in
the form of a refund smart contract. The new contract could only withdraw and investors in The DAO could make refund requests for lost Ether.
¨ While it makes perfect sense to seek to reimburse the victims of the attack,
the hard fork uncovered a number of arguments that are still prevalent in the world of cryptocurrency today.
¤ Some opposed the hard fork and argued that the original statement of The DAO
terms and conditions could never be changed.
¤ They also felt that the blockchain should be free from censorship and things that
take place on the blockchain shouldn’t be changed even in the event of negative
¤ Opponents of these arguments felt that the hacker could not be allowed to profit
from his actions and that returning the funds would keep blockchain projects free from regulation and litigation.
¨ The hard fork also made sense as it only returned funds to the original
investors and would also help to stabilize the price of Ether.
50 50
¨ The final decision was voted on and approved by Ether
holders, with 89% 89% votin ing for the hard fork and as a result, it took place on July 20 2016 during the 1920000th block.
¨ The immediate result of this was the creation of Ethereum
Classic ETC, 7.22% which shares all the data on the Ethereum blockchain up until block 1920000.
¨ The creation of Ethereum Classic showed that hard forks were
ve very much possible and it can be said that the creation of the second Ethereum currency has had an influence on the creators of subsequent Bitcoin BTC, 7.72% forks.
¨ It also became clear that while the DAO was great idea, it was
not implemented correctly and in order to move forward successfully blockchain projects would have to implement rigid security protocols.
51 51
¨ The attacker was analyzing DAO.sol, and noticed that the
'splitDAO' function was vulnerable to the recursive send pattern we've described above: this function updates user balances and totals at the end, so if we can get any of the function calls before this happens to call splitDAO again, we get the in infin finit ite recursio ion that can be used to move as many funds as we want (code comments are marked with XXXXX, you may have to scroll to see em):
¨
http://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/
52 52
The basic idea is this: propose a split. Execute the split. When the DAO goes to withdraw your reward, call the function to execute a split before that withdrawal finishes.
53 53
¨ Basically the attacker is using this to transfer more tokens than they
should be able to into their child DAO.
¨ How does the DAO decide how many tokens to move? Using the
balances array of course:
¨ Because p.splitData[0] is going to be the same every time the
attacker calls this function (it's a property of the proposal p, not the general state of the DAO), and because the attacker can call this function from withdrawRewardFor be before the he ba balanc nces array is upda updated, the attacker can get this code to run arbitrarily many times using the described attack, with fundsToBeMoved coming out to the same value each time.
54 54 ¨ Our team is blessed to have Dr. Christian Reitwießner, Father of Solidity, as
its Advisor. During the early development of the DAO Framework 1.1 and thanks to his guidance we were made aware of a generic vulnerability common to all Ethereum smart contracts. We promptly circumvented this so-called “recursive call vulnerability” or “race to empty” from the DAO Framework 1.1 as can be seen on line 580:
55 55
¨ Three days ago this design vulnerability potential was raised in a
blog post which subsequently led to the discovery of such an issue in an unrelated project, MakerDAO. This was highlighted in a reddit post, with MakerDAO being able to drain their own funds safely before the vulnerability could be exploited.
¨ Around 12 hours ago user Eththrowa on the DAOHub Forum
spotted that while we had identified the vulnerability in one aspect
account mechanism was affected. His message and our prompt confirmation can be found here.
¨ We issued a fix immediately as part of the DAO Framework 1.1
milestone.
https://blog.slock.it/no-dao-funds-at-risk-following-the-ethereum-smart-contract-recursive-call-bug-discovery-29f482d348b
Linux: 125 errors, 24 false; BSD: 12 errors, 4 false
array[v] while(i < v) … v.clean
Use(v)
v.tainted Syscall param Network packet copyin(&v, p, len) any<= v <= any memcpy(p, q, v) copyin(p,q,v) copyout(p,q,v)
ERROR
Warn when unchecked integers from untrusted sources reach trusting sinks
57 57
58 58
¨ d is read from the user ¨ Signed integer d.idx is upper-bound checked but not lower-bound checked ¨ d.used is unchecked, allowing 2GB of user data to be copied into the kernel
/* 2.4.5/drivers/char/drm/i810_dma.c */ if(copy_from_user(&d, arg, sizeof(arg))) return –EFAULT; if(d.idx > dma->buf_count) return –EINVAL; buf = dma->buflist[d.idx]; Copy_from_user(buf_priv->virtual, d.address, d.used);
59 59
¨ msg points to arbitrary network data ¨ This can be used to overflow cmd and write data
/* 2.4.9/drivers/isdn/act2000/capi.c:actcapi_dispatch */ isdn_ctrl cmd; ... while ((skb = skb_dequeue(&card->rcvq))) { msg = skb->data; ... memcpy(cmd.parm.setup.phone, msg->msg.connect_ind.addr.num, msg->msg.connect_ind.addr.len - 1);
¨ We would want to
reason about the flow of the input (si size) and na name provided by the user
61 61
atoi main exit free malloc printf fgets say_hello
62 62
char * buf[8]; if (a) b = new char [5]; if (a && b) buf[8] = a; delete [] b; *b = ‘x’; END *a = *b; a !a a && b !(a && b)
63 63
Represent logical structure of code in graph form
char * buf[8]; if (a) b = new char [5]; if (a && b) buf[8] = a; delete [] b; *b = ‘x’; END *a = *b; a !a a && b !(a && b)
64 64
Conceptually: Analyze each path through control graph separately Actually Perform some checking computation once per node; combine paths at merge nodes Conceptually Actually
char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)
65 65
Null Null po poin inters Us Use a after f free Ar Array over errun
See how three checkers are run for this path
transitions and error states Checker
previous point, program actions
Run Checker
char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)
66 66
Null pointers Use after free Array overrun “buf is 8 bytes”
char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)
67 67
Null pointers Use after free Array overrun “buf is 8 bytes” “a is null”
char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)
68 68
Null pointers Use after free Array overrun “buf is 8 bytes” “a is null” Already knew a was null
char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)
69 69
Null pointers Use after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted”
char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)
70 70
Null pointers Use after free Array overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!”
char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b)
71 71
Null pointers Use after free Array overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!”
No more errors reported for b
72 72 ¨ What is a bug? Something the user will fix. ¨ Many sources of false positives
¤ False paths ¤ Idioms ¤ Execution environment assumptions ¤ Killpaths ¤ Conditional compilation ¤ “third party code” ¤ Analysis imprecision ¤ …
char * buf[8]; if (a) b = new char [5]; if (a && b) buf[8] = a; delete [] b; *b = ‘x’; END *a = *b; a !a a && b !(a && b)
73 73
char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b
74 74
Integer Range Disequality Branch
char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b
75 75
“a in [0,0]” “a == 0 is true” Integer Range Disequality Branch
char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b
76 76
“a in [0,0]” “a == 0 is true” “a != 0” Integer Range Disequality Branch
char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b
77 77
“a in [0,0]” “a == 0 is true” “a != 0”
Impossible
Integer Range Disequality Branch
78 78
¨ Stanford research project ¨ Ken Ashcraft and Dawson Engler, Using
Programmer-Written Compiler Extensions to Catch Security Holes, IEEE Security and Privacy 2002
¨ Used modified compiler to find over 100 security
holes in Linux and BSD
79 79
Gain control of system 18 15 3 3 Corrupt memory 43 17 2 2 Read arbitrary memory 19 14 7 7 Denial of service 17 5 0 0 Minor 28 1 0 0 Total 125 52 12 12 Linux BSD Violation Bug Fixed Bug Fixed
81 81
¨Runtime Monitoring
¨Static Analysis
82 82
¨ A form of vulnerability analysis and testing ¨ Many slightly anomalous test cases are input
¨ Application is monitored for any sign of error
83 83
¨ Standard HTTP GET request
¤ GET /index.html HTTP/1.1
¨ Anomalous requests
¤ AAAAAA...AAAA /index.html HTTP/1.1 ¤ GET ///////index.html HTTP/1.1 ¤ GET %n%n%n%n%n%n.html HTTP/1.1 ¤ GET /AAAAAAAAAAAAA.html HTTP/1.1 ¤ GET /index.html HTTTTTTTTTTTTTP/1.1 ¤ GET /index.html HTTP/1.1.1.1.1.1.1.1 ¤ etc...
84 84
Shortly after the iPhone was released, a group of security researchers at Independent Security Evaluators decided to investigate how hard it would be for a re remote ad adversar ary to compromise the private information stored on the device
85
¨ Within two weeks of
part time work, we had successfully
¤ discovered a
vulnerability
¤ developed a toolchain
for working with the iPhone's architecture
¤ created a proof-of-
concept exploit capable
the user's iPhone to a remote attacker
¨ Notified Apple of the
vulnerability and proposed a patch.
¨ Apple subsequently
resolved the issue.
¨ Released an advisory 86 86
87 87
88
¨ iPhone Safari downloads malicious web page
¤Arbitrary code is run with administrative privileges ¤Can read SMS log, address book, call history, etc. ¤Can transmit collected data to attacker ¤Can perform physical actions on the phone
n system sound and vibrate the phone for a second n could dial phone numbers, send text messages, or record
audio (as a bugging device)
89
¨ WebKit is open source
¤ “WebKit is an open source web browser engine. WebKit is also the name of the
Mac OS X system framework version of the engine that's used by Safari, Dashboard, Mail, and many other OS X applications.”
¨ So we know what they use for code testing
¤ Use code coverage to see which portions of code is not well tested ¤ Tools gcov, icov, etc., measure test coverage
90 90
Identify potential focus
site: the JavaScriptCore Tests “If you are making changes to JavaScriptCore, there is an additional test suite you must run before landing changes. This is the Mozilla JavaScript test suite.”
91
¨ 59.3% of 13,622 lines in JavaScriptCore were covered
¤ 79.3% of main engine covered ¤ 54.
54.7% 7% of Perl Compatible Regular Expression (PCRE) covered
¨ Next step: focus on PCRE
¤ Wrote a PCRE fuzzer (20 lines of perl) ¤ Ran it on standalone PCRE parser (pcredemo from PCRE library) ¤ Started getting errors: PCRE compilation failed at offset 6:
internal error: code overflow
¨ Evil regular expressions crash mobile Safari
92 92
93 93
¨ Little or no knowledge of the
structure of the inputs is assumed
¨ Input anomalies are added to
existing valid inputs
¨ Input anomalies may be
completely random or follow some heuristics
¨ Requires little to no set up time ¨ Success dependent on the inputs
being modified
¨ May help to get to parts of the
code protected by complex conditionals
¨ May fail for protocols with
checksums, those which depend
¨ Examples:
¤ ZZUF, very successful at finding
bugs in many real-world programs, http://sam.zoy.org/zzuf/
¤ Taof, GPF, ProxyFuzz, FileFuzz,
Filep, etc.
94 94
95
¨ Google for .pdf (about 1 billion results) ¨ Crawl pages to build a corpus ¨ Use fuzzing tool (or script to) ¤ 1. Grab a file ¤ 2. Mutate that file ¤ 3. Feed it to the program ¤ 4. Record if it crashed (and input that crashed it)
96 96
97 97
http://www.youtube.com/watch?v=dHYLu3oYpnQ&feature=youtu.be
A TRUE FALSE 00 1
1.0
2
65536 268435455
2147483647 0xfffffff NULL null \0 \00 < script > < / script> %0a %00 +%00 \0 \0\0
98 98
dir%00| |dir |dir| |/bin/ls -al ?x= ?x=" ?x=| ?x=> /boot.ini ABCD|%8.8x|%8.8x|%8. 8x|%8.8x|%8.8x|%8.8x| %8.8x|%8.8x|%8.8x|%8. 8x| ../../boot.ini /../../../../../../../../%2A %25%5c..%25%5c..%25% 5c..%25%5c..%25%5c..% 25%5c..%25%5c..%25%5 c..%25%5c..%25%5c..%2 5%5c..%25%5c..% 25%5c..%2 5%5c..%00 %25%5c..%25%5c..%25% 5c..%25%5c..%25%5c..% 25%5c..%25%5c..%25%5 c..%25%5c..%25%5c..%2 5%5c..%25%5c..%
03C < < < < < < < < < < < < < < < < < < < < < \x3c
\x3C \u003c \u003C something%00html ' /' \' ^' @' {'} ['] *' #' ">xxx<P>yyy "><script>" <script>alert("XSS")</scri pt> uname -n -s whoami pwd last cat /etc/passwd ls -la /tmp ls -la /home ping -i 30 127.0.0.1 ping 127.0.0.1 ping -n 30
¨ Test cases are generated from
some description of the format: RFC, documentation, grammar, etc.
¨ Knowledge of format or
protocol should give better results than random fuzzing
¨ Can take significant time to
set up
99 99
s_string("POST /testme.php HTTP/1.1rn"); s_string("Host: testserver.example.comrn"); s_string("Content-Length: "); s_blocksize_string("block1", 5); s_string("rnConnection: closernrn"); s_block_start("block1"); s_string("inputvar="); s_string_variable("inputval") ; s_block_end("block1");
POST /testme.php HTTP/1.1 Host: testserver.example.com Content-Length: [size_of_data] Connection: close
inputvar=[fuzz_string]
100 100
s_string_variable(“string”); // inserts a fuzzed string into your “SPIKE”. The string “string” will be used for the first iteration of this variable, as well as for any SPIKES where other s_string_variables are being iterated
101 101 ¨ Mutation based fuzzers can generate a huge number of
test cases... When has the fuzzer run long enough?
¨ Generation based fuzzers generate lots of test cases,
are found?
¨ How do you monitor the target application such that
you know when something “bad” has happened?
102 102
¨ What happens when you find too many bugs? ¨ Or every anomalous test case triggers the same (boring) bug? ¨ Given a crash, how do you find the actual vulnerability ¨ After fuzzing, how do you know what changes to make to
improve your fuzzer?
¨ When do you stop fuzzing an application?
103 103
¨ Have a PDF file with 248,000 bytes
¤ There is one byte that, if changed to par
partic icular ular value alues, causes a crash
¤ This byte is 94% of the way through the file
¨ Any single random mutation to the file has a probability of .00000392 of
finding the crash
¨ On average, need 127,512 test cases to find it ¨ At 2 seconds a test case, that’s just under 3 days
104 104
¨
Changing a byte in the file to 0xff crashes QuickTime Player 42% of the time
¨
All these crashes seem to be from the same bug
¨
There may be other bugs “hidden” by this bug
105
¨ Line/block coverage
¤ Measures how many lines of source code have been
executed.
¨ Branch coverage
¤ Measures how many branches in code have been taken
(conditional jmps)
¨ Path coverage
¤ Measures how many paths have been taken
¨
In general, a program with n “reachable” branches will require 2n test cases for branch coverage and 2n test cases for path coverage!
¨
If you consider loops, there are an infinite number of paths
¨
Some paths are infeasible
¨
You can’t satisfy both of these conditionals, i.e. there is only three paths through this code, not four
106 106
if(x>=0){ x = 1; } if(x < 0) { x = -1; }
¨ An 0day is a vulnerability
that’s not publicly known
¤ Modern 0days often combine
multiple attack vectors & vulnerabilities into one exploit
¤ Many of these used only once on
high value targets
¨ 0day statistics ¤ Often open for months,
sometimes years
107
108
¨ Step #1: obtain information
¤ Hardware, software information ¤ Sometimes the hardest step
¨ Step #2: bug finding
¤ Manual audit ¤ (semi)automated techniques/tools
n Fuzz testing (focus of this lecture)