DART: Directed Automated Random Testing
PLDI 2005 Patrice Godefroid1 Nils Klarlund1 Koushik Sen2
1Bell Laboratories, Lucent Technologies 2University of Illinois at Urbana-Champaign
November 10, 2015 Presented by Markus
1/20
DART: Directed Automated Random Testing PLDI 2005 Patrice Godefroid - - PowerPoint PPT Presentation
DART: Directed Automated Random Testing PLDI 2005 Patrice Godefroid 1 Nils Klarlund 1 Koushik Sen 2 1 Bell Laboratories, Lucent Technologies 2 University of Illinois at Urbana-Champaign November 10, 2015 Presented by Markus 1/20 Introduction
1Bell Laboratories, Lucent Technologies 2University of Illinois at Urbana-Champaign
1/20
◮ Testing makes up 50% of software development cost
http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
◮ Testing makes up 50% of software development cost ◮ Failures cost $60 billion per year in USA alone
http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
◮ Testing makes up 50% of software development cost ◮ Failures cost $60 billion per year in USA alone ◮ Software testing is important
http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
◮ Testing makes up 50% of software development cost ◮ Failures cost $60 billion per year in USA alone ◮ Software testing is important ◮ Software testing is. . .
http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
◮ Testing makes up 50% of software development cost ◮ Failures cost $60 billion per year in USA alone ◮ Software testing is important ◮ Software testing is. . .
◮ Hard http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
◮ Testing makes up 50% of software development cost ◮ Failures cost $60 billion per year in USA alone ◮ Software testing is important ◮ Software testing is. . .
◮ Hard ◮ Boring http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
◮ Testing makes up 50% of software development cost ◮ Failures cost $60 billion per year in USA alone ◮ Software testing is important ◮ Software testing is. . .
◮ Hard ◮ Boring ◮ Tedious http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
◮ Testing makes up 50% of software development cost ◮ Failures cost $60 billion per year in USA alone ◮ Software testing is important ◮ Software testing is. . .
◮ Hard ◮ Boring ◮ Tedious
◮ Automated techniques
http://vignette4.wikia.nocookie.net/spongebob/images/9/9f/Money Krabs CS.jpg
2/20
1
2
3
4
5
6
7
8
9
◮ Automated random testing 3/20
1
2
3
4
5
6
7
8
9
◮ Automated random testing
◮ Hard to guess
3/20
1
2
3
4
5
6
7
8
9
◮ Automated random testing
◮ Hard to guess
◮ Directed random testing 3/20
1
2
3
4
5
6
7
8
9
◮ Automated random testing
◮ Hard to guess
◮ Directed random testing
◮ Specify reachability as
3/20
1
2
3
4
5
6
7
8
9
◮ Input One:
4/20
1
2
3
4
5
6
7
8
9
◮ Input One:
◮ Second branch not
4/20
1
2
3
4
5
6
7
8
9
◮ Input One:
◮ Second branch not
◮ Path constraint:
4/20
1
2
3
4
5
6
7
8
9
◮ Input One:
◮ Second branch not
◮ Path constraint:
◮ Direct tester to new paths 4/20
1
2
3
4
5
6
7
8
9
◮ Input One:
◮ Second branch not
◮ Path constraint:
◮ Direct tester to new paths
◮ Alter path constraint &
4/20
1
2
3
4
5
6
7
8
9
◮ Input One:
◮ Second branch not
◮ Path constraint:
◮ Direct tester to new paths
◮ Alter path constraint &
◮ New constraint:
4/20
1
2
3
4
5
6
7
8
9
◮ Input One:
◮ Second branch not
◮ Path constraint:
◮ Direct tester to new paths
◮ Alter path constraint &
◮ New constraint:
◮ x = 10 ∧ y = 1000
4/20
◮ Random testing + directed testing 5/20
◮ Random testing + directed testing ◮ Randomly apply function inputs 5/20
◮ Random testing + directed testing ◮ Randomly apply function inputs ◮ Gather path constraints on a trace 5/20
◮ Random testing + directed testing ◮ Randomly apply function inputs ◮ Gather path constraints on a trace ◮ Use solver to find new inputs 5/20
◮ Random testing + directed testing ◮ Randomly apply function inputs ◮ Gather path constraints on a trace ◮ Use solver to find new inputs ◮ Static method to identify interfaces 5/20
◮ Random testing + directed testing ◮ Randomly apply function inputs ◮ Gather path constraints on a trace ◮ Use solver to find new inputs ◮ Static method to identify interfaces ◮ Fully automated 5/20
6/20
7/20
7/20
7/20
7/20
1
2
3
4
5
6
7
8
9
10
11
8/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
8/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
◮ z = 20 → x = z
8/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
◮ Initially:
8/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
◮ After line 2:
8/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
◮ After line 3:
8/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
◮ After line 3:
◮ Path constraint: ¬c1 8/20
1
2
3
4
5
6
7
8
9
10
11
◮ After line 3:
◮ Old constraint: ¬c1 9/20
1
2
3
4
5
6
7
8
9
10
11
◮ After line 3:
◮ Old constraint: ¬c1 ◮ New constraint: c1 9/20
1
2
3
4
5
6
7
8
9
10
11
◮ Logic formula:
9/20
1
2
3
4
5
6
7
8
9
10
11
◮ Logic formula:
◮ Satisfying assignment:
9/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
10/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
◮ c1 = x == z = 1 10/20
1
2
3
4
5
6
7
8
9
10
11
◮ Concrete input:
◮ c1 = x == z = 1 ◮ t2 = x + 10 = 10 10/20
1
2
3
4
5
6
7
8
9
10
11
◮ After line 6:
10/20
1
2
3
4
5
6
7
8
9
10
11
◮ After line 6:
◮ Path constraint: c1 ∧ ¬c2 10/20
1
2
3
4
5
6
7
8
9
10
11
◮ New constraint: c1 ∧ c2 10/20
1
2
3
4
5
6
7
8
9
10
11
◮ New constraint: c1 ∧ c2 ◮ Logic formula:
10/20
1
2
3
4
5
6
7
8
9
10
11
◮ New constraint: c1 ∧ c2 ◮ Logic formula:
◮ Unsatisfiable! (The error is
10/20
◮ Transfer functions 11/20
◮ Transfer functions
◮ Function from symbolic equation to symbolic equation
11/20
◮ Transfer functions
◮ Function from symbolic equation to symbolic equation ◮ S → S
11/20
◮ Transfer functions
◮ Function from symbolic equation to symbolic equation ◮ S → S
◮ Evaluate: z = x 11/20
◮ Transfer functions
◮ Function from symbolic equation to symbolic equation ◮ S → S
◮ Evaluate: z = x
◮ λS.Sz := x
11/20
◮ Programs may be infinite 12/20
◮ Programs may be infinite
◮ Cannot have an infinitly long formulas
12/20
◮ Programs may be infinite
◮ Cannot have an infinitly long formulas
◮ Solution: bound the depth of the search 12/20
◮ Programs may be infinite
◮ Cannot have an infinitly long formulas
◮ Solution: bound the depth of the search ◮ Under-approximated analysis 12/20
◮ Programs may be infinite
◮ Cannot have an infinitly long formulas
◮ Solution: bound the depth of the search ◮ Under-approximated analysis
◮ Bug hunting
12/20
◮ Programs may be infinite
◮ Cannot have an infinitly long formulas
◮ Solution: bound the depth of the search ◮ Under-approximated analysis
◮ Bug hunting ◮ Not proof generation
12/20
◮ Programs may be infinite
◮ Cannot have an infinitly long formulas
◮ Solution: bound the depth of the search ◮ Under-approximated analysis
◮ Bug hunting ◮ Not proof generation
◮ No false alarms: 12/20
◮ Programs may be infinite
◮ Cannot have an infinitly long formulas
◮ Solution: bound the depth of the search ◮ Under-approximated analysis
◮ Bug hunting ◮ Not proof generation
◮ No false alarms:
◮ Detected bugs are guarnateed to exist in the actual
12/20
13/20
◮ Pentium III 800 MHz Processor 14/20
◮ Pentium III 800 MHz Processor ◮ lp solve solver 14/20
◮ Pentium III 800 MHz Processor ◮ lp solve solver ◮ CIL parser 14/20
◮ Pentium III 800 MHz Processor ◮ lp solve solver ◮ CIL parser ◮ Three programs: 14/20
◮ Pentium III 800 MHz Processor ◮ lp solve solver ◮ CIL parser ◮ Three programs:
14/20
◮ Pentium III 800 MHz Processor ◮ lp solve solver ◮ CIL parser ◮ Three programs:
14/20
◮ Pentium III 800 MHz Processor ◮ lp solve solver ◮ CIL parser ◮ Three programs:
14/20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
◮ Random testing
15/20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
◮ Random testing
◮ 232 × 232 = 264
15/20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
◮ Random testing
◮ 232 × 232 = 264
◮ One leads to the
15/20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
◮ Random testing
◮ 232 × 232 = 264
◮ One leads to the
◮ Never finds the bug
15/20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
◮ Random testing
◮ 232 × 232 = 264
◮ One leads to the
◮ Never finds the bug
◮ DART: less than
15/20
◮ Protocol for two users to authenticate each other 16/20
◮ Protocol for two users to authenticate each other ◮ Contains impersonation bug 16/20
◮ Protocol for two users to authenticate each other ◮ Contains impersonation bug ◮ C implementation (400 LOC) 16/20
◮ Protocol for two users to authenticate each other ◮ Contains impersonation bug ◮ C implementation (400 LOC) ◮ Used “reasonable” environment constraints 16/20
◮ Protocol for two users to authenticate each other ◮ Contains impersonation bug ◮ C implementation (400 LOC) ◮ Used “reasonable” environment constraints ◮ Dart: 18 minutes to find error 16/20
◮ Protocol for two users to authenticate each other ◮ Contains impersonation bug ◮ C implementation (400 LOC) ◮ Used “reasonable” environment constraints ◮ Dart: 18 minutes to find error ◮ Re-ran on “fixed” version: found another bug 16/20
◮ Protocol for two users to authenticate each other ◮ Contains impersonation bug ◮ C implementation (400 LOC) ◮ Used “reasonable” environment constraints ◮ Dart: 18 minutes to find error ◮ Re-ran on “fixed” version: found another bug
◮ 22 minutes
16/20
◮ oSIP: Telephone over IP library 17/20
◮ oSIP: Telephone over IP library ◮ Tested external functions 17/20
◮ oSIP: Telephone over IP library ◮ Tested external functions ◮ Found many functions not checking NULL pointers 17/20
◮ oSIP: Telephone over IP library ◮ Tested external functions ◮ Found many functions not checking NULL pointers ◮ Found denial of service in parser 17/20
◮ oSIP: Telephone over IP library ◮ Tested external functions ◮ Found many functions not checking NULL pointers ◮ Found denial of service in parser
◮ Request too large a stack frame
17/20
◮ oSIP: Telephone over IP library ◮ Tested external functions ◮ Found many functions not checking NULL pointers ◮ Found denial of service in parser
◮ Request too large a stack frame ◮ Return of alloca not checked
17/20
◮ oSIP: Telephone over IP library ◮ Tested external functions ◮ Found many functions not checking NULL pointers ◮ Found denial of service in parser
◮ Request too large a stack frame ◮ Return of alloca not checked
◮ “Bugs” fixed by developers 17/20
◮ oSIP: Telephone over IP library ◮ Tested external functions ◮ Found many functions not checking NULL pointers ◮ Found denial of service in parser
◮ Request too large a stack frame ◮ Return of alloca not checked
◮ “Bugs” fixed by developers ◮ Intuition: specifications make this technique much better 17/20
18/20
◮ How to handle concurrent programs? 19/20
◮ How to handle concurrent programs?
◮ Branches and thread schedules?
19/20
◮ How to handle concurrent programs?
◮ Branches and thread schedules? ◮ Assertion Guided Symbolic Execution of Multithreaded
19/20
◮ How to handle concurrent programs?
◮ Branches and thread schedules? ◮ Assertion Guided Symbolic Execution of Multithreaded
◮ How to handle unbounded programs? 19/20
◮ How to handle concurrent programs?
◮ Branches and thread schedules? ◮ Assertion Guided Symbolic Execution of Multithreaded
◮ How to handle unbounded programs? ◮ How scalable is this approach? 19/20
◮ Function-test generation 20/20
◮ Function-test generation ◮ Fully automated 20/20
◮ Function-test generation ◮ Fully automated ◮ Faster than random testing 20/20
◮ Function-test generation ◮ Fully automated ◮ Faster than random testing 20/20
◮ Function-test generation ◮ Fully automated ◮ Faster than random testing
20/20