AUTOMATED UNIT TEST GENERATION DURING SOFTWARE DEVELOPMENT
José Miguel Rojas j.rojas@sheffield.ac.uk Joint work with Gordon Fraser and Andrea Arcuri
A Controlled Experiment and Think-aloud Observations ISSTA 2015
AUTOMATED UNIT TEST GENERATION DURING SOFTWARE DEVELOPMENT A - - PowerPoint PPT Presentation
AUTOMATED UNIT TEST GENERATION DURING SOFTWARE DEVELOPMENT A Controlled Experiment and Think-aloud Observations ISSTA 2015 Jos Miguel Rojas j.rojas@sheffield.ac.uk Joint work with Gordon Fraser and Andrea Arcuri Testing is a widespread
José Miguel Rojas j.rojas@sheffield.ac.uk Joint work with Gordon Fraser and Andrea Arcuri
A Controlled Experiment and Think-aloud Observations ISSTA 2015
“Testing is a widespread validation approach in industry, but it is still largely ad hoc, expensive, and unpredictably effective.”
“Software Testing Research: Achievements, Challenges, Dreams,”
“Testing is a widespread validation approach in industry, but it is still largely ad hoc, expensive, and unpredictably effective.” “Test case generation has a strong impact on the effectiveness and efficiency of testing.” “…one of the most active research topics in software testing for several decades, resulting in many different approaches and tools.”
“Software Testing Research: Achievements, Challenges, Dreams,”
”An orchestrated survey of methodologies for automated software test case generation,” S. Anand, E. K. Burke, T.
Harman, M.J. Harrold, P . McMinn. J. Systems and Software. Elsevier. 2013.
“Does automated white-box test generation really help software testers?,” G. Fraser, M. Staats, P . McMinn, A. Arcuri and F. Padberg
“Does automated white-box test generation really help software testers?,” G. Fraser, M. Staats, P . McMinn, A. Arcuri and F. Padberg
ARE UNIT TEST GENERATION TOOLS HELPFUL TO DEVELOPERS WHILE THEY ARE CODING?
Golden Implementation and Test Suite
Class Template Golden Implementation and Test Suite
Class Template Implementation and Test Suite Golden Implementation and Test Suite
Class Template Implementation and Test Suite Golden Implementation and Test Suite
Class Template Implementation and Test Suite
Manual
Golden Implementation and Test Suite
1 hour
Class Template Implementation and Test Suite
Manual
Golden Implementation and Test Suite
1 hour
Class Template Implementation and Test Suite
41
Manual
Golden Implementation and Test Suite
1 hour
Class Template Implementation and Test Suite
41
Manual
Golden Implementation and Test Suite
1 hour
Class Template Implementation and Test Suite
41 2
Manual
Golden Implementation and Test Suite
1 hour
Class Template Implementation and Test Suite
41 4 2
Manual
Golden Implementation and Test Suite
Branch Coverage
0% 20% 40% 60% 80% 100% FilterIterator FixedOrderComparator ListPopulation PredicatedMap
50% 26% 57% 39% 41% 83% 38% 63%
Assisted Manual
participants’ test suites run on their own implementations
Times Coverage was checked
2 4 6 8 10
Category Axis
FilterIterator FixedOrderComparator ListPopulation PredicatedMap
6.4 4 9.6 6 5.3 5.9 1.9 9 Assisted Manual
Times coverage was checked
ListPopulation
Branch Coverage (%)
0% 25% 50% 75% 100%
Time (min)
10 20 30 40 50 60
Manual EvoSuite Assisted
participant’s test suites run on their own implementations
ListPopulation
Branch Coverage (%)
0% 25% 50% 75% 100%
Time (min)
10 20 30 40 50 60
Manual EvoSuite Assisted
participant’s test suites run on their own implementations
ListPopulation
Branch Coverage (%)
0% 25% 50% 75% 100%
Time (min)
10 20 30 40 50 60
Manual EvoSuite Assisted
participant’s test suites run on their own implementations
ListPopulation
Branch Coverage (%)
0% 25% 50% 75% 100%
Time (min)
10 20 30 40 50 60
Manual EvoSuite Assisted
participant’s test suites run on their own implementations
Branch Coverage
0% 20% 40% 60% 80% 100% FilterIterator FixedOrderComparator ListPopulation PredicatedMap
50% 28% 21% 30% 42% 37% 35% 41%
Assisted Manual
participants’ test suites run on golden implementations
FilterIterator
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
Assisted Manual EvoSuite-generated
participant’s test suites run on golden implementations, over time
ListPopulation
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
FixedOrderComparator
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
PredicatedMap
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
FilterIterator
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
Assisted Manual EvoSuite-generated
participant’s test suites run on golden implementations, over time
ListPopulation
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
FixedOrderComparator
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
PredicatedMap
Branch Coverage 0% 25% 50% 75% 100% Time (min) 10 20 30 40 50 60
Coverage can be higher when using EvoSuite, depending on how the generated tests are used.
3 6 8 11 14 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
5.6 11 12.9 13.7 4.4 8.2 7.2 7.8 Assisted Manual
Number of test runs
5 10 16 21 26 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
14.3 25 15.8 20 7.7 12.6 9.3 18.5
Assisted Manual
Minutes spent on testing
5 10 16 21 26 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
14.3 25 15.8 20 7.7 12.6 9.3 18.5
Assisted Manual
Minutes spent on testing
Using EvoSuite reduces the time spent on testing.
Golden test suites run on participants’ implementations
Number of Failures+Errors
3 6 10 13 16 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
14.3 5.3 4.3 6.3 15.6 6.4 4.2 6.1
Assisted Manual
Golden test suites run on participants’ implementations
Number of Failures+Errors
3 6 10 13 16 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
14.3 5.3 4.3 6.3 15.6 6.4 4.2 6.1
Assisted Manual
Using EvoSuite during development did not lead to to better implementations.
Time spent with EvoSuite
Correlation with number of failures plus errors
0.00 0.10 0.20 0.30 0.40 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
0.02
0.35 0.32
0.35 Number of runs Time spent on tests
Time spent with EvoSuite
Correlation with number of failures plus errors
0.00 0.10 0.20 0.30 0.40 FilterIterator FixedOrderComparator ListPopulation PredicatedMap
0.02
0.35 0.32
0.35 Number of runs Time spent on tests
Implementation quality improves the more time developers spend with EvoSuite-generated tests.
Information Technology, vol. 22, no. 2, pp. 127–140, 2003.
Information Technology, vol. 22, no. 2, pp. 127–140, 2003.
Subject
Information Technology, vol. 22, no. 2, pp. 127–140, 2003.
Observer Subject
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
4 1
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
2 hours
Class Template Implementation and Test Suite Golden Implementation and Test Suite
5 4 1
Images: http://jozef89.deviantart.com/
3 6 2 2 5 15 20 23 7 10
94% | 92% 97% | 72% 94% | 95% 100% | 94% 100% | 83% FixedOrderComparator ListPopulation FilterIterator PredicatedMap PredicatedMap
Images: http://jozef89.deviantart.com/
3 6 2 2 5 15 20 23 7 10
94% 97% 94% | 95% 100% 100% FixedOrderComparator ListPopulation FilterIterator PredicatedMap PredicatedMap
Images: http://jozef89.deviantart.com/
3 6 2 2 5 15 20 23 7 10
94% 97% | 72% 94% 100% 100%
FixedOrderComparator ListPopulation
FilterIterator PredicatedMap PredicatedMap
should be adaptable to them
should be adaptable to them
should be adaptable to them
should be adaptable to them
should be adaptable to them
best use automated test generation tools!
“Coverage is easy to assess because it is a number, while readability is a very non- tangible property…
“Coverage is easy to assess because it is a number, while readability is a very non- tangible property… … What is readable to me may not be readable to you. It is readable to me just because I spent the last hour and a half doing this.”
“Coverage is easy to assess because it is a number, while readability is a very non- tangible property… … What is readable to me may not be readable to you. It is readable to me just because I spent the last hour and a half doing this.”
—Participant 5
j.rojas@sheffield.ac.uk
j.rojas@sheffield.ac.uk