Unit Tes)ng Tool Compe))on Round Four Urko Rueda, Ren Just, Juan - - PowerPoint PPT Presentation
Unit Tes)ng Tool Compe))on Round Four Urko Rueda, Ren Just, Juan - - PowerPoint PPT Presentation
Unit Tes)ng Tool Compe))on Round Four Urko Rueda, Ren Just, Juan P. Galeo5, Tanja E. J. Vos The 9th Interna=onal Workshop on Search-Based SoDware Tes=ng Contents 1. About the Tool compe==on 2. The Tools 3. The Methodology 4. The Results
Contents
1.
About the Tool compe==on
2.
The Tools
3.
The Methodology
4.
The Results
5.
Lessons learned
4th Java unit tes=ng compe==on 1
Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs / Projects / Tools Tools SBST & nonSBST
2012
ICST’13
✓
Cobertura Javalanche 77 / 5 / 2 Manual & Randoop
- baselines
2013
Round Two FITTEST’13
✓
JaCoCo PITest 63 / 9 / 4 1st + T3 & Evosuite 63 / 9 / 8
2014
Round Three SBST’15
✗
2nd + Commercial & GRT & jTexPert & Mosa(Evosuite)
2015
Round Four SBST’16
✗
Defects4J: github.com/rjust/ defects4j + Real fault finding metric 68 / 5 / 4 Randoop - baseline & T3 & Evosuite & jTexPert
Benchmarked Java unit tes=ng at the class level
About the Tool compe))on
4th Java unit tes=ng compe==on 2
About the Tool compe))on
§ Why? § Towards tes=ng field maturity – this is just Java … § Tools improvements, future developments insight § What is new in the 4th edi=on? § Benchmark infrastructure – split into
§ Test genera=on § Test execu=on & Test assessment (Defects4J)
§ Benchmark subjects (from Defects4J dataset) § Time budgets (1, 2, 4 & 8 minutes) § Flaky tests (non compliable, non reliable pass)
4th Java unit tes=ng compe==on 3
The Tools
Tool Technique Static analysis Edition 2012 2013 2014 2015 Randoop (baseline) Random ✗ ✓ ✓ ✓ ✓ T3 ✗ ✗ ✓ ✓ ✓ jTexPert Random (guided) ✓ ✗ ✗ ✓ ✓ Evosuite Evolutionary algorithm ✓ ✗ ✓ ✓ ✓
§ SBST and non-SBST tools § Command line tools § Fully automated – no human interven=on
4th Java unit tes=ng compe==on 4
The Methodology
§ Tool deployment § Installa=on – Linux environment § Wrapper implementa=on – runtool script
§ Std. IN/OUT communica=on protocol § 4th edi=on has a =me budget
§ Tune-up cycle – setup, run, resolve issues
§ Benchmark infrastructure
§ Defects4J integra=on § Decoupling test genera=on from test execu=on/assessment
§ Tool – run over non contest benchmark samples
4th Java unit tes=ng compe==on 5
The Methodology
run tool for Tool T benchmark framework "BENCHMARK" Src Path / Bin Path / ClassPath ClassPath for JUnit Compilation "READY"
. . .
name of CUT
. . .
generate file in ./temp/testcases "READY" compile + execute + measure test case
loop
preparation time-budget
4th Java unit tes=ng compe==on 6
The Methodology
§ Benchmark infrastructure § Two HP Z820 worksta=ons – each:
§ 2 CPU sockets for a total of 20 cores § 256Gb RAM
§ 32 virtual machines (16 per worksta=on)
§ Test genera=on
§ 1 core – control tool mul=-threading capability § 8GB RAM
§ Test execu=on/assessment (tool independent)
§ 2 cores § 16Gb RAM – resolves out of memory issues
4th Java unit tes=ng compe==on 7
The Methodology
benchmark tool replicated x32 VMs T3 jTexpert EvoSuite Randoop runtool
80 CUTs
RUNs 1, 2, 3 generate test cases collect metrics aggregator runtool runtool runtool HP Z820 16 VMs 20core CPU 256Gb RAM 1core CPU 8Gb RAM time budgets 1 2 4 8m 2core CPU 16Gb RAM 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m HP Z820 16 VMs 20core CPU 256Gb RAM 1core CPU 8Gb RAM time budgets 1 2 4 8m 2core CPU 16Gb RAM 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m 1 2 4 8m RUNs 4, 5, 6 generate test cases collect metrics Calculate Score 4th Java unit tes=ng compe==on 8
Randoop
Test classes @Test @Test @Test
compilable
run to detect and remove flaky tests
Test classes @Test @Test No flaky tests
run to collect metrics calculate score
benchmark tool
runtool runtool runtool runtool
T3 EvoSuite jTexpert Time- budget
(1, 2 , 4, 8min)
Y N
CUT (fixed) CUT (1 real fault) CUT (mutated) generate CUT (fixed)