Test Factoring: Focusing test suites on the task at hand David - PowerPoint PPT Presentation

Test Factoring: Focusing test suites on the task at hand David Saff, MIT ASE 2005 1 David Saff

The problem: large, general system tests My test suite One hour Where I changed code Where I broke code How can I get: Quicker feedback? [Saff, Ernst, Less wasted time? ISSRE 2003] 2 David Saff

The problem: large, general system tests My test suite Test selection 3 David Saff

The problem: large, general system tests My test suite Test selection Test prioritization 4 David Saff

The problem: large, general system tests My test suite Test selection Test prioritization Test factoring 5 David Saff

Test factoring • Input: large, general system tests • Output: small, focused unit tests • Work with Shay Artzi, Jeff Perkins, and Michael D. Ernst 6 David Saff

A factored test… • exercises less code than system test • should be faster if a system test is slow • can eliminate dependence on expensive resources or human interaction • isolates bugs in subsystems • provides new opportunities for prioritization and selection 7 David Saff

Test Factoring • What? – Breaking up a system test • How? – Automatically creating mock objects • When? – Integrating test factoring into development • What next? – Results, evaluation, and challenges 8 David Saff

System Test Provided Checked There’s more than one way to factor a test! Basic strategy: - Capture a subset of behavior beforehand. 9 - Replay that behavior at test time. David Saff

System Test PayrollCalculator Provided • Fast Checked • Is changing Tested Code X capture X capture X capture X X capture X capture Environment Database Server • Expensive • Not changing 10 David Saff

Introduce Mock Provided Checked Provided Provided Checked Checked Tested Code Provided Checked Environment Introduce Mock: [Saff, Ernst, • simulate part of the functionality of the original environment PASTE 2004] • validate the unit’s interaction with the environment 11 David Saff

How? Automating Introduce Mock calculatePayroll() PayrollCalculator addResultsTo(ResultSet) X capture getResult() Database getResult() getResult() X capture addResult(String) addResult(String) ResultSet addResult(String) Tested Code Environment 13 David Saff

Interfacing: separate type hierarchy from inheritance hierarchy calculatePayroll() addResultsTo(IResultSet) IPayrollCalculator PayrollCalculator IDatabase Database getResult() getResult() getResult() addResult(String) addResult(String) addResult(String) IResultSet ResultSet Tested Code Environment 14 David Saff

Capturing: insert recording decorators where capturing must happen calculatePayroll() IPayrollCalculator addResultsTo(IResultSet) IDatabase PayrollCalculator Capturing Database getResult() Database getResult() getResult() capture IResultSet addResult(String) Callback addResult(String) ResultSet ResultSet addResult(String) capture Tested Code Environment 15 David Saff

Replay: simulate environment’s behavior calculatePayroll() IPayrollCalculator addResultsTo(IResultSet) IDatabase PayrollCalculator Replaying verified Database getResult() Database getResult() getResult() replayed IResultSet addResult(String) addResult(String) ResultSet addResult(String) Tested Code Environment 16 David Saff

When? Test factoring life cycle: Slow system tests Run factored tests Capture Success Failure Developer Transcript changes Replay exception tested unit Run system tests Replay for replay exceptions Fast unit tests 18 David Saff

Time saved: Slow system tests Run factored tests Run system tests for replay exceptions 19 David Saff

Time until first error Time saved: Slow system tests Factored tests Time to complete tests 20 David Saff

Implementation for Java • Captures and replays – Static calls – Constructor calls – Calls via reflection – Explicit class loading • Allows for shared libraries – i.e., tested code and environment are free to use disjoint ArrayLists without verification. • Preserves behavior on Java programs up to 100KLOC 22 David Saff

Case study • Daikon: 347 KLOC – Uses most of Java: reflection, native methods, JDK callbacks, communication through side effects • Tests found real developer errors • Two developers – Fine-grained compilable changes over two months: 2505 – CVS check-ins over six months (all developers): 104 23 David Saff

Evaluation method • Retrospective reconstruction of test factoring’s results during real development – Test on every change, or every check-in. • Assume capture happens every night • If transcript is too large, don’t capture – just run original test • If factored test throws a ReplayException, run original test. 24 David Saff

Measured Quantities • Test time : total time to find out test results • Time to failure : If tests fail, how long until first failure? • Time to success : If tests pass, how long until all tests run? • ReplayExceptions are treated as giving the developer no information 25 David Saff

Results How Test time Time to Time to success often? failure Dev. 1 Every .79 1.56 .59 change (7.4 / 9.4 min) (14 / 9 s) (5.5 / 9.4 s) Dev. 2 Every .99 1.28 .77 change (14.1 / 14.3 min) (64 / 50 s) (11.0 / 14.3 s) All Every .09 n/a .09 devs. check-in (0.8 / 8.8 min) (0.8 / 8.8 min) 26 David Saff

Discussion • Test factoring dramatically reduced testing time for checked-in code (by 90%) • Testing on every developer change catches too many meaningless versions • Are ReplayExceptions really not helpful? – When they are surprising, perhaps they are 27 David Saff

Future work: improving the tool • Generating automated tests from UI bugs – Factor out the user • Smaller factored tests – Use static analysis to distill transcripts to bare essentials 28 David Saff

Future work: Helping users • How do I partition my program? – Should ResultSet be tested or mocked? • How do I use replay exceptions? – Is it OK to return null when “” was expected? • Can I change my program to make it more factorable? – Can the tool suggest refactorings? 29 David Saff

Conclusion • Test factoring uses large, general system tests to create small, focused unit tests • Test factoring works now • How can it work better, and help users more? • saff@mit.edu 30 David Saff

31 David Saff

Challenge: Better factored tests • Allow more code changes – It’s OK to call toString an additional time. • Eliminate redundant tests – Not all 2,000 calls to calculatePayroll are needed. 32 David Saff

Evaluation strategy 1) Observe : minute-by-minute code changes from real development projects. 2) Simulate: running the real test factoring code on the changing code base. 3) Measure: – Are errors found faster? – Do tests finish faster? – Do factored tests remain valid? 4) Distribute: developer case studies 33 David Saff

Conclusion • Rapid feedback from test execution has measurable impact on task completion. • Continuous testing is publicly available. • Test factoring is working, and will be available by year’s end. • To read papers and download: – Google “continuous testing” 34 David Saff

Case Study • Four development projects monitored Shown here: Perl implementation of delta tools. • • Developed by me using test-first development methodology. Tests were run often. • Small code base with small test suite. lines of code 5714 total time worked (hours) 59 total test runs 266 average time between tests (mins) 5 35 David Saff

We want to reduce wasted time Test-wait time . Regret time : If developers test If developers test often, they spend a lot rarely, regression of time waiting for errors are not found tests to complete. quickly. Extra time is spent remembering and fixing old changes. 36 David Saff

Results predict: continuous testing reduces wasted time Best we can do by Wasted Time Reduction by Continuous Testing changing frequency 0.12 0.10 Wasted Time Best we 0.08 Regret 0.06 can do by Test-wait 0.04 changing 0.02 order 0.00 Observed Best Random Recent Continuous Reorder Errors testing Without ct With ct drastically cuts regret 37 time. David Saff

Test Factoring: Focusing test suites on the task at hand David - PowerPoint PPT Presentation

Test Factoring: Focusing test suites on the task at hand David Saff, MIT ASE 2005 1 David Saff The problem: large, general system tests My test suite One hour Where I changed code Where I broke code How can I get: Quicker feedback?

TOTAL NUMBER OF SUITES 26 7 JUNIOR SUITES 2 JUNIOR SUITES FOR PEOPLE WHO NEED SPECIAL CARE 4

TOTAL NUMBER OF SUITES 26 7 JUNIOR SUITES 2 JUNIOR SUITES FOR PEOPLE WHO NEED SPECIAL CARE 4

Hardware- -Based Implementations Based Implementations Hardware of Factoring Algorithms of

Factors 2 12 Factors Factors 3 13 and Unique Unique 14 4 to 10 to 15 Greatest Common

Osmocom TTCN-3 Test Suites Harald Welte <laforge@gnumonks.org> Osmocom TTCN-3 Test Suites

SUITES P A R K H Y A T T C H I C A G O the SUITES P A R K S U I T E E X E C U T I V E S U I

Deluxe Studio Suites Anemi DELUXE Studios Our Deluxe Studio suites have been carefully designed

List of hand outs for this session Hand out 1: Incident decision tree Hand out 2: Yorkshire

Test automation Building automatically repeatable test suites Test automation n Test automation

Hand Hygiene Stefan Morton Hand Hygiene Coordinator Evidence Improved adherence to hand

Special Purpose Hardware for Factoring: Special Purpose Hardware for Factoring: the NFS Sieving

Factoring Large Numbers Factoring Large Numbers with the TWIRL Device with the TWIRL Device Adi

Creators of the finest hand painted wallpapers and fabrics, hand carved furniture and hand painted

ROUNDERS (1998) CASINO ROYALE (2006) HAND RANKINGS HIGH CARD HAND RANKINGS PAIR HIGH CARD

Raise your hand in Zoom Click on Participants Your hand is raised Click hand to lower it

Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science

Pane nel Session: n: 5G Test and Measur ureme ment Malcolm Robertson, , Keysight Jon

CVPR 2 CVPR 2019 Tracking and Detection Challenge 16/06/2019 www.motchallenge.net CVPR 2019

Advanced Systems Security Fuzz Testing Trent Jaeger Systems and Internet Infrastructure

Intelligent Software Engineering: Synergy between AI and Software Engineering Tao Xie

DING T EST A UTOMATION XTENDI OR V IR ON FOR IRTU TUALI LIZATION Sharookh Daru aruwalla

S V V .lu software verification & validation Testing of Cyber-Physical Systems:

Learning objectives Understand the purpose of integration testing Distinguish

FROM REQUIREMENTS TO TESTING, VALIDATION AND VERIFICATION Patricia Derler, National Instruments

Sambuz

Useful Links

Newsletter

Mail Us

Test Factoring: Focusing test suites on the task at hand David - PowerPoint PPT Presentation

Test Factoring: Focusing test suites on the task at hand David Saff, MIT ASE 2005 1 David Saff The problem: large, general system tests My test suite One hour Where I changed code Where I broke code How can I get: Quicker feedback?

TOTAL NUMBER OF SUITES 26 7 JUNIOR SUITES 2 JUNIOR SUITES FOR PEOPLE WHO NEED SPECIAL CARE 4

TOTAL NUMBER OF SUITES 26 7 JUNIOR SUITES 2 JUNIOR SUITES FOR PEOPLE WHO NEED SPECIAL CARE 4

Hardware- -Based Implementations Based Implementations Hardware of Factoring Algorithms of

Factors 2 12 Factors Factors 3 13 and Unique Unique 14 4 to 10 to 15 Greatest Common

Osmocom TTCN-3 Test Suites Harald Welte &lt;laforge@gnumonks.org&gt; Osmocom TTCN-3 Test Suites

SUITES P A R K H Y A T T C H I C A G O the SUITES P A R K S U I T E E X E C U T I V E S U I

Deluxe Studio Suites Anemi DELUXE Studios Our Deluxe Studio suites have been carefully designed

List of hand outs for this session Hand out 1: Incident decision tree Hand out 2: Yorkshire

Test automation Building automatically repeatable test suites Test automation n Test automation

Hand Hygiene Stefan Morton Hand Hygiene Coordinator Evidence Improved adherence to hand

Special Purpose Hardware for Factoring: Special Purpose Hardware for Factoring: the NFS Sieving

Factoring Large Numbers Factoring Large Numbers with the TWIRL Device with the TWIRL Device Adi

Creators of the finest hand painted wallpapers and fabrics, hand carved furniture and hand painted

ROUNDERS (1998) CASINO ROYALE (2006) HAND RANKINGS HIGH CARD HAND RANKINGS PAIR HIGH CARD

Raise your hand in Zoom Click on Participants Your hand is raised Click hand to lower it

Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science

Pane nel Session: n: 5G Test and Measur ureme ment Malcolm Robertson, , Keysight Jon

CVPR 2 CVPR 2019 Tracking and Detection Challenge 16/06/2019 www.motchallenge.net CVPR 2019

Advanced Systems Security Fuzz Testing Trent Jaeger Systems and Internet Infrastructure

Intelligent Software Engineering: Synergy between AI and Software Engineering Tao Xie

DING T EST A UTOMATION XTENDI OR V IR ON FOR IRTU TUALI LIZATION Sharookh Daru aruwalla

S V V .lu software verification &amp; validation Testing of Cyber-Physical Systems:

Learning objectives Understand the purpose of integration testing Distinguish

FROM REQUIREMENTS TO TESTING, VALIDATION AND VERIFICATION Patricia Derler, National Instruments

Sambuz

Useful Links

Newsletter

Mail Us

Osmocom TTCN-3 Test Suites Harald Welte <laforge@gnumonks.org> Osmocom TTCN-3 Test Suites

S V V .lu software verification & validation Testing of Cyber-Physical Systems: