Balancing Soundness and Efficiency for Practical Testing of - PowerPoint PPT Presentation

Balancing Soundness and Efficiency for Practical Testing of Configurable Systems Sabrina Souto Marcelo d’Amorim Rohit Gheyi UEPB, Brazil UFCG, Brazil UFPE, Brazil sabrinadfs@gmail.com rohit@dsc.ufcg.edu.br damorim@cin.ufpe.br

Configurable Systems Configurations Many other Configurable System examples! ... … 2

Bugs in Configurable Systems System Configurations Configurable System ... Configuration-related bug! … 3

Testing Configurable Systems System Tests Configurations … Monolithic Tests System ... … … … … 4

Limitations of Existing Techniques Efficacy (#failures) Efficiency (#samples) 5

Limitations of Existing Techniques Efficacy * (#failures) Exhaustive Find all bugs Very expensive Efficiency (#samples) 6

Limitations of Existing Techniques Efficacy * (#failures) Exhaustive Find all bugs Very expensive Very efficient * Default Can miss bugs Efficiency (#samples) 7

Limitations of Existing Techniques Efficacy * (#failures) Exhaustive Find all bugs Very expensive Try to find bugs with less samples * Sampling False positives and false negatives Very efficient * Default Can miss bugs Efficiency (#samples) 8

Limitations of Existing Techniques Efficacy * (#failures) * Exhaustive Dynamic (SPLat [FSE’13,SPLC’15] ) Find all bugs Consider code and test Very expensive It may not scale in all cases Try to find bugs with less samples * Sampling False positives and false negatives Very efficient * Default Can miss bugs Efficiency (#samples) 9

Limitations of Existing Techniques Efficacy * (#failures) * Exhaustive Dynamic (SPLat) S-SPLat S ampling + SPLat * Sampling * Default Efficiency (#samples) 10

Example Sampling (one-enabled) SPLat S-SPLat (one-enabled) 11 11

Example Notepad • 17 configuration variables Sampling (one-enabled) • Only 3 are reached by toolbar() Test SPLat S-SPLat (one-enabled) 12 12

Example Notepad • 17 configuration variables Sampling (one-enabled) • Only 3 are reached by toolbar() Test 17 configurations SPLat S-SPLat (one-enabled) 13 13

Example Notepad • 17 configuration variables Sampling (one-enabled) • Only 3 are reached by toolbar() Test 17 configurations SPLat S-SPLat (one-enabled) 6 configurations 14 14

Example Notepad • 17 configuration variables Sampling (one-enabled) • Only 3 are reached by toolbar() Test 17 configurations SPLat S-SPLat (one-enabled) 2 configurations 6 configurations 15 one-enabled

S-SPLat Input Output ... Instrumented Configurable System Tests executed with reachable and satisfiable configurations C1, T1 C2, T1 C1, T2 C5, T2 Tests C4, T3 … ... Sampling Heuristic Feature Model (Optional) 16

S-SPLat Input Output For all tests ... Run the test T i Yes Instrumented Configurable System Find reachable variables Tests executed with reachable and satisfiable configurations C1, T1 Look for next reachable configuration C2, T1 C1, T2 Otherwise C5, T2 Tests C4, T3 … ... Sampling Heuristic Check: - Sampling heuristic - Feature model Feature Model (Optional) 17

EVALUATION 18

Research Questions RQ1  Which heuristics maximize efficiency (#samples)? RQ2  Which heuristics maximize efficacy (#failures)? RQ3  Which heuristics (basic or combination) maximize efficiency and efficacy? 19

Scenarios • 17K+ tests • 2k+ variables Software Product Lines (SPLs) Version Version 8 subjects 6.1 4.8.2 • All existing tests • 3,557 tests • All existing options • 50 most frequently cited options in bug reports 20

Evaluation SPLs 21

Evaluation Evaluation SPLs Evaluation Techniques [ICSE’16,ASE’14] SPLs 8 subjects Techniques : 1. SPLat 2. SPLat + med 3. SPLat + oe 4. SPLat + od 5. SPLat + pw 6. SPLat + ran 22

Evaluation Evaluation SPLs Findings RQ1: Which heuristics maximize efficiency (#samples)? SPLat and SPLat+ SPLat+ RQ2: Which heuristics maximize efficacy (#failures)? SPLat+ SPLat+ 23

Evaluation Evaluation SPLs Findings RQ3: Which heuristics maximize efficiency (#samples) and efficacy (#failures)? Combinations of heuristics • oe x od x med x pw • c1 = oe+od • c2 = oe+med • c3 = oe+pw … • c11 = oe+od+med+pw 24

Evaluation Evaluation SPLs Findings RQ3: Which heuristics maximize efficiency (#samples) and efficacy (#failures)? SPLat+Most-enabled-disabled #failures optimized #samples at the expense of #failures SPLat+c11 (oe + od + med + pw) optimized #failures at the expense of #samples SPLat did not scale for some subjects The sampling heuristics reduced the number of samples explored by SPLat yet retaining their ability to reveal failures . #samples 25

Evaluation 26

Evaluation Evaluation SPLs Evaluation Techniques [ICSE’16,ASE’14] Version Version 6.1 4.8.2 Techniques : 1. SPLat 2. SPLat + med 3. SPLat + oe 4. SPLat + od 5. SPLat + pw 6. SPLat + ran 27

Evaluation Evaluation Findings SPLs Version 6.1 RQ1: Which heuristics maximize efficiency (#samples)? SPLat+ and SPLat+ SPLat+ RQ2: Which heuristics maximize efficacy (#failures)? SPLat+ SPLat+ 28

Evaluation Evaluation Findings SPLs RQ3: Which heuristics maximize efficiency (#samples) Version 6.1 and efficacy (#failures)? Bugs found #bugs #samples 2 new bugs reported. It is preferable to pick the best performing heuristics in the leftmost group  the best choices ! 29

Evaluation Evaluation Findings SPLs RQ3: Which heuristics maximize efficiency (#samples) Version 4.8.2 and efficacy (#failures)? #bugs Bugs found #samples All five bugs were captured. SPLat+c2 (oe+med) found all bugs with a relatively small number of samples. 30

Lessons Learned • For SPLs  c11 (oe+od+med+pw) • For GCC  c2(oe+med) • For SPLs and GCC  c7 (oe+od+med) • [ICSE 2016] A comparison of 10 sampling algorithms for configurable systems . • Combine different simple heuristics • Avoid heuristics with a large number of requirements 31

S-SPLat found a good balance between bugs and samples The sampling heuristics helped to reduced the number of samples explored by SPLat without loss the ability to find bugs S-SPLat could deal with scalability It revealed bugs in potentially large configuration spaces https://sabrinadfs.github.io/s-splat/ sabrinadfs@gmail.com 32

BACKUP SLIDES 33

Evaluation Evaluation RQ1: #samples SPLs GCC #samples RQ2: #failures #failures Não é possível exibir esta imagem no momento. Technique Technique od and pw found almost the same SPLat and ran explored much samples. number of failures as splat but they required much fewer samples. med explored the smallest sample sets. od explored the largest sample sets. 34

RQ3: #samples x #failures Evaluation Evaluation SPLs GCC #failures • Combinations of heuristics • oe x od x med x pw • c1 = oe+od • c2 = oe+med • c3 = oe+pw … • c11 = oe+od+med+pw SPLat and med optimize one dimension at the expense of the other. c11 (oe + od + med + pw) performed consistently well in all cases. The sampling heuristics reduced the number of samples explored by SPLat yet retaining their ability to reveal failures . 35 #samples

Evaluation Evaluation SPLs GCC RQ1: #samples RQ2: #bugs Version #bugs 6.1 #samples pw found more failures. It was one of the most expensive techniques. Technique Technique #bugs #samples oe and od found almost the same number of failures as pw but with much fewer samples. Technique Technique 36

Evaluation Evaluation SPLs GCC Discussion • c2 found all crashes with a relatively low number of configurations • c7 performed better, it detected most failures and crashes through a relatively small number of configurations • Combine different simple heuristics instead of using one that entails a larger number of test requirements • S-SPLat is promising to reveal errors in potentially large configuration spaces 37

Handling Constraints Complex models SPLs • 54% of the selected configurations are invalid • 43% of failures are false positives The use of validation is not necessary GCC Crashes was only revealed in valid configurations • The techniques performed consistently with and without feature constraints 38

Evaluation Evaluation SPLs GCC Additional Evaluations S-SPLat Random Sampling x with more rates: Regular Sampling 10% and 30% Regular Sampling New results are detected the same bugs proportional to the as S-SPLat with more change in the sampling configurations. rates of random. 39

Threats to Validity and Limitations • The selection of subjects • We used subjects from a variety of sources, including a large configurable system with hundreds of options • Eventual implementation errors • We thoroughly checked our implementation and our experimental results • Our datasets and implementations are publicly available: https://sabrinadfs.github.io/s-splat/ • SPLat currently only supports systems with dynamically bound feature variables ]) • It remains to investigate how SPLat and S-SPLat would perform on systems with #ifdef variability 40

Balancing Soundness and Efficiency for Practical Testing of - PowerPoint PPT Presentation

Balancing Soundness and Efficiency for Practical Testing of Configurable Systems Sabrina Souto Marcelo dAmorim Rohit Gheyi UEPB, Brazil UFCG, Brazil UFPE, Brazil sabrinadfs@gmail.com rohit@dsc.ufcg.edu.br damorim@cin.ufpe.br

ON THE COST OF TYPE-TAG SOUNDNESS Ben Greenman Zeina Migeed ON THE COST OF TYPE-TAG SOUNDNESS

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -> 2

Load Balancing Load Balancing Load balancing: distributing data and/or computations across

On 1 -soundness and Soundness of Workflow Nets Lu Ping, Hu Hao and L Jian Department of

The Modal Logic K Contents 1 Soundness and Completeness; Decidability 1 1.1 Soundness . . . .

Internal Load Balancing in 5 mins Deliver scalable and resilient internal-only services on GCP

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

Epidemic Algorithm for Load Balancing Harshitha Menon, Laxmikant Kal e 15th April 1 / 25

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Practical Issues with HIV Practical Issues with HIV Testing, CD4 Count and Viral Testing, CD4

Testing and Balancing September 16 th ,2020 Why do we Test and Balance? Occupant Comfort

Predicate Logic: Soundness and Completeness of Formal Deduction Alice Gao Lecture 17 CS 245

Blueprint for Restoring Safety and Soundness to the GSEs: One Year Later November 2018 Safety

Blueprint for Restoring Safety and Soundness to the GSEs June 2017 This presentation summarizes

The Validity and Soundness of Arguments Torben Amtoft Kansas State University Torben Amtoft

Outline Frameworks Approach CS1007: Object Oriented Design Requirements and

Contributing to an Eclipse Project Who Are We? Olivier Prouvost Wayne Beaton OPCoach

Incremental Change of Software Taxonomy of Evolution Changes Incremental change (IC)

The CLOSER: Automating Resource Management in Java Isil Dillig Thomas Dillig Eran Yahav Satish

The Symfony Framework YOUR FREE NEW TOOLKIT Hallo! > Lead contributor to the Symfony

Topics Defining a Class Defining Instance Variables Writing Methods Chapter 7

What is an AI toolkit? There are various levels at which we can build machines, some much harder

WINK AND THE MOBILE WEB INNOVATION Jrme Giraud Orange Labs About me I missed the glory days

Balancing Soundness and Efficiency for Practical Testing of - PowerPoint PPT Presentation

Balancing Soundness and Efficiency for Practical Testing of Configurable Systems Sabrina Souto Marcelo dAmorim Rohit Gheyi UEPB, Brazil UFCG, Brazil UFPE, Brazil sabrinadfs@gmail.com rohit@dsc.ufcg.edu.br damorim@cin.ufpe.br

ON THE COST OF TYPE-TAG SOUNDNESS Ben Greenman Zeina Migeed ON THE COST OF TYPE-TAG SOUNDNESS

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -&gt; 2

Load Balancing Load Balancing Load balancing: distributing data and/or computations across

On 1 -soundness and Soundness of Workflow Nets Lu Ping, Hu Hao and L Jian Department of

The Modal Logic K Contents 1 Soundness and Completeness; Decidability 1 1.1 Soundness . . . .

Internal Load Balancing in 5 mins Deliver scalable and resilient internal-only services on GCP

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

Epidemic Algorithm for Load Balancing Harshitha Menon, Laxmikant Kal e 15th April 1 / 25

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Practical Issues with HIV Practical Issues with HIV Testing, CD4 Count and Viral Testing, CD4

Testing and Balancing September 16 th ,2020 Why do we Test and Balance? Occupant Comfort

Predicate Logic: Soundness and Completeness of Formal Deduction Alice Gao Lecture 17 CS 245

Blueprint for Restoring Safety and Soundness to the GSEs: One Year Later November 2018 Safety

Blueprint for Restoring Safety and Soundness to the GSEs June 2017 This presentation summarizes

The Validity and Soundness of Arguments Torben Amtoft Kansas State University Torben Amtoft

Outline Frameworks Approach CS1007: Object Oriented Design Requirements and

Contributing to an Eclipse Project Who Are We? Olivier Prouvost Wayne Beaton OPCoach

Incremental Change of Software Taxonomy of Evolution Changes Incremental change (IC)

The CLOSER: Automating Resource Management in Java Isil Dillig Thomas Dillig Eran Yahav Satish

The Symfony Framework YOUR FREE NEW TOOLKIT Hallo! &gt; Lead contributor to the Symfony

Topics Defining a Class Defining Instance Variables Writing Methods Chapter 7

What is an AI toolkit? There are various levels at which we can build machines, some much harder

WINK AND THE MOBILE WEB INNOVATION Jrme Giraud Orange Labs About me I missed the glory days

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -> 2

The Symfony Framework YOUR FREE NEW TOOLKIT Hallo! > Lead contributor to the Symfony