Test Set Coverage Efficiency Bryan Hickerson Monica Farkash - - PDF document

Mining Coverage Data for Test Set Coverage Efficiency Bryan Hickerson Monica Farkash - presenter Mike Behm Balavinayagam Samynathan IBM Austin UT Austin

Outline • Coverage Efficiency • Coverage in Time • First Time Per Test Coverage • Hard To Hit Coverage • Coverage Distribution • Scenarios to Waves • Wave Windows of Probability • Controlling the Test Load • Results & Conclusion • Acknowledgments / References 2 3/6/2015 Change "footer" to presenter's name and affiliation

Coverage Efficiency Scenario files n Test files Coverage Input to TCG generated with S1 S1 C1_1 T11 S2 T12 Test Case . . . Simulator Gen. . . . Sk T1n • 12000 scenario files – Coverage driven verification • Millions of tests – Coverage driven test case generation • Coverage – Graph based test case generation – 150k All Events – Hard-to-Hit 73k Automatic or manual targeting ( < than 2k hits for1M tests ) – Never-Hit events 15k 3 A test case generator receives as input a file which contains the scenario that the user desire as a pattern to generate tests. It uses it to generate as many tests as desired, longer or shorter, which all follow the given pattern. For each scenario we can generate tests, and for each such test we gather the coverage information as a witness of the impact that the test had on the design (as in functions that it exercised). The PowerPC methodology uses more than 12 thousand such scenario files while generating millions of tests. It collects information regarding the coverage of about 150 thousand coverage events. Each coverage event is classified depending on how many times it was hit in the moving window of the last 1 Million tests. The classification used here considers events hit less than 2000 times as hard-to-hit. We notice an approximate 10% ration of never-hit events.

Coverage Efficiency • Coverage – Never-Hit used to drive the verification process – Hard-to-Hit – Often-Hit => redundancy • Efficiency – Achieve coverage goal less resources – Reduce redundancy • Observe – Summarization, model identification, probability • Control – Control the test case generation 4 Coverage efficiency would show how quickly do I cover the coverage event list, given existing resources. Most of the time we talk in terms of two lists: the never-hit and the hard-to-hit. In spite of this, to increase the efficiency of our verification process we need to reduce the redundancy in our often-hit events too. In order to increase the efficiency of our methodology, the first step is to observe how coverage happens. We did so using summarization, model identification, probability analysis. The next step is to use that information, what we learned from looking at the data, to control the verification process to our benefit. We used that information to control only one very simple parameter in the process of test case generation, with spectacular impact on the resulting efficiency.

Coverage in Time  Same scenario: ◦ Semaphores Locking mechanism  Same load ◦ Nb. of instr. ◦ Nb. of cycles 5 The first information we looked at was how coverage happens in time. For this we chose a few testcases generated using the same scenario as input file, this means they were normally following the same high level pattern, and followed the evolution of coverage in time. For this we simply put the names of the signals which represent coverage monitors, in the list of signals to be recorded during simulation. What we show here is the number of coverage events that happen to be hit in the same simulation cycle, for 10 tests. The scenario is a locking mechanism, and even though we could guess that the lack of activity could mean locking, that is all we could guess. We didn’t learn much about how efficiency coverage is while running tests.

First Time Per Test Coverage Test A Test B Same Scenario DSI_EAO 6 To increase our understanding, we decide to remove the redundancy from our plotting. This means that we post-process the data and keep as information only the first time a given coverage event is hit. We plot it again, in time, as in how many events were covered in a given cycle, for the first time. We can remove redundancy because we can assume that there is less value in a coverage event being hit many times in the same test, than in that event being hit in different tests. By removing redundancy we start to learn about how real coverage efficiency happens. We notice that there is a rather large wave of FTPT coverage followed by other, smaller waves, later on. We also notice that, even though the overall coverage seems impressive, the only one that matters, comparatively, is not much. This would means that the most efficient tests would be the shortest, those with the highest ratio of FTPT coverage per cycle, with the big problem that some coverage events would never happen in that time-frame.

HTH Coverage in Time and FTPT A LARX_STCX Test 7 We come back to our real problem, the hard to hit events, and try to understand where do they happen. For this we plot the hard-to-hit events compared to the all events, first as overall , and then as only the FTPT events. We already know that, according to the definition of what a hard-to-hit event is in our methodology, approximately ½ of events are defined as such. This is visible from both comparisons. What we also notice is that the FTPT coverage continues to keep the same shape, in all events as in hard-to-hit events. This means that we can again state that the hard-to- hit events are accessed in “waves” throughout the test.

FTPT Gamma Distribution 400 A LARX_STCX Test 300 200 FTPT 100 0 8 What we learned until now is that the FTPT hard-to-hit events follow the shape of waves. We identify the waves as gamma distributions. It re-enforces the shape we generally use to show how coverage is being achieved. Until now we know that there are “coverage waves” that come throughout a test. We can explain them as areas that are being “opened” by certain activities. For example , after a few memory operations with a given relation between addresses, there is a cache operation that is being triggered. That cache activity is new to the test, hence it will “trigger” a new wave.

Mixture of Coverage Waves • Expectation Minimization (EM) algorithm to identify the mixture of Gaussians • Waves show the exercising of a new area in the design • We do not target coverage, we target coverage waves Number Of FTPT Events Cycles 9 We looked if that is true for the rest of the test, not only the obvious first wave. We used a modeling identification algorithm, and for ease we approximated the gamma with Gaussians. We learned that the rest of the FTPT fits to waves, throughout the whole test, no matter how far from the “initial” wave the y are.

Different Scenarios Four tests, two different scenarios (DSI_EAO 456 and 163 and ATOMIC 58 and 20 ) 10 What we learned until now is that the coverage happens in waves. What we show now is that the waves are different from a scenario to another one, but consistent to a given scenario file. Intuitively that should be easy to explain what that is correct. If a scenario stresses caches, eventually it will trigger the same activity, opening similar “wave” throughout the HW. A different scenario, going for a different activity, will reach it in a different way, and the activity can open a larger area of new events. Here we compare two scenarios, and two tests each.

Scenarios to Generate Certain Waves Particular wave(s) targeted by each scenario => Focus on the Hard-To-Hit waves for each scenario 11 We notice that, if we look at the “wave” pattern, there is a huge difference between these two scenarios, but is rather consistent for tests generated with the same scenario.

HTH Coverage Wave Windows For each scenario – Identify which hard-to-hit wave it targets – Identify the conditions under which it succeeds to achieve it. T1 T2 T3 T4 Cycle window likely to see a T5 given wave. T6 T8 T9 T10 T11 T12 12 If we look at the tests generated with the same scenario, some of them will reach the desired functionality, and hence trigger the activity wave, earlier, later, or maybe never. If we run enough such tests we can define a “window” in which there is a highe r likelihood of that wave to happen. This means that there is a cycle window which we should target with our tests. Longer tests would be a waste, because we already reached the targeted wave, shorter would be a waste because we decrease the chances of reaching that wave. This is why we look at hard-to-hit events and the probability distribution of those events to be hit in a given cycle.

Probability Mass Function 100 ID93352 ID41930 50 ID126982 0  Overall ID206127 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829 Probability => Identifies the Hard- to-Hit cycle windows • 13 Probability for an event to be hit during a test: ( ) - number of tests which contain event e - total number of tests ( )=(Nftpt(e))/Ntests probability mass function : →[0,1] The probability distribution of an event C to happen at cycle ∈ Cycles - the set of cycles in simulation, c one such cycle. E- sample space is the set of all possible outcomes, e - one element of E , e ∈ . ( )=Pr⁡( = )=Pr⁡({ ∈ : ( )= }); ∑1_(c∈Cycles)▒ 〖 p(c)=1 〗 ;

Test Set Coverage Efficiency Bryan Hickerson Monica Farkash - - PDF document

Slide 1 Mining Coverage Data for Test Set Coverage Efficiency Bryan Hickerson Monica Farkash - presenter Mike Behm Balavinayagam Samynathan IBM Austin UT Austin Slide 2 Outline Coverage Efficiency Coverage in Time First Time

Logic-based test coverage Basic approach Clauses and predicates Basic coverage criteria: CC, PC,

410(b) Coverage Testing Chad Blech Robin Snyder 410(b) Coverage Tests What is the 410(b)

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

CODE COVERAGE ISNT COVERAGE Wayne Roseberry Microsoft Author of Writing Test Plans Made

Presentation Objectives Test coverage concepts Advantages of automated test coverage

Coverage-Oriented Verification Coverage-Oriented Verification of Banias of Banias Alon Gluska

Data Flow Coverage 1 Stuart Anderson Stuart Anderson Data Flow Coverage 1 2011 c 1 Why

200511316 200511316 Test plan Test design specification g p

FLSA DUTIES TEST Exemption/Duties Test Types of Duties/Exemption Test Executive Exemption

Engineering Best Practices Test, test, test, and test some more; test as you go Start from a

Test automation Building automatically repeatable test suites Test automation n Test automation

Nehemiah Prays Nehemiah 1-2 Here is some test text Here is some test text Here is some test

Input. A set of men M , and a set of women W . Input. A set of men M , and a set of women W .

Coverage A Primer on (Potential) Coverage Issues 1 Overview of Current Situation Governmental

Occupy Central Coverage 2014 Coverage via Facebook Coverage via Twitter Liveblogging the Events

5 Official 5 Official 5 Official 5 Official Run Zone Coverage Run Zone Coverage Run Zone

Code Coverage Outlines Code Coverage. EMMA Installing EMMA Running EMMA View

Automated Test Oracles Automated Test Oracles for GUIs for GUIs Eighth International Symposium

UML-Based Statistical Test Case Generation Matthias Riebisch, Ilka Philippow, Marco Gtze

Microso' in Chicago: Beyond Pilot Projects in the City of Broad Shoulders Sco; Mauvais Director

Property Based Testing; Lazy Evaluation Liam OConnor University of Edinburgh LFCS (and UNSW)

Testing Why bother? Professor Larry Heimann Carnegie Mellon University Information Systems

ACTUARIES & DATA SCIENCE Jerome Tuttle, FCAS, CPCU Retired Actuary 1 What i is an

Improving Automation in Developer Testing: Test Oracles Tao Xie Department of Computer Science

Test Set Coverage Efficiency Bryan Hickerson Monica Farkash - - PDF document

Slide 1 Mining Coverage Data for Test Set Coverage Efficiency Bryan Hickerson Monica Farkash - presenter Mike Behm Balavinayagam Samynathan IBM Austin UT Austin Slide 2 Outline Coverage Efficiency Coverage in Time First Time

Logic-based test coverage Basic approach Clauses and predicates Basic coverage criteria: CC, PC,

410(b) Coverage Testing Chad Blech Robin Snyder 410(b) Coverage Tests What is the 410(b)

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

CODE COVERAGE ISNT COVERAGE Wayne Roseberry Microsoft Author of Writing Test Plans Made

Presentation Objectives Test coverage concepts Advantages of automated test coverage

Coverage-Oriented Verification Coverage-Oriented Verification of Banias of Banias Alon Gluska

Data Flow Coverage 1 Stuart Anderson Stuart Anderson Data Flow Coverage 1 2011 c 1 Why

200511316 200511316 Test plan Test design specification g p

FLSA DUTIES TEST Exemption/Duties Test Types of Duties/Exemption Test Executive Exemption

Engineering Best Practices Test, test, test, and test some more; test as you go Start from a

Test automation Building automatically repeatable test suites Test automation n Test automation

Nehemiah Prays Nehemiah 1-2 Here is some test text Here is some test text Here is some test

Input. A set of men M , and a set of women W . Input. A set of men M , and a set of women W .

Coverage A Primer on (Potential) Coverage Issues 1 Overview of Current Situation Governmental

Occupy Central Coverage 2014 Coverage via Facebook Coverage via Twitter Liveblogging the Events

5 Official 5 Official 5 Official 5 Official Run Zone Coverage Run Zone Coverage Run Zone

Code Coverage Outlines Code Coverage. EMMA Installing EMMA Running EMMA View

Automated Test Oracles Automated Test Oracles for GUIs for GUIs Eighth International Symposium

UML-Based Statistical Test Case Generation Matthias Riebisch, Ilka Philippow, Marco Gtze

Microso' in Chicago: Beyond Pilot Projects in the City of Broad Shoulders Sco; Mauvais Director

Property Based Testing; Lazy Evaluation Liam OConnor University of Edinburgh LFCS (and UNSW)

Testing Why bother? Professor Larry Heimann Carnegie Mellon University Information Systems

ACTUARIES &amp; DATA SCIENCE Jerome Tuttle, FCAS, CPCU Retired Actuary 1 What i is an

Improving Automation in Developer Testing: Test Oracles Tao Xie Department of Computer Science

ACTUARIES & DATA SCIENCE Jerome Tuttle, FCAS, CPCU Retired Actuary 1 What i is an