Comparing the Effectiveness of Testing Techniques a paper by Elaine - - PowerPoint PPT Presentation

comparing the effectiveness of testing techniques
SMART_READER_LITE
LIVE PREVIEW

Comparing the Effectiveness of Testing Techniques a paper by Elaine - - PowerPoint PPT Presentation

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Comparing the Effectiveness of Testing Techniques a paper by Elaine J. Weyuker presented by Matthias Kegele June 8, 2011 Comparing the Effectiveness


slide-1
SLIDE 1

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Comparing the Effectiveness of Testing Techniques

a paper by Elaine J. Weyuker presented by Matthias Kegele June 8, 2011

Comparing the Effectiveness of Testing Techniques

slide-2
SLIDE 2

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Agenda

◮ Comparison relations ◮ Probabilistic relations ◮ Limitations of formal analysis ◮ Empirical comparison of criteria ◮ Conclusions

Comparing the Effectiveness of Testing Techniques

slide-3
SLIDE 3

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

On comparing effectiveness of testing techniques

◮ ultimate goal: higher dependability of SUT ◮ definition of effective ◮ finding faults: quality over quantity? ◮ real life versus in vitro

Comparing the Effectiveness of Testing Techniques

slide-4
SLIDE 4

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Comparison relations

◮ subsumption relation ◮ power relation ◮ BETTER relation

Comparing the Effectiveness of Testing Techniques

slide-5
SLIDE 5

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Subsumption relation (1)

◮ natural way ◮ program P, test set TS, test criteria C1 and C2 ◮ C1 subsumes C2 ◮ ∀TSP satisfies(TS, C1) ⇒ satisfies(TS, C2)

Comparing the Effectiveness of Testing Techniques

slide-6
SLIDE 6

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Subsumption relation (2)

◮ incomparability of testing criteria ◮ can be misleading ◮ wide spectrum of test suites that satisfy a given criterium ◮ little or no guidance how to choose test suite

Comparing the Effectiveness of Testing Techniques

slide-7
SLIDE 7

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Power relation (1)

◮ detection of failure ◮ C1 is at least as powerful as C2 ◮ detects(C2, failure) ⇒ detects(C1, failure)

Comparing the Effectiveness of Testing Techniques

slide-8
SLIDE 8

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Power relation (2)

◮ still problem with incomparability ◮ C1 is at least as powerful as C2 ◮ ∃C2: exposes some failures more often than C1 ◮ ∃C1, C2: none of both find certain failures

Comparing the Effectiveness of Testing Techniques

slide-9
SLIDE 9

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

BETTER relation (1)

◮ test case required by criterion C to test P:

tc ∈ ts ∧ ∀ts.satisfies(ts, C) ⇒ requires(C, ts)

◮ C1 is BETTER than C2 ◮ ∀tc.requires(C2, tc) ⇒ requires(C1, tc) ◮ relevant test sets (monotonic)

Comparing the Effectiveness of Testing Techniques

slide-10
SLIDE 10

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

BETTER relation (2)

◮ still problem with incomparability ◮ very few criteria require specific test case

Comparing the Effectiveness of Testing Techniques

slide-11
SLIDE 11

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Probabilistic measures

◮ covers relation ◮ properly covers relation ◮ expected number of failures detected

Comparing the Effectiveness of Testing Techniques

slide-12
SLIDE 12

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Input domains (1)

◮ program P with domain D (possible inputs) ◮ partition into subsets: subdomains Di ◮ test set: for each subdomain one test case

Comparing the Effectiveness of Testing Techniques

slide-13
SLIDE 13

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Input domains (2)

◮ program P has domain D = {0, 1, 2, 3, 4} ◮ failure causing input: 0 ◮ C1 requires test case from subdomain {0, 1, 2} and one from

{3, 4}

◮ C2 requires test case from the subdomain {0, 1, 2} and one

from {0, 3, 4}

◮ test sets that satisfy C1: (0,3), (0,4), (1, 3), (1, 4), (2, 3), (2, 4) ◮ test sets that satisfy C2:

(0,0), (0,3), (0,4), (1,0), (1, 3), (1, 4), (2,0), (2, 3), (2, 4)

◮ P(find failure with C1) = 1 3 ◮ P(find failure with C2) = 5 9

Comparing the Effectiveness of Testing Techniques

slide-14
SLIDE 14

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Covers relation (1)

◮ C1 covers C2 ◮ SDC(P, S) subdomains used for generating test sets satisfying

criterion C

◮ ∀D ∈ SDC2(P, S) exists {D1, ..., Dn} belonging to SDC1(P, S)

such that D1 ∪ ... ∪ Dn = D

◮ universally covers : ∀(P, S) : C1 covers C2

Comparing the Effectiveness of Testing Techniques

slide-15
SLIDE 15

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Covers relation (2)

◮ M = P(”test set exposes at least one fault”) ◮ specification S, subdomains size di of Di, failure causing

inputs mi (tc)

◮ M(C, P, S) = 1 − n i=1 (1 − mi di )

Comparing the Effectiveness of Testing Techniques

slide-16
SLIDE 16

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Covers relation (3)

◮ deficit: ◮ ∃C1, C2, P, S : C1 covers C2 ⇒ M(C1, P, S) ≥ M(C2, P, S)

Comparing the Effectiveness of Testing Techniques

slide-17
SLIDE 17

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Properly covers relation

◮ C1 properly covers C2 ⇒ M(C1, P, S) ≥ M(C2, P, S) ◮ each subdomain of C2 is covered by a union of C1 subdomains ◮ none of C2 subdomains occur more often in the covering than

it does in SDC1 (cmp. example)

◮ properly universally covers : ∀(P, S) : C1 properly covers C2

Comparing the Effectiveness of Testing Techniques

slide-18
SLIDE 18

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Expected number of failures detected

◮ E(C, P, S) = n i=1 mi di ◮ C1 properly covers C2 ⇒ E(C1, P, S) ≥ E(C2, P, S) ◮ ability to rank criteria (next slide) ◮ properly covers = will find more faults ⇒ just more likely

Comparing the Effectiveness of Testing Techniques

slide-19
SLIDE 19

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions Comparing the Effectiveness of Testing Techniques

slide-20
SLIDE 20

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Limitations of formal analysis

◮ relations compare idealized versions of testing strategies, not

practicle ones

◮ missing risk analysis: high consequence faults, trivial faults ◮ no provision of human variability: experiences, expertise,

acquired intuition

◮ cost of testing: is it beneficial to use certain criteria?

Comparing the Effectiveness of Testing Techniques

slide-21
SLIDE 21

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Empirical comparison of criteria (1)

◮ formal scientific experiment:

◮ synthetic faults seeded into a program ◮ small programs ◮ not representative

◮ case study:

◮ large industrial software system ◮ containing real faults ◮ need for modelling the system (expensive) ◮ may not be representative of a wider class of programs Comparing the Effectiveness of Testing Techniques

slide-22
SLIDE 22

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Empirical comparison of criteria (2)

◮ repeating similar case studies ⇒ body of knowledge ◮ different sorts of systems, development environment, test

personnel, languages

◮ severity of faults in case studies?

Comparing the Effectiveness of Testing Techniques

slide-23
SLIDE 23

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Conclusions (1)

◮ formal comparison relations

◮ intuitive but profound problems ◮ increase of dependability? ◮ C1 is better than C2 but is C1 good? Comparing the Effectiveness of Testing Techniques

slide-24
SLIDE 24

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Conclusions (2)

◮ probabilistic measures

◮ more appropriate, still serious flaws ◮ absolute instead of relative comparison possible?

◮ empirical studies

◮ how it works in practice ◮ test ultimate goal of achiving higher dependability Comparing the Effectiveness of Testing Techniques

slide-25
SLIDE 25

Comparison relations Probabilistic measures Limitations Empirical comparison Conclusions

Thank you for your attention! Questions?

Comparing the Effectiveness of Testing Techniques