[PPT] - Rostra: A Framework for Detecting Redundant Object-Oriented Unit PowerPoint Presentation

SLIDE 1

1

Rostra: A Framework for Detecting Redundant Object-Oriented Unit Tests

Tao Xie Darko Marinov David Notkin

Dept. of Computer Science & Engineering, University of Washington,

MIT Computer Science and Artificial Intelligence Laboratory (UIUC)

23 Sept. 2004

ASE 2004, Linz, Austria

1 1 2 2 1

SLIDE 2

2

Motivation

Tool generated test cases
Many test cases
Important to reduce by eliminating “redundant” test cases
Need automation
Common approach
Identify “similar” test cases and eliminate
Without reducing “quality” of test suite*
Object-oriented programs
Test case is a sequence of method calls on an object
Note: Unit tests only

*Some reduction in fault detection may be tolerated!

SLIDE 3

3

Example Code

public class IntStack { private int[] store; private int size; public IntStack() { … } public void push(int value) { … } public int pop() { … } public boolean isEmpty() { … } public boolean equals(Object o) { … } } [Henkel&Diwan 03]

SLIDE 4

4

Example Tests

Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 2 (T2): IntStack s2 = new IntStack(); s2.push(3); s2.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

SLIDE 5

5

Same inputs ⇒ Same behavior

Method Execution

bject state @entry
bject state @exit

Method arguments Method return Input = + Output = + Testing a method with the same inputs is unnecessary Assumption: deterministic method

How to represent object states?

SLIDE 6

6

Redundant Test Cases Defined

Equivalent method executions
the same method names, signatures, and input

(equivalent object states @entry and arguments)

Redundant test case:
A test case is redundant for a test suite if the test suite

has exercised method executions equivalent to all method executions exercised by the test case

SLIDE 7

7

Related Work

State equivalence using observational equivalence

[Bernot et al. 91, Doong&Frankl 94, Henkel&Diwan 03]

for verifying or inferring algebraic specifications
Expensive because of number of sequences
State equivalence based on user-defined

abstraction functions [Grieskamp et al. 02]

AsmLT tool for conformance testing
Need to define the function

SLIDE 8

8

Five State-Representation Techniques

Method-sequence representations
WholeSeq
The entire sequence
ModifyingSeq
Ignore methods that don’t modify the state
Concrete-state representations
WholeState
The full concrete state
MonitorEquals
Relevant parts of the concrete state
PairwiseEquals
equals() method used to compare pairs of states

SLIDE 9

9

WholeSeq Representation

Notation: methodName(entryState, methodArgs).state [Henkel&Diwan 03] Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

Method sequences that create objects

SLIDE 10

10

<init>( ).state

WholeSeq Representation

Notation: methodName(entryState, methodArgs).state [Henkel&Diwan 03] Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

Method sequences that create objects

SLIDE 11

11

<init>( ).state

WholeSeq Representation

Notation: methodName(entryState, methodArgs).state [Henkel&Diwan 03] Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

Method sequences that create objects

isEmpty( ).state

SLIDE 12

12

<init>( ).state

WholeSeq Representation

Notation: methodName(entryState, methodArgs).state [Henkel&Diwan 03] Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

Method sequences that create objects

isEmpty( ).state push( , 3).state s1.push 2

SLIDE 13

13

<init>( ).state

WholeSeq Representation

Notation: methodName(entryState, methodArgs).state [Henkel&Diwan 03] Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

Method sequences that create objects

isEmpty( ).state push( , 3).state s1.push 2 s3.push push(<init>( ).state, 3).state 2

SLIDE 14

14

s1.push s3.push push(<init>( ).state, 3).state push(isEmpty(<init>( ).state).state, 3).state

ModifyingSeq Representation

Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

State-modifying method sequences that create objects

2 2

SLIDE 15

15

WholeState Representation

s1.push s2.push

store.length = 3 store[0] = 3 store[1] = 2 store[2] = 0 size = 1

Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 2 (T2): IntStack s2 = new IntStack(); s2.push(3); s2.push(5);

store.length = 3 store[0] = 3 store[1] = 0 store[2] = 0 size = 1 5 5

The entire concrete state reachable from the object

SLIDE 16

16

MonitorEquals Representation

s1.push s2.push store.length = 3 store[0] = 3 store[1] = 2 store[2] = 0 size = 1 Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 2 (T2): IntStack s2 = new IntStack(); s2.push(3); s2.push(5); store.length = 3 store[0] = 3 store[1] = 0 store[2] = 0 size = 1

5 5

The relevant part of the concrete state defined by equals (invoking

bj.equals(obj) and monitor field accesses)

SLIDE 17

17

PairwiseEquals Representation

s1.push s2.push Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 2 (T2): IntStack s2 = new IntStack(); s2.push(3); s2.push(5);

5 5

The results of equals invoked to compare pairs of states

s1.equals(s2) == true

Fundamental difference between

MonitorEquals and PairwiseEquals

MonitorEquals monitors field

accesses during execution of the equals() method and compares the monitored parts

PairwiseEquals relies only on the
utput of the equals() method
Example of sets

SLIDE 18

18

Detected Redundant Tests

T3, T2 MonitorEquals T3 WholeState T3, T2 PairwiseEquals T3 ModifyingSeq WholeSeq

detected redundant tests w.r.t. T1

technique

Test 1 (T1): IntStack s1 = new IntStack(); s1.isEmpty(); s1.push(3); s1.push(2); s1.pop(); s1.push(5); Test 2 (T2): IntStack s2 = new IntStack(); s2.push(3); s2.push(5); Test 3 (T3): IntStack s3 = new IntStack(); s3.push(3); s3.push(2); s3.pop();

SLIDE 19

19

Experiment: Evaluated Test Generation Tools

ParaSoft Jtest 4.5
A commercial Java testing tool
Generates tests with method-call lengths up to three
JCrasher 0.2.7
An academic robustness testing tool
Generates tests with method-call lengths of one

SLIDE 20

20

Questions to Be Answered

How much do we benefit after applying

Rostra on tests generated by Jtest and JCrasher?

Does redundant-test removal decrease test

suite quality?

SLIDE 21

21

Experimental Subjects

1000 931 949 25 61 TreeMap 86 3028 398 32 38 LinkedList 47 5186 597 19 27 HeapMap 150 3743 468 14 24 FibonacciHeap 64 779 166 7 10 DisjSet 438 6205 535 17 22 BinomialHeap 56 277 246 8 13 BinSearchTree 135 519 34 7 7 BankAccount 31 470 70 8 9 ShoppingCart 14 1423 106 11 11 UBStack 6 94 44 5 5 IntStack JCrasher tests Jtest tests ncnb loc public methods methods class

SLIDE 22

22

Assumptions About Subjects

Method-sequence representations assume that each

method does not modify argument state

MonitorEquals and PairwiseEquals representations

assume a user-defined equals()

SLIDE 23

23

Quality of Original Test Suites

30% 53% Avg mutant killing ratio (600 mutants) 52% 77% Avg Branch cov 2 4 Avg num uncaught exceptions

JCrasher-generated tests Jtest-generated tests

SLIDE 24

24

Elapsed Real Time in Minimizing Jtest-Generated Tests (in secs)

50 100 150 200 250 300 I n t S t a c k U B S t a c k S h

p

p i n g C a r t B a n k A c c

u

n t B i n S e a r c h T r e e B i n

m

i a l H e a p D i s j S e t F i b

n

a c c i H e a p H a s h M a p L i n k e d L i s t T r e e M a p WholeSeq ModifyingSeq WholeState MonitorEquals PairwiseEquals

SLIDE 25

25

Elapsed Real Time in Minimizing JCrasher-Generated Tests (in secs)

1 2 3 4 5 6 7 8 IntStack UBStack ShoppingCart BankAccount BinSearchTree BinomialHeap DisjSet FibonacciHeap HashMap LinkedList TreeMap WholeSeq ModifyingSeq WholeState MonitorEquals PairwiseEquals

SLIDE 26

26

Redundancy among Jtest-generated Tests

The last three techniques detect around 90% redundant tests
Detected redundancy in increasing order for five techniques

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% IntStack UBStack ShoppingCart BankAccount BinSearchTree BinomialHeap DisjSet FibonacciHeap HashMap LinkedList TreeMap WholeSeq ModifyingSeq WholeState MonitorEquals PairwiseEquals

SLIDE 27

27

Redundancy among JCrasher-generated Tests

The last three techniques detect over 50% on half subjects
JCrasher generates fewer tests and shorter tests

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 IntStack UBStack ShoppingCart BankAccount BinSearchTree BinomialHeap DisjSet FibonacciHeap HashMap LinkedList TreeMap WholeSeq ModifyingSeq WholeState MonitorEquals PairwiseEquals

SLIDE 28

28

Quality of Minimized Test Suites

All five techniques on JCrasher preserve all

measurements

The first three techniques on Jtest preserve all

measurements.

Two equals techniques on Jtest decrease (with
nly small loss in 2 programs)
in branch cov %
in mutant killing %

SLIDE 29

29

Comparison of Five Techniques

Time and space taken to find redundant tests
from a couple of seconds to several minutes across

subjects

in roughly increasing order except for pairwiseEquals

(being the least expensive)

The number of redundant tests found
in increasing order

SLIDE 30

30

Conclusions

Redundant tests add cost without any benefit
Existing test generation tools can be potentially

improved (by incorporating Rostra framework)

The experimental results have shown
High redundancy among their generated tests
Removing them does not decrease test suite quality
Rostra framework useful in test minimization,

assessment, selection, and generation

SLIDE 31

31

Evolution of ParaSoft Jtest

Version 4.5 (released in March 2002) allows

method-call lengths (1 ― 3) [studied in this work]

Version 5.0 (released in Feb 2004) allows

method-call length of only 1

ParaSoft notified the authors last week that

Version 6.0 (internal version, not yet released) has addressed the test redundancy issue identified by the authors and added back the

ption to generate long call sequence

SLIDE 32

32

Questions?

SLIDE 33

33

Threats to Validity

Representative of true practice?
Subject programs, third-party test generation

tools

Instrumentation effects that bias the results
Faults on tools (Rostra, Jtest, JCrasher,

measurement tools)

SLIDE 34

34

Applications

Assessment: compare the quality of different test

suites.

Selection: select a subset of automatically generated

tests to augment an existing test suite.

Minimization: minimize an automatically generated

test suite for correctness inspection and regression executions.

Generation: avoid generating and executing redundant

tests

SLIDE 35

35

Related Work

Test selection or minimization with some loss in

the quality of test suites [Rothermel et al. 98,

Chang&Richardson 99, Harder et al. 03, Xie&Notkin 03]

for regression testing or test inspection