Equivalence Modulo States Bo Wang, Yingfei Xiong , Yangqingwei Shi, - - PowerPoint PPT Presentation

β–Ά
equivalence modulo states
SMART_READER_LITE
LIVE PREVIEW

Equivalence Modulo States Bo Wang, Yingfei Xiong , Yangqingwei Shi, - - PowerPoint PPT Presentation

Faster Mutation Analysis via Equivalence Modulo States Bo Wang, Yingfei Xiong , Yangqingwei Shi, Lu Zhang, Dan Hao Peking University July 12, 2017 Mutation Analysis Mutation analysis is a fundamental software analysis technique Mutation


slide-1
SLIDE 1

Faster Mutation Analysis via Equivalence Modulo States

Bo Wang, Yingfei Xiong, Yangqingwei Shi, Lu Zhang, Dan Hao Peking University July 12, 2017

slide-2
SLIDE 2

Mutation Analysis

  • Mutation analysis is a fundamental software analysis technique
  • Mutation Testing [DeMillo & Lipton, 1970]
  • Mutation-Based Test Geneartion [Fraser & Zeller, 2012]
  • Determining Mutant Utility [Just et al., 2017]
  • Mutation-based Fault Localization [Papadakis & Traon, 2012]
  • Generate-Validate Program Repair [Weimer et al., 2013]
  • Testing Software Product Lines [Devroey et al., 2014]

Program Mutants Mutants Mutants Mutants Mutants Test Results Compile & Test Mutate

slide-3
SLIDE 3

Scalability: A Key Limiting Issue

  • The testing time of a single program is amplified N times
  • N is the number of mutants
  • N can be usually large
  • N is related to the size of the program
  • Plain mutation analysis scales to only programs less than 10k

lines of code

Program Mutants Mutants Mutants Mutants Mutants Test Results Compile & Test Mutate

slide-4
SLIDE 4

Redundant Computations

  • Many computation steps in mutation analysis are

equivalent

  • Reusing them could possibly enhance scalability
slide-5
SLIDE 5

Example

p(): 1: a=x(); 2: a=a/2; 3: y(a); p(): 1: a=x(); 2: a=a-2; 3: y(a); p(): 1: a=x(); 2: a=a+2; 3: y(a); p(): 1: a=x(); 2: a=a*2; 3: y(a); test: p(); assert(…);

πΆπ‘—π‘œπ‘π‘ π‘§1 π‘†π‘“π‘‘π‘£π‘šπ‘’1 Mutate Compile Execute πΆπ‘—π‘œπ‘π‘ π‘§2 π‘†π‘“π‘‘π‘£π‘šπ‘’2 πΆπ‘—π‘œπ‘π‘ π‘§3 π‘†π‘“π‘‘π‘£π‘šπ‘’3

slide-6
SLIDE 6

Existing work 1: Mutation Schemata [Untch, Offutt, Harrold, 1993]

  • Procedures x() and y() are the same in the three

mutants, but they are compile three times

  • Redundancy in Compilation

p(): 1: a=x(); 2: a=a-2; 3: y(a); x(): … y(): … p(): 1: a=x(); 2: a=a+2; 3: y(a); x(): … y(): … p(): 1: a=x(); 2: a=a*2; 3: y(a); x(): … y(): …

slide-7
SLIDE 7

Existing work 1: Mutation Schemata [Untch, Offutt, Harrold, 1993]

  • Generate one big program that compiles once
  • Mutants are selected dynamically through input

parameters

p(): 1: a=x(); 2: a=a-2; 3: y(a); p(): 1: a=x(); 2: a=a+2; 3: y(a); p(): 1: a=x(); 2: a=a*2; 3: y(a); p(): 1: a=x(); 2: if(mut==1) a=a-2 else if (mut==2) a=a+2 else a=a*2; 3: y(a);

slide-8
SLIDE 8

Existing work 2: Split-Stream Execution

  • The computations before the first mutated

statement are redundant

1: a=x(); 2: a=a-2; 3: y(a); 1: a=x(); 2: a=a+2; 3: y(a); 1: a=x(); 2: a=a*2; 3: y(a); 1 2 3 1 2 3 1 2 3

a=x(); a=x(); a=x(); a=a-2 a=a+2 a=a*2 y(a); y(a); y(a); [King, Offutt, 1991][Tokumoto et al., 2016][Gopinath, Jensen, Groce, 2016]

slide-9
SLIDE 9

Existing work 2: Split-Stream Execution

  • Start with one process
  • Fork processes when mutated statements are

encountered

1: a=x(); 2: a=a-2; 3: y(a); 1: a=x(); 2: a=a+2; 3: y(a); 1: a=x(); 2: a=a*2; 3: y(a); 1 2 3 1 2 3 1 2 3

a=x(); a=a-2 a=a+2 a=a*2 y(a); y(a); y(a); fork() fork()

slide-10
SLIDE 10

Redundancy After the First Mutated Statement

1: a=x(); 2: a=a-2; 3: y(a); 1: a=x(); 2: a=a+2; 3: y(a); 1: a=x(); 2: a=a*2; 3: y(a); 1 2 3 1 2 3 1 2 3

a=a-2 a=a+2 a=a*2 a==2 a==2 a==2 a==0 a==4 a==4

slide-11
SLIDE 11

Our Contribution

  • Equivalence Modulo States
  • Two statements are equivalent modulo the current state

if executing them leads to the same state from the current state

  • Statements
  • a = a * 2
  • a = a + 2
  • are equivalent modulo
  • State 2 where a == 2
slide-12
SLIDE 12

Mutation Analysis via Equivalence Modulo States

  • Start with a process representing all mutants
  • At each state, group next statements into equivalence

classes modulo the current state

  • Fork processes and execute each group in one process

1 2 3 1 2 3

a=a-2 a=a+2 a=a*2 m1,m2,m3 m2,m3 m1 m2,m3 m1 m2,m3 m1 Process 1 Process 2

slide-13
SLIDE 13

Challenges

  • Objective: Overheads << Benefits
  • Challenge 1: How to efficiently determine equivalences

between statements?

  • Challenge 2: How to efficiently fork executions?
  • Challenge 3: How to efficiently classify the mutants?

1 2 3 1 2 3

a=a-2 a=a+2 a=a*2 m1,m2,m3 m2,m3 m1 m2,m3 m1 m2,m3 m1 Process 1 Process 2

slide-14
SLIDE 14

Challenge 1: Determine Statement Equivalence

  • Performance trial executions of statements and

record their changes to states

  • State: a==2
  • a=a+2 ⟹ 𝑏 β†’ 4
  • a=a*2 ⟹ 𝑏 β†’ 4
  • Compare their changes to determine equivalence
  • Does not work on statements making many

changes

  • f(x, y), f(y, x)
slide-15
SLIDE 15

Challenge 1: Determine Statement Equivalence

  • Record abstract changes that can be efficiently

compared

  • Ensuring 𝑑(𝑑1) β‰  𝑑(𝑑2) ⟹ 𝑏 𝑑1 β‰  𝑏 𝑑2
  • 𝑑1, 𝑑2: Statements
  • 𝑑(𝑑): Concrete changes made by 𝑑
  • 𝑏(𝑑): Abstract changes made by 𝑑
  • Abstract changes of method call: values of arguments
  • State: x = 2, y =2
  • f(x, y) ⟹ <2,2>
  • f(y, x) ⟹ <2,2>
slide-16
SLIDE 16

Challenge 2: Fork Execution

  • Memory: the POSIX system call β€œfork()”
  • Implements the copy-on-write mechanism
  • Integrated with POSIX virtual memory management
  • Other resources: files, network accesses, databases
  • Solution 1: implement the copy-on-write mechanism
  • Solution 2: map them into memory
slide-17
SLIDE 17

Experiments – Mutation Operators

  • Defined on LLVM IR
  • Mimicking Javalanche and Major
slide-18
SLIDE 18

Experiments - Dataset

slide-19
SLIDE 19

Experiments - Results

2 4 6 8 10 12 Time (hours) Our Approach Split-Stream Execution Mutation Schemata

2.56X speedup over SSE, and 8.95X speedup over MS

slide-20
SLIDE 20

Experiments - Results

50 100 150 200 250 flex gzip grep printtokens printtokens2 vim7.4 Our Approach Split-Stream Execution Mutation Schemata 10 20 30 40 50 replace schedule schedule2 tcas totinfo Our Approach Split-Stream Execution Mutation Schemata

slide-21
SLIDE 21

Discussion: Why worked?

  • Overheads: the overhead for each instruction is small
  • Not related to the size of the program, effectively O(1)
  • Benefits: equivalences between statements modulo the

current state are common in mutation analysis

  • 𝑏 > 𝑐 β‡’

𝑏 β‰₯ 𝑐 𝑏 > 𝑐 + 1 𝑏 > 𝑑 𝑑 > 𝑐

  • See paper for a detailed study on overheads/benefits
slide-22
SLIDE 22

Discussion: Eliminating More Redundancies

  • Translating to model checking problem
  • [KΓ€stner et al., 2012]
  • [Kim, Khurshid, and Batory, 2012]
  • Record multiple states as a meta state at variable

level

  • [KΓ€stner et al., 2012]
  • [Meinicke, 2014]
  • Overheads yet need to be controlled
slide-23
SLIDE 23

Conclusion

  • Mutation analysis is useful
  • Scalability is the a key challenge
  • Eliminating redundancy is a promising way to

address scalability

  • Overhead and benefit must be balanced
  • Equivalence modulo states could achieve 2.56X

speedup over SSE

slide-24
SLIDE 24

Acknowledgments

  • We acknowledge Rene Just and Micheal Ernst for fruitful

discussion helping scope the paper

  • and ISSTA Program Committee for the recognition
  • and you for listening!