Quickly Detecting Relevant Program Invariants Michael Ernst, Adam - - PowerPoint PPT Presentation

quickly detecting relevant program invariants
SMART_READER_LITE
LIVE PREVIEW

Quickly Detecting Relevant Program Invariants Michael Ernst, Adam - - PowerPoint PPT Presentation

Quickly Detecting Relevant Program Invariants Michael Ernst, Adam Czeisler, Bill Griswold (UCSD), and David Notkin University of Washington http://www.cs.washington.edu/homes/mernst/daikon Michael Ernst, page 1 Overview Goal: improve


slide-1
SLIDE 1

Michael Ernst, page 1

Quickly Detecting Relevant Program Invariants

Michael Ernst, Adam Czeisler, Bill Griswold (UCSD), and David Notkin

University of Washington

http://www.cs.washington.edu/homes/mernst/daikon

slide-2
SLIDE 2

Michael Ernst, page 2

Overview

Goal: improve dynamic invariant detection

[ICSE 99, TSE]

Relevance improvements:

  • add desired invariants (2 techniques)
  • eliminate undesired ones (3 techniques)

Experiments validate the success

slide-3
SLIDE 3

Michael Ernst, page 3

Program invariants

Detect invariants (as in asserts or specifications)

  • x > abs(y)
  • x = 16*y + 4*z + 3
  • array a contains no duplicates
  • for each node n, n = n.child.parent
  • graph g is acyclic
slide-4
SLIDE 4

Michael Ernst, page 4

Uses for invariants

  • Write better programs [Gries 81, Liskov 86]
  • Document code
  • Check assumptions: convert to assert
  • Maintain invariants to avoid introducing bugs
  • Locate unusual conditions
  • Validate test suite: value coverage
  • Provide hints for higher-level profile-directed

compilation [Calder 98]

  • Bootstrap proofs [Wegbreit 74, Bensalem 96]
slide-5
SLIDE 5

Michael Ernst, page 5

Dynamic invariant detection is accurate

Recovered formal specifications, found bugs Target programs:

  • The Science of Programming [Gries 81]
  • Program checkers [Detlefs 98, Xi 98]
  • MIT 6.170 student programs
  • Data Structures and Algorithm Analysis in Java [Weiss 99]
slide-6
SLIDE 6

Michael Ernst, page 6

Dynamic invariant detection is useful

563-line C program: regexp search & replace

[Hutchins 94, Rothermel 98]

  • Explicated data structures
  • Contradicted expectations, preventing bugs
  • Revealed bugs
  • Showed limited use of procedures
  • Improved test suite
  • Validated program changes
slide-7
SLIDE 7

Michael Ernst, page 7

Dynamic invariant detection

Look for patterns in values the program computes:

  • Instrument the program to write data trace files
  • Run the program on a test suite
  • Invariant engine reads data traces, generates potential

invariants, and checks them

Invariants Instrumented program Original program Test suite

Run Instrument

Data trace database

Detect invariants

slide-8
SLIDE 8

Michael Ernst, page 8

Checking invariants

For each potential invariant:

  • instantiate

(determine constants like a and b in y = ax + b)

  • check for each set of variable values
  • stop checking when falsified

This is inexpensive: many invariants, each cheap

slide-9
SLIDE 9

Michael Ernst, page 9

Relevance

Usefulness to a programmer for a task Contingent on task and programmer We manually classified invariants Perfect output is unnecessary (and impossible)

slide-10
SLIDE 10

Michael Ernst, page 10

Improved invariant relevance

Add desired invariants:

  • 1. Implicit values
  • 2. Unused polymorphism

Eliminate undesired invariants (and improve performance):

  • 3. Unjustified properties
  • 4. Redundant invariants
  • 5. Incomparable variables
slide-11
SLIDE 11

Michael Ernst, page 11

  • 1. Implicit values

Goal: relationships over non-variables Examples:

  • for array a: length(a), sum(a), min(a), max(a)
  • for array a and scalar i: a[i], a[0..i]
  • for procedure p: #calls(p)
slide-12
SLIDE 12

Michael Ernst, page 12

Derived variables

Successfully produces desired invariants Adds many new variables Potential problems:

  • slowdown: interleave derivation and inference
  • irrelevant invariants: techniques 3–5, later in talk
slide-13
SLIDE 13

Michael Ernst, page 13

  • 2. Unused polymorphism

Variables declared with general type, used with more specific type Example: given a generic list that contains only integers, report that the contents are sorted Also applicable to subtype polymorphism

slide-14
SLIDE 14

Michael Ernst, page 14

Unused polymorphism example

class MyInteger { int value; … } class Link { Object element; Link next; … } class List { Link header; … } List myList = new List(); for (int i=0; i<10; i++) myList.add(new MyInteger(i));

Desired invariant: in class List,

header.closure(next) is sorted by 

  • ver key .element.value
slide-15
SLIDE 15

Michael Ernst, page 15

Polymorphism elimination

Daikon respects declared types Pass 1: front end outputs object ID, runtime type, and all known fields Pass 2: given refined type, front end outputs more fields Sound for deterministic programs Effective for programs tested so far

slide-16
SLIDE 16

Michael Ernst, page 16

  • 3. Unjustified properties

Given three samples for x:

x = 7 x = –42 x = 22

Potential invariants: x  0 x  22 x  –42

slide-17
SLIDE 17

Michael Ernst, page 17

Statistical checks

Check hypothesized distribution To show x  0 for v values of x in range of size r, probability of no zeroes is Range limits (e.g., x  22):

  • same number of samples as neighbors (uniform)
  • more samples than neighbors (clipped)

v r        1 1

variable value # of samples variable value # of samples

slide-18
SLIDE 18

Michael Ernst, page 18

Duplicate values

Array sum program:

// Sum array b of length n into variable s. i := 0; s := 0; while i  n do { s := s+b[i]; i := i+1 }

b is unchanged inside loop Problem: at loop head,

–88  b[n – 1]  99 –556  sum(b)  539

Reason: more samples inside loop

slide-19
SLIDE 19

Michael Ernst, page 19

Disregard duplicate values

Idea: count a value if its var was just modified Front end outputs modification bit per value

  • compared techniques for eliminating duplicates

Result: eliminates undesired invariants

slide-20
SLIDE 20

Michael Ernst, page 20

  • 4. Redundant invariants

Given: 0  i  j Redundant: a[i]  a[0..j] max(a[0..i])  max(a[0..j]) Redundant invariants are logically implied Implementation contains many such tests

slide-21
SLIDE 21

Michael Ernst, page 21

Suppress redundancies

Avoid deriving variables: suppress 25-50%

  • equal to another variable
  • nonsensical (a[i] when i < 0)

Avoid checking invariants:

  • false invariants: trivial improvement
  • true invariants: suppress 90%

Avoid reporting trivial invariants: suppress 25%

slide-22
SLIDE 22

Michael Ernst, page 22

  • 5. Unrelated variables

Problem: the following are of no interest

bool b; int *p; b < p int myweight, mybirthyear; myweight < mybirthyear

slide-23
SLIDE 23

Michael Ernst, page 23

Limit comparisons

Check relations only over comparable variables

  • declared program types
  • Lackwit [O’Callahan 97]: value flow analysis

based on polymorphic type inference

slide-24
SLIDE 24

Michael Ernst, page 24

Comparability results

Comparisons:

  • declared types: 60% as many comparisons
  • Lackwit: 5% as many comparisons; scales well

Runtime: 40-70% improvement Few differences in reported invariants

slide-25
SLIDE 25

Michael Ernst, page 25

Future work

Online inference Proving invariants Characterize good test suites New invariants: temporal, existential User interface

  • control over instrumentation
  • display and manipulation of invariants

Further experimental evaluation

  • apply to more and bigger programs
  • apply to a variety of tasks
slide-26
SLIDE 26

Michael Ernst, page 26

Related work

Dynamic inference

  • inductive logic programming [Bratko 93, Cypher 93]
  • program spectra [Reps 97, Harrold 98]
  • finite state machines [Boigelot 97, Cook 98]

Static inference

  • checking specifications [Detlefs 96, Evans 96, Jacobs 98]
  • specification extension [Givan 96, Hendren 92]
  • other [Jeffords 98, Henry 90, Ward 96]
slide-27
SLIDE 27

Michael Ernst, page 27

Conclusions

Naive implementation is infeasible Relevance improvements: accuracy, performance

  • add desired invariants
  • eliminate undesired invariants

Experimental validation Dynamic invariant detection is promising for research and practice

slide-28
SLIDE 28

Michael Ernst, page 28

Questions?

slide-29
SLIDE 29

Michael Ernst, page 29

Ways to obtain invariants

  • Programmer-supplied
  • Static analysis: examine the program text

[Cousot 77, Gannod 96]

  • properties are guaranteed to be true
  • pointers are intractable in practice
  • Dynamic analysis: run the program
  • complementary to static techniques
slide-30
SLIDE 30

Michael Ernst, page 30

Unused polymorphism example

class MyInteger { int value; … } class Link { Object element; Link next; … } class List { Link header; … } List myList = new List(); for (int i=0; i<10; i++) myList.add(new MyInteger(i));

Desired invariant: in class List,

header.closure(next).element.value: sorted by 

slide-31
SLIDE 31

Michael Ernst, page 31

Comparison with AI

Dynamic invariant detection: Can be formulated as an AI problem Cannot be solved by current AI techniques

  • not classification or clustering
  • no noise
  • no negative examples; many positive examples
  • intelligible output
slide-32
SLIDE 32

Michael Ernst, page 32

Is implication obvious?

Want:

size(topOfStack.closure(next)) = size(orig(topOfStack.closure(next))) + 1

Get:

size(topOfStack.next.closure(next)) = size(topOfStack.closure(next)) – 1

topOfStack.next.closure(next) =

  • rig(topOfStack.closure(next))

Solution: interactive UI, queries on variables