Automatic Generation of Program Specifications Jeremy Nimmer MIT - PowerPoint PPT Presentation

Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science http://pag.lcs.mit.edu/ Joint work with Michael Ernst Jeremy Nimmer, page 1

Synopsis Specifications are useful for many tasks • Use of specifications has practical difficulties Dynamic analysis can capture specifications • Recover from existing code • Infer from traces • Results are accurate (90%+) • Specification matches implementation Jeremy Nimmer, page 2

Outline • Motivation • Approach: Generate and check specifications • Evaluation: Accuracy experiment • Conclusion Jeremy Nimmer, page 3

Advantages of specifications • Describe behavior precisely • Permit reasoning using summaries • Can be verified automatically Jeremy Nimmer, page 4

Problems with specifications • Describe behavior precisely • Tedious and difficult to write and maintain • Permit reasoning using summaries • Must be accurate if used in lieu of code • Can be verified automatically • Verification may require uninteresting annotations Jeremy Nimmer, page 5

Solution Automatically generate and check specifications from the code Code Specification Generator myStack.isEmpty() = false myStack.push(elt); Proof Checker Q.E.D. Jeremy Nimmer, page 6

Solution scope • Generate and check “complete” specifications • Very difficult • Generate and check partial specifications • Nullness, types, bounds, modification targets, ... • Need not operate in isolation • User might have some interaction • Goal: decrease overall effort Jeremy Nimmer, page 7

Previous approaches Code Generation: Specification Generator myStack.push(elt); myStack.isEmpty() = false • By hand Proof Checker • Static analysis Q.E.D. Checking • By hand • Non-executable models Jeremy Nimmer, page 9

Our approach Code Specification Generator myStack.push(elt); myStack.isEmpty() = false Proof Checker Q.E.D. • Dynamic detection proposes likely properties • Static checking verifies properties • Combining the techniques overcomes the weaknesses of each • Ease annotation • Guarantee soundness Jeremy Nimmer, page 10

Daikon: Dynamic invariant detection Original Instrumented program program Data trace Invariants database Detect Instrument Run invariants Test suite Look for patterns in values the program computes: • Instrument the program to write data trace files • Run the program on a test suite • Invariant detector reads data traces, generates potential invariants, and checks them Jeremy Nimmer, page 11

ESC/Java: Invariant checking • ESC/Java: Extended Static Checker for Java • Lightweight technology: intermediate between type-checker and theorem-prover; unsound • Intended to detect array bounds and null dereference errors, and annotation violations /*@ requires x != null */ /*@ ensures this.a[this.top] == x */ void push(Object x); • Modular: checks, and relies on, specifications Jeremy Nimmer, page 12

Integration approach Code Specification Daikon myStack.push(elt); myStack.isEmpty() = false Proof ESC/Java Q.E.D. Run Daikon over target program Insert results into program as annotations Run ESC/Java on the annotated program All steps are automatic. Jeremy Nimmer, page 13

Stack object invariants public class StackAr { Object[] theArray; theArray A E I O U Y int topOfStack; topOfStack /*@ invariant theArray != null; invariant theArray != null; invariant \typeof(theArray) == \type(Object[]); invariant \typeof(theArray) == \type(Object[]); invariant topOfStack >= -1; invariant topOfStack >= -1; invariant topOfStack < theArray.length; invariant topOfStack < theArray.length; invariant theArray[0..topOfStack] != null; invariant theArray[0..topOfStack] != null; invariant theArray[topOfStack+1..] == null; invariant theArray[topOfStack+1..] == null; */ ... Jeremy Nimmer, page 14

Stack push method theArray A E I O U Y W topOfStack /*@ requires x != null; /*@ requires x != null; requires topOfStack < theArray.length - 1; requires topOfStack < theArray.length - 1; modifies topOfStack, theArray[*]; modifies topOfStack, theArray[*]; ensures topOfStack == \old(topOfStack) + 1; ensures topOfStack == \old(topOfStack) + 1; ensures x == theArray[topOfStack]; ensures x == theArray[topOfStack]; ensures theArray[0..\old(topOfStack)]; ensures theArray[0..\old(topOfStack)]; == \old(theArray[0..topOfStack]); */ == \old(theArray[0..topOfStack]); */ public void push( Object x ) { ... } Jeremy Nimmer, page 15

Stack summary • ESC/Java verified all 25 Daikon invariants • Reveal properties of the implementation (e.g., garbage collection of popped elements) • No runtime errors if callers satisfy preconditions • Implementation meets generated specification Jeremy Nimmer, page 16

Accuracy experiment • Dynamic generation is potentially unsound • How accurate are its results in practice? • Combining static and dynamic analyses should produce benefits • But perhaps their domains are too dissimilar? Jeremy Nimmer, page 18

Programs studied • 11 programs from libraries, assignments, texts • Total 2449 NCNB LOC in 273 methods • Test suites • Used program’s test suite if provided (9 did) • If just example calls, spent <30 min. enhancing • ~70% statement coverage Jeremy Nimmer, page 19

Accuracy measurement • Compare generated specification to a verifiable specification invariant theArray != null; invariant topOfStack >= -1; invariant topOfStack < theArray.length; invariant theArray[0..length-1] == null; invariant theArray[0..topOfStack] != null; invariant theArray[topOfStack+1..] == null; • Standard measures from info ret [Sal68, vR79] • Precision (correctness) : 3 / 4 = 75% • Recall (completeness) : 3 / 5 = 60% Jeremy Nimmer, page 20

Experiment results • Daikon reported 554 invariants • Precision: 96 % of reported invariants verified • Recall: 91 % of necessary invariants were reported Jeremy Nimmer, page 21

Causes of inaccuracy • Limits on tool grammars • Daikon: May not propose relevant property • ESC: May not allow statement of relevant property • Incompleteness in ESC/Java • Always need programmer judgment • Insufficient test suite • Shows up as overly-strong specification • Verification failure highlights problem; helpful in fixing • System tests fared better than unit tests Jeremy Nimmer, page 22

Experiment conclusions • Our dynamic analysis is accurate • Recovered partial specification • Even with limited test suites • Enabled verifying lack of runtime exceptions • Specification matches the code • Results should scale • Larger programs dominate results • Approach is class- and method-centric Jeremy Nimmer, page 23

Value to programmers Generated specifications are accurate • Are the specifications useful? • How much does accuracy matter? • How does Daikon compare with other annotation assistants? Answers at FSE'02 Jeremy Nimmer, page 24

Conclusion • Specifications via dynamic analysis • Accurately produced from limited test suites • Automatically verifiable (minor edits) • Specification characterizes the code • Unsound techniques useful in program development Jeremy Nimmer, page 26

Questions? Jeremy Nimmer, page 27

Formal specifications • Precise, mathematical desc. of behavior [LG01] • (Another type of spec: requirements documents) • Standard definition; novel use • Generated after implementation • Still useful to produce [PC86] • Many specifications for a program • Depends on task • e.g. runtime performance Jeremy Nimmer, page 28

Effect of bugs • Case 1: Bug is exercised by test suite • Falsifies one or more invariants • Weaker specification • May cause verification to fail • Case 2: Bug is not exercised by test suite • Not reflected in specification • Code and specification disagree • Verifier points out inconsistency Jeremy Nimmer, page 29

Automatic Generation of Program Specifications Jeremy Nimmer MIT - PowerPoint PPT Presentation

Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science http://pag.lcs.mit.edu/ Joint work with Michael Ernst Jeremy Nimmer, page 1 Synopsis Specifications are useful for many tasks Use of specifications

EPUB in the Wild Liz Castro @lizcastro http://PigsGourdsandWikis.com

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

A Framework for Automatic Generation A Framework for Automatic Generation of Configuration Files

Introduction Introduction Batteries Battery Pack 24V/DC 5.5Ah Technical specifications

Automatic Enrollment and Automatic IRAs David C. John The Heritage Foundation The Retirement

Automatic Registration and Calibration Automatic Registration and Calibration Automatic

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Specifications and formal program development in-the-large What are specifications for? For

Digital Testing Digital Testing Lecture 9 : Combinational Automatic Test Pattern Automatic

Methods and Specifications Practical considerations for methods, specifications and applications of

2014 SPECIFICATIONS UPDATE Book Cover for the 2014 Standard Specifications for Construction and

M Squared Engineering M Squared Engineering PLAN REVIEW AND PLAN REVIEW AND SPECIFICATIONS

STORAGE STORAGE RACKS RACKS PREMIUM SERIES SPECIFICATIONS SPECIFICATIONS Welded connection

A Crash Course on A Crash Course on Temporal Specifications Temporal Specifications [Kansas

Specifications Introduction to the Module This module is dedicated to specifications The

Asphalt Aggregate Specifications Aggregate Specifications In order to make good asphalt

DySy Dynamic Symbolic Execution for Invariant Inference April 28th 2009 Lukas Schwab

DySy: Dynamic Symbolic Execution for Invariant Inference C. Csallner N. Tillmann Y.

Searching for Program Invariants using Genetic Programming and Mutation Testing Sam Ratcliff,

Daikon: Dynamic Analysis for Inferring Likely Invariants Reading: Dynamically Discovering Likely

An Empirical Comparison of Automated Generation and Classification Techniques for

Extending Dynamic Constraint Detection with Disjunctive Constraints Nadya Kuzmina John Paul

Method Specifications using primitive data x = 6 x {2, 5, 30} x < y y = 5x + 10 z =

An integrated approach to P systems formal verification Marian Gheorghe 1,2 , Florentin Ipate 2 ,

Automatic Generation of Program Specifications Jeremy Nimmer MIT - PowerPoint PPT Presentation

Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science http://pag.lcs.mit.edu/ Joint work with Michael Ernst Jeremy Nimmer, page 1 Synopsis Specifications are useful for many tasks Use of specifications

EPUB in the Wild Liz Castro @lizcastro http://PigsGourdsandWikis.com

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

A Framework for Automatic Generation A Framework for Automatic Generation of Configuration Files

Introduction Introduction Batteries Battery Pack 24V/DC 5.5Ah Technical specifications

Automatic Enrollment and Automatic IRAs David C. John The Heritage Foundation The Retirement

Automatic Registration and Calibration Automatic Registration and Calibration Automatic

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Specifications and formal program development in-the-large What are specifications for? For

Digital Testing Digital Testing Lecture 9 : Combinational Automatic Test Pattern Automatic

Methods and Specifications Practical considerations for methods, specifications and applications of

2014 SPECIFICATIONS UPDATE Book Cover for the 2014 Standard Specifications for Construction and

M Squared Engineering M Squared Engineering PLAN REVIEW AND PLAN REVIEW AND SPECIFICATIONS

STORAGE STORAGE RACKS RACKS PREMIUM SERIES SPECIFICATIONS SPECIFICATIONS Welded connection

A Crash Course on A Crash Course on Temporal Specifications Temporal Specifications [Kansas

Specifications Introduction to the Module This module is dedicated to specifications The

Asphalt Aggregate Specifications Aggregate Specifications In order to make good asphalt

DySy Dynamic Symbolic Execution for Invariant Inference April 28th 2009 Lukas Schwab

DySy: Dynamic Symbolic Execution for Invariant Inference C. Csallner N. Tillmann Y.

Searching for Program Invariants using Genetic Programming and Mutation Testing Sam Ratcliff,

Daikon: Dynamic Analysis for Inferring Likely Invariants Reading: Dynamically Discovering Likely

An Empirical Comparison of Automated Generation and Classification Techniques for

Extending Dynamic Constraint Detection with Disjunctive Constraints Nadya Kuzmina John Paul

Method Specifications using primitive data x = 6 x {2, 5, 30} x &lt; y y = 5x + 10 z =

An integrated approach to P systems formal verification Marian Gheorghe 1,2 , Florentin Ipate 2 ,

Method Specifications using primitive data x = 6 x {2, 5, 30} x < y y = 5x + 10 z =