Correctness? Principles of Software Construction: Learning Goals Objects, Design, and Concurrency • Integrating unit testing into the development (Part 2: Designing (Sub-)Systems) process • Understanding and applying coverage metrics More Analysis for to approximate test suite quality; awareness of the limitations Functional Correctness • Basic understanding of the mechanisms and limitations of static analysis tools Christian Kästner Charlie Garrod • Characterizing assurance techniques in terms of soundness and completeness School of Computer Science 15-214 15-214 15-214 1 2 3 Software Errors Reminder: Functional Correctness Formal Verification • Functional errors • Design defects • The compiler ensures that the types are correct • Proving the correctness of an implementation • Performance errors • Versioning and (type checking) with respect to a formal specification, using • Deadlock configuration errors – Prevents “Method Not Found” and “Cannot add formal methods of mathematics. • Hardware errors • Race conditions Boolean to Int ” errors at runtime • Formally prove that all possible executions of • State management errors • Static analysis tools (e.g., FindBugs) recognize • Boundary errors • Metadata errors • Buffer overflow certain common problems an implementation fulfill the specification • Error-handling errors • Integration errors – Warns on possible NullPointerExceptions or forgetting • User interface errors to close files • Usability errors • Manual effort; partial automation; not • API usage errors • How to ensure functional correctness of contracts • Robustness errors automatically decidable • … beyond? • Load errors 5 15-214 15-214 15-214 4 5 6 1
Testing Testing Decisions • Who tests? • Executing the program with selected inputs in – Developers a controlled environment (dynamic analysis) – Other Developers • Goals: – Separate Quality Assurance Team – Customers – Reveal bugs (main goal) • When to test? – Assess quality (hard to quantify) – Before development – During development – Clarify the specification, documentation TEST-DRIVEN DEVELOPMENT – After milestones – Verify contracts – Before shipping "Testing shows the presence, • When to stop testing? not the absence of bugs (More in 15-313) Edsger W. Dijkstra 1969 9 15-214 15-214 15-214 7 8 9 Test Driven Development Discussion: Testing in Practice • Tests first! • Popular agile technique • Write tests as specifications before code • Never write code without a failing test • Claims: • Design approach toward testable design • Think about interfaces first TEST COVERAGE (CC BY-SA 3.0) • Avoid writing unneeded code Excirial • Higher product quality (e.g. better code, less defects) • Higher test suite quality • Higher overall productivity 11 15-214 15-214 15-214 10 11 12 2
How much testing? Blackbox: Random Inputs Blackbox: Covering Specifications • Try random inputs, many of them • Cannot test all inputs • Looking at specifications, not code: – Observe whether system crashes (exceptions, – too many, usually infinite assertions) – Try more random inputs, many more • Test representative case • Successful in certain domains (parsers, network • What makes a good test suite? • Test boundary condition issues, …) • When to stop testing? – But, many tests execute similar paths • Test exception conditions – But, often finds only superficial errors • How much to invest in testing? • (Test invalid case) – Can be improved by guiding random selection with additional information (domain knowledge or extracted from source) 15-214 15-214 15-214 13 14 15 Structural Analysis of Textual Specification Method Coverage System under Test public int read(byte[] b, int off, int len) throws IOException • Trying to execute each method as part of at least – Organized according to program decision structure Reads up to len bytes of data from the input stream into an array of bytes. An public static int binsrch (int[] a, int key) { one test attempt is made to read as many as len bytes, but a smaller number may be read. The number of bytes actually read is returned as an integer. This method blocks int low = 0; until input data is available, end of file is detected, or an exception is thrown. int high = a.length - 1; If len is zero, then no bytes are read and 0 is returned; otherwise, there is an • Will this statement get executed in a test? attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored • Does it return the correct result? while (true) { into b. The first byte read is stored into element b[off], the next one into b[off+1], and so on. The number of bytes read is, at most, equal to len. Let k be the number of if ( low > high ) return -(low+1); bytes actually read; these bytes will be stored in elements b[off] throughb[off+ k - 1], leaving elements b[off+ k ] through b[off+len-1] unaffected. int mid = (low+high) / 2; In every case, elements b[0] through b[off] and elements b[off+len] through b[b.length-1] are unaffected. if ( a[mid] < key ) low = mid + 1; else if ( a[mid] > key ) high = mid - 1; Throws: else return mid; IOException - If the first byte cannot be read for any reason other than end of file, or if the input stream has been closed, or if some other I/O error occurs. } • Does this guarantee correctness? • Could this array index be out of bounds? NullPointerException - If b is null. } IndexOutOfBoundsException - If off is negative, len is negative, or len is greater than b.length - off • Does this return statement ever get reached? 17 15-214 15-214 15-214 16 17 18 3
Statement Coverage Structure of Code Fragment to Test Statement Coverage • Trying to test all parts of the implementation • Statement coverage – What portion of program statements • Execute every statement in at least one test (nodes) are touched by test cases • Advantages – Test suite size linear in size of code – Coverage easily assessed • Issues – Dead code is not reached – May require some sophistication to select input sets – Fault-tolerant error-handling code may be difficult to “touch” • Does this guarantee correctness? – Metric: Could create incentive to Flow chart diagram for remove error handlers! junit.samples.money.Money.equals 20 21 15-214 15-214 15-214 19 20 21 Branch Coverage Path Coverage Test Coverage Tooling • Branch coverage • Path coverage • Coverage assessment tools – – What portion of condition branches are What portion of all possible paths through the program are covered by tests? – Track execution of code by test cases covered by test cases? – Loop testing: Consider representative and edge – Or: What portion of relational expressions cases: • Count visits to statements and values are covered by test cases? • Zero, one, two iterations • • Condition testing (Tai) If there is a bound n: n-1, n, n+1 iterations – Develop reports with respect to specific coverage – Multicondition coverage – all boolean • Nested loops/conditionals from inside out • combinations of tests are covered Advantages criteria – Better coverage of logical flows • Advantages • Disadvantages – Instruction coverage, – Test suite size and content derived – Infinite number of paths from structure of boolean expressions line coverage, branch – Not all paths are possible, or necessary – Coverage easily assessed • What are the significant paths? coverage • Issues – Combinatorial explosion in cases unless careful choices are made – Dead code is not reached • Example: Cobertura and • E.g., sequence of n if tests can yield – Fault-tolerant error-handling code up to 2^n possible paths EclEmma for JUnit tests may be difficult to “touch” – Assumption that program structure is basically sound 22 23 24 15-214 15-214 15-214 22 23 24 4
“Coverage” is useful but also Check your understanding dangerous • Examples of what coverage analysis could miss • Write test cases to achieve 100% line coverage – Unusual paths but not 100% branch coverage – Missing code void foo(int a, int b) { – Incorrect boundary values – Timing problems if (a == b) – Configuration issues a = a * 2; – Data/memory corruption bugs – Usability problems if (a + b > 10) – Customer requirements issues return a - b; • Coverage is not a good adequacy criterion – Instead, use to find places where testing is inadequate return a + b; } 25 27 15-214 15-214 15-214 25 26 27 Test coverage – Ideal and Real Stupid Bugs • An Ideal Test Suite – Uncovers all errors in code public class CartesianPoint { – Uncovers all errors that requirements capture private int x, y; • All scenarios covered • Non-functional attributes: performance, code safety, security, etc. int getX() { return this.x; } – Minimum size and complexity int getY() { return this.y; } – Uncovers errors early in the process public boolean equals(CartesianPoint that) { • A Real Test Suite return ( this .getX()==that.getX()) && – Uncovers some portion of errors in code – Has errors of its own ( this .getY() == that.getY()); STATIC ANALYSIS – Assists in exploratory testing for validation } – Does not help very much with respect to non-functional attributes } – Includes many tests inserted after errors are repaired to ensure they won ’t reappear 28 29 15-214 15-214 15-214 28 29 30 5
Recommend
More recommend