SLIDE 1 SE320 Software Verification and Validation
Requirements Testing, Code Testing Frameworks
Fall 2018, Week 2 (10/01–05)
1
SLIDE 2
- Prof. Gordon’s Office Hours
Tuesdays 2-4pm
2
SLIDE 3
The Problem of Requirements: An Example
3
SLIDE 4 The Triangle Program
- Assume that I have been asked to write a program
satisfying the following requirements: The program will accept as input three integers: a, b, and c. These integers represent the sides of a triangle. The program will print out the type of triangle as deter- mined by the three sides: Equilateral (i.e., all 3 sides are equal), Isosceles (i.e., 2 sides are equal), Scalene (i.e., the 3 sides are all different), or NotATriangle.
- I have written the Triangle program, and I’d like you to test it.
- How would you verify this program?
4
SLIDE 5 Possible Test Cases
id (a, b, c) Expected Outcome T1 5, 5, 5 Equilateral T2 2, 2, 3 Isosceles T3 3, 4, 5 Scalene T4 3, 4, 20 NotATriangle Question Any other test cases?
- Other values for Equilateral (e.g., (1,1,1) or (100,100,100))?
- What about Isosceles with b=c (e.g., (3,7,7)) and a=c (e.g.,
(10,5,10))?
- What about Scalene with a>b>c (e.g., (5,4,3))?
- What about (5,1,1)?
5
SLIDE 6 Possible Test Cases
- What about negative values?
- What should the expected output be for negative values?
- What should we expect for (-10,-10,-10)? Equilateral, or
something else?
6
SLIDE 7 Questions and Observations
- Can we claim that we have “verified” the program by using
test cases {T1, T2, T3, T4}?
- At what point are we satisfied that we have adequately
tested the program?
- Notice there are infinitely many possible input values
- Okay, 3 ∗ 2|bits in integer|, which is only 12,884,901,888 for
32-bit integers — roughly 12.9 billion inputs
7
SLIDE 8 Questions and Observations (cont.)
The requirements are quite weak.
- What happens with non-positive inputs?
- Should (0,5,5) be considered Isosceles?
- If not, what is it?
- What about (0,0,0)?
- What if the user provides only 2 input values?
- Might depend on the language, but easy in JavaScript/etc.,
command line applications
- There is no explicit upper limit on the input values
- Is a 32-bit integer big enough?
- There is no clear specification of what constitutes
“NotATriangle”
- What assumptions did we make for “NotATriangle”?
8
SLIDE 9
Improved Requirements
The program will accept as input three integers: a, b, and c. These integers represent the sides of a triangle. The integers a, b, and c must satisfy the following con- ditions: C1 1 ≤ a ≤ 200 C2 1 ≤ b ≤ 200 C3 1 ≤ c ≤ 200 C4 a < b + c C5 b < a + c C6 c < a + b The output of the program is the type of triangle deter- mined by the three sides: Equilateral, Isosceles, Sca- lene, or NotATriangle.
9
SLIDE 10 Improved Requirements (cont.)
If an input value fails any of conditions C1 C2 and C3, the program notes this with a message, for example, "Value of b is not in the range of permitted values". If values of a, b, and c satisfy conditions C1, C2, and C3, one of the four mutually exclusive outputs is given:
- 1. If all three sides are equal, the program output is
Equilateral.
- 2. If exactly one pair of sides is equal, the program
- utputs Isosceles.
- 3. If no pair of sides is equal, the program output is
Scalene.
- 4. If any of conditions C4, C5, and C6 fails, the
program output is NotATriangle.
10
SLIDE 11 Observations
Obviously this version of the requirements is much better.
- There is a clear definition of the acceptable range of input
values
- The developer does not have to make wild assumptions
about NotATriangle
- The tester has more information to work with
11
SLIDE 12 Observations (cont.)
- The quality of the requirements matters a great deal even
for very simple programs!
- It matters even more for complex software systems!!!
- How do we know that the requirements are good?
- Even with good requirements we still have a testing
- problem. . .
- How do we know something is still not missing or is
incorrectly specified
- What if the program implements some function which is not
even specified in the requirements?
- For example, attempts to determine a “right triangle”. . .
12
SLIDE 13 Important Questions
- Two important technical questions related to verification:
- 1. How do we go about generating test cases?
- 2. What criteria do we use to ensure that we have adequately
tested the program?
- One important process question:
- 1. What is the recommended process?
13
SLIDE 14 Specification-based vs. Program-based Testing
- There are two major testing approaches:
- Specification-based Testing
- Program-based Testing
- Specification-based testing uses the requirements as the
point of reference for generating test cases and for determining adequacy of testing.
- i.e., Blackbox Testing
- Program-based testing uses the code as the point of
reference.
- i.e., Whitebox Testing
- As we’ll see, each approach has its advantages and
disadvantages.
- As you can imagine, it is possible to combine approaches –
we’ll see how.
14
SLIDE 15
Specification Testing
15
SLIDE 16 “Testing” the Specification1
- You do not need code in order to start testing
- “Testing” the specification can save time and cost later on
- What mistakes would you have made in the Triangle
program example, with the old vs. new spec?
- Mistakes in the specifications account for half of all bugs
- The specification is typically written using prose and
pictures to describe functional and non-functional aspects of the software
1Based on material from Software Testing, 2nd Ed., by Ron Patton
16
SLIDE 17 Motivation
- Consider a spec that is vague, inconsistent, incomplete, or
- therwise inadequate for describing a useful software
system
- Any software “correctly” implementing this broken
specification is inherently broken
- Therefore finding problems with the specification effectively
rules out certain bugs in the implementation
- Or at least, greatly reduces their odds and clarifies how to fix
17
SLIDE 18 Requirements Specification: An Overview
- Basic goal: To understand the problem as perceived by the
user.
- Activities of specification are problem oriented.
- Focus on what, not how (this is design)
- Don’t cloud the specification with unnecessary detail.
- Don’t pre-constrain design in the specification.
- After specification is done, do software design:
- solution oriented
- how to implement the what
- Key to specification is good communication between
customer and developers.
- Work from specification document as guide.
18
SLIDE 19 Requirements Specification
- Basically, it’s a process of setting clear and precise
expectations of the customer about the software
19
SLIDE 20 The Purpose of Specification
- Raw user requirements are often:
- vague
- contradictory
- impractical or impossible to implement
- overly concrete
- just plain wrong
- The purpose of specification is to get a usable set of
requirements from which the system may be designed and implemented, with minimal “surprises”.
20
SLIDE 21 Two Kinds of Requirements
- Functional (what): The precise tasks or functions the
system is to perform.
- e.g., details of a flight reservation system
- Non-functional (how): Usually, a constraint of some kind on
the system or its construction
- e.g., expected performance and memory requirements,
process model used, implementation language and platform, compatibility with other tools, deadlines, . . .
21
SLIDE 22 The Specification Document
- The official statement of what is required of the system
developers.
- Includes system models, requirements definition, and
requirements specification.
- Not a design document.
- States functional and non-functional requirements.
- Serves as a reference document for maintenance.
22
SLIDE 23 Specification Document “Requirements”
- Should be easy to change as requirements evolve
- Must be kept up-to-date as system changes
23
SLIDE 24 Specification Should State. . .
- Forseen problems
- “won’t support Windows 7”
- Expected evolution
- “will port to Linux and FreeBSD in next version”
- Response to unexpected events/usage
- “if input data is in the old format, will auto-convert”
24
SLIDE 25 Example Specification Structure
- Introduction (describe need for system)
- Functional Requirements
- Non-Functional Requirements
- System Evolution (describe anticipated changes)
- Glossary (technical and/or new jargon)
- Appendices
- Index
Why a glossary?
25
SLIDE 26 To summarize. . .
- Specification focuses on determining what the customer
wants, and not how it will be implemented.
- Specification is hard to get correct; it requires good
communication skills.
- Requirements may change over time.
- Requirements specification requires iteration.
- The customer often doesn’t have good grasp of what he
wants.
- Bugs created in the requirements stage are very expensive
to fix later.
26
SLIDE 27 Specification Reviews
- Goal: Discover any issues that prevent a specification from
meeting its goals (clarity, precision, etc.) before implementation
- Involve people examining the specification with the aim of
discovering anomalies and defects.
- Reviewers use domain knowledge so they are likely to have
seen the types of error that commonly arise.
- Does not require the execution of a system so may be used
before implementation.
- Effective technique for discovering errors.
27
SLIDE 28 Reviews and Testing
- Reviews and testing are complementary and not opposing
verification techniques.
- Both should be used during the V & V process.
- Caveat: Reviews cannot check non-functional
characteristics such as performance, usability, etc.
28
SLIDE 29 Spec Review Pre-Conditions
- A precise specification must be available.
- Team members must be familiar with the organization
standards.
- Management must accept that reviews will increase costs
early in the software process.
- Why?
- Management must not use reviews for staff appraisal.
- Why?
29
SLIDE 30 What Is A Specification Review?
- A process of identifying faults in the specification of a
software system.
- Review should uncover both:
- Errors made in producing specification documents
- Errors made earlier in the requirements engineering process.
30
SLIDE 31 The Naïve Approach to Specification Review
The Naïve Approach Read the specification very carefully. Note problems.
- Does this seem good? Ideal?
- Is this the approach you use for code reviews? (those who
have done them)
31
SLIDE 32 Challenges in Specification Review
The natural “read carefully” approach has some issues:
- Too much information to go through, and not enough time to
do it thoroughly.
- Unfamiliarity of individual reviewers with the overall goals of
the design and problem domain.
- No single part of the specification may get a thorough and
complete evaluation.
- Burden is on reviewer to initiate action.
- One-on-one interaction between individual reviewers and
specification team is limited.
32
SLIDE 33 Better Method: Active Specification Review Process
- Also called “perspective-based reading”
- Change from “general” review to a set of more focused
reviews.
- Use questionnaires to engage the reviewer in using the
specification.
- Instead of “find problems,” “check or clarify X, Y, and Z”
- More opportunities for one-on-one discussion between
reviewer and specification team.
33
SLIDE 34 An Example
- We have been asked to review the specification for a
hospital’s order processing system.
- The order processing system allows users to order items for
patients, such as tests or medications. What perpectives immediately come to mind? HIPAA compliance, security, reliability. . .
34
SLIDE 35 Active Specification Review Process
Step #1: Prepare the documentation for review
- Make assumptions explicit
- System can record the order pertaining to a patient.
- It is possible to obtain all the orders for a patient.
- System can determine and change the status of an order.
- The order always contains at least one item.
- The status of an order is always in one of the two states i.e
active or cancelled.
- Incorrect Usage Assumptions
- Cannot add or remove items once the order is placed.
- Once an order is cancelled, the status cannot be set to
active again.
- An item is always added with respect to an order.
35
SLIDE 36 Active Specification Review Process (cont.)
Step #2: Identify the specialized reviews
- Focus the reviewer’s attention on specific properties of the
specification (e.g., data access).
- Data access sufficiency.
- E.g., provides all data required by the other features of the
system.
- Assumption sufficiency.
- E.g., contains all of the assumptions needed to access the
feature’s data.
- Authentication properties
- E.g., describes who should have access to what
36
SLIDE 37 Active Specification Review Process (cont.)
Step #3: Identify the reviewers needed
- People with different perspectives and expertise are needed
as reviewers.
- Programmers and analysts who worked on the other
features of the order processing system.
- Programmers and analysts familiar with hospital information
systems in general (e.g., HIPAA).
37
SLIDE 38 Active Specification Review Process (cont.)
Step #4: Design the questionnaires
- Make reviewers take an active role
- Make reviewers use the documentation
- Phrase questions in an active way
- E.g., “Write down the exceptions that can occur” rather than
“Are exceptions defined for every program?”
38
SLIDE 39 Active Specification Review Process (cont.)
Step #5: Conduct the review.
- Assign reviews to the reviewers.
- Reviewers complete their reviews, meeting with the
specification authors as needed.
- Specification authors review completed questionnaires, and
meet with reviewers to resolve questions.
- Specification authors produce new version of the
specification.
39
SLIDE 40 Specification Attribute Checklist
- Completeness
- Accuracy
- Precision
- Consistency
- Relevance
- Feasibility
- Code/Design-free
- Testability
40
SLIDE 41 Specification Terminology Checklist
Things to look out for:
- Always, every, all, none, never, . . . (absolutely sure?)
- Certainly, therefore, clearly, obviously, customarily, most, . . .
(persuasion lingo)
- Some, sometimes, often, usually, ordinarily, customarily,
most, . . . (vague)
- Etc., and so forth, and so on, such as, . . . (not testable)
- Good, fast, cheap, efficient, small, stable, . . .
(unquantifiable)
- Handled, processed, rejected, skipped, eliminated, . . .
- If . . . then . . . (missing else)
41
SLIDE 42 Conclusions
- Reviewers focus on those areas they are best suited to
evaluate
- Time is used more wisely for all participants
- More errors are likely to be found
- One-on-one communication with specification authors
makes it easier for people to speak up.
- Few errors found does not necessarily indicate that the
specification is good.
- E.g., Perhaps the review process was not effective.
42
SLIDE 43
Testing Frameworks
43
SLIDE 44 A Note on Topic Ordering
- You may expect us to talk about the details of choosing tests
before we talk about writing tests
- Most of you have already written tests despite not having
undertaken a thorough study of blackbox vs. whitebox testing techniques, etc.
- I’ll assume for now you can all come up with some
reasonable tests
- Let’s talk about testing infrastructure, skipping ahead to
Chapter 7 in the text
- Then we can be very concrete when discussing specific
testing techniques
- Also, your first assignment on blackbox testing goes out
next week
44
SLIDE 45 Unit Testing Frameworks
Most of you have already used unit testing frameworks, on co-op
So you’ve all written properly scoped, sized, etc. unit tests, right? Maybe.
- Unit testing frameworks aren’t specific to unit testing!
- Unit testing frameworks provide structured ways to check
arbitrary properties of code or programs.
45
SLIDE 46 Simplified xUnit Model
- The common model for unit testing frameworks —
henceforth, xUnit frameworks2 — is a class full of test methods.
- But things are more subtle than that.
2see JUnit, NUnit, PHPUnit, ...
46
SLIDE 47 General Model
In general, xUnit frameworks:
- Use classes to group related tests
- Have a way to distinguish test case methods from helper
methods
- Some depend on annotations — @Test in JUnit 4.x, [Test]
in NUnit. . .
- Some depend on naming conventions — e.g., methods with
names beginning with “test” in earlier JUnit versions
- Have automatically managed setup and cleanup routines
(more shortly)
- Provide test runners that are responsible for orchestrating
setup, finding and running tests, and cleaning up
- Underappreciated: Can use this to customize the means of
providing test cases
47
SLIDE 48
General Model
48
SLIDE 49 Running a Test Suite
For each test class, the test runner:
- Runs the class-level test initializer
- e.g., start up external server
- For each test method in the class:
- Run the per-test initializer method
- e.g., put files on disk in known state
- Run the test method
- Run the per-test cleanup method
- Runs the class-level cleanup method
Wait— These initializers (and cleanups) sound too big to be unit tests!
49
SLIDE 50 Test Initializers
- Many tests depend on state — program states, filesystem
state, computer state
- To be repeatable, a test must be executed in the same state
every time
- Otherwise tests pass and fail randomly without changing the
- program. . .
- Technically they depend on only part of the state
- Test initializers configure this state — the test fixture
- Class-level initializers set up reusable resources
- Servers, etc.
- Test-level initializers encapsulate common setup work for
different tests
- E.g., clear static caches
50
SLIDE 51 Test Teardown
- Many test frameworks also support per-test and class-level
teardown methods
- At the class level, this releases otherwise-persistent
resources
- e.g., shut off server used for testing
- At the per-test level, typically releases memory or closes
files
- In a language with garbage collection, this is a code smell
unless you’re working with files
- Most tests shouldn’t need cleanup — subsequent initializers
should handle everything
- In some languages, destructors are used instead
51
SLIDE 52 Maintenance of Test Suites with Initializers and Teardown
- Using common initializer and teardown routines reduces
code duplication, and makes tests easier to maintain
- They also distribute the code for a test across several
sources, making reading a bit more complicated
- Sometimes half a test class’s methods require setup, and
the other half don’t
- Refactor!
- You can have as many test classes as needed
- Better to have one class where all tests require
setup/teardown, and one where none do
52
SLIDE 53 Roll Your Own Testing Framework
Doesn’t this sound like you could make a list of tests, and write your own code to run each test? Why not just do this? You could! The frameworks aren’t magic. But. . .
- Test frameworks have many additional features
- Nice reports
- We’ll see parameterized tests and matchers shortly
- Don’t need them? Ignore them. Picking up these features
later is easier than implementing them, too.
- Other tools already know about them
- IDEs (e.g., Eclipse) and build tools (ant, gradle, maven)
already know about popular test frameworks
- You don’t really lose flexibility
- For example, the Checker Framework makes source files to
check into unit tests. . .
53
SLIDE 54 Structuring Tests
How do you write a test?
- Three pieces: Arrange-Act-Assert
- Or Build-Operate-Check, Given-When-Then, . . .
- Arrange to test the relevant behavior
- Set up state
- Act: Trigger the behavior being tested
- Assert that the behavior was correct
54
SLIDE 55
Structuring Tests (cont.)
@Test public void MagicHatConvertsRedScarfIntoWhiteRabbit() { // Arrange MagicHat magicHat = new MagicHat(); magicHat.PutInto(new Scarf(Color.Red)); // Act magicHat.TapWithMagicWand(); Item itemFromHat = magicHat.PullOut(); // Assert Item expectedItem = new Rabbit(Color.White); assertEquals(expectedItem, itemFromHat); }
55
SLIDE 56
Motivation for Test Structure
Why these three sections? Understandability Arrange-Act-Assert discourages doing some setup, then checking something, doing a little more work, then checking something, doing a little more, over and over. . . . Interleaving behavior and checks makes it difficult to identify what is being tested, which makes it harder to evaluate the quality of testing.
56
SLIDE 57 So What Can You Check?
(with JUnit)
- assertEquals (.equals() equality)
- assertSame (==)
- assertNotSame (!=)
- assertTrue
- assertFalse
- assertNull
- assertNotNull
- fail (just fails the test)
57
SLIDE 58 Why Special Asserts?
Java includes an assert. C#/.NET has Debug.Assert and
- friends. Why not just use these?
- They are disabled in certain configurations.
- Java requires -ea to be passed to the JVM
- .NET requires DEBUG symbol to be defined during
compilation (/d:DEBUG)
- Harder to distinguish test failure from other internal failures
- More when we discuss Design by Contract
58
SLIDE 59 How Many Assertions?
- Ideally one
- Sometimes that doesn’t work:
- Checking multiple fields
- Checking size of a collection and its contents
- . . .
- Most important is grouping assertions together at the end,
and checking everything you need to!
59
SLIDE 60 Matchers
- JUnit has a lot of assertion constructs.
- But they’re all limited to asserting (in)equality, booleans, or
nullness
- If what you’re checking doesn’t fit neatly into that, you need
to shoe-horn it in
- OR, you could use matchers
- Called constraints in some .NET frameworks
- JUnit permits you to define your own test predicates
- And customize their error messages
- E.g., a predicate for checking that a Person is an adult
60
SLIDE 61 Are Matchers Overkill?
This might sound like overkill
- Stricly speaking, again, yes, but you get some nice bonuses:
- Nice custom error messages
- Shorter tests
- No extra maintenance costs on do-it-yourself solutions
- Third parties ship matchers
- You can use them without writing them!
- JUnit includes combinators like everyItem, either, and
both
- There are specialized matchers for checking properties of
JSON, etc.
61
SLIDE 62
Constraints in Action
public class MinMatchers { @Test public void matchStuff() { assertThat(Min.min(0,4), is(0)); assertThat(Min.min(3,4), anyOf(is(9), is(4))); } } Test Output java.lang.AssertionError: Expected: (is <9> or is <4>) but: was <3> ...
62
SLIDE 63 Parameterized Tests / Theories
- Sometimes you want to run the same code with multiple
different inputs.
- You could write a helper method with the main test logic,
then separate tests for each input
- But that’s tedious, error prone, and you need to do it again
for every new test
- Many frameworks support parameterized tests, called
theories in JUnit
- These let you specify a range of possible inputs for some
type, and run the same check for all inputs
63
SLIDE 64
JUnit Theory Example
@RunWith(Theories.class) public class TestMinTheory { @DataPoints public static int[] bounds = {Integer.MIN_VALUE, -4, 0, 6, Integer.MAX_VALUE}; @Theory public void testMinLessThanArgs(int a, int b){ int c = Min.min(a,b); assertTrue("min less than first argument", c <= a); assertTrue("min less than second argument", c <= b); } }
64
SLIDE 65 Exceptions
- Exceptions abort your program’s execution abnormally
- They either terminate your program, or transfer control to an
appropriate dynamically enclosing catch block for the appropriate exception type
- i.e., the catch block may not be locally visible in the source —
an exception can be caught outside the method that threw it.
- They are used for two purposes:
- Indicating unexpected failing conditions (e.g.,
NullPointerException)
- Managing error handling for expected error conditions (e.g.,
IOExceptions from missing files, lost network connectivity)
- These expected error conditions must be tested as well
65
SLIDE 66
Testing Exceptions, the Old Way
@Test public void testFooFailsForNull() { try { foo(null); // should yield IllegalArgumentException fail("foo(null) didn't yield IllegalArgumentException"); } catch (IllegalArgumentException e) { // Do nothing, or something! (check error message, etc.) } }
Advantages Straightforward, no special framework knowledge Disadvantages Verbose, requires repeating the pattern correctly There are two alternatives.
66
SLIDE 67
Testing Exeptions, the Short Way
@Test(expected = IllegalArgumentException.class) public void testExceptionsWithBrevity() { throw new IllegalArgumentException( "No good reason for this."); }
JUnit’s @Test accepts an optional expected exception. Advantages Concise Disadvantages No way to check further properties of the exception, like message or reason.
67
SLIDE 68
Testing Exceptions, the Robust Way
@Rule public ExpectedException thrown = ExpectedException.none(); @Test public void testExceptionsWithStyle() { thrown.expect(IllegalArgumentException.class); thrown.expectMessage("Bad Argument"); throw new IllegalArgumentException("Bad Argument"); }
Advantages Can perform extra checks on the exception Disadvantages A bit more verbose if you don’t need the power
68
SLIDE 69
Regression Testing
Who remembers what regression testing is from last week? Regression Testing Running tests again to ensure previously-working functionality, performance, etc. continues to work as expected. This is the primary use of most tests, and in particular automated tests.
69
SLIDE 70 How Do You Do Regression Testing?
There are several options:
- Manually
- Works, when you remember
- If there are multiple sets of tests, you might forget some
relevant ones
- Via QA
- Some tests may be specialized, or require special hardware,
etc., and may need to be run by specialists
- Having only QA run tests has fallen out of favor
- Automatically
- via continuous integration
70
SLIDE 71 Continuous Integration
- Merge “often”
- Automate testing
- Needed emphasis in
1994
- Server automatically runs
tests for every checkin
archived
- Originally just a part of
CI, now the most prominent part
Popular book on CI
71