Testing Database-Driven Applications: Challenges and Solutions - - PowerPoint PPT Presentation

testing database driven applications challenges and
SMART_READER_LITE
LIVE PREVIEW

Testing Database-Driven Applications: Challenges and Solutions - - PowerPoint PPT Presentation

Testing Database-Driven Applications: Challenges and Solutions Gregory M. Kapfhammer Department of Computer Science Mary Lou Soffa University of Pittsburgh Department of Computer Science University of Pittsburgh Department of Computer


slide-1
SLIDE 1

Testing Database-Driven Applications: Challenges and Solutions

Gregory M. Kapfhammer Department of Computer Science University of Pittsburgh Department of Computer Science Allegheny College Mary Lou Soffa Department of Computer Science University of Pittsburgh

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 1/32

slide-2
SLIDE 2

Outline

Introduction and Motivation Testing Challenges Database-Driven Applications A Unified Representation Test Adequacy Criteria Test Suite Execution Test Coverage Monitoring Conclusions and Resources

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 2/32

slide-3
SLIDE 3

Motivation

The Risks Digest, Volume 22, Issue 64, 2003 Jeppesen reports airspace boundary problems

About 350 airspace boundaries contained in Jeppesen NavData are incorrect, the FAA has warned. The error

  • ccurred at Jeppesen after a software upgrade when

information was pulled from a database containing 20,000 airspace boundaries worldwide for the March NavData update, which takes effect March 20.

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 3/32

slide-4
SLIDE 4

Testing Challenges

Should consider the environment in which software applications execute Must test a program and its interaction with a database Database-driven application’s state space is well-structured, but infinite (Chays et al.) Need to show program does not violate database integrity, where integrity = consistency + validity (Motro) Must locate program and database coupling points that vary in granularity

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 4/32

slide-5
SLIDE 5

Testing Challenges

The structured query language’s (SQL) data manipulation language (DML) and data definition language (DDL) have different interaction characteristics Database state changes cause modifications to the program representation Different kinds of test suites require different techniques for managing database state during testing

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 5/32

slide-6
SLIDE 6

Testing Challenges

The many testing challenges include, but are not limited to, the following: Unified program representation Family of test adequacy criteria Efficient test coverage monitoring techinques Intelligent approaches to test suite execution

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 6/32

slide-7
SLIDE 7

Database-Driven Applications

P m i m j Dl Dk

R R2

1

E F G H A B C D I

R3

J K L

Program P interacts with two relational databases

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 7/32

slide-8
SLIDE 8

Database Interaction Levels

Database Level

D1

P

Dn

A program can interact with a database at different levels of granularity

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 8/32

slide-9
SLIDE 9

Database Interaction Levels

UserInfo

user_name

4

acct_lock

1 Brian Zorman 2 Robert Roos 3

card_number pin_number

Geoffrey Arnold 32142 41601 45322 56471 Marcus Bittman

Relation Level

P

D1 Dn

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 8/32

slide-10
SLIDE 10

Database Interaction Levels

UserInfo

user_name

4

acct_lock

1 Brian Zorman 2 Robert Roos 3

card_number pin_number

Geoffrey Arnold 32142 41601 45322 56471 Marcus Bittman

Record Level

P

n

D1 D

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 8/32

slide-11
SLIDE 11

Database Interaction Levels

UserInfo

4

acct_lock

1 Brian Zorman 2 Robert Roos 3

card_number pin_number

32142 41601 45322 56471

user_name

Attribute Level

Marcus Bittman Geoffrey Arnold

P

D1 Dn

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 8/32

slide-12
SLIDE 12

Database Interaction Levels

UserInfo

4

acct_lock

1 Brian Zorman 2 Robert Roos 3

card_number pin_number

Geoffrey Arnold 32142 41601 45322 56471

user_name

Attribute Value Level

Marcus Bittman

P

D1

n

D

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 8/32

slide-13
SLIDE 13

Database Interaction Points

Database interaction point Ir ∈ I corresponds to the execution of a SQL DML statement Consider a simplified version of SQL and ignore SQL DDL statements (for the moment) Interaction points are normally encoded within Java programs as dynamically constructed Strings

select uses Dk, delete defines Dk, insert defines Dk, update defines and/or uses Dk

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 9/32

slide-14
SLIDE 14

Database Interaction Points (DML)

select A1, A2, . . . , Aq from r1, r2, . . . , rm where Q delete from r where Q insert into r(A1, A2, . . . , Aq) values(v1, v2, . . . , vq) update r set Al = F(Al) where Q

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 10/32

slide-15
SLIDE 15

Refined Database-Driven Application

P m i m j

R R2

1

E F G H A B C D

l

D

k

D

where set J = 500 update L < 1000 R3 select

1

* from R R2 from ) select where A < ( avg(G)

I

R3

J K L

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 11/32

slide-16
SLIDE 16

Test Adequacy Criteria

P violates a database Dk’s validity when it:

(1-v) inserts entities into Dk that do not reflect real

world P violates a database Dk’s completeness when it:

(1-c) deletes entities from Dk that still reflect real

world In order to verify (1-v) and (1-c), T must cause P to define and then use entities within D1, . . . , Dn!

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 12/32

slide-17
SLIDE 17

Data Flow Information

Interaction point: ‘‘UPDATE UserInfo SET acct lock=1 WHERE card number=’’ + card number + ‘‘;’’;

Database Level: define(BankDB) Attribute Level: define(acct_lock) and use(card_number)

Data fl

  • w information varies with respect to

the granularity of the database interaction

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 13/32

slide-18
SLIDE 18

Database Entities

UserInfo

user_name

4

acct_lock

1 Brian Zorman 2 Robert Roos 3

card_number pin_number

Marcus Bittman Geoffrey Arnold 41601 45322 56471 32142 v r

A (I ) = {

32142

}

1 Geoffrey Arnold

, , . . . , ,

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 14/32

slide-19
SLIDE 19

The DICFG: A Unified Representation

entry lockAccount

temp1 = parameter0:c_n temp2 = LocalDatabaseEntity0:Bank temp3 = LocalDatabaseEntity1:acct_lock temp4 = LocalDatabaseEntity2:card_number

“Database-enhanced” CFG for lockAccount Define temporaries to represent the program’s interaction at the levels of database and attribute

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 15/32

slide-20
SLIDE 20

The DICFG: A Unified Representation

exit G G G G

r r2 r 2 r 1

1

entry entry exit lockAccount update_lock = m_connect.createStatement() if( result_lock == 1) completed = true exit

D

qu_lck = "UPDATE UserInfo ..." + temp1 + ";" use(temp4) result_lock = update_lock.executeUpdate(qu_lck) define(temp2)

A Ir

define(temp3)

Database interaction graphs (DIGs) are placed before interaction point Ir Multiple DIGs can be integrated into a single CFG String at Ir is determined in a control-fl

  • w sensitive

fashion

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 15/32

slide-21
SLIDE 21

Test Adequacy Criteria

all−attribute−value−DUs all−record−DUs all−attribute−DUs all−relation−DUs all−database−DUs

Database interaction association (DIA) involves the def and use of a database entity DIAs can be located in the DICFG with data flow analysis all-database-DUs requires tests to exercise all DIAs for all

  • f the accessed databases

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 16/32

slide-22
SLIDE 22

Generating Test Requirements

Database Seeder

Database

(P)

Test Adequacy Criterion

(C)

System Under Test

Test Case Specification Relational Schema Requirements Test

Measured time and space overhead when computing family of test adequacy criteria Modifi ed ATM and mp3cd to contain appropriate database interaction points Soot 1.2.5 to calculate intraprocedural associations GNU/Linux workstation with kernel 2.4.18-smp and dual 1 GHz Pentium III Xeon processors

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 17/32

slide-23
SLIDE 23

Counting Associations and Definitions

D Rc Rl A Av Database Granularity 250 500 750 1000 1250 1500 1750 Assoc & Def Count D Rc Rl A Av

mp3cd HD ATM HD mp3cd DB ATM DB

DIAs at attribute value level represent 16.8% of mp3cd’s and 9.6% of ATM’s total number of intraprocedural associations

– p. 18/32

slide-24
SLIDE 24

Measuring Time Overhead

None D Rc Rl A Av Database Granularity 22.5 25 27.5 30 32.5 35 37.5 System Time sec None D Rc Rl A Av Time Overhead

mp3cd ATM

Computing DIAs at the attribute value level incurs no more than a 5 second time overhead

– p. 19/32

slide-25
SLIDE 25

Measuring Average Space Overhead

None D Rc Rl A Av Database Granularity 16 18 20 22 24 26 28 30 Node & Edge Count None D Rc Rl A Av

Emp3 Eatm Nmp3 Natm

mp3cd shows a more marked increase in the average number of nodes and edges than ATM

– p. 20/32

slide-26
SLIDE 26

Measuring Maximum Space Overhead

None D Rc Rl A Av Database Granularity 200 400 600 800 1000 1200 1400 Node & Edge Count None D Rc Rl A Av

Emp3 Eatm Nmp3 Natm

mp3cd shows a signifi cantly greater maximum space

  • verhead than ATM

– p. 21/32

slide-27
SLIDE 27

Automatic Representation Construction

Manual construction of DICFGs is not practical Use extension of BRICS Java String Analyzer (JSA) to determine content of String at Ir Per-class analysis is inter-procedural and control flow sensitive Conservative analysis might determine that all database entities are accessed Include coverage monitoring instrumentation to track DIGs that are covered during test suite execution

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 22/32

slide-28
SLIDE 28

Tracking Covered DIGs and DIAs

DB

P

DIGr m i m j DIGs

DIG # 1 DEF USE { ... } { ... } q { ... } { ... } 1 1 2 2 TEST { ... } { ... } COV?

DIG Coverage Table

DIA coverage can be tracked by recording which DIGs within a DICFG were executed during testing

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 23/32

slide-29
SLIDE 29

Types of Test Suites

T T ∆

1 e

m1 ∆1

e

∆ 0 me ∆e−1

Independent

∆ m T1

1

∆1

ε−1

∆ Tε

ε

m

ε

e

e

m Te ∆e

Partially Independent

T T ∆

1 e

m1 ∆1

e

∆ 0 me ∆e−1

Non-restricted

– p. 24/32

slide-30
SLIDE 30

Test Suite Execution

Independent test suites can be executed by using provided setup code to ensure that all ∆γ = ∆0 Non-restricted test suites simply allow state to accrue Partially independent test suites must return to ∆ε after Tε is executed by :

  • 1. Re-executing all SQL statements that resulted in

the creation of ∆ε

  • 2. Creating a compensating transaction to undo the

SQL statements executed by each test after Tε

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 25/32

slide-31
SLIDE 31

Representation Extension

The execution of a SQL insert during testing requires the re-creation of DICFG(s) The SQL delete does not require re-creation because we must still determine if deleted entity is ever used DICFG re-creation only needed when database interactions are viewed at the record or attribute-value level Representation extension ripples to other methods DICFGs can be re-constructed after test suite has executed, thus incurring smaller time overhead

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 26/32

slide-32
SLIDE 32

Test Coverage Monitoring

For each tested method mi that interacts with a database and each interaction point Ir that involves an

insert we must:

  • 1. Update the DICFG
  • 2. Re-compute the test requirements

We can compute the set of covered DIAs by consulting the DIG coverage table Test adequacy is : # covered DIAs / # total DIAs

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 27/32

slide-33
SLIDE 33

Calculating Adequacy

mi mj

mi

DIA <def(e1), use(e1)> <def(e2), use(e2)> COV? <def(e3), use(e3)> <def(e4), use(e4)>

Test Requirements

DIA COV? <def(e5), use(e5)> <def(e6), use(e6)> <def(e7), use(e7)> <def(e8), use(e8)> <def(e9), use(e9)> <def(e10), use(e10)>

Test Requirements

mj

Tf

cov(mi) = 2

4

cov(mj) = 4

6

cov(Tf) = 6

10

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 28/32

slide-34
SLIDE 34

Related Work

Jin and Offutt and Whittaker and Voas have suggested that the environment of a software system is important Chan and Cheung transform SQL statements into C code segments Chays et al. and Chays and Deng have created the category-partition inspired AGENDA tool suite Neufeld et al. and Zhang et al. have proposed techniques for database state generation Dauo et al. focused on the regression testing of database-driven applications

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 29/32

slide-35
SLIDE 35

Conclusions

Must test the program’s interaction with the database Many challenges associated with (1) unifi ed program representation, (2) test adequacy criteria, (3) test coverage monitoring, (4) test suite execution The DICFG shows database interactions at varying levels of granularity Unique family of test adequacy criteria to detect type (1) violations of database validity and completeness Intraprocedural database interactions can be computed from a DICFG with minimal time and space overhead

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 30/32

slide-36
SLIDE 36

Conclusions

Test coverage monitoring instrumentation supports the tracking of DIAs executed during testing Three types of test suites require different techniques to manage the state of the database SQL insert statement causes the re-creation of the representation and re-computation of test requirements Data fl

  • w-based test adequacy criteria can serve as the

foundation for automatically generating test cases and supporting regression testing

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 31/32

slide-37
SLIDE 37

Resources

Gregory M. Kapfhammer and Mary Lou Soffa. A Family of Test Adequacy Criteria for Database-Driven Applications. In FSE 2003. Gregory M. Kapfhammer. Software Testing. CRC Press Computer Science Handbook. June, 2004.

http://cs.allegheny.edu/˜gkapfham/research/diatoms/

Database drIven Application T esting tOol ModuleS, IBM T.J. Watson Research Center, May 14, 2004 – p. 32/32