Virtual Mutation Analysis of Relational Database Schemas University - - PowerPoint PPT Presentation

virtual mutation analysis of relational database schemas
SMART_READER_LITE
LIVE PREVIEW

Virtual Mutation Analysis of Relational Database Schemas University - - PowerPoint PPT Presentation

Virtual Mutation Analysis of Relational Database Schemas University of Sheffield Phil McMinn Allegheny College Gregory M. Kapfhammer University of Sheffield Chris J. Wright Relational Databases Why Should We (Still) Care?


slide-1
SLIDE 1

Virtual Mutation Analysis of Relational Database Schemas

Phil McMinn
 Gregory M. Kapfhammer
 Chris J. Wright University of Sheffield
 Allegheny College
 University of Sheffield

slide-2
SLIDE 2

Relational Databases – 
 Why Should We (Still) Care?

A vital component of many software systems Despite the wave of interest in “NoSQL” technologies, Relational Databases are still popular (and faster) For developers: schemas provide self-documentation

slide-3
SLIDE 3

Relational Databases – 
 Why Should We (Still) Care?

Relational Databases are still 
 important, popular and relevant

A vital component of many software systems Despite the wave of interest in “NoSQL” technologies, Relational Databases are still popular (and faster) For developers: schemas provide self-documentation

slide-4
SLIDE 4

A Relational Database Schema

slide-5
SLIDE 5

A Relational Database Schema

Table

slide-6
SLIDE 6

A Relational Database Schema

Table Column and
 data type

slide-7
SLIDE 7

Integrity Constraints

Prevent invalid data being entered into the database Encode domain logic

slide-8
SLIDE 8

Integrity Constraints

Prevent invalid data being entered into the database Encode domain logic

slide-9
SLIDE 9

Testing the Schema

Correctly accepted by the schema Correctly rejected by the schema Correctly rejected by the schema

slide-10
SLIDE 10

Testing the Schema

Correctly accepted by the schema Correctly rejected by the schema Correctly rejected by the schema

slide-11
SLIDE 11

Testing the Schema

Correctly accepted by the schema Correctly rejected by the schema Correctly rejected by the schema

slide-12
SLIDE 12

Testing the Schema

Correctly accepted by the schema Correctly rejected by the schema Correctly rejected by the schema

slide-13
SLIDE 13

Why Do We Need to Do This?

slide-14
SLIDE 14

Why Do We Need to Do This?

To trap common errors when designing a schema


For example: lack of uniqueness property on usernames, out of range values

slide-15
SLIDE 15

Why Do We Need to Do This?

To trap common errors when designing a schema


For example: lack of uniqueness property on usernames, out of range values

To test development behaviour vs deployment


DBMSs have subtly different behaviors

slide-16
SLIDE 16

Why Do We Need to Do This?

To trap common errors when designing a schema


For example: lack of uniqueness property on usernames, out of range values

To test development behaviour vs deployment


DBMSs have subtly different behaviors

Nobody throws away a database of data


To test the success of database migrations

slide-17
SLIDE 17

Why Do We Need to Do This?

To trap common errors when designing a schema


For example: lack of uniqueness property on usernames, out of range values

To test development behaviour vs deployment


DBMSs have subtly different behaviors

Nobody throws away a database of data


To test the success of database migrations

Industry advice


Destroying database consistency can have huge cost implications

slide-18
SLIDE 18

Mutation Analysis

Once a test suite has been created, its fault finding capability can be estimated with mutation analysis.

For relational database schema testing, mutants are created by making small changes to the schema

slide-19
SLIDE 19

Mutation Analysis

Once a test suite has been created, its fault finding capability can be estimated with mutation analysis.

For relational database schema testing, mutants are created by making small changes to the schema

slide-20
SLIDE 20

Mutation Analysis

Once a test suite has been created, its fault finding capability can be estimated with mutation analysis.

For relational database schema testing, mutants are created by making small changes to the schema

slide-21
SLIDE 21

Mutation Analysis

Once a test suite has been created, its fault finding capability can be estimated with mutation analysis.

For relational database schema testing, mutants are created by making small changes to the schema

slide-22
SLIDE 22

Mutation Analysis is Costly

schema mutants

slide-23
SLIDE 23

Mutation Analysis is Costly

schema mutants mutant test suite database

Mutant killed / alive

+

slide-24
SLIDE 24

Mutation Analysis is Costly

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

slide-25
SLIDE 25

Mutation Analysis is Costly

slide-26
SLIDE 26

Mutation Analysis is Costly

DO FEWER

slide-27
SLIDE 27

Mutation Analysis is Costly

DO FEWER DO SMARTER

slide-28
SLIDE 28

Mutation Analysis is Costly

DO FEWER DO SMARTER DO FASTER

slide-29
SLIDE 29

Mutation Analysis is Costly

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

slide-30
SLIDE 30

Mutation Analysis is Costly

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

slide-31
SLIDE 31

Mutation Analysis is Costly

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

H i g h c

  • s

t

  • f

c

  • m

m u n i c a t i n g w i t h t h e D B M S a n d e x e c u t i n g S Q L q u e r i e s

  • n

i t

slide-32
SLIDE 32

Reducing the Cost

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

slide-33
SLIDE 33

Reducing the Cost

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

L

  • c

a l , n

  • c
  • m

m u n i c a t i

  • n
  • v

e r h e a d

slide-34
SLIDE 34

Reducing the Cost

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

model of

slide-35
SLIDE 35

Reducing the Cost

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

model of

Virtual Mutation
 Analysis

slide-36
SLIDE 36

Reducing the Cost

schema mutants mutant test suite database

Mutant killed / alive

+

SchemaAnalyst

model of

Virtual Mutation
 Analysis

L

  • w

e r e x e c u t i

  • n
  • v

e r h e a d

slide-37
SLIDE 37

The Model

slide-38
SLIDE 38

The Model

Integrity constraint predicate icp1

slide-39
SLIDE 39

The Model

icp2 icp3

Integrity constraint predicate icp1

slide-40
SLIDE 40

The Model

icp2 icp3 icp4 icp5

Integrity constraint predicate icp1

slide-41
SLIDE 41

The Model

Form an acceptance predicate for the table:

icp2 icp3 icp4 icp5

ap = icp1 /\ icp2 /\ icp3 /\ icp4 /\ icp5 Integrity constraint predicate icp1

slide-42
SLIDE 42

The Model

Form an acceptance predicate for the table:

icp2 icp3 icp4 icp5

ap = icp1 /\ icp2 /\ icp3 /\ icp4 /\ icp5 Integrity constraint predicate icp1

True when DBMS would accept the data False otherwise

slide-43
SLIDE 43

Virtual DBMS Models

slide-44
SLIDE 44

Empirical Study

  • RQ1. What is the relative efficiency of the virtual approach?
  • RQ2. What are the time savings?
  • RQ3. How do mutation scores compare when the standard

approach is run for as long as the virtual one?

slide-45
SLIDE 45

Subject Schemas

slide-46
SLIDE 46

RQ1: Efficiency

Standard Virtual 1 1 1 1 1 1 HyperSQL PostgreSQL SQLite C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s Database Schema Mutation Analysis Time (Log Transformed)

slide-47
SLIDE 47

RQ1: Efficiency

Standard Virtual 1 1 1 1 1 1 HyperSQL PostgreSQL SQLite C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s Database Schema Mutation Analysis Time (Log Transformed)

Virtual Mutation Analysis is significantly more efficient for Postgres and HyperSQL, but not SQLite

slide-48
SLIDE 48

RQ2: Time Savings

−100 −50 50 100 50 100 150 Number of Mutants Percentage of Mean Time Saved HyperSQL PostgreSQL SQLite

slide-49
SLIDE 49

RQ2: Time Savings

−100 −50 50 100 50 100 150 Number of Mutants Percentage of Mean Time Saved HyperSQL PostgreSQL SQLite

Virtual Mutation Analysis yields large time savings for Postgres and HyperSQL but not always with SQLite, leading to an average time saving of 51% overall

slide-50
SLIDE 50

HyperSQL PostgreSQL SQLite 5 1 1 5 C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s Database Schema Total Number of Mutants Selective Virtual

RQ3: Comparison

slide-51
SLIDE 51

HyperSQL PostgreSQL SQLite 5 1 1 5 C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s C

  • f

f e e O r d e r s E m p l

  • y

e e I n v e n t

  • r

y I s

  • 3

1 6 6 J W h

  • i

s S e r v e r M

  • z

i l l a P e r m i s s i

  • n

s N i s t W e a t h e r P e r s

  • n

P r

  • d

u c t s Database Schema Total Number of Mutants Selective Virtual

RQ3: Comparison

Virtual Mutation Analysis evaluates more mutants

slide-52
SLIDE 52

RQ3: Comparison

Selective Virtual 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 HyperSQL PostgreSQL SQLite CoffeeOrders Employee Inventory Iso3166 JWhoisServer MozillaPermissions NistWeather Person Products CoffeeOrders Employee Inventory Iso3166 JWhoisServer MozillaPermissions NistWeather Person Products Database Schema Mutation Score

slide-53
SLIDE 53

RQ3: Comparison

Selective Virtual 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 HyperSQL PostgreSQL SQLite CoffeeOrders Employee Inventory Iso3166 JWhoisServer MozillaPermissions NistWeather Person Products CoffeeOrders Employee Inventory Iso3166 JWhoisServer MozillaPermissions NistWeather Person Products Database Schema Mutation Score

Virtual Mutation Analysis is the best option when highly accurate scores are needed under a time constraint

slide-54
SLIDE 54

Conclusions

Virtual Mutation Analysis Technique:

Removes the need to use a real DBMS for relational database schema mutation testing More cost-effective while still being accurate:

  • More efficient for 22 of 27 configurations studied
  • Yields time savings of 13 to 99%