W W HAT IS T EST D RIVEN D EVELOPMENT ? T D D ? TDD is a - - PowerPoint PPT Presentation

w w hat is t est d riven d evelopment t d d
SMART_READER_LITE
LIVE PREVIEW

W W HAT IS T EST D RIVEN D EVELOPMENT ? T D D ? TDD is a - - PowerPoint PPT Presentation

A C OMPARATIVE C ASE S TUDY ON THE I MPACT OF T EST -D RIVEN D EVELOPMENT ON P ROGRAM D ESIGN AND T EST C OVERAGE M ARIA S INIAALTO AND P EKKA A BRAHAMSSON ESEM, 2007 Irena Pletikosa Cvijikj Softvare Engineering Seminar Softvare Engineering


slide-1
SLIDE 1

A COMPARATIVE CASE STUDY ON THE IMPACT OF TEST-DRIVEN DEVELOPMENT

ON PROGRAM DESIGN AND TEST

COVERAGE

MARIA SINIAALTO AND PEKKA ABRAHAMSSON

ESEM, 2007

Irena Pletikosa Cvijikj Softvare Engineering Seminar Softvare Engineering Seminar ETH Zurich, 30.03.2010

slide-2
SLIDE 2

CON

Introduction

NTE

Empirical Body of Evidence Empirical Results from a

ENT

Empirical Results from a

Comparative Case Study

T

Conclusions

2

slide-3
SLIDE 3

INTRODUCTION

“...TDD is one of the most fundamental practices

3

enabling the development of software in an agile and iterative manner…”

slide-4
SLIDE 4

W T D D ? WHAT IS TEST DRIVEN DEVELOPMENT?

TDD is a software development technique based

  • n the repetition of a short development cycle:
  • Write a test
  • Write the code to pass the test
  • Refactor the code for better quality.

Write a test Pass?

No No Yes

4

Write some code Pass?

No Yes Refactor

slide-5
SLIDE 5

W T D D ? WHY TEST DRIVEN DEVELOPMENT?

 Improves test coverage.  Leads to modularized flexible extensible code  Leads to modularized, flexible, extensible code.  Enhances developers job satisfaction/confidence.  Enables simultaneous work on same code.  Limits number of defects.  Increases productivity.

5

slide-6
SLIDE 6

M T P MOTIVATION FOR THIS PAPER

C fi i i l i b

O i i i ti b d f id

Confirm existing claims by:

  • Organizing existing body of evidence
  • Conducting comparative case study

H1: TDD leads to increased test coverage

6

H2: TDD improves design quality

slide-7
SLIDE 7

EMPIRICAL BODY OF EVIDENCE

“…there are a few studies addressing the impact of

7

TDD on program design and are currently very scarce…”

slide-8
SLIDE 8

C TDD S CLASIFICATION OF TDD STUDIES

Type Subjects Context #Studies Industry Industrial Real project 4 y developers p j Semi–industry Industrial d l Experimental task 4 developers Student developers Real project 1 p Academic Student developers Experimental task 7 Total 16

8

slide-9
SLIDE 9

S R SUMMARY OF RESULTS

Positive Results Negative Results

 Increased code quality  High test coverage  Reduced code quality  Low test coverage  Increased productivity  Good acceptance

R d d l i i

 Reduced productivity  Strong reluctance to adopt

G i id f f il

 Reduced late integration

problems

 Greater incidence of failures

at the acceptance level

9

slide-10
SLIDE 10

C ? CONCLUSION?

Code Quality Productivity Acceptance Code Quality Test coverage

10

N i f l l i ! N i f l l i ! No meaningful conclusion! No meaningful conclusion!

slide-11
SLIDE 11

EMPIRICAL RESULTS FROM A C C S COMPARATIVE CASE STUDY

“…the main goal of this comparative empirical

11

evaluation of TDD is to explore the impact of TDD

  • n program design and test coverage…”
slide-12
SLIDE 12

S D STUDY DESIGN

 Controlled case study  3 projects real products for real customer  3 projects - real products for real customer  Project duration: 9 weeks (not simultaneous)  Participants: senior undergraduate students  Programming language: Java  Agile development method: Mobile-D

12

[Ihme, T. and Abrahamsson, P. 2004]

slide-13
SLIDE 13

C P S CASE PROJECTS SUMMARY

Project 1 Project 2 Project 3 # of developers 4* 5* 4** p

  • Dev. technique

Test-last Test-last TDD Application type Web Mobile Web pp yp Product concept Research data management Stock market browser PM tool Product size (LOC) 7700 7000 5800

* All ti i t h d i d t i l i * All participant had some industrial experience ** Only one participant worked in industry before

13

slide-14
SLIDE 14

R RESULTS

Coupling Between Object Classes Lack of Cohesion in Methods Test Coverage

14

slide-15
SLIDE 15

OO M D OO METRICS DEFINITIONS

Coupling Between Objects is… p g j

  • …the count of the classes to which this class is
  • coupled. Two classes are coupled when methods

declared in one class use methods or instance variables of the other class

Lack of Cohesion in Methods is…

  • …the number of different methods within a class

that reference a given instance variable.

15

[Chidamber and Kemerer 1994, Henderson-Sellers 1996]

slide-16
SLIDE 16

CBO R CBO RESULTS

16

slide-17
SLIDE 17

LCOM R LCOM RESULTS

17

slide-18
SLIDE 18

T C TEST COVERAGE

18

slide-19
SLIDE 19

S R SUMMARY OF RESULTS

 WMC, DIT, NOC, RFC: no significant difference  CBO: no conclusion can be made  CBO: no conclusion can be made  LCOM: experience in TDD usage is required  Test coverage: significant increase  Productivity: no effect

19

slide-20
SLIDE 20

T V THREATS TO VALIDITY

Differences in:

  • programming experience,
  • level of complexity for projects,
  • size of the project and
  • size of the project, and
  • distribution of the work.
  • encouraged to write tests but

Non-TDD developers were… encouraged to write tests, but…

  • …not informed about the test coverage measurement.

Use of students as study subjects.

20

slide-21
SLIDE 21

CONCLUSIONS

“…the results presented in this paper are

21

important as they contribute to the gradual build up of empirical evidence on software engineering innovations…” innovations…

slide-22
SLIDE 22

CONCLUSIONS

Claims that TDD leads to more loosely coupled objects can not be confirmed coupled objects can not be confirmed. TDD does not automatically result in highly cohesive code. TDD leads to significantly increased test coverage.

22

slide-23
SLIDE 23

APPENDIX

23

slide-24
SLIDE 24

C TDD CRITICISMS FOR TDD

 Does not compensate the lack of up-front design.  Not suitable for security and multithreaded  Not suitable for security and multithreaded

applications.

 R

id h i b k i t t

 Rapid changes cause expensive breakage in tests.  Lack of skills produce inadequate test coverage.  Tests become part of the maintenance overhead.  May bring a false sense of security.

y g y

24

slide-25
SLIDE 25

E B K EXISTING BODY OF KNOWLEDGE

Details on Previous Studies

25

slide-26
SLIDE 26

TDD I S TDD IN INDUSTRIAL SETTINGS

Reference # of subjects Study focus/Comparison Bhat and Nagappan 2 projects, Non-TDT 11-14 developers Lui and Chan 2 teams Traditional test-last Willi t l d 9 d l Ad h it t ti Williams et al and Maximillien and Williams 9 developers Ad-hoc unit testing Damm et al. 2 projects Specialized testing tool

26

slide-27
SLIDE 27

R I S RESULTS IN INDUSTRIAL SETTINGS

Positive Results Negative Results

 Increased code quality  Increased test coverage  Increased development

time/reduced productivity g

 Tests used as auto

documentation

 Improved task estimation  Improved process tracking  Defect rate reduction  Decreased defect repair time

D d j l d i

27

 Decreased project lead time  Daily integration reduces

late integration problems late integration problems

slide-28
SLIDE 28

TDD S I S TDD IN SEMI-INDUSTRIAL SETTINGS

Reference # of subjects Study focus/Comparison Canfora et al. 28 developers Traditional test-last Мüller 5 TDD and 3 conventional projects Traditional projects G d Willi 24 d l T diti l t t l t George and Williams 24 developers Traditional test-last Geras at al. 14 developers Traditional test-last Ab h t l 4 d l E l t d t Abrahamsson et al. 4 developers Exploratory data

28

slide-29
SLIDE 29

R S I S RESULTS IN SEMI-INDUSTRIAL SETTINGS

Positive Results Negative Results

 Better performance

predictability

 Increased development

time/reduced productivity

 Improved code quality  Increased test coverage  Greater incidence of

failures at the acceptance l l

 Increased number of tests  More frequent test

level

 Strong reluctance to adopt

TDD running TDD

 Developers didn’t see the

benefits

29

slide-30
SLIDE 30

TDD A S TDD IN ACADEMIC SETTINGS

Reference # of subjects Study focus/Comparison Janzen and Saiedian 3 teams Iterative test-last, no tests Kaufmann and Janzen 8 developers Test-last Müller and Hagner 19 developers Traditional test-last Pancur et al. 38 developers Iterative test-last Erdogmus et al. 24 developers Iterative test-last Steinberg / Exploratory data Edwards 59 developers Automated tool, no tests

30

slide-31
SLIDE 31

R A S RESULTS IN ACADEMIC SETTINGS

Positive Results Negative Results

 Increased productivity  TDD accepted after trying  Low test coverage  Concerns regarding  Improved code quality  Increased confidence

complexity and coupling

 Lower final reliability on

acceptance test

 Better program/requirements

understanding

 Fast and correct use of methods

acceptance test

 Reduced code quality  Adopting TDD was difficult  Fast and correct use of methods  Reduced debugging and

refactoring effort

 Adopting TDD was difficult

and it was found as not very effective

31

 Less faults and easier correction  Increased cohesion / looser

coupling

slide-32
SLIDE 32

O O M OBJECT ORIENTED METRICS

32

slide-33
SLIDE 33

O O M OBJECT-ORIENTED METRICS

WMC – Weighted Methods per Class DIT – Depth of Inheritance Tree NOC – Number Of Children CBO – Coupling Between Objects RFC – Response for a Class

33

LCOM – Lack of Cohesion in Methods

slide-34
SLIDE 34

W M C WEIGHTED METHODS PER CLASS

 WMC is defined as the sum of the complexities of

all methods of a class. all methods of a class.

 The number of methods and the complexity of

methods involved is a predictor of how much time p and effort is required to develop and maintain the class.

 The larger the number of methods in a class the

greater the potential impact on children, since children will inherit all the methods defined in the children will inherit all the methods defined in the class.

 Classes with large numbers of methods are likely to  Classes with large numbers of methods are likely to

be more application specific, limiting the possibility

  • f reuse.

34

[Chidamber and Kemerer 1994]

slide-35
SLIDE 35

D I T DEPTH OF INHERITANCE TREE

 DIT is defined as the maximum length from the

node to the root of the tree. node to the root of the tree.

 The deeper a class is in the hierarchy, the greater the

number of methods it is likely to inherit, making it y , g more complex to predict its behavior.

 Deeper trees constitute greater design complexity,

since more methods and classes are involved.

 The deeper a particular class is in the hierarchy, the

h i l f i h i d h d greater the potential reuse of inherited methods.

35

[Chidamber and Kemerer 1994]

slide-36
SLIDE 36

N O C NUMBER OF CHILDREN

 NOC is defined as the number of immediate

subclasses. subclasses.

 The greater the number of children, the greater the

reuse, since inheritance is a form of reuse. ,

 The greater the number of children, the greater the

likelihood of improper abstraction of the parent class. If a class has a large number of children, it may be a case of misuse of subclassing. Th b f hild i id f h i l

 The number of children gives an idea of the potential

influence a class has on the design. If a class has a large number of children it may require more testing large number of children, it may require more testing

  • f the methods in that class.

36

[Chidamber and Kemerer 1994]

slide-37
SLIDE 37

C B O COUPLING BETWEEN OBJECTS

 CBO is defined as the count of the classes to which

this class is coupled. Two classes are coupled when methods declared in one class use methods or instance variables of the other class.

 Excessive coupling between object classes is detrimental to  Excessive coupling between object classes is detrimental to

modular design and prevents reuse. The more independent a class is, the easier it is to reuse it in another application. I d t i d l it d t l ti

 In order to improve modularity and promote encapsulation,

inter-object class couples should be kept to a minimum. The larger the number of couples, the higher the i i i h i h f h d i d sensitivity to changes in other parts of the design, and therefore maintenance is more difficult.

 A measure of coupling is useful to determine how complex

p g p the testing of various parts of a design are likely to be. The higher the inter-object class coupling, the more rigorous the testing needs to be.

37

the testing needs to be.

[Chidamber and Kemerer 1994]

slide-38
SLIDE 38

R C RESPONSE FOR A CLASS

 RFC is defined as number of methods in the set

  • f all methods that can be invoked in response to
  • f all methods that can be invoked in response to

a message sent to an object of a class.

 If a large number of methods can be invoked in

If a large number of methods can be invoked in response to a message, the testing and debugging of the class becomes more complicated since it requires a greater level of understanding on the part of the tester. Th l th b f th d th t b

 The larger the number of methods that can be

invoked from a class, the greater the complexity of the class the class.

 A worst case value for possible responses will assist

in appropriate allocation of testing time.

38

pp p g

[Chidamber and Kemerer 1994]

slide-39
SLIDE 39

L C M (1) LACK OF COHESION IN METHODS (1)

 LCOM is defined as the number of different

methods within a class that reference a given methods within a class that reference a given instance variable.

 Cohesiveness of methods within a class is desirable,

Cohesiveness of methods within a class is desirable, since it promotes encapsulation.

 Lack of cohesion implies classes should probably be

p p y split into two or more subclasses.

 Any measure of disparateness of methods helps

identify flaws in the design of classes.

 Low cohesion increases complexity, thereby

i i th lik lih d f d i th increasing the likelihood of errors during the development process.

39

[Chidamber and Kemerer 1994]

slide-40
SLIDE 40

L C M (2) LACK OF COHESION IN METHODS (2)

 LCOM is defined as the number of different methods

within a class that reference a given instance variable. H d S ll l i h d i h

 Henderson-Sellers algorithm produces answers in the

range 0 to 1, with the value zero representing perfect cohesion and with value one presenting extreme lack cohesion and with value one presenting extreme lack

  • f cohesion.

 LCOM = (m - sum(mA)/a) / (m-1)  m: number of methods in a class  a: number of attributes in a class.  mA: number of methods that access the attribute a.  sum(mA): sum of all mA over all the attributes in the class

40

 sum(mA): sum of all mA over all the attributes in the class.

[Henderson -Sellers 1996]