Sub-clones: Considering the Part Rather than the Whole Robert - - PowerPoint PPT Presentation

sub clones considering the part rather than the whole
SMART_READER_LITE
LIVE PREVIEW

Sub-clones: Considering the Part Rather than the Whole Robert - - PowerPoint PPT Presentation

Sub-clones: Considering the Part Rather than the Whole Robert Tairas Department of Computer and Information Sciences University of Alabama at Birmingham Jeff Gray Department of Computer Science University of Alabama This research is


slide-1
SLIDE 1

Sub-clones: Considering the Part Rather than the Whole

Robert Tairas

Department of Computer and Information Sciences University of Alabama at Birmingham

Jeff Gray

Department of Computer Science University of Alabama

This research is supported by NSF grant CPA-0702764 Software Composition and Modeling Lab University of Alabama at Birmingham University of Alabama

slide-2
SLIDE 2

Cloning in Software

 Code Clones:

 A section of code that is duplicated in multiple locations in

a program

 Different granularity levels:

 Statements, Block, Method,

Class, Program

 Clone Group:

 Clones of the same duplication

2

Source Code

Cloned Code

slide-3
SLIDE 3

Maintaining Clones

3

After a period of time A new programmer

slide-4
SLIDE 4

Removing Clones through Refactoring

 Modularizing the code represented by clones through

appropriate abstractions may improve code quality

 Less duplicated code to maintain  Ease of future maintenance efforts

 Refactoring is one means of improving the quality of code

 The goal of refactoring is to preserve the external behavior

  • f code while improving its internal structure

4

Modularized Clone Clone 1 Clone 2

slide-5
SLIDE 5

Refactoring Clones

5

publ i c c l a s s A { publ i c voi d m e t hod( ) {

{cloned statements} {cloned statements} {cloned statements}

. . .

{cloned statements} {cloned statements} {cloned statements}

} } publ i c c l a s s A { publ i c voi d m e t hod( ) { ne wM e t hod( ) ; . . . ne wM e t hod( ) ; } publ i c voi d ne wM e t hod( ) {

{cloned statements} {cloned statements} {cloned statements}

} }

Extract-Method Refactoring

slide-6
SLIDE 6

Clone Refactoring Process

6

Manually Detect Clones Determine Clones For Refactoring Refactoring Clones

 Changes between two versions

 First version contains original code  Second version contains refactored code

slide-7
SLIDE 7

Clone Refactoring Process

7

Automated Clone Detection Tool

 What are the refactoring characteristics of clones

detected by a clone detection tool, if such a tool was used in the clone maintenance process?

Manually Detect Clones Determine Clones For Refactoring Refactoring Clones

slide-8
SLIDE 8

Approach: Observing Refactorings

 Observing actual clone-related refactorings in multiple

release versions of JBoss (v2.2.0–4.2.3)

 Used Simian clone detection tool

8

Source Code (Version 1) Clone Detection Code Clones Source Code (Version 2)

Compare

@ @

  • 2471, 13 +2469, 7 @

@ s c a n_pos i t i on. c ur r e nt _s l ot = Pa ge . I NVALI D_SLOT_NUM BER; / / r e l e a s e t he s c a n l oc k now t ha t we ha ve s a ve d a wa y t he r ow.

  • i f ( s c a n_pos i t i on. c ur r e nt _s c a n_pa ge no ! = 0)
  • {
  • t hi s . ge t Loc ki ngPol i c y( ) . unl oc kSc a n(
  • s c a n_pos i t i on. c ur r e nt _s c a n_pa ge no) ;
  • s c a n_pos i t i on. c ur r e nt _s c a n_pa ge no = 0;
  • }

+ unl oc kCur r e nt Sc a n( s c a n_pos i t i on) ; } } }

C lone c C lone c

diff region

C lone c C lone c

diff region

C lone c

V ersion 1 V ersion 1 V ersion 1 V ersion 2 V ersion 2 V ersion 2

slide-9
SLIDE 9

Refactoring of Simian Clones

 Observations

 21 Extract Method-type Refactorings  Range of refactored code not equal to the range reported

as a clone

9

Type T

  • tal

Extract Method 14 Extract Method with Pull-up Method 1 Extract Method to utility class 6 Total 21

slide-10
SLIDE 10

Observing with Other Tools

 Consider clones reported by other tools

 CCFinder, CloneDR, Deckard, and SimScan

 Run these tools on source files associated with the 21

Extract Method-type refactorings from Simian clones

10

Source Code (Version 1) Simian Code Clones Source Files CCFinder CloneDR Deckard SimScan Code Clones Code Clones Code Clones Code Clones Source Code (Version 2) Compare Source Code (Version 2) Source Code (Version 2) Source Code (Version 2) Compare Compare Compare

slide-11
SLIDE 11

Evaluation: Tool Coverage

11

 Coverage of 21 Extract Method-type refactorings in JBoss

 Initially detected by using Simian clones

 Reported clones that exactly covered the refactored

code were less than half for all the tools

T

  • ol

Exact Coverage Larger Coverage 1. CCFinder 4 (19%) 8 (38%) 2. CloneDR 6 (28%) 9 (42%) 3. Deckard 8 (38%) 3 (14%) 4. Simian 2 (9%) 0 (0%) 5. Simscan 6 (28%) 12 (57%)

slide-12
SLIDE 12

Refactoring in Clone Ranges

12

1 2 4 5 pr ot e c t e d St r i ng ge t Va l ue ( St r i ng na m e , St r i ng va l ue ) { 1 2 4 5 i f ( va l ue . s t a r t s W i t h( " ${" ) & & va l ue . e nds W i t h( " }" ) ) { 1 2 3 4 5 - t r y { 1 2 3 4 5 - St r i ng pr ope r t yNa m e = va l ue . s ubs t r i ng( 2, va l ue . l e ngt h( ) - 1) ; 1 2 3 4 5 - Obj e c t Na m e pr ope r t ySe r vi c e ON = ne w Obj e c t Na m e ( “ . . . " ) ; 1 2 3 4 5 - Ke r ne l Abs t r a c t i on ke r ne l Abs t r a c t i on = Ke r ne l Abs t r a c t i onFa c t or y. ge t I ns t a nc e ( ) ; 1 2 3 4 5 - St r i ng pr ope r t yVa l ue = ( St r i ng) ke r ne l Abs t r a c t i on. i nvoke ( . . . ) ; 1 2 3 5 - l og. de bug( " Re pl a c e d e j b- j a r . xm l e l e m e nt " + na m e + " wi t h va l ue " + pr ope r t yVa l ue ) ; 1 2 3 5 - r e t ur n pr ope r t yVa l ue ; 1 2 3 5 - } c a t c h ( Exc e pt i on e ) { 1 2 3 5 - l og. wa r n( " Una bl e t o l ook up pr ope r t y s e r vi c e f or e j b- j a r . xm l e l e m e nt " + . . . ) ; 1 2 3 5 - } + St r i ng r e pl a c e m e nt = St r i ngPr ope r t yRe pl a c e r . r e pl a c e Pr ope r t i e s ( va l ue ) ; + i f ( r e pl a c e m e nt ! = nul l ) + va l ue = r e pl a c e m e nt ; 1 2 5 } 1 2 5 r e t ur n va l ue ; 1 2 5 } i f ( e dge i ns t a nc e of M Tr a ns i t i on) { M Tr a ns i t i on t r = ( M Tr a ns i t i on) e dge ;

  • Fi gTr a ns

t r Fi g = ne w Fi gTr a ns ( t r ) ;

  • / / s e t s our c e a nd de s t
  • / / s e t a ny a r r owhe a ds , l a be l s , or c ol or s
  • M

St a t e Ve r t e x s our c e SV = t r . ge t Sour c e ( ) ;

  • M

St a t e Ve r t e x de s t SV = t r . ge t Ta r ge t ( ) ;

  • Fi gNode s our c e FN = ( Fi gNode ) l a y. . .
  • Fi gNode de s t FN = ( Fi gNode ) l a y. . .
  • t r Fi g. s e t Sour c e Por t Fi g( s our c e FN) ;
  • t r Fi g. s e t Sour c e Fi gNode ( s our c e FN) ;
  • t r Fi g. s e t De s t Por t Fi g( de s t FN) ;
  • t r Fi g. s e t De s t Fi gNode ( de s t FN) ;

+ Fi gTr a ns t r Fi g = ne w Fi gTr a ns ( t r , l a y) ; r e t ur n t r Fi g; }

 Refactoring performed on

  • nly part of the reported

clone range

 Sub-clone refactoring

slide-13
SLIDE 13

Evaluation: Focus on Deckard

13

 Deckard selected due to tree-based tool performance

 JBoss re-evaluated  Additional artifacts: ArgoUML (v0.10.1–0.26) and Apache

Derby (v10.1.1.0–10.5.3.0)

Property JBoss ArgoUML Derby Refactoring Coverage Exact coverage 19 17 12 Sub-clone coverage 14 9 15 Coverage Levels Same level 4 4 6 1 level above 9 2 8 > 1 level above 1 3 1 Clone Differences Refactorable 7 4 8 Not Refactorable 7 5 7

slide-14
SLIDE 14

Evaluation: Focus on Deckard

14

 Reported clone range mainly the same level or one

syntactic level above the actual refactored code

 Possibly to keep some logic in the original location

 Programmers only refactored a sub-clone even when the

entire clone was refactorable

Property JBoss ArgoUML Derby Refactoring Coverage Exact coverage 19 17 12 Sub-clone coverage 14 9 15 Coverage Levels Same level 4 4 6 1 level above 9 2 8 > 1 level above 1 3 1 Clone Differences Refactorable 7 4 8 Not Refactorable 7 5 7

slide-15
SLIDE 15

Conclusion

15

 We

  • bserved

the actual refactoring

  • f

clones by evaluating source code changes between multiple versions

 In various instances only part of the reported clone (i.e.,

sub-clone) was refactored

 We conclude that sub-clone refactoring should be

included in the clone maintenance process

 Future Work

 Individual evaluation of other clone detection tools  Provide support for sub-clone refactoring in an IDE

slide-16
SLIDE 16

CeDAR plug-in

16

slide-17
SLIDE 17

Sub-clones in CeDAR

17

slide-18
SLIDE 18

Thank you

 Personal:

 http://www.cis.uab.edu/tairasr

 Code Clones Literature:

 http://www.cis.uab.edu/tairasr/clones/literature

 SoftCom Laboratory:

 http://www.cis.uab.edu/softcom

18