Aiko Yamashita
CWI, Netherlands Oslo and Akershus University College, Norway 2016-12-01
Studies on Developers, Refactoring and Code Smells Aiko Yamashita - - PowerPoint PPT Presentation
Studies on Developers, Refactoring and Code Smells Aiko Yamashita CWI, Netherlands Oslo and Akershus University College, Norway 2016-12-01 A SHORT INTRO ABOUT ME.. Oslo, Norway Gothenburg, Sweden Amsterdam, Netherlands Sapporo, Japan
CWI, Netherlands Oslo and Akershus University College, Norway 2016-12-01
Sapporo, Japan San Francisco, USA mom dad Oslo, Norway San Jose, Costa Rica Gothenburg, Sweden Berkeley, USA
Amsterdam, Netherlands
Something happens here...
Maybe the absence/presence
Researcher Maintenance Project
TAXONOMY FOR INCREMENTAL CHANGE
The wider environment The containing system The system (operators and processes) The software product Code
User knowledge, programmer effectiveness, product quality, programmer time availability, machine requirements, and system reliability
Maintenance management,
personnel factors, and system characteristics.
Maintenance tasks
Software Process Improvement scenario
Configuration Management
Alexander, Ian F. "A taxonomy of stakeholders: Human roles in system development." International Journal of Technology and Human Interaction (IJTHI) 1.1 (2005): 23-59.
inspired by
Program comprehension difficulties
Risk factors/programming difficulties
METHODOLOGY
2) Confusion and erroneous hypothesis generation 3) Slow acquisition of overview/general understanding of the system 4) Time-consuming information quests 5) Time consuming changes
Class A Class B Change Class C Change Change
I C S M E 2 1 3
Concept Extraction Concept Location I m p a c t A n a l y s i s A c t u a l i z a t i
Incorporation C h a n g e P r
a g a t i
Refactoring
More concise, accurate categorizations Use the model of ‘Incremental change’ process by Raijlich & Gosabi
Software Engineers Software Engineering Researchers Smell detector W C R E 2 1 3
Smell/Anti-Pattern Points
19.53
9.78
8.32
7.09
3.04
2.70
2.33
2.33
2.31
2.25
1.50
1.50
1.20
1.12
1.12
Feature Points
10.00
4.08
3.50
2.67
2.50
2.50
2.33
2.33
2.25
( )
CF', 2BGBG0CF' 1F I8FGF CCF%, 49BG937F) 5CC6BCF I8FGF) 0CF7BI7 C7FFB7F%
Where do you read about code smells?
Yslow and Pingdom
do I really need to check all
12 anti-patterns and code smells covered Definition Synonyms Description Argumentation for existence of FP Examples (code) Concrete instances of FP Relevant contextual factors Empirical studies Theory Practical experiences Factor Implication
Preliminary Catalogue of Anti-pattern and Code Smell False Positives Technical Report RA-5/2015
Poznań University of Technology
S A N E R 2 1 6 FIRST ATTEMPT TO BUILD A TAXONOMY
defect taxonomy
construction modelling/design requirements requirement captured incorrectly requirement not implemented requirement misunderstood while implementing design flaw / code smell / anti-pattern logic incorrect expression data-flow (incl. variable assignment / initialization) algorithm (control-flow) data database OR mapping interface user interaction API (technical interface) resources memory concurrency performance case completeness exceptions validation language environment
OS 3rd Party hardware built environment build tools (make/rake etc) configuration management system QA standards documentation testing
Building a metadata for improvig sampling of projects from GitHub TAXONOMY FOR DEFECT CLASSIFICATION
Programming (code-related) activities during Maintenance Reading Searching Navigating Editing Others
Developer System
Task 3. New Reporting functionality Task 1. Replacing external data source
Task 2. New authentication mechanism
System System!
System Project context Tasks
Source code Daily interviews Audio files/notes Subversion database
Programming Skill
Defects*
Development Technology
Change Size** Effort** Maintenance outcomes
Think aloud Video files/notes Task progress sheets Eclipse activity logs Trac (Issue tracker), Acceptance test reports Open interviews Audio files/notes
Variables
Data sources Moderator variables
Code smells
(num. smells** smell density**) ** System and fle level * Only at system level Maintainability perception* Maintenance problems**
Think aloud Video files/notes Study diary
fact-sheet
Activity logs
➡Mostly performed activities:
Navigating (58.72%), Reading (28.27%), Editing (10.18%) and searching (2.47%)
➡Distribution is consistent with Ko
et al. 2006 (top four)
➡Reading as most consuming activity
in Ko et al. 2006.
an activity
For our analysis, we only consider: Editing, Navigating, Searching and Reading
DISTRIBUTION OF ACTIVITY EFFORT
S m e l l s e x p l a i n b e t t e r E d i t i n g a n d N a v i g a t i n g e f f
t t h a n fi l e s i z e , b u t n
f
R e a d i n g a n d S e a r c h i n g Maintenance problems in previous work related to increased effort for editing, navigating and reading
SUMMARY OF RESULTS
Developer Round Effort (time) System Main variables Refactorings (types, no.)
Is#there#an#individual#refactoring#style? What#are#the#general#tendencies#w.r.t.#refactorings? Is#there#a#learning#when#refactoring?#(diff#between#rounds) Research(ques+ons:
Refactored files and directories
Files and directories before maintenance Files and directories after maintenance
Dataset(1
Code smells (initial) Type of task Associated to bugfix? Code smells (final) Main variables Size in LOC (initial) Total effort Total churn Size in LOC (final) Other variables Original file? Refactored files and directories Java files
Dataset&2 Does%the%ini*al%design**%affect%the%choice%of%refactorings%and%smell%evolu*on? Does%the%type%of%task%affect%the%types/effort%spent%on%refactoring? Research&ques.ons:
ANALYSING LOGS TO UNDERSTAND REFACTORING CHOICES T
e s u b m i t t e d s
e w h e r e . . .
Only for developer 3, there was an effect from system, we suspect due to a more “ad-hoc” refactoring strategy
0" 20" 40" 60" 80" 100" 120" 140" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"A" 2"D" 0" 2" 4" 6" 8" 10" 12" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"D" 2"C" 0" 1" 2" 3" 4" 5" 6" 7" 8" 9" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"B" 2"A" 0" 10" 20" 30" 40" 50" 60" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"C" 2"B" 0" 0.5" 1" 1.5" 2" 2.5" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"C" 2"D" 0" 0.5" 1" 1.5" 2" 2.5" 3" 3.5" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"A" 2"B"
developer 1 developer 4 developer 2 developer 3 developer 5 developer 6
Model &2(Log( likelihood Cox(&(Snell(R( Square Nagelkerke(R( Square Variables((contributing Data(Clump!(Beta!=(4,093!p!<!0,05) Data(Class!(Beta!=!5,568!p!<!0,05) Feature(Envy!(Beta!=4,303!p!<!0,05) Shotgun(Surgery!(Beta!=!(3,657!p!<!0,001) Total(no.(smells!(Beta!=!(3,414!p<0,05) Extract!method 256,678 ,103 ,209 Lines(of(code!(Beta!=!0,197!p!<!0,01) ,371 ,081 72,718 Change!method!signature ,149 ,102 473,871 Organize!imports
0" 20" 40" 60" 80" 100" 120" 140" 160" R e n a m e " e l e m e n t " C h a n g e " m e t h
" s i g n a t u r e " G e n e r a t e " c
s t r u c t
s " u s i n g " fi e l d s " G e n e r a t e " c
s t r u c t
s " f r
" s u p e r c l a s s " E x t r a c t " l
a l " v a r i a b l e " O r g a n i z e " t h e " i m p
t s " E x t r a c t " m e t h
" E x t r a c t " S u p e r " C l a s s " P u l l " U p " O v e r r i d e "
" I m p l e m e n t " m e t h
" R e n a m e " i n f
m a G
" I n f e r " g e n e r i c " t y p e " G e n e r a l i z e " d e c l a r e d " t y p e " P u s h " D
n " M
e " m e t h
/ c l a s s " C
v e r t " n e s t e d " t
t
" a c G
" I n t r
u c e " P a r a m e t e r " A c G
" G e n e r a t e " D e l e g a t e " M e t h
s " E x t r a c t " C
s t a n t " E x t r a c t " I n t e r f a c e " Task"1" Task"2" Task"3"
For tasks, it was much more the task size what could explain the frequency, rather than the type of task Only few code smells could explain some refactorings (some with negative coefficients)
It appears as extract method could increase odds of defects!! (results from a series of binary logistic regression models)
Feature Envy God Class God Method Temporal variable used for several purposes Usage of implementation instead of interface Extract Method Bugfix (after refactoring) Bugfix (at some point) Generate constructors from super class Override or implement method Move method/class
decrease
decrease increase increase
increase
decrease
Churn