FUTURE CHALLENGES IN SOFTWARE EVOLUTION AND QUALITY ANALYSIS FROM - - PowerPoint PPT Presentation

future challenges in software evolution and quality
SMART_READER_LITE
LIVE PREVIEW

FUTURE CHALLENGES IN SOFTWARE EVOLUTION AND QUALITY ANALYSIS FROM - - PowerPoint PPT Presentation

FUTURE CHALLENGES IN SOFTWARE EVOLUTION AND QUALITY ANALYSIS FROM INDUSTRIAL & ACADEMIC PERSPECTIVES Aiko Yamashita, PhD. Centrum Wiskunde & Informatica, Netherlands Oslo and Akershus University College of Applied Sciences, Norway


slide-1
SLIDE 1

FUTURE CHALLENGES IN SOFTWARE EVOLUTION AND QUALITY ANALYSIS FROM INDUSTRIAL & ACADEMIC PERSPECTIVES Aiko Yamashita, PhD.

Centrum Wiskunde & Informatica, Netherlands Oslo and Akershus University College of Applied Sciences, Norway 2016-12-01

slide-2
SLIDE 2

Sapporo, Japan San Francisco, USA mom dad Oslo, Norway San Jose, Costa Rica Gothenburg, Sweden Berkeley, USA

A SHORT INTRO ABOUT ME..

Amsterdam, Netherlands

slide-3
SLIDE 3

Career trayectory...

slide-4
SLIDE 4

High-level research focus/question: How can we use different** metrics to attain better maintainability assessments and to support software evolution-related activities?

**product-related and project-related (empirical) metrics

A METRICS APPROACH FOR SOFTWARE EVOLUTION AND MAINTENANCE

Heuristics

90

50

8

Design restructuring Heuristics for programmers Evaluation/ estimation

00101001 11011011 00010110 00101001

Unstructured data Reports Code/Design attributes Historical data

slide-5
SLIDE 5

METRICS FOCUS: CODE SMELLS

A hint about suboptimal implementation choices that can affect negatively future maintenance and evolution.

slide-6
SLIDE 6

METRICS FOCUS: CODE SMELLS

A change leads to another change, to another, to another..

Shotgun Surgery

Reduce the coupling between components

Move method refactoring

slide-7
SLIDE 7

METRICS FOCUS: CODE SMELLS

  • Refactoring ROI not clear: e.g., to eliminate a code smell implies a cost

(refactoring, rework) and a risk (introduction defects).

  • Not clear in which contexts (e.g., activities) smell-based analysis

performs best, and which are the preconditions (e.g., additional data) required.

Challenges D JDeodorant InFusion/InCode

NDepend Analyze4J

State of art

Tooling Detection

PTIDEJ / DECOR

slide-8
SLIDE 8

Goal: Assess wether code smells can be used effectively for assesing the maintainability of software. Research method: Longitudinal, in-vivo case study investigating a Maintenance Project involving 4 Java systems and 6 software professionals. Research techniques: case replication, cross-case synthesis, explanatory models (e.g., regression), grounded theory.

LETS GET EMPIRICAL!

slide-9
SLIDE 9

EMPIRICAL STUDY

D C B A

Developer System

Study Design

  • 4 Java Applications
  • Same functionality
  • Different design/code
  • Size: 7KLOC to 14KLOC

Context Maintenance Tasks

Task 3. New Reporting functionality Task 1. Replacing external data source

Task 2. New authentication mechanism

System System!

slide-10
SLIDE 10

System Project context Tasks

Source code Daily interviews Audio files/notes Subversion database

Programming Skill

Defects*

Development Technology

Change Size** Effort** Maintenance outcomes

Think aloud Video files/notes Task progress sheets Eclipse activity logs Trac (Issue tracker), Acceptance test reports Open interviews Audio files/notes

Variables

  • f interest

Data sources Moderator variables

Code smells

(num. smells** smell density**) ** System and fle level * Only at system level Maintainability perception* Maintenance problems**

Think aloud Video files/notes Study diary

INSIDE THE BELLY OF THE MONSTER

  • 50,000 Euros
  • Sep-Dec, 2008
  • 7 Weeks
  • 6 Developers
  • 2 Companies

fact-sheet

slide-11
SLIDE 11
slide-12
SLIDE 12

Aggregated Individual

Interconnected

LET’S GET SOME “PERSPECTIVES”

slide-13
SLIDE 13

Can code smells Indicate system-level maintainability?

Systems were ranked according to their no. of code smells, and their smell density (no. smells/KLOC). Maintainability Assessment Systems were ranked a c c o r d i n g t o t h e i r maintainability, which was measured by: effort (time) and defects introduced. Actual Maintainability

Do they correspomd?

Standardized scores were calculated for the ranking

In addition, smell-based assessment was compared to two previous assessments (CK metrics and Expert judgment) on the systems

slide-14
SLIDE 14

Can code smells Indicate system-level maintainability?

slide-15
SLIDE 15

Can code smells Indicate system-level maintainability?

Number of code smells displayed highest correspondence to actual maintainability

Expert Judgment was considered as the most flexible approach, because it considers both the effects of system size and potential maintenance scenarios

Number of code smells is correlated with system size!

slide-16
SLIDE 16

Do code smell cover important maintainability attributes?

D C B A

Developer Audio fjle Transcript Coded Statement System Maintainability Factor Cross-case Matrix

Appropriate technical platform Coherent naming Design suited to the problem domain Initial defects Architecture

Not covered

Encapsulation Inheritance Libraries Simplicity Use of components Design consistency Logic Spread

Partially covered

Duplicated code

Covered

slide-17
SLIDE 17

Aggregated Individual

Interconnected

LET’S GET SOME “PERSPECTIVES”

slide-18
SLIDE 18

Can code smells explain maintenance effort or problems?

Dependent variable: Problematic? Independent variables: 12 smells Control variables:

  • File size (LOC)
  • Churn
  • System

Analysis: Logistic Regression Model Explanatory model for Problem Dependent variable: Effort (time) Independent variables: 12 smells Control variables:

  • File size (LOC)
  • Number of revisions on a file
  • System
  • Developer
  • Round

Analysis: Multiple Regression Model Explanatory model for Effort

In addition, principal component analysis (PCA) on the code smell distribution and qualitative analysis was performed.

slide-19
SLIDE 19

Can code smells explain maintenance effort or problems?

  • Interface Segregation Principle Violation

(ISPV) was able to explain problems [Exp(B) = 7.610, p = 0.032]

  • Data Clump significant contributor of

model [Exp(B) = 0.053, p = 0.029] but associated to less problems!

  • PCA indicated that ISPV tends to not

be associated to code smells that are related to size.

  • Qualitative data suggests that ISPV is

related to error/change propagation, and difficult concept location. Explanatory model for Problem Explanatory model for Effort

  • A model that includes file size and

number of changes and code smells displayed a fit of R2 = 0.58

  • Removing the smells from that model

did not decrease the fit!! (R2 = 0.58)

  • Only smell that remained significant

was Refused Bequest, which registered a decrease in effort (α < 0.01)

  • File size and number of changes remain

the most significant predictors of effort (α < 0.001) Code smells are not better at explaining sheer-effort at file level, than size and number of revisions. Some code smells can potentially explain the occurrence of problems during maintenance. Also, not all smells seem to be problematic…

slide-20
SLIDE 20

Aggregated Individual

Interconnected

LET’S GET SOME “PERSPECTIVES”

slide-21
SLIDE 21

To what extent can code smells explain maintenance problems?

Principal Component Analysis Dependency Analysis

% Non-Source code related difficulties % Source code related difficulties % Code-smell related difficulties % Non-code-smell related difficulties Maintenance Difficulties ¡Observa)onal ¡study ¡Daily ¡interviews ¡Think-­‑aloud ¡protocol

Analysis

slide-22
SLIDE 22

To what extent can code smells explain maintenance problems?

Distribution of maintenance problems according to source

slide-23
SLIDE 23

Some patterns where identified..

Hoarders Feature Envy God Class God Method Data Containers Data Clump Data Class Wide Interfaces ISP Violation Shotgun Surgery Confounders Temporal variable used for several purposes Duplicated code in conditional brances have dependencies

  • n

could become are often found together with could become

How do Code Smells interact with one another?

slide-24
SLIDE 24

How do Code Smells interact with one another?

File System DC CL DUP FE GC GM ISPV MC RB SS Temp Imp StudyDatabase A 7 1 1 1 1 1 PrivilegesManageAction B 1 1 StudiesEditAction B 1 2 StudiesSearchAction B 1 1 1 StudySearch B 2 1 2 DB C 2 16 1 2 1 1 StudyDAO D 1 10 1 2 1 1 1

P r

  • b

l e m a t i c fi l e s w i t h a t l e a s t

  • n

e G

  • d

M e t h

  • d

: Hoarders in System B and how they are distributed across two co

File Individual code smells Coupled smells StudySearch.java GC, GM, FE ObjectStatementImpl.java ISPV, SS FE, GM, ISPV, GC, SS MemoryCachingSimula.java GC, TMP Simula.java ISPV, SS ISPV, GC, SS, TMP

S y s t e m B

God Class Feature Envy God Class Feature Envy

Coupling

A B

Coupled smells can have similar implications as collocated smells!

slide-25
SLIDE 25

An interesting example case for the interaction effect between code smells and other design flaws…

Typed getters and setters

How do Code Smells interact with one another?

slide-26
SLIDE 26

OVERALL FINDINGS

  • Aggregated and individual code smell analyses are insufficient

to understand the role of code smells on maintenance

  • Code smells interact! (collocated and coupled code smells)
  • An approach (more promising?) is to incorporate dependency

analysis to the study of individual code smells.

  • There may be other smells not yet discovered…
  • Role of code smells are dependent of the maintenance context

(ex. Data Clumps)

slide-27
SLIDE 27

FOLLOW-UP STUDIES?

slide-28
SLIDE 28

Open Source Systems Name: ElasticSearch Java: 2951 files Total Size: 253 KLOC History: 102 minor releases and 22 major releases since 2010 S e a r c h e n g i n e p l a t f

  • r

m Name: Mahout Java: 935 Scala: 12 Total Size: 92 KLOC History: 10 releases since 2010 M a c h i n e l e a r n i n g l i b r a r y Industrial System Name: Ebehandling Java: 5300 files Total: 5840 files Total Size: 601 KLOC History: 40 major releases and 15 patch releases since 2009 G r a n t a p p l i c a t i

  • n

s y s t e m

Principal Component Analysis Dependency Analysis

Can we find the same inter-smells in larger systems?

slide-29
SLIDE 29

IGNORING COUPLED SMELLS INCREASES CHANCES OF INTER-SMELL FALSE NEGATIVES

System (Component) Smells Ebehandling (9) Mahout (10) ElasticSearch (8) Data Class Feature Envy Ebehandling (7) ElasticSearch (14) Sibling Duplication Message Chains Mahout (5) ElasticSearch (1) God Class Schizo Class Ebehandling (1) Mahout (5) God Class Data Class Message Chains Mahout (9) ElasticSearch (7) God Class Feature Envy Ebehandling (4) ElasticSearch (13) Feature Envy Message Chains Ebehandling (5) Mahout (2) Internal Duplication External Duplication

Pattern 3

Sibling Duplication Message Chains Ebehandling (7), ElasticSearch(14) Message Chains Sibling Duplication Class Ebehandling (2), ElasticSearch(3)

Pattern 1

Feature Envy Data Class Ebehandling (9), Mahout(10) Class Data Class Feature Envy ElasticSearch (8)

Pattern 5

Class God Class Feature Envy Ebehandling (1), ElasticSearch(1)

GodClass ExternalDuplication FeatureEnvy 1 3 GodClass FeatureEnvy IntensiveCoupling 2 4 GodClass FeatureEnvy IntensiveCoupling 1 4

InternalDuplication ExternalDuplication 1 3

Pattern 7

Mahout(2) Class Internal Duplication External Duplication External Duplication Internal Duplication Class Ebehandling (6)

slide-30
SLIDE 30

INDUSTRIAL SYSTEMS ARE DIFFERENT FROM OPEN SOURCE

(a) (b) (c)

Data Clumps having dependencies

  • n other Data Clumps

Mahout Ebehandling ElasticSearch

P a t t e r n 9

Data Clumps Data Clumps Ebehandling (8), Mahout(6), ElasticSearch(5) ion God Class Feature Envy Ebehandling (1), ElasticSearch(1)

slide-31
SLIDE 31

(a) (c) (b)

Classes having incomming dependencies from God Classes and Feature Envy methods Ebehandling Mahout ElasticSearch

Pattern 9

s Data Clumps ElasticSearch(5) t(5)

Pattern 5

Class God Class Feature Envy Ebehandling (1), ElasticSearch(1)

INDUSTRIAL SYSTEMS ARE DIFFERENT FROM OPEN SOURCE

slide-32
SLIDE 32

Programming (code-related) activities during Maintenance Reading Searching Navigating Editing Others

How do Code Smells affect Maintenance Activities?

  • MINING IDE EVENT LOGS-
slide-33
SLIDE 33

MINING EVENT LOGS TO UNDERSTAND EFFORT

  • Selection of artifacts in the package explorer
  • Selection of Java elements in the editor window
  • Selecting Java elements in the file outline
  • Editing source files (Java files)
  • Scrolling the source code window
  • Switching between open files
  • Running Eclipse “commands” (copy, paste, go to line)

Eclipse activity logs

Activity logs

slide-34
SLIDE 34

DISTRIBUTION OF ACTIVITY EFFORT

➡Mostly performed activities:

Navigating (58.72%), Reading (28.27%), Editing (10.18%) and searching (2.47%)

➡Distribution is consistent with Ko

et al. 2006 (top four)

➡Reading as most consuming activity

in Ko et al. 2006.

  • Definition of event/action belonging to

an activity

For our analysis, we only consider: Editing, Navigating, Searching and Reading

slide-35
SLIDE 35

CODE SMELLS AFFECT SOME TASKS...

S m e l l s e x p l a i n b e t t e r E d i t i n g a n d N a v i g a t i n g e f f

  • r

t t h a n fi l e s i z e , b u t n

  • t

f

  • r

R e a d i n g a n d S e a r c h i n g Maintenance problems in previous work related to increased effort for editing, navigating and reading

slide-36
SLIDE 36

Developer Round Effort (time) System Main variables Refactorings (types, no.)

Is#there#an#individual#refactoring#style? What#are#the#general#tendencies#w.r.t.#refactorings? Is#there#a#learning#when#refactoring?#(diff#between#rounds) Research(ques+ons:

Refactored files and directories

Files and directories before maintenance Files and directories after maintenance

Dataset(1

  • What are refactoring tendencies?
  • Individual refactoring styles?
  • Learning when refactoring?

Code smells (initial) Type of task Associated to bugfix? Code smells (final) Main variables Size in LOC (initial) Total effort Total churn Size in LOC (final) Other variables Original file? Refactored files and directories Java files

Dataset&2 Does%the%ini*al%design**%affect%the%choice%of%refactorings%and%smell%evolu*on? Does%the%type%of%task%affect%the%types/effort%spent%on%refactoring? Research&ques.ons:

  • Do design or type of task affect refactoring?

How do Code Smells affect Refactoring?

  • MINING IDE EVENT LOGS-
slide-37
SLIDE 37

Only for developer 3, there was an effect from system, we suspect due to a more “ad-hoc” refactoring strategy

0" 20" 40" 60" 80" 100" 120" 140" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"A" 2"D" 0" 2" 4" 6" 8" 10" 12" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"D" 2"C" 0" 1" 2" 3" 4" 5" 6" 7" 8" 9" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"B" 2"A" 0" 10" 20" 30" 40" 50" 60" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"C" 2"B" 0" 0.5" 1" 1.5" 2" 2.5" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"C" 2"D" 0" 0.5" 1" 1.5" 2" 2.5" 3" 3.5" Rename"element" Select"refactoring"menu" Change"method"signature" Generate"constructors"using" Generate"constructors"from" Extract"local"variable" Organize"the"imports" Extract"method" Extract"Super"Class" Pull"Up" Override"or"Implement" Rename"informaFon" Infer"generic"type" Generalize"declared"type" Push"Down" Move"method/class" Convert"nested"to"top"acFon" Introduce"Parameter"AcFon" Generate"Delegate"Methods" Extract"Constant" Extract"Interface" 1"A" 2"B"

INDIVIDUAL STYLE FOR REFACTORING?

developer 1 developer 4 developer 2 developer 3 developer 5 developer 6

DISCLAIMER: VERY PRELIMINARY!! (small sample)

slide-38
SLIDE 38

Model &2(Log( likelihood Cox(&(Snell(R( Square Nagelkerke(R( Square Variables((contributing Data(Clump!(Beta!=(4,093!p!<!0,05) Data(Class!(Beta!=!5,568!p!<!0,05) Feature(Envy!(Beta!=4,303!p!<!0,05) Shotgun(Surgery!(Beta!=!(3,657!p!<!0,001) Total(no.(smells!(Beta!=!(3,414!p<0,05) Extract!method 256,678 ,103 ,209 Lines(of(code!(Beta!=!0,197!p!<!0,01) ,371 ,081 72,718 Change!method!signature ,149 ,102 473,871 Organize!imports

0" 20" 40" 60" 80" 100" 120" 140" 160" R e n a m e " e l e m e n t " C h a n g e " m e t h

  • d

" s i g n a t u r e " G e n e r a t e " c

  • n

s t r u c t

  • r

s " u s i n g " fi e l d s " G e n e r a t e " c

  • n

s t r u c t

  • r

s " f r

  • m

" s u p e r c l a s s " E x t r a c t " l

  • c

a l " v a r i a b l e " O r g a n i z e " t h e " i m p

  • r

t s " E x t r a c t " m e t h

  • d

" E x t r a c t " S u p e r " C l a s s " P u l l " U p " O v e r r i d e "

  • r

" I m p l e m e n t " m e t h

  • d

" R e n a m e " i n f

  • r

m a G

  • n

" I n f e r " g e n e r i c " t y p e " G e n e r a l i z e " d e c l a r e d " t y p e " P u s h " D

  • w

n " M

  • v

e " m e t h

  • d

/ c l a s s " C

  • n

v e r t " n e s t e d " t

  • "

t

  • p

" a c G

  • n

" I n t r

  • d

u c e " P a r a m e t e r " A c G

  • n

" G e n e r a t e " D e l e g a t e " M e t h

  • d

s " E x t r a c t " C

  • n

s t a n t " E x t r a c t " I n t e r f a c e " Task"1" Task"2" Task"3"

STILL.. EFFECTS FROM TASK TYPE AND CODE SMELLS WERE MINOR...

For tasks, it was much more the task size what could explain the frequency, rather than the type of task Only few code smells could explain some refactorings (some with negative coefficients)

slide-39
SLIDE 39

WE ARE DEALING WITH AN EVEN MORE MESSIER PROBLEM (THAN WE THOUGHT)

CONTEXTUAL VARIABILITY (ACTIVITY types, FLOSS vs. IND) UNCLEAR CATEGORIZATION and EFFECTS from TASKS ARTIFICIAL (USELESS) FORMALISMS?

slide-40
SLIDE 40
slide-41
SLIDE 41

Software measurements Human measurements Interaction

WE NEED TO DEVELOP A NEW GENERATION OF INSTRUMENTATION & MEASUREMENT TECHNIQUES

slide-42
SLIDE 42

Heuristics

Think-aloud sessions

00101001 11011011 00010110 00101001

IDE Logs

+

  • Adaptable mining-based capability that can “read” the situation based on code

evolution, defects and behavioural patterns from developers

  • Identification of problematic cases via think-alouds
  • Event patterns can help identifying problematic artifacts
  • Examination of artifacts can hopefully lead to measurable attributes on the artifacts

Inductive approch to identification of harmful design/code attributes Historical adaptable data-mining capability for sw evolution

+

Defect data Change history

slide-43
SLIDE 43

Process Mining Events Visualization

Theory

Recommender systems

Guide focus

  • n features/

research

Problems experienced Interaction traces

+

Validation of theoretical models Deviations from normal flow General bottlenecks

search hypothesize execute action validate consolidate

Behavioral Model

slide-44
SLIDE 44

Challenge (1 out of N): Badly integrated empirical data

Defect data Issues data Change history

BUT... HOW DO WE GO ABOUT INDUCTIVE ANALYSIS?

  • Routines/tooling are not integrated into the work-flow...

Challenge (1 out of N): Badly integrated empirical data

Defect data Issues data Change history

BUT... HOW DO WE GO ABOUT INDUCTIVE ANALYSIS?

  • Routines/tooling are not integrated into the work-flow...
  • No culture for registering the underlying cause of code change...
  • No “reflective routines” powered by toolsets that are easy to use...

Iteration planning

Mini-iteration

Retrospective create use

2 weeks (normal) iteration Continuous process monitoring

update R e f a c t

  • r

i n g B a c k l

  • g

Iteration planning

2 weeks (normal) iteration

Retrospective

Mini-iteration ...

Detection Effort

  • Maintainab. Risk

Code Smell Effort implement Side-effects Refactoring

*..*

Knowledge base

slide-45
SLIDE 45

IDEA:TOOLSET AND ROUTINES FOR SEAMLESS

INTEGRATION OF CODE AND REPOSITORY ANALYSIS

+ API??

1.Technology should be available

IDEA:TOOLSET AND ROUTINES FO

INTEGRATION OF CODE AND REPOSIT

+ API??

1.Technology should

g

IDEA:TOOLSET AND ROUTINES FOR SEAMLESS

INTEGRATION OF CODE AND REPOSITORY ANALYSIS

+ API??

1.Technology should be available 2 . W h i c h c a n b e t e s t e d i n c a s e s t u d i e s v i a f e a s i b l e r

  • u

t i n e s

Iteration planning

Mini-iteration

Retrospective create use

2 weeks (normal) iteration Continuous process monitoring

update Refactoring Backlog Iteration planning

2 weeks (normal) iteration

Retrospective

Mini-iteration ...

Detection Effort

  • Maintainab. Risk

Code Smell Effort implement Side-effects Refactoring

*..*

Knowledge base

BUT better “packaging” is needed!

slide-46
SLIDE 46
  • %
  • '

( )

  • ,

CF', 2BGBG0CF' 1F I8FGF CCF%, 49BG937F) 5CC6BCF I8FGF) 0CF7BI7 C7FFB7F%

Like!

Where do you read about code smells?

Are we even concerned about the same smells?

  • EXPLORATORY SURVEY-
slide-47
SLIDE 47

Cross and Hagardon. “Critical Connections: Driving Rapid Innovation with a Network Perspective”

slide-48
SLIDE 48

Know-what Know-how Implications!

slide-49
SLIDE 49
  • %
  • '

( )

  • ,

CF', 2BGBG0CF' 1F I8FGF CCF%, 49BG937F) 5CC6BCF I8FGF) 0CF7BI7 C7FFB7F% Yamashita & Moonen, 2013 - “Do Developers Care About Code Smells? – An Exploratory Survey”

Challenge 1: Difficulties understanding it, its usage or implications

Accessibility issues?

In the top list of Reddit (for a day)!

slide-50
SLIDE 50

“If you don't make it to air, there is

  • nothingness. You're dead.”
  • Jack Welch

We like low- hanging fruits!

industrial dude

slide-51
SLIDE 51

Prejudices on results coming from academia:

“too complicated”‘,“too theoretical”, “boring”, “not as cool as..”

Wave of technology adoption may obey a different set of rules than our current understanding... Maybe we need to understand better the phenomenon of adoption from a sociological perspective (e.g., perception)

slide-52
SLIDE 52

Concept of Sand-box Platform support, Tutorials, etc

Now many OSS projects are “sleek”... and why not?

There has been an ‘evolution’ on FLOSS “perception”

slide-53
SLIDE 53

BUT... HOW DO WE GO ABOUT INDUCTIVE ANALYSIS?

  • No clear/useful/easy to use taxonomy for classifying faults..

Challenge (2 out of N): No clear classification schemas

Defect data Issues data

= ?

BUT... HOW DO WE GO ABOUT INDUCTIVE ANALYSIS?

  • No clear/useful/easy to use taxonomy for classifying faults..
  • No clear taxonomy for classifying underlying reasons for changes
  • Unclear description of contexts/domains of given projects (comparability

may be a challenge, even if the data is correctly classified)

Challenge (2 out of N): No clear classification schemas

Defect data Issues data

= ?

slide-54
SLIDE 54

defect taxonomy

construction modelling/design requirements requirement captured incorrectly requirement not implemented requirement misunderstood while implementing design flaw / code smell / anti-pattern logic incorrect expression data-flow (incl. variable assignment / initialization) algorithm (control-flow) data database OR mapping interface user interaction API (technical interface) resources memory concurrency performance case completeness exceptions validation language environment

  • perating environment

OS 3rd Party hardware built environment build tools (make/rake etc) configuration management system QA standards documentation testing

  • ther

Building a metadata for improvig sampling of projects from GitHub

IDEA: TAXONOMY FOR DEFECT CATEGORIZATION THAT IS EASY TO USE BY DEVELOPERS

slide-55
SLIDE 55

do I really need to check all

  • f them?

THE FALSE POSITIVES PROBLEM

slide-56
SLIDE 56

IDEA:TAXONOMY FOR DESCRIBING PROJECT CONTEXTS

AND DOMAINS (ANOTHER “GRAND” IDEA) Mainly for generating more adaptable quality models based on empirical data from given project repository

Machine learning

+

Crowd sourcing

IDEA:TAXONOMY FOR DESCRIBING PROJECT CONTEXTS

AND DOMAINS (ANOTHER “GRAND” IDEA) Mainly for generating more adaptable quality models based on empirical data from given project repository

Machine learning

+

Crowd sourcing

Via crowdsourcing, we ask people what domain a project is and we train a model to guess the domain when analyzing projects in GitHub.

slide-57
SLIDE 57

Challenge (3 out of N): No (empirical) data available

Defect data Issues data Change history

BUT... HOW DO WE GO ABOUT INDUCTIVE ANALYSIS?

Only millennial files of code after code, complemented with some other artefacts... (like outdated documentation)

S

  • f

t w a r e a r c h e

  • l
  • g

i s t

slide-58
SLIDE 58

IDEA: USE OF BENCHMARKS, THRESHOLD ANALYSIS (AND OTHERS...)

C

  • n

c e p t h a s b e e n p r

  • p
  • s

e d b y A l v e s , F e r r e i r a , B a g g e n , O l i v e i r a . . .

R E S T A P I C l

  • n

e d a n d a n a l y z e d Percentiles Analysis B e n c h m a r k ( S e t

  • f

. c s v fi l e s )

How reliable is the benchmark?

slide-59
SLIDE 59

IDEA: WHAT ABOUT A META-DATA CATALOGUE?

slide-60
SLIDE 60

Software Engineers Software Engineering Researchers

Smell detector

LAST (BUT NOT LEAST) IDEA: MORE FOCUS ON DEVELOPERS? This is the picture we want to avoid...

slide-61
SLIDE 61

What is going on there?

slide-62
SLIDE 62

.. or here?

slide-63
SLIDE 63

Perceptual and cognitive processes during Software Engineering activities are yet not well understood

slide-64
SLIDE 64

Human/Social factors play a central role in

software development

slide-65
SLIDE 65
slide-66
SLIDE 66
slide-67
SLIDE 67

Theories on program comprehension date from 90's (+20 years)

But... theories on program comprehension are outdated

slide-68
SLIDE 68

We are undergoing a dramatic paradigm shift

The Tabulating Era (1900s – 1940s) The Cognitive Era (2011 – ) The Programming Era (1950s – present)

Image 1

Source: IBM

slide-69
SLIDE 69

And many assumptions need to be reassessed or actualized...

slide-70
SLIDE 70

Modern Software Engineering Research Empirical Methods Tools, Languages and Methods Real-life, industrial scale validation Study of Human and Social Factors Study of Contextual Factors Theory Hypothesis Data

Software Engineering Research Platform (proposal)

http://www.codeswat.com

Analyzer4J

  • Blob Classes
  • Swiss Knife
  • Complex Class
http://www.intooitus.com/inCode.html

InCode

  • Feature Envy
  • Data Class
  • God Class
slide-71
SLIDE 71

Software Analysis Software Transformation Software Generation Tools for Software Development Formal Methods Machine Learning Language Engineering

Complex, Large

Source code Processes Logs/Data Teams System of systems Contexts/Domains

Do they help? How much? How can it be better?

Research on Software Quality and Evolution

slide-72
SLIDE 72

“We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win.”

  • John Kennedy
slide-73
SLIDE 73

ありがとうございます!

Contact: aiko.fallas@gmail.com

(thank you!)