Cohesion and Coupling Optimisation BALANCING IMPROVEMENT AND - - PowerPoint PPT Presentation

cohesion and coupling
SMART_READER_LITE
LIVE PREVIEW

Cohesion and Coupling Optimisation BALANCING IMPROVEMENT AND - - PowerPoint PPT Presentation

Cohesion and Coupling Optimisation BALANCING IMPROVEMENT AND DISRUPTION MATHEUS PAIXAO, MARK HARMAN, YUANYUAN ZHANG, YIJUN YU What I have (not) done The ultimate solution for software modularisation Full understanding of developers behaviour


slide-1
SLIDE 1

Cohesion and Coupling Optimisation

BALANCING IMPROVEMENT AND DISRUPTION

MATHEUS PAIXAO, MARK HARMAN, YUANYUAN ZHANG, YIJUN YU

slide-2
SLIDE 2

What I have (not) done

The ultimate solution for software modularisation Full understanding of developers behaviour Identification of the best metrics for software modularisation

matheus.paixao.14@ucl.ac.uk 2

slide-3
SLIDE 3

What I have done

Thorough empirical study of structural cohesion and coupling

  • ptimisation

Identification of disruption as an important overlooked issue Multi-objective approach to balance structural improvement and disruption

matheus.paixao.14@ucl.ac.uk 3

slide-4
SLIDE 4

Structural Dependencies

4 matheus.paixao.14@ucl.ac.uk

p1 c1 c2 c3 c4 c5 p2 c6 c7 c8 p3 c9

Function Call Data Access Inheritance Interface Implementation

slide-5
SLIDE 5

Modularisation drivers

Semantic and Structural cohesion/coupling metrics better describe developers’ implementations [1][2]

matheus.paixao.14@ucl.ac.uk 5

[2] Candela, I., Bavota, G., Russo, B., & Oliveto, R. (2016). Using Cohesion and Coupling for Software Remodularization : Is It Enough ? ACM Transactions on Software Engineering and Methodology, 25(3), 1–28. [1] Bavota, G., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., & De Lucia, A. (2013). An empirical study on the developers’ perception of software coupling. In 2013 35th International Conference on Software Engineering (ICSE) (pp. 692–701). San Francisco: IEEE.

slide-6
SLIDE 6

Literature Review

6 matheus.paixao.14@ucl.ac.uk

Metrics Validation Longitudinal Evaluation Disruption Analysis

largest empirical study of automated software re-modularisation to date

slide-7
SLIDE 7

Software Systems Under Study

matheus.paixao.14@ucl.ac.uk 7

slide-8
SLIDE 8

Modularisation Quality - MQ

matheus.paixao.14@ucl.ac.uk 8 p1 c1 c2 c3 c4 c5 p2 c6 c7 c8 p3 c9

𝑁𝐺(𝑞1) = 0.72 𝑁𝐺(𝑞2) = 0.66 𝑁𝐺(𝑞3) = 0.00

𝑁𝑅 = 1.38

slide-9
SLIDE 9

RQ1: Is there any evidence that open source software systems respect structural measurements of cohesion and coupling?

matheus.paixao.14@ucl.ac.uk 9

RQ1.1: Purely Random Distribution RQ1.2: k-Random Neighbourhood Search RQ1.3: Systematic 1-Neighbourhood Search

Cohesion 99.99% 0.1001% MQ 99.99% 0.0866%

slide-10
SLIDE 10

RQ2: What is the relationship between raw cohesion and the MQ metric?

matheus.paixao.14@ucl.ac.uk 10

  • 100.00%
  • 50.00%

0.00% 50.00% 100.00% 150.00% 200.00% 250.00% 300.00% 350.00% 400.00%

Bunch Solutions

MQ Difference Cohesion Difference

Bunch Search Solutions

Pivot 2.0.2 13 Packages

58

All Systems

493.11%

slide-11
SLIDE 11

RQ2: What is the relationship between raw cohesion and the MQ metric?

matheus.paixao.14@ucl.ac.uk 11

Package-constrained Search Solutions

0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00%

Package-constrained Solutions

MQ Difference Cohesion Difference

slide-12
SLIDE 12

matheus.paixao.14@ucl.ac.uk 12

slide-13
SLIDE 13

DisMoJo

Disruption metric based on MoJoFM[2]

matheus.paixao.14@ucl.ac.uk 13 [3] Zhihua Wen, & Tzerpos, V. (2004). An effectiveness measure for software clustering algorithms. In

  • Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004. (pp. 194–203).
slide-14
SLIDE 14

Disruption Analysis Workflow

matheus.paixao.14@ucl.ac.uk 14

Releases Re-modularisation approach Improved Modularisations DisMoJo Disruption Values Bunch Package-constrained 30 executions

slide-15
SLIDE 15

RQ3: What is the disruption caused by search based approaches for optimising software modularisation?

matheus.paixao.14@ucl.ac.uk 15

Bunch Package-constrained

Mean 80.39% 57.82%

slide-16
SLIDE 16

Release 1.2

p4

Longitudinal Disruption Analysis

matheus.paixao.14@ucl.ac.uk 16

Release 1.1

p1 c1 p2 p3 c2 c3 c4 c5 c9 c6 c7 c8 p1 c1 p2 p3 c2 c3 c4 c5 c9 c6 c7 c8 c10 c11 c12

slide-17
SLIDE 17

Longitudinal Disruption Analysis

matheus.paixao.14@ucl.ac.uk 17

Lower Bound

Mean 4.32% 30.99%

Upper Bound

slide-18
SLIDE 18

Balancing modularity improvement and disruption

matheus.paixao.14@ucl.ac.uk 18

slide-19
SLIDE 19

RQ5: What is the modularity improvement provided by the multiobjective search for acceptable disruption levels?

matheus.paixao.14@ucl.ac.uk 19

Package-free Lower Bound 1.66% in MQ 0.13% in Cohesion Upper Bound 150.38% in MQ 2.38% in Cohesion Package-constrained Lower Bound 3.36% in MQ 0.72% in Cohesion Upper Bound 59.25% in MQ 23.49% in Cohesion

slide-20
SLIDE 20

Conclusion

Software systems respect modularity measurements Search based approaches for re-modularisation cause large disruption Multiobjective search can be used to find clear and constant trade-off between modularity improvement and disruption Modularity can be improved within lower and upper bounds of acceptable disruption performed by developers

matheus.paixao.14@ucl.ac.uk 20

slide-21
SLIDE 21

Backup – System’s Selection Criteria

At least 10 subsequent official releases No general libraries and APIs Java systems

matheus.paixao.14@ucl.ac.uk 21

slide-22
SLIDE 22

Backup - Software Systems Under Study

matheus.paixao.14@ucl.ac.uk 22

slide-23
SLIDE 23

it’s an ordinal metric[1] value has no meaning it’s not normalised

Backup - Modularisation Metrics

matheus.paixao.14@ucl.ac.uk 23

avoids god packages

MQ

[4] Stanley Stevens. (1946). On the Theory of Scales of Measurement. American Association for the Advancement of Science, 103(2684), 677–680.

it’s an interval metric[1] easy to understand it’s normalised leads to god packages

Cohesion

inflation effect

slide-24
SLIDE 24

Backup – RQ1 table of results

matheus.paixao.14@ucl.ac.uk 24

slide-25
SLIDE 25

Backup – RQ2 table of results

matheus.paixao.14@ucl.ac.uk 25

slide-26
SLIDE 26

Backup – Classes distribution

matheus.paixao.14@ucl.ac.uk 26

RQ2.3: Package-constrained Search Solutions

30.06% cohesion improvement

34.15%

Biggest package Smallest package

Original Implementations

0.32%

Biggest package Smallest package

Package-constrained solutions

1.33% 19.80%

slide-27
SLIDE 27

Backup – RQ3 table of results

matheus.paixao.14@ucl.ac.uk 27

slide-28
SLIDE 28

Backup - RQ3: Best MQ and best cohesion disruption

matheus.paixao.14@ucl.ac.uk 28

Bunch Package-constrained

Mean 80.39% 57.82% Best MQ 79.67% 55.30% Best Cohesion 78.77% 54.33%

slide-29
SLIDE 29

Backup – RQ4 Two-Archive GA parameters

Population Size: number of classes (N) Single point crossover (0.8 if N < 100; 1.0 otherwise) Swap mutation (0.004 x log2N) Tournament selection (size 2) 50N Generations

matheus.paixao.14@ucl.ac.uk 29

slide-30
SLIDE 30

Backup – RQ5 natural disruption

matheus.paixao.14@ucl.ac.uk 30

slide-31
SLIDE 31

Backup – RQ5 modularity improvement within acceptable disruption

matheus.paixao.14@ucl.ac.uk 31