Riaan Cornelius Using forensic techniques for targeted refactoring - - PowerPoint PPT Presentation

riaan cornelius
SMART_READER_LITE
LIVE PREVIEW

Riaan Cornelius Using forensic techniques for targeted refactoring - - PowerPoint PPT Presentation

Crafting Code Riaan Cornelius Using forensic techniques for targeted refactoring Who am I > More than a decade of software dev experience > Mobile app developer by day > Purveyor of strange topics by night > Ive dabbled in AI,


slide-1
SLIDE 1

Riaan Cornelius

Using forensic techniques for targeted refactoring

Crafting Code

slide-2
SLIDE 2

Who am I

> More than a decade of software dev experience > Mobile app developer by day > Purveyor of strange topics by night > I’ve dabbled in AI, computer vision, robotics and even cooking > Please remember to rate my talk: http://www.devconf.co.za/rate

slide-3
SLIDE 3

Why do we refactor?

> As a developer, what is your job?

slide-4
SLIDE 4

Why do we refactor?

slide-5
SLIDE 5

Why do we refactor?

slide-6
SLIDE 6

Why do we refactor?

slide-7
SLIDE 7

Why do we refactor?

slide-8
SLIDE 8

Why do we refactor?

> Maintenance is expensive

slide-9
SLIDE 9

The enemy of change

> Complexity > If our job is to understand code, how do we make that job easier

slide-10
SLIDE 10

Some (potentially) useful tools

> Static analysis > Complexity metrics > Code reviews > Tests

slide-11
SLIDE 11

Tools I used

> Git (specifically git log) > Code Maat > Python > D3.js (Javascript library)

slide-12
SLIDE 12

Forget the tools

> It’s not about the tools, but rather the techniques > These tools simplify some parsing, processing or visualisation > You can write your own scripts for any of these functions

slide-13
SLIDE 13

Problems of scale

> In large systems, how do you prioritise improvements?

slide-14
SLIDE 14

The problem with complexity metrics

> Complexity is only a problem if you need to deal with it

slide-15
SLIDE 15

Offender profiling

> You probably know something about offender profiling. > Hollywood loves it:

  • Silence of the lambs
  • Numbers
  • Criminal minds
  • NCIS
  • Many more…
slide-16
SLIDE 16

Offender profiling

> There is one serious limitation: They only work in Hollywood

slide-17
SLIDE 17

Geographic profiling

> Based in statistics and psychology. > Same principle as police officer sticking pins in a map

slide-18
SLIDE 18

Geographic profiling

slide-19
SLIDE 19

Applying geographical profiling to code

> What if a hotspot analysis could narrow down areas of bad code?

slide-20
SLIDE 20

Exploring the geography of code

slide-21
SLIDE 21

Add a spatial component

> Hopefully you all use a VCS. > We need to focus on areas with high developer activity

slide-22
SLIDE 22

Add a spatial component

> git log --pretty=format:'[%h] %an %ad %s' --date=short --numstat > maat.bat -l git.log -c git -a revisions > metric_data.cvs

slide-23
SLIDE 23

Add a spatial component

slide-24
SLIDE 24

Combine complexity and effort

slide-25
SLIDE 25

Profiling your codebase

> Choose a timespan for your analysis > Get frequency data > Add complexity data > Merge complexity and effort > Visualise this data

slide-26
SLIDE 26

Profiling your codebase

> We’ll look at the hibernate ORM > git clone https://github.com/hibernate/hibernate-orm.git

slide-27
SLIDE 27

Profiling your codebase

> Choosing a timeframe > Don’t look at the life of the project > What timeframe you use depend on your development methodology

  • Between releases
  • Over iterations
  • Around significant events (reorganisation of code or teams)
slide-28
SLIDE 28

Profiling your codebase

> Generate a log: > git log --pretty=format:'[%h] %an %ad %s' --date=short –numstat -- before=2013-09-05 --after=2012-01-01 > hib_evo.log

slide-29
SLIDE 29

Profiling your codebase

> A summary of the changes shows some interesting things: prompt> maat -l l hib ib_evo.lo log -c git it -a su summary ry

statistic,value number-of-commits,1346 number-of-entities,10193 number-of-entities-changed,18258 number-of-authors,89

slide-30
SLIDE 30

Profiling your codebase

> Analyzing change frequencies: > maat -l hib_evo.log -c git -a revisions > hib_freqs.csv

slide-31
SLIDE 31

Profiling your codebase

> Calculate complexity > Complexity by lines of code? > Bad metric, but no worse than others… > Cloc ./ --by-file –csv –quiet –report-file=hib_lines.csv

slide-32
SLIDE 32

Profiling your codebase

> Combine complexity and effort: > python scripts/merge_comp_freqs.py hib_freqs.csv hib_lines.csv

> module,revisions,code build.gradle,79,402 hibernate-core/.../persister/entity/AbstractEntityPersister.java,44,3983 hibernate-core/.../cfg/Configuration.java,40,2673 hibernate-core/.../internal/SessionImpl.java,39,2097 hibernate-core/.../internal/SessionFactoryImpl.java,34,1384 …

slide-33
SLIDE 33

Profiling your codebase

> Now we can finally get to the fun part: Visualisation > I’m using a sample D3.js circle-packing algorithm > Due to security restrictions in modern browsers: > pyth ython -m m Sim Simple leHTTPServer 8888

slide-34
SLIDE 34

Profiling your codebase

slide-35
SLIDE 35

Profiling your codebase

slide-36
SLIDE 36

Measuring complexity

> Is there a simple option that is better than lines of code?

slide-37
SLIDE 37

Measuring complexity

slide-38
SLIDE 38

Measuring complexity

> python scripts/complexity_analysis.py hibernate- core/src/main/java/org/hibernate/cfg/Configuration.java

n, total, mean, sd, max 3335, 8072, 2.42, 1.63, 14

slide-39
SLIDE 39

Measuring complexity

> You’ve already seen how to analyze a single revision. Now we want to:

  • 1. Take a range of revisions for a specific module.
  • 2. Calculate the indentation complexity of the module as it occurred in

each revision.

  • 3. Output the results revision by revision for further analysis.
slide-40
SLIDE 40

Measuring complexity

> python scripts/git_complexity_trend.py

  • -start ccc087b --end 46c962e
  • -file hibernate-core/src/main/java/org/hibernate/cfg/Configuration.java

> rev, n, total, mean, sd e75b8a7, 3080, 7610, 2.47, 1.76 23a6280, 3092, 7649, 2.47, 1.76 8991100, 3100, 7658, 2.47, 1.76 8373871, 3101, 7658, 2.47, 1.76 …

slide-41
SLIDE 41

Visualising complexity trends

slide-42
SLIDE 42

Visualising complexity trends

slide-43
SLIDE 43

Visualising complexity trends

slide-44
SLIDE 44

Going further

slide-45
SLIDE 45

Resources

> http://riaan.me/dc16 Twitter: @riaancornelius Please remember to rate my talk: http://www.devconf.co.za/rate

slide-46
SLIDE 46

/* THANK YOU*/

Riaan Cornelius Entelect Software Riaan.Cornelius@Entelect.co.za 084 755 1866

http://www.devconf.co.za/