Best practices in scientific programming Software Carpentry, Part I - - PowerPoint PPT Presentation

best practices in scientific programming software
SMART_READER_LITE
LIVE PREVIEW

Best practices in scientific programming Software Carpentry, Part I - - PowerPoint PPT Presentation

Best practices in scientific programming Software Carpentry, Part I Valentin H anel valentin.haenel@bccn-berlin.de Technische Universit at Berlin Bernstein Center for Computational Neuroscience Berlin Python Winterschool Warsaw, Feb 2010


slide-1
SLIDE 1

Best practices in scientific programming Software Carpentry, Part I

Valentin H¨ anel valentin.haenel@bccn-berlin.de

Technische Universit¨ at Berlin Bernstein Center for Computational Neuroscience Berlin

Python Winterschool Warsaw, Feb 2010 Slides based on material by Pietro Berkes

1 / 49

slide-2
SLIDE 2

Todays Schedule

Morning

Valentin

Agile Methods Unit Testing Version Control

Rike

Unit Testing Examples Subversion Debugging Profiling

2 / 49

slide-3
SLIDE 3

Todays Schedule

Afternoon

Niko General Design Principles Object Oriented Programming in Python Object Oriented Design Principles Design Patterns

3 / 49

slide-4
SLIDE 4

Motivation

Many scientists write code regularly but few have formally been trained to do so Best practices can make a lot of difference Development methodologies are established in the software engineering industry We can learn a lot from them to improve our coding skills

4 / 49

slide-5
SLIDE 5

Scenarios

Lone student/scientist Small team of scientists, working on a common library Speed of development more important than execution speed Often need to try out different ideas quickly:

rapid prototyping of a proposed algorithm re-use/modify existing code

5 / 49

slide-6
SLIDE 6

Outline

1 Introduction 2 Agile methods 3 Unit Testing 4 Version Control 5 Additional techniques

6 / 49

slide-7
SLIDE 7

What is a Development Methodology

Consist of: A philosophy that governs the style and approach towards development A set of tools and models to support the particular approach Help answer the following questions: How far ahead should I plan? What should I prioritize? When do I write tests and documentation?

7 / 49

slide-8
SLIDE 8

The Waterfall Model, Royce 1970

Requirements Design Implementation Testing Maintenence

8 / 49

slide-9
SLIDE 9

Agile Methods

Agile methods emerged during the late 90’s Generic name for set of more specific paradigms Set of best practices Particularly suited for:

small teams ( less than 10 people) unpredictable or rapidly changing requirements

9 / 49

slide-10
SLIDE 10

Prominent Features of Agile methods

Minimal planning Small development iterations Rely heavily on testing Promote collaboration and teamwork Very adaptive

10 / 49

slide-11
SLIDE 11

The Basic Agile Workflow

Define Test Write Simplest Version of Code Ensure Test Passes Writte Better Version of Code

11 / 49

slide-12
SLIDE 12

Example

Define Test

function my sum should return the sum of a list.

12 / 49

slide-13
SLIDE 13

Example

Write Simplest Version of Code

1

def my_sum(my_list ):

2

""" Compute sum of list elements. """

3

answer = 0

4

for item in my_list:

5

answer = answer + item

6

return answer

13 / 49

slide-14
SLIDE 14

Example

Ensure Test Passes

1

>>> my_sum ([1 ,2 ,3])

2

6

14 / 49

slide-15
SLIDE 15

Example

Writte Better Version of Code

1

def my_sum(my_list ):

2

""" Compute sum of list elements. """

3

return sum(my_list)

15 / 49

slide-16
SLIDE 16

Agile methods

16 / 49

slide-17
SLIDE 17

Whats Next

Look at tools to support the agile workflow Better testing with Unit Tests Keeping track of changes and collaborating with Version Control Additional techniques

17 / 49

slide-18
SLIDE 18

Outline

1 Introduction 2 Agile methods 3 Unit Testing 4 Version Control 5 Additional techniques

18 / 49

slide-19
SLIDE 19

Unit Tests

Definition of a Unit

The smallest testable piece of code Example: my sum We wish to automate testing of our units In python we use the package unittest

19 / 49

slide-20
SLIDE 20

Example

1

import unittest

2 3

def my_sum(my_list ):

4

""" Compute sum of list elements. """

5

return sum(my_list)

6 7

class Test(unittest.TestCase ):

8

def test_my_sum(self ):

9

self.assertEqual(my_sum ([1 ,2 ,3]) ,6)

10 11

if __name__ == "__main__":

12

unittest.main ()

20 / 49

slide-21
SLIDE 21

Running the Example

1

% python example -test2.py

2

.

3

  • 4

Ran 1 test in 0.000s

5 6

OK

21 / 49

slide-22
SLIDE 22

The Basic Agile Workflow - Reloaded

Define Unit Test Write Simplest Version of Unit Ensure Unit Test Passes Writte Better Version of Unit

22 / 49

slide-23
SLIDE 23

Goals

check code works check design works catch regression

23 / 49

slide-24
SLIDE 24

Benefits

Easier to test the whole, if the units work Can modify parts, and be sure the rest still works Provide examples of how to use code

24 / 49

slide-25
SLIDE 25

How to Test ?

Test with simple cases, using hard coded solutions

my sum([1,2,3]) == 6

Test special or boundary cases

my sum([]) == 0

Test that meaningful error messages are raised upon corrupt input

my sum([’1’, ’a’]) → TypeError: unsupported operand type(s) for +: ’int’ and ’str’

25 / 49

slide-26
SLIDE 26

What Makes a Good Test?

independent (of each other, and of user input) repeatable (i.e. deterministic) self-contained

26 / 49

slide-27
SLIDE 27

Stuff Thats Harder to Test

Probabilistic code Use toy examples as validation Consider fixing the seed for your pseudo random number generator Hardware use mock up software that behaves like the hardware should Plots (any creative ideas welcome)

27 / 49

slide-28
SLIDE 28

Test Suits

All unit tests are collected into a test suite Execute the entire test suite with a single command Can be used to provide reports and statistics

28 / 49

slide-29
SLIDE 29

Refactoring

This is what its called when you write a better version of your code. Re-organisation of your code without changing its function:

remove duplicates by creating functions and methods increase modularity by breaking large code blocks into units rename and restructure code to increase readability and reveal intention

Always refactor one step at a time, and use the unit tests to check code still works Learn how to use automatic refactoring tools to make your life easier

29 / 49

slide-30
SLIDE 30

Dealing with Bugs

Isolate the bug (using a debugger) Write a unit test to expose the bug Fix the code, and ensure the test passes Use the test to catch the bug should it reappear

Debugger

A program to run your code one step at a time, and giving you the ability to inspect its current state.

30 / 49

slide-31
SLIDE 31

Dealing with Bugs

31 / 49

slide-32
SLIDE 32

Introducing New Features

Split feature into units Use the agile workflow Tests drive the development Keep the iterations small

32 / 49

slide-33
SLIDE 33

Some Last Thoughts

Tests increase the confidence that your code works correctly, not only for yourself but also for your reviewers Tests are the only way to trust your code It might take you a while to get used to the idea, but it will pay off quite rapidly Questions?

33 / 49

slide-34
SLIDE 34

Outline

1 Introduction 2 Agile methods 3 Unit Testing 4 Version Control 5 Additional techniques

34 / 49

slide-35
SLIDE 35

What is Version Control?

Problem 1

”Help my code worked yesterday, but I can’t recall what I changed!”

Problem 2

”We would like to work together, but we don’t know how!” Version control is a method to track changes in source code Concurrent editing is possible via merging

35 / 49

slide-36
SLIDE 36

Features

Revert to previous versions Document developer effort

Who changed what, when and why?

Easy collaboration across the globe

36 / 49

slide-37
SLIDE 37

Where the Versions are Stored?

Repository Zaza Yarik Xenia

repository is located on a server Developers must connect to this server

37 / 49

slide-38
SLIDE 38

Contents of the Repository

Version 22 Version 23 Version 24 Version: 23 Author: Valentin Date : 07.02.2010 Message: Improve my_sum Changes: [...]

38 / 49

slide-39
SLIDE 39

Basic Version Control Workflow

39 / 49

slide-40
SLIDE 40

What Will We Use ?

Many different systems available We will use the de-facto standard:

40 / 49

slide-41
SLIDE 41

Some Last Thoughts

Use version control for anything thats text

Code Thesis Letters

We will be using centralised version control, note there exists also decentralised version control Again, it might take a while to get used to the idea, but it will pay off rapidly. Questions

41 / 49

slide-42
SLIDE 42

Outline

1 Introduction 2 Agile methods 3 Unit Testing 4 Version Control 5 Additional techniques

42 / 49

slide-43
SLIDE 43

Pair Programming

Two developers, one computer Two roles: driver and navigator Driver sits at keyboard Navigator observes and instructs Switch roles every so often

43 / 49

slide-44
SLIDE 44

Optimization for Speed

Readable code is usually better than fast code Only optimize if its absolutely necessary Only optimize your bottlenecks ...and identify these using a profiler, for example cprofile

Profiler

A tool to measure and provide statistics on the execution time of code.

44 / 49

slide-45
SLIDE 45

Prototyping

If you are unsure how to implement something, write a prototype Hack together a proof of concept quickly No tests, no documentation Use this to explore the feasability of your idea When you are ready, scrap the prototype and start with the unit tests

45 / 49

slide-46
SLIDE 46

Coding Style

Give your variables meaningful names Adhere to coding conventions OR use a consistent style Use automated tools to ensure adherence: pylint

46 / 49

slide-47
SLIDE 47

Documentation

Minimum requirement: at least a docstring For a library document arguments and return objects Use tools to automatically generated website from code: pydoc

47 / 49

slide-48
SLIDE 48

Results

Every scientific result (especially if important) should be independently reproduced at least internally before publication. (German Research Council 1999) Increasing pressure to make the source used in publications available With unit tested code you need not be embarrassed to publish your code Using version control allows you to share and collaborate easily

48 / 49

slide-49
SLIDE 49

The Last Slide

Open source tools used to make this presentation:

wiki2beamer L

AT

EXbeamer dia

Questions ?

49 / 49