Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 - - PowerPoint PPT Presentation

open source without headaches
SMART_READER_LITE
LIVE PREVIEW

Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 - - PowerPoint PPT Presentation

Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 November 2018 Wait, isnt open source a Good Thing ? To science , open-source if unequivocally good Tools are free and open to public scrutiny Wait, isnt open source a


slide-1
SLIDE 1

Open-source without headaches

Edwin Dalmaijer

20 November 2018

@esdalmaijer

slide-2
SLIDE 2

Wait, isn’t open source a Good Thing?

  • To science, open-source if unequivocally good
  • Tools are free and open to public scrutiny
slide-3
SLIDE 3

Wait, isn’t open source a Good Thing?

  • To science, open-source if unequivocally good
  • Tools are free and open to public scrutiny

PyGaze PsychToolbox EEGLAB SPM

slide-4
SLIDE 4

So what about those headaches?

  • To a scientist, open-source is a distraction
  • Publishing open code requires additional time and efgort
  • Open code is not rewarded in systematic ways
  • More open code => fewer papers => lower grant chances
slide-5
SLIDE 5

So what about those headaches?

  • To a scientist, open-source is a distraction
  • Publishing open code requires additional time and efgort
  • Open code is not rewarded in systematic ways
  • More open code => fewer papers => lower grant chances
  • But what if you publish a paper on your code?
  • Unlike paper, software requires continued efgort
  • Unlike authors, people join and leave development teams
slide-6
SLIDE 6

So what about those headaches?

  • To a scientist, open-source is a distraction
  • Publishing open code requires additional time and efgort
  • Open code is not rewarded in systematic ways
  • More open code => fewer papers => lower grant chances
  • But what if you publish a paper on your code?
  • Unlike paper, software requires continued efgort
  • Unlike authors, people join and leave development teams

Psychophysics toolbox (Matlab) PsychoPy (Python) Brainard (1997): 11578 citations Peirce (2007, 2009): 2471 citations Kleiner et al. (2007): 1954 citations Peirce et al: under review!

slide-7
SLIDE 7

So what about those headaches?

  • To a scientist, open-source is a distraction
  • Publishing open code requires additional time and efgort
  • Open code is not rewarded in systematic ways
  • More open code => fewer papers => lower grant chances
  • But what if you publish a paper on your code?
  • Unlike paper, software requires continued efgort
  • Unlike authors, people join and leave development teams
  • But doesn’t your toolbox get you exposure?
  • Important for early career, but doesn’t get you fellowships
  • How many PIs are ‘methods people’?
slide-8
SLIDE 8

Does it really take up that much time?

“What kind of data quality would be achievable with my webcam? (See attached image of my face.)” “When I try to run your Python script in OpenSesame / Unity / [other non-Python tool], it doesn’t work!” “Your code didn’t work, what should I do?” “Hi, I need help with this other software you didn’t develop!”

slide-9
SLIDE 9

Does it really take up that much time?

  • Continuous work on development
  • Bug fjxes, new features, dependencies change
  • Continuous work on support
  • If people use your tools, they’ll ask questions
  • Communities are hard to build, and require critical mass

that most science projects just don’t have

slide-10
SLIDE 10

Is supporting open developers important?

Kelle Cruz, AstroPy May 2018

  • Three omnipresent packages
  • About 90 million downloads
  • Estimated cost over $21 million
  • Just 15 active maintainers!
slide-11
SLIDE 11

We need to reward software contributions

  • Science relies on crucial open software
  • Without these, most of us couldn’t do our jobs
slide-12
SLIDE 12

We need to reward software contributions

  • Science relies on crucial open software
  • Without these, most of us couldn’t do our jobs
  • The current system punishes developers
  • Matthew efgect: less time for papers => fewer grants
  • Low pay, even lower job security
slide-13
SLIDE 13

We need to reward software contributions

  • Science relies on crucial open software
  • Without these, most of us couldn’t do our jobs
  • The current system punishes developers
  • Matthew efgect: less time for papers => fewer grants
  • Low pay, even lower job security
  • We need to adjust academic reward structures
  • Citations to associated papers are not enough
  • More stable positions for open-source developers?
  • Include software overhead in grants?
slide-14
SLIDE 14

Post-soapbox usefulness

slide-15
SLIDE 15

Two types of code among researchers

  • Script: analysis pipeline
  • Usually written in one long fjle
  • Pretty specifjc to one project
  • Usually not particularly useful to other people
  • analysis_fjnal2-October 2018.m
  • Libraries: set of more general functions
  • Importable to scripts from a central place
  • Combine functions for particular purposes
  • Tend to be useful to other people
slide-16
SLIDE 16

What do you hate in other people’s code?

  • No README
  • No docstrings
  • Unhelpful commenting
  • Unclear variable names
  • All fjles reference each other
slide-17
SLIDE 17

What is a good open-source project?

  • Clearly documented
  • README, function descriptions, and EXCESSIVE comments
  • Sensible structure
  • File structure and folders neatly organised
  • Sensible fjle names
  • Easy to fjnd and to download
  • For example through GitHub, GitLab, BitBucket, or OSF
  • Not dependent on hidden code.
  • Sensible dependencies; don’t use obscure homebrew
slide-18
SLIDE 18

Start with a sensible folder structure

  • 2018 Super Amazing Study
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiments
  • constants.py
  • experiment_v4.py
  • custom_functions.py
  • literature
  • writing
slide-19
SLIDE 19

Start with a sensible folder structure

  • 2018 Super Amazing Study
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiment
  • constants.py
  • experiment_v4.py
  • custom_functions.py
  • literature
  • writing
slide-20
SLIDE 20

Start with a sensible folder structure

  • 2018 Super Amazing Study
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiment
  • constants.py
  • experiment_v4.py
  • custom_functions.py
  • literature
  • writing
slide-21
SLIDE 21

Start with a sensible folder structure

  • 2018 Super Amazing Study
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiment
  • constants.py
  • experiment_v4.py
  • custom_functions.py
  • literature
  • writing
slide-22
SLIDE 22

Add a README to every project

  • 2018 Super Amazing Study
  • README.md
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiment
  • constants.py
  • experiment_v4.py
  • custom_functions.py
slide-23
SLIDE 23

Creating a new repository on GitHub

slide-24
SLIDE 24

Creating a new repository on GitHub

slide-25
SLIDE 25

Creating a new repository on GitHub

slide-26
SLIDE 26

Creating a new repository on GitHub

slide-27
SLIDE 27

Open folder in terminal / command prompt

  • 2018 Super Amazing Study
  • README.md
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiment
  • constants.py
  • experiment_v4.py
  • custom_functions.py

cd “/home/documents/ 2018 Super Amazing Study”

slide-28
SLIDE 28

Initialise a Git repository

git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master

slide-29
SLIDE 29

Add all current fjles to the repository

git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master

slide-30
SLIDE 30

Schedule fjles to be uploaded

git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master

slide-31
SLIDE 31

Connect the GitHub repository

git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master

slide-32
SLIDE 32

Upload committed fjles to GitHub repo!

git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master

slide-33
SLIDE 33

Edit, add, commit, push; repeat!

  • 2018 Super Amazing Study
  • README.md
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiment
  • constants.py
  • experiment_v4.py
  • custom_functions.py

git add . git commit -m “description” git push origin master Change something here...

slide-34
SLIDE 34

Edit, add, commit, push; repeat!

  • 2018 Super Amazing Study
  • README.md
  • analysis
  • data
  • pp01.tsv
  • pp01.cnt
  • ...
  • analysis_script_v3.py
  • eeg_functions.py
  • motion_tracking.py
  • experiment
  • constants.py
  • experiment_v4.py
  • custom_functions.py

git add . git commit -m “description” git push origin master Then run the magic words!

slide-35
SLIDE 35

GitHub Desktop has a GUI instead

  • Some people don’t like the command line
  • Everyone has their preferences, don’t be embarrassed
  • GitHub Desktop is a graphical alternative
  • Available on Windows and on OS X
slide-36
SLIDE 36

Principles of Object-Oriented Programming

Class (blueprint)

slide-37
SLIDE 37

Principles of Object-Oriented Programming

Class (blueprint) Instance (realised object)

slide-38
SLIDE 38

Principles of Object-Oriented Programming

Class (blueprint) Instance (realised object)

slide-39
SLIDE 39

Principles of Object-Oriented Programming

  • Hide specifjc implementation in functions
  • Classes and functions do their own thing
  • Internal variables don’t need to be exposed
  • Functions return required output
slide-40
SLIDE 40

Principles of Object-Oriented Programming

  • Hide specifjc implementation in functions
  • Classes and functions do their own thing
  • Internal variables don’t need to be exposed
  • Functions return required output
  • Compartmentalise where possible
  • General functions can be reused more easily!
slide-41
SLIDE 41

Principles of Object-Oriented Programming

  • Hide specifjc implementation in functions
  • Classes and functions do their own thing
  • Internal variables don’t need to be exposed
  • Functions return required output
  • Compartmentalise where possible
  • General functions can be reused more easily!
  • Stufg your functions in a library
  • Import it when you need it; leave it alone when you don’t
slide-42
SLIDE 42

Class defjnition example

class Car: def __init__(self, colour, engine): """Initialises the car. colour – tuple with 8-bit ints indicating the colour engine – instance of the Engine class """ # Define the number of wheels self.n_wheels = 4

slide-43
SLIDE 43

Function defjnition example

def sum(numbers): """Computes the sum of passed numbers numbers – list of floats """ # Start at 0. s = 0.0 # Loop through the numbers for num in numbers: # Add the current number to total s += num # Return the sum return s

slide-44
SLIDE 44

Best open-source practices

  • Start straight away!
  • Don’t wait with until publication to get organised
  • Sensible structure of fjles and code
  • Structure folders in an organised way
  • Compartmentalise code where possible
  • Re-using code? NO COPY-PASTING! Write a function!
  • Sharing is easy via GitHub
  • Or others: GitLab, BitBucket, Open Science Framework, etc
slide-45
SLIDE 45

Useful resources

@esdalmaijer