MPAGS AS1 An introduction to scientific computing with - - PowerPoint PPT Presentation

mpags as1 an introduction to scientific computing with
SMART_READER_LITE
LIVE PREVIEW

MPAGS AS1 An introduction to scientific computing with - - PowerPoint PPT Presentation

MPAGS AS1 An introduction to scientific computing with http://stevenbamford.com/python_mpags_2018/ Steven Bamford Course prerequisites To make the most of this course, you should have: Some programming experience (in any language)


slide-1
SLIDE 1

Steven Bamford MPAGS AS1 An introduction to scientific computing with

http://stevenbamford.com/python_mpags_2018/

slide-2
SLIDE 2

Course prerequisites

  • To make the most of this course, you should have:
  • Some programming experience (in any language)
  • Access to a computer with Python installed
  • Preferably a laptop so you can try things out during the sessions
  • You will need (at least) numpy, matplotlib and scipy modules installed
  • See your sys. admin. and the course webpage for installation help
  • Anaconda recommended
  • Ideally you should also have:
  • Some current or upcoming need of a scripting language
  • A piece of real or toy analysis on which you can try out using Python
slide-3
SLIDE 3

Course aims

  • To give you…
  • experience of using a modern scripting language (specifically Python)
  • practical advice about scientific programming
  • knowledge of the main scientific modules for Python
  • the ability to do basic data analysis tasks in Python

(e.g. data manipulation, plotting, …)

  • knowledge of some specific tools for scientific computing

(e.g. signal processing, optimisation, …)

  • an overview of Python's full capabilities
  • Not to…
  • teach programming in general
  • cover every aspect of Python
slide-4
SLIDE 4

Course structure

  • Five sessions of ≤ 2 hours each:
  • Thursday 3:00 – 5:00 pm
  • ~1.5 hour lecture
  • Optionally some time working on exercises with support
  • Lecture notes will go online before each session
  • Caveats, pros and cons
  • Questions:
  • During lectures/exercises
  • communication with remote locations tricky
  • tweet me @thebamf
  • Email to arrange a meeting (in my office or over skype)
  • steven.bamford@nottingham.ac.uk
  • Formative assessment by coursework
  • Development of a working Python program related to your studies
slide-5
SLIDE 5

Coursework

  • A Python program relevant to your research
  • put course material into practice
  • pportunity to become familiar with Python
  • get feedback on your coding
  • requirement to qualify for credits
  • Your program should…
  • be written as an importable module (.py file) or Jupyter notebook (.ipynb)
  • do something meaningful: analyse real data or perform a simulation
  • use at least two user functions
  • use at least one of the modules introduced in Sessions 3 – 5
  • produce at least one informative plot
  • comprise >~ 50 lines of actual code (excluding comments and imports)
slide-6
SLIDE 6

Coursework

  • Submit as source code (.py/.ipynb file) and pdf/png files of the output plot(s)
  • Email me (steven.bamford@nottingham.ac.uk) with subject “MPAGS coursework”
  • Link to a GitHub repository, etc. or attach code
  • if private add me (bamford) as a collaborator
  • make sure any requirements are clear
  • Three stages:
  • One (optional)
  • README describing what you intend your code to do
  • Rough outline of the code (classes, functions, snippets, comments, pseudocode)
  • If submitted by 9 Nov, I will provide feedback
  • Two (optional)
  • Roughly working version of your code, although may be incomplete, have bugs,

although try to make it reasonable and easy to understand!

  • If submitted by 23 Nov, I will provide feedback
  • Three (for MPAGS credits)
  • Full (ideally) working version of your code (doesn’t have to be amazing!)
  • 14 Dec deadline
slide-7
SLIDE 7

Outline

  • Session1: Introduction to Python
  • Why Python is (mostly) awesome
  • Introduction to the language:
  • start-up, syntax, constructs, functions, classes, getting help
  • Python 2.x versus 3.x
  • Good programming practice versus 'thinking aloud'
  • Session 2: Numerical Python and plotting
  • Numpy
  • Using arrays wisely
  • Plotting:
  • matplotlib and others
  • IPython notebooks
  • Session 3: Scientific Python
  • Scipy
  • ptimisation, interpolation, statistics, filtering, integration, …
  • Other tools
  • scikit, astroML, GNU Scientific Library, R, Theano, …
slide-8
SLIDE 8

Outline

  • Session 4: Python for specialists
  • Python for observers
  • Astropy
  • Handling FITS files
  • PyRAF – scripting IRAF with Python
  • Python for theorists
  • Symbolic algebra (sympy and Sage)
  • Bayesian inference and Deep Learning in Python
  • MCMC with emcee
  • ANNs with keras
  • Session 5: Extreme Python
  • Efficiently storing and processing large amounts of data
  • PyTables, Pandas
  • Multiprocessing
  • Timing and testing
  • Wrapping external libraries and creating the fastest code
  • Cython
  • Numba and NumbaPro
  • Web applications
  • Django and others
slide-9
SLIDE 9

Session 1: Introduction to Python

An introduction to scientific programming with

slide-10
SLIDE 10

Why use a scripting language?

  • Modern scripting languages:
  • Python, IDL, R, Perl, Ruby, …
  • High-level
  • Interactive interpreter
  • Ease of use
  • Speed of development
  • Encourages scripting, rather than one-off analysis
  • Permanent record
  • Repeatability
slide-11
SLIDE 11

Why not?

  • If you want fastest possible performance
  • at the expense of everything else
  • You need highly parallel code
  • Need low-level control
  • Unless you are working on a supercomputer or developing operating

systems components, these probably don't apply to you

  • Even then, Python could be useful in places (glue, tests, etc.)
slide-12
SLIDE 12

Why Python is awesome

  • Designed to be easy to learn and use – clear syntax
  • Well documented
  • Powerful, flexible, fully-featured programming language
  • Multi-paradigm
  • 'Batteries included'
  • Comprehensive scientific tools
  • Fast, efficient
  • Interpreter, introspection
  • Runs everywhere, completely free
  • You probably already have it
slide-13
SLIDE 13

Why learn Python?

  • Less stress
  • Get more science done
  • Widely used and growing popularity
  • Throughout academia and industry
  • NASA, AstraZeneca, Google, Industrial Light & Magic, Philips,…
  • Web services, engineering, science, air traffic control, quantitative

finance, games, education, data management, …

  • Python programmers in demand
  • Easy introduction to general programming concepts

Why not?

  • Existing code for your project in another language, but still…
slide-14
SLIDE 14

Starting the interpreter

  • IPython (interactive Python)
  • $ ipython

/ $ ipython qtconsole

  • Spyder
  • integrated development environment
slide-15
SLIDE 15

Jupyter (IPython) notebooks

  • Mathematica/Maple-style notebooks
  • Store code and output together in one file
  • Blend interactive prompt and scripts
  • Good for demonstrations / trying things out
  • Keep reproducable record of interactive analyses
  • To start, in terminal:

jupyter notebook

  • Opens notebook interface in web browser
  • If you put them online (especially on GitHub) can easily display with

nbviewer.ipython.org

  • jupyter nbconvert can be used to convert to python/html/slides, etc.
slide-16
SLIDE 16

Basics

>>> 2+2 4 >>> # This is a comment ... 2+2 4 >>> 2+2.0 # and a comment on the same line as code 4.0 >>> (50-5*6)/4 5 >>> width = 20 # assignment, no type declaration >>> height = 5*9 >>> width * height 900 >>> x = y = z = 0 # zero x, y and z >>> y >>> n Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'n' is not defined

slide-17
SLIDE 17

Scripts

2+2 # This is a comment 2+2 2+2.0 # and a comment on the same line as code (50-5*6)/4 width = 20 # assignment, no type declaration height = 5*9 width * height x = y = z = 0 # zero x, y and z print(y)

  • Write code in a text editor (ideally one Python aware!)
  • Save in a file and execute…

from command line: $ python test.py from the IPython prompt: In [1]: %run test.py

  • Save and use interactively in future sessions (>>> import test)
  • Create executable files ($ ./test.py)
slide-18
SLIDE 18

Strings

>>> 'spam and eggs' 'spam and eggs' >>> 'doesn\'t' "doesn't" >>> "doesn't" "doesn't" >>> '"Yes," he said.' '"Yes," he said.' >>> hello = 'Greetings!' >>> hello 'Greetings!' >>> print(hello) Greetings! >>> print(hello + ' How do you do?') Greetings! How do you do? >>> print(hello, 'How do you do?') ('Greetings!', 'How do you do?') OR Greetings! How do you do? >>> howdo = 'How do you do?' >>> print(hello+' '+howdo) Greetings! How do you do?

slide-19
SLIDE 19

Numbers

>>> 10 + 3 13 >>> 10 - 3 7 >>> 10 * 3 30 >>> 10 / 3 3 OR 3.3333333333333335 >>> 10 // 3 3 >>> 10 % 3 1 >>> 10**3 1000 >>> 10 + 3 * 5 # *,/ then +,- 25 >>> (10 + 3) * 5 65 >>> -1**2 # Note: -(1**2)

  • 1

>>> 10.0 + 3.0 13.0 >>> 10.0 - 3.0 7.0 >>> 10.0 * 3 30.0 >>> 10.0 / 3 3.3333333333333335 >>> 10.0 // 3 3.0 >>> 10.0 % 3.0 1.0 >>> 10.0**3 1000.0 >>> 4.2 + 3.14 7.3399999999999999 >>> 4.2 * 3.14 13.188000000000001

slide-20
SLIDE 20

Numbers

Augmented assignment:

>>> a = 20 >>> a += 8 >>> a 28 >>> a /= 8.0 >>> a 3.5

Functions:

>>> abs(-5.2) 5.2 >>> sqrt(25) 5.0

Comparisons:

>>> 5 * 2 == 4 + 6 True >>> 0.12 * 2 == 0.1 + 0.14 False >>> a = 0.12 * 2; b = 0.1 + 0.14 >>> eps = 0.0001 >>> a - eps < b < a + eps True

slide-21
SLIDE 21

Python 2.x versus 3.x

  • The 3.x branch is intentionally backwards incompatible
  • Various improvements, removal of obsolete code, but annoying!
  • print a ➞ print(a)
  • Note that print(a, b, c) does not behave same in 2.x and 3.x
  • 1/3 == 1//3 == 0 ➞ 1/3 == 1.0/3.0 == 1.333…, 1//3 == 0
  • … various others …
  • Versions >=2.6 contains most backward compatible changes, and can

warn of usage (-3 switch) which would be an error in 3.x

  • 2to3 (semi-)automated code translator
  • A very small number of modules still not compatible with 3.x
  • You should probably start with Python 3, but 2.7 is ok
slide-22
SLIDE 22

More about Python 2 vs 3

  • It is possible to use some Python 3 features in Python 2
  • from __future__ import print_function
  • makes more advanced features of print() available
  • from __future__ import division

>>> print ('one', 'two') >>> print ('one', 'two') ('one', 'two') ('one', 'two') >>> from __future__ import >>> from __future__ import print_function print_function >>> print ('one', 'two') >>> print ('one', 'two')

  • ne two
  • ne two

>>> print ('one', 'two', sep=' : ') >>> print ('one', 'two', sep=' : ')

  • ne : two
  • ne : two

>>> 5/3 >>> 5/3 1 >>> from __future__ import division >>> from __future__ import division >>> 5/3 >>> 5/3 1.6666666666666667 1.6666666666666667

If you really need code that is Python 2 and 3 compatible, try Six

slide-23
SLIDE 23

Containers

Lists:

>>> a = [1, 2, 4, 8, 16] # list of ints >>> c = [4, 'candles', 4.0, 'handles'] # can mix types >>> c[1] 'candles' >>> c[2] = 'knife'

>>> c[-1] # negative indices count from end 'handles' >>> c[1:3] # slicing

['candles', 'knife']

>>> c[2:] # omitting defaults to start or end

['knife', 'handles']

>>> c[0:4:2] # variable stride (could just write c[::2])

[4, 'knife']

>>> a + c # concatenate [1, 2, 4, 8, 16, 4, 'candles', 'knife', 'handles'] >>> len(a) 5

slide-24
SLIDE 24

Containers

Tuples:

>>> q = (1, 2, 4, 8, 16) # tuple of ints >>> r = (4, 'candles', 4.0, 'handles') # can mix types >>> s = ('lonely',) # singleton >>> t = () # empty >>> r[1] 'candles' >>> r[2] = 'knife' # cannot change tuples Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'tuple' object does not support item assignment >>> u = 3, 2, 1 # parentheses not necessary >>> v, w = 'this', 'that' >>> v 'this' >>> w 'that'

slide-25
SLIDE 25

Containers

Dictionaries:

>>> a = {'eyecolour': 'blue', 'height': 152.0, 42: 'the answer'} >>> a['age'] = 28 >>> a {42: 'the answer', 'age': 28, 'eyecolour': 'blue', 'height': 152.0} >>> del(a['height']) >>> a {42: 'the answer', 'age': 28, 'eyecolour': 'blue'} >>> b = {} >>> b['hello'] = 'Hi!' >>> a.keys() [42, 'age', 'eyecolour'] >>> a.values() ['the answer', 28, 'blue']

slide-26
SLIDE 26

Conditionals

>>> a = 4; b = 3 >>> if a > b: ... result = 'bigger' ... c = a - b ... >>> print(result, c) bigger 1 >>> a = 1; b = 3 >>> if a > b: ... result = 'bigger' ... elif a == b: ... result = 'same' ... else: # i.e. a < b ... result = 'smaller' ... >>> print(result) smaller >>> if a < b: print 'ok'

  • k

Comparison operators:

== != > < >= <= is is not in not in

Boolean operators:

and

  • r

not

  • Indentation is important!
  • be consistent
  • use four spaces
  • do not use tabs
  • Colon indicates the start of

an indented block

slide-27
SLIDE 27

Conditionals

>>> if 'Steven' in ['Bob', 'Amy', 'Steven', 'Fred']: ... print 'Here!' ... Here! >>> if 'Carol' not in ['Bob', 'Amy', 'Steven', 'Fred']: ... print 'Away!' ... Away! >>> test = a == b >>> if test: print 'Equal' 'Equal'

slide-28
SLIDE 28

Loops

>>> a = b = 0 >>> while a < 10: ... a += 3 ... print(a) ...

3 6 9 12

>>> while True: ... b += 3 ... if b >= 10: break ... print(b) 3 6 9 >>> for i in [2, 5, 3]: ... print(i**2) 4 25 9 >>> for j in range(5): print(j) 1 2 3 4 >>> range(3, 10, 2) [3,5,7,9]

slide-29
SLIDE 29

Loops

>>> d = {'this': 2, 'that': 7} >>> for k, v in d.items(): ... print('%s is %i'%(k, v)) this is 2 that is 7 >>> numbers = ['none', 'one', 'two', 'lots'] >>> for i, j in enumerate(numbers): ... print('%i: %s' % (i, j)) 0: none 1: one 2: two 3: lots

slide-30
SLIDE 30

Functions

>>> def my_func(x, y=0.0, z=1.0): ... a = x + y ... b = a * z ... return b ... >>> my_func(1.0, 3.0, 2.0) 8.0 >>> my_func(1.0, 3.0) 4.0 >>> my_func(1.0, y=3.0) 4.0 >>> my_func(5.0) 5.0 >>> my_func(2.0, z=3.0) 6.0 >>> my_func(x=2.0, z=3.0) 6.0

slide-31
SLIDE 31

Methods

>>> a = [2, 5, 3, 6, 5] >>> a.sort() >>> print(a) [2, 3, 5, 5, 6] >>> a.count(5) 2 >>> a.reverse() >>> print(a) [6, 5, 5, 3, 2] >>> d = {'black': 100, 'grey': 50, 'white': 0} >>> d.values() [0, 50, 100] >>> s = '-'.join(('2009', '07', '07')) >>> print(s) 2009-07-07 >>> a.__contains__(3) # leading underscores indicate True # not intended for general use

slide-32
SLIDE 32

Help

>>> help(math) >>> help(math.cos) >>> a = [1, 2, 3] >>> help(a)

  • Powerful help tools
  • Most objects, functions, modules, … can be inspected
  • If in doubt, hit 'tab'
  • If impatient, hit 'tab'

In [1]: math.cos? In [2]: a?

In IPython: (ignore things starting with _ _ )

slide-33
SLIDE 33

Lots of support online

  • python.org/doc
  • Language documentation
  • Library documentation
  • Beginner's Guide and Tutorials
  • ipython.org/documentation.html
  • www.codecademy.com/en/tracks/python
  • google.com
  • stackoverflow.com
  • etc. …
slide-34
SLIDE 34

Getting a bit more advanced…

slide-35
SLIDE 35

List comprehensions

  • A neat way of creating lists (and arrays) without writing a loop

my_fav_num = [3, 17, 22, 46, 71, 8] even_squared = [] for n in my_fav_num: if n%2 == 0: even_squared.append(n**2) # in one line: even_better = [n**2 for n in my_fav_num if n%2 == 0] # both produce [484, 2116, 64] freshfruit = [' banana', ' loganberry ', 'passion fruit '] stripped = [weapon.strip() for weapon in freshfruit] print(stripped) ['banana', 'loganberry', 'passion fruit']

slide-36
SLIDE 36

File I/O

>>> fname = 'myfile.dat' >>> f = open(fname) >>> lines = f.readlines() >>> f.close() f = open(fname) >>> firstline = f.readline() >>> secondline = f.readline() >>> f = open(fname) >>> for l in f: >>> print l.split()[1] >>> f.close() >>> outfname = 'myoutput' >>> outf = open(outfname, 'w') # second argument denotes writable >>> outf.write('My very own file\n') >>> outf.close()

slide-37
SLIDE 37

String formatting for output

>>> name = 'Steven'; day = 'Wednesday' >>> print('Hello {}. It is {}.'.format(name, day)) Hello Steven. It is Wednesday. >>> # Same effect: >>> print('Hello {1}. It is {0}'.format(day, name)) >>> print('Hello {n}. It is {d}'.format(d=day, n=name)) >>> d = {'Bob': 1.87, 'Fred': 1.768} >>> for name, height in d.items(): ... print('{who} is {height:.2f}m tall'.format(who=name, ... height=height)) ... Bob is 1.87m tall Fred is 1.77m tall >>> # older alternative uses '%' >>> for name, height in d.items(): ... print('%s is %.2f metres tall'%(name, height))

slide-38
SLIDE 38

Classes

>>> class MyClass: ... def __init__(self, x, y=1.0) ... self.my_x = x ... self.my_y = y ...

... def product(): ... return x*y

... >>> m = MyClass(2, 4) >>> m.product() 8 >>> p = MyClass(5) >>> p.product() 5 >>> m.product() 8

slide-39
SLIDE 39

Modules

>>> import math >>> math.cos(math.pi)

  • 1.0

>>> math.cos(pi) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pi' is not defined >>> from math import cos, pi >>> cos(pi)

  • 1.0

>>> from math import *

  • Modules can contain any code
  • Classes, functions, definitions, immediately executed code
  • Can be imported in own namespace, or into the global namespace
slide-40
SLIDE 40

Your own modules

  • Simply put code in a file with extension .py
  • Import it in interpreter
  • Use it!
  • Reloading
  • IPython autoreloading

>>> import mymod >>> mymod.dostuff() >>> from importlib import reload # required for Python 3 >>> reload(mymod) # if you change the module code >>> %load_ext autoreload >>> %autoreload

slide-41
SLIDE 41

Your own modules

  • Can also collect files together in folders

myphdmodule/ __init__.py contains code run upon import backgroundreader.py datareducer.py resultsinterpreter.py thesiswriter/ submodule introwriter.py bulkadder.py concluder.py >>> from myphdmodule import thesiswriter >>> thesiswriter.concluder()

slide-42
SLIDE 42

Executable scripts

#! /usr/bin/env python def countsheep(n): sheep = 0 for i in range(n): sheep += 1 return sheep if __name__ == "__main__”: import sys if len(sys.argv) > 1: n = int(sys.argv[1]) else: n = 100 s = countsheep(n) print(s)

slide-43
SLIDE 43

Executable scripts (better)

#! /usr/bin/env python def countsheep(n): sheep = 0 for i in range(n): sheep += 1 return sheep def main(): if len(sys.argv) > 1: n = int(sys.argv[1]) else: n = 100 s = countsheep(n) print(s) if __name__ == "__main__”: main() You will also need to make the file executable in Linux or OSX: $ $ chmod chmod u+x u+x countsheep.py countsheep.py $ ./ $ ./countsheep.py countsheep.py

slide-44
SLIDE 44

Documentation and tests

# My totally wicked function def my_func(x, y=0.0, z=1.0): ”””This does some stuff. For example: >>> my_func(1.0, 3.0, 2.0) 8.0 Yes, it really is that good! ””” a = (x + y) * z return a

  • Comments before function, class, etc. are used to generate help
  • “Docstrings”
  • preferred way of documenting code
  • can contain examples, which are automatically turned into tests!
  • see doctest module (also unittest and nose for more testing)
slide-45
SLIDE 45

Good programming practice versus 'thinking aloud'

  • Compromise between:
  • producing results quickly

versus

  • easy reusability and adaptation of code
  • code that can be quickly understood by others
  • Comment clearly
  • Use functions
  • Use modules
  • Consider 'refactoring' before code gets too messy
  • Look into version control (git is best!)
  • Science, not software development
slide-46
SLIDE 46

Version control – do it!

free private and public code hosting academic accounts unlimited free public code hosting, free private hosting for education and research GUI for Mac OSX GUI for Windows, Linux, OSX

slide-47
SLIDE 47

Exercises 1

1) Start your python interpreter and check the version. 2) Use python as a calculator (use variables and the math module). 3) Look at help for the math module. 4) Create a list of five numbers on the range (0, 10) and check the identity cosh2(x) – sinh2(x) = 1 holds true for them, using (a) a for loop, and (b) a list comprehension. 5) Write a function f(x, n), where x is a list of numbers and n is a single number, which returns a list of the indices of x where the value is exactly divisible by n. Check it works! 6) Put the above function in a script, with documentation. Import and test it. 7) Adapt your previous solution to take a filename instead of a list, and read the numbers to be tested from that file, considering one line at a time. 8) Adapt your solution to use a class with an attribute n and a method that returns which elements of a provided list are divisible by n.

slide-48
SLIDE 48

Any questions?

  • shout and wave (now)
  • tweet @thebamf (now)
  • email steven.bamford@nottingham.ac.uk (later)

Exercises? Feel free to get started on them now. I'll be around for support for a while. Solutions are online and will be reviewed at the start of next session.

Questions and exercises