CS207: Systems Development for Computational Science - - PowerPoint PPT Presentation

cs207 systems development for computational science
SMART_READER_LITE
LIVE PREVIEW

CS207: Systems Development for Computational Science - - PowerPoint PPT Presentation

CS207: Systems Development for Computational Science https://harvard-iacs.github.io/2019-CS207/ Instructor: David Sondak TFs: Lindsey Brown, Feiyu Chen, Aditya Karan, Bhaven Patel Harvard University Institute for Applied Computational Science


slide-1
SLIDE 1

CS207: Systems Development for Computational Science

https://harvard-iacs.github.io/2019-CS207/ Instructor: David Sondak TFs: Lindsey Brown, Feiyu Chen, Aditya Karan, Bhaven Patel

Harvard University Institute for Applied Computational Science

9/3/2019

slide-2
SLIDE 2

Motivation: Thermal Convection and the Geodynamo

Thermal convection drives most fluid flows in the universe

1 / 25

slide-3
SLIDE 3

Motivation: Thermal Convection and the Geodynamo

Thermal convection drives most fluid flows in the universe COLD HOT g Cold fluid falls, hot fluid rises

Plate Tectonics Video 1 / 25

slide-4
SLIDE 4

Motivation: Thermal Convection and the Geodynamo

Thermal convection drives most fluid flows in the universe COLD HOT g Cold fluid falls, hot fluid rises

Plate Tectonics Video DESY 1 / 25

slide-5
SLIDE 5

Motivation: Thermal Convection and the Geodynamo

Thermal convection drives most fluid flows in the universe COLD HOT g Cold fluid falls, hot fluid rises

Plate Tectonics Video DESY

∂T ∂t + ∇ · (uT) = k∇2T

  • Ignoring ∇ · (uT) gives the usual heat conduction equation!

1 / 25

slide-6
SLIDE 6

Motivation: The Pillars of Science

2 / 25

slide-7
SLIDE 7

Motivation: The Pillars of Science

2 / 25

slide-8
SLIDE 8

Computational Science

Mathematics Computer Science Scientific Discipline

Computational Science

3 / 25

slide-9
SLIDE 9

Why take this class?

  • Scientific software is complex
  • Your code needs to be:
  • Reuseable
  • Portable
  • Robust
  • Must go beyond “scripting”

4 / 25

slide-10
SLIDE 10

Why take this class?

  • Scientific software is complex
  • Your code needs to be:
  • Reuseable
  • Portable
  • Robust
  • Must go beyond “scripting”

4 / 25

slide-11
SLIDE 11

Why take this class?

  • Scientific software is complex
  • Your code needs to be:
  • Reuseable
  • Portable
  • Robust
  • Must go beyond “scripting”

CS207 Objectives To give students who may not have a traditional computer science background the knowledge and tools to develop and maintain effective software for computational science applications.

4 / 25

slide-12
SLIDE 12

Why take this class?

  • Scientific software is complex
  • Your code needs to be:
  • Reuseable
  • Portable
  • Robust
  • Must go beyond “scripting”

CS207 Objectives To give students who may not have a traditional computer science background the knowledge and tools to develop and maintain effective software for computational science applications.

4 / 25

slide-13
SLIDE 13

Who should take this class?

  • Any kind of scientist is welcome to take this class!
  • This course is computer science for people who aren’t computer

scientists:

  • Data scientists
  • Biologists
  • Chemists
  • Engineers
  • Physicists
  • Mathematicians
  • Economists
  • .

. .

  • It is also for computer scientists who want to develop scientific

software

  • CS207 is for students who need to know effective and modern

software practices for their career

5 / 25

slide-14
SLIDE 14

Sample Topics

A few selected topics to be covered:

  • Unix and Linux
  • Version control
  • Python
  • Software documentation
  • Software testing
  • Object-oriented programming
  • Data structures
  • Databases

6 / 25

slide-15
SLIDE 15

Sample Topics

A few selected topics to be covered:

  • Unix and Linux
  • Version control
  • Python
  • Software documentation
  • Software testing
  • Object-oriented programming
  • Data structures
  • Databases

Other potential topics (not guaranteed):

  • Debuggers and debugging
  • Build systems (Makefiles,

autotools, ...)

  • Compiled languages

6 / 25

slide-16
SLIDE 16

Course Structure

  • CS207 is an application-driven course
  • Two, 75 minute lectures per week
  • Lectures centered around group programming exercises
  • Programming assignments for homework
  • Primary deliverable is a software development project
  • All course content hosted on GitHub

Course Website: https://harvard-iacs.github.io/2019-CS207/

7 / 25

slide-17
SLIDE 17

Course Project: Overview

  • You will work in groups of 3 to 4 people (assigned by teaching staff)
  • You will add to your library throughout the semester
  • The project consists of two milestones
  • For the final project, you will add a non-trivial feature to your library
  • A portion of your grade will come from peer-assessment
  • Exact details on website

8 / 25

slide-18
SLIDE 18

Course Project: The Topic

Automatic differentiation

9 / 25

slide-19
SLIDE 19

Course Project: The Topic

Automatic differentiation What is Automatic Differentiation?

9 / 25

slide-20
SLIDE 20

Course Project: The Topic

Automatic differentiation What is Automatic Differentiation?

  • A way to evaluate derivatives of functions and computer programs

9 / 25

slide-21
SLIDE 21

Course Project: The Topic

Automatic differentiation What is Automatic Differentiation?

  • A way to evaluate derivatives of functions and computer programs
  • Computes derivatives to machine precision!

9 / 25

slide-22
SLIDE 22

Course Project: The Topic

Automatic differentiation What is Automatic Differentiation?

  • A way to evaluate derivatives of functions and computer programs
  • Computes derivatives to machine precision!
  • Can be very efficient and accurate

9 / 25

slide-23
SLIDE 23

Course Project: The Topic

Automatic differentiation What is Automatic Differentiation?

  • A way to evaluate derivatives of functions and computer programs
  • Computes derivatives to machine precision!
  • Can be very efficient and accurate
  • Also known as “algorithmic differentiation”

9 / 25

slide-24
SLIDE 24

Course Project: The Topic

Automatic differentiation What is Automatic Differentiation?

  • A way to evaluate derivatives of functions and computer programs
  • Computes derivatives to machine precision!
  • Can be very efficient and accurate
  • Also known as “algorithmic differentiation”

We will have four lectures on automatic differentiation this semester to cover the main points.

9 / 25

slide-25
SLIDE 25

Why Automatic Differentiation?

  • Encapsulates many ideas in software design
  • Object-oriented programming
  • Operator overloading
  • Datastructures

10 / 25

slide-26
SLIDE 26

Why Automatic Differentiation?

  • Encapsulates many ideas in software design
  • Object-oriented programming
  • Operator overloading
  • Datastructures
  • Pervasive throughout science and gaining steam
  • Neural networks and backpropagation
  • Hamiltonian Monte Carlo methods
  • Full Jacobian calculations
  • Jacobian-free calculations

10 / 25

slide-27
SLIDE 27

AD Teaser

Suppose we have a function like y = exp

  • x + cos2 (x)
  • sin
  • x ln
  • 1 + x2

.

11 / 25

slide-28
SLIDE 28

AD Teaser

Suppose we have a function like y = exp

  • x + cos2 (x)
  • sin
  • x ln
  • 1 + x2

. The symbolic derivative is y′ = exp

  • x + cos2 (x)
  • cos
  • x ln
  • 1 + x2 2x2

1 + x2 + ln

  • 1 + x2

− exp

  • x + cos2 (x)

1 − 2 cos (x) sin (x) 2

  • x + cos2 (x)

sin

  • x ln
  • 1 + x2

11 / 25

slide-29
SLIDE 29

AD Teaser

Suppose we have a function like y = exp

  • x + cos2 (x)
  • sin
  • x ln
  • 1 + x2

. The symbolic derivative is y′ = exp

  • x + cos2 (x)
  • cos
  • x ln
  • 1 + x2 2x2

1 + x2 + ln

  • 1 + x2

− exp

  • x + cos2 (x)

1 − 2 cos (x) sin (x) 2

  • x + cos2 (x)

sin

  • x ln
  • 1 + x2

And that’s only the first derivative! Demo

11 / 25

slide-30
SLIDE 30

Next Steps

Go to https: //harvard-iacs.github.io/2019-CS207/lectures/lecture0/.

12 / 25

slide-31
SLIDE 31

Unix and Linux

Portions of this lecture taken from the lecture notes of Dr. Chris Simmons.

slide-32
SLIDE 32

Why Unix / Linux?

https://www.top500.org/lists/2019/06/ https://www.top500.org/statistics/list/

14 / 25

slide-33
SLIDE 33

What is Unix?

  • Unix is a multi-user, preemtive, multitasking, operating system
  • It provides several facilities:
  • Management of hardward resources
  • Directories and file systems
  • Loading, execution, and suspension of programs
  • There are many versions of Unix:
  • Solaris
  • AIX
  • BSD
  • Linux (not unix, but pretty close)
  • .

. .

15 / 25

slide-34
SLIDE 34

What is Linux?

  • Linux is a clone of Unix
  • Written by Linus Torvalds
  • First version dates to September 1991
  • Linux has been further developed by people around the world
  • Developed under the GNU General Public License
  • Source code for Linux is freely available

16 / 25

slide-35
SLIDE 35

How Does Unix Work?

  • Unix has a kernel and one or

more shells

  • The kernel is the core of the OS
  • It receives tasks from the shell

and executes them

  • Users interact with the shell!

Kernel Shell

17 / 25

slide-36
SLIDE 36

How Does Unix Work?

  • Everything in Unix is a process
  • r a file
  • A process
  • Is an executing program (has

a unique PID)

  • May be short or run

indefinitely

  • A file
  • Is a collection of data
  • Created by users
  • The Unix kernel is reponsible for
  • rganizing processes and

interacting with files Kernel Shell

18 / 25

slide-37
SLIDE 37

The Shell

  • The Unix interface is called the shell
  • The shell basically does four things repeatedly:
  • Display prompt
  • Read command
  • Process command
  • Execute command

19 / 25

slide-38
SLIDE 38

How to Interact with Unix

  • The user interacts with Unix via a shell
  • Different kinds of shells
  • Graphical, e.g. X-Windows
  • Text-based (command-line), e.g. bash and tcsh
  • To remotely access a shell session, use ssh (secure shell)

20 / 25

slide-39
SLIDE 39

Some Common Unix Terminology

  • Unix has the notion of accounts, which include:
  • a username/password
  • userid/groupid
  • home directory
  • a shell preference
  • userids are called UIDs
  • Unix has the notion of groups:
  • A Unix group can share files and active processes
  • Each account is assigned a primary group
  • The groupid corresponds to this primary group
  • groupids are called GIDs

21 / 25

slide-40
SLIDE 40

Unix Files and Directories

  • A file is a basic unit of storage
  • Every file must have a name
  • Unix is case-sensitive
  • A directory is a special kind of file
  • Directories hold information about other files
  • We often think of a directory as a container that holds other files
  • e.g. folders for Mac or Windows users

22 / 25

slide-41
SLIDE 41

Comments on the Unix Filesystem

  • The filesystem is a hierarchical system of files and directories
  • The top level in the heirarchy is called the root
  • The full pathname of a file includes the filename and all directories up

to the root

  • e.g. /Users/dsondak/Teaching/Harvard/CS207/2019-CS207/
  • Absolute and relative pathnames:
  • Absolute pathnames start at the root
  • Relative pathnames are specified in relation to the current working

directory

  • e.g. Harvard/CS207/2019-CS207/

23 / 25

slide-42
SLIDE 42

Special Directory Names

  • There is a special relative pathname for the current working directory
  • .
  • Just a dot
  • There is a special relative pathname for the parent directory
  • ..
  • Pronounced dot-dot
  • There is a special symbol for the home directory
  • Just a tilde

These commands will become second nature to you.

24 / 25

slide-43
SLIDE 43

Next Steps

Go to https: //harvard-iacs.github.io/2019-CS207/lectures/lecture0/.

25 / 25