How to build a large-scale biological simulator CERN openlab summer - - PowerPoint PPT Presentation

how to build a large scale biological simulator
SMART_READER_LITE
LIVE PREVIEW

How to build a large-scale biological simulator CERN openlab summer - - PowerPoint PPT Presentation

How to build a large-scale biological simulator CERN openlab summer student lecture Lukas Breitwieser Help life scientists understand (patho)physiological processes Source: Kaiser, University of Newcastle, UK; www.dynamic-connectome.org From


slide-1
SLIDE 1

How to build a large-scale biological simulator

CERN openlab summer student lecture Lukas Breitwieser

slide-2
SLIDE 2

Help life scientists understand (patho)physiological processes

Source: Kaiser, University of Newcastle, UK; www.dynamic-connectome.org

slide-3
SLIDE 3

From atoms to organisms

slide-4
SLIDE 4

Agent-based simulations

slide-5
SLIDE 5

Platform

slide-6
SLIDE 6

BioDynaMo design goals

  • Modular system that supports different fields

(e.g. neuroscience, oncology, immunology, ...)

  • Support large-scale biological simulations
  • Hide complexity of parallel and distributed

computing

  • Promote reproducibility of results
slide-7
SLIDE 7

Is this a good research idea?

  • Why is this question important?

– Most ideas fail – Good ones take a long time to implement – → Terminate bad ideas quickly

  • How to determine which ideas to pursue?

– Heilmeier catechism

  • What are you trying to do? Articulate your objectives

using absolutely no jargon.

  • How is it done today, and what are the limits of current

practice?

  • What is new in your approach and why do you think it will

be successful?

Slide credit: Bill Dally; https://www.darpa.mil/work-with-us/heilmeier-catechism George H. Heilmeier

slide-8
SLIDE 8

How to answer all these questions?

Electrons Hardware System Software BioDynaMo Computational Model Research Question Timeline

Now

Levels of transformation

  • Look back
  • Look forward
  • Look up
  • Loop down
slide-9
SLIDE 9

Look back

  • Literature review
  • How to review a research paper?

– Summary

  • What is the problem the paper is trying to solve? What

are the key ideas of the paper? Key insights?

  • What is the key contribution to literature at the time it was

written?

  • What are the most important things you take out from it?

– Strengths – Weaknesses – Can you do better? – What have you learned/enjoyed/disliked in the

Slide credit: Onur Mutlu

slide-10
SLIDE 10

Look forward

  • Research is a moving target

Aim here

Inspired by: Bill Dally, Moving the needle

slide-11
SLIDE 11

Look up

  • What do users expect from the

system?

  • Which workflows are they used to?
  • Which technologies are they

familiar with?

  • What kind of models will they run?

Electrons Hardware System Software BioDynaMo Computational Model Research Question

slide-12
SLIDE 12

Look down

Source: Onur Mutlu; Andrzej Nowak; http://www.iue.tuwien.ac.at/phd/weinbub/dissertationsu16.html

  • Abstraction: A higher level only needs to know

about the interface to the lower level, not how the lower level is implemented

  • Then, why would you want to know what goes
  • n underneath?

– The program you wrote is running slow? – The program you wrote does not run correctly? – ...

slide-13
SLIDE 13

Design tradeoffs

slide-14
SLIDE 14

Software Engineering Best Practices

slide-15
SLIDE 15

Testing & Continues Integration

  • Essential to keep code base

maintainable

– Refactoring

  • Reduces the risk to “touch”
  • thers code
  • Protect reputation

– Ensure that software installs

fine on supported systems and demos work

  • Continues integration
slide-16
SLIDE 16

Follow a styleguide

  • Set of guidelines and

best practices which improve readability and maintainability of a code base

  • Code is more often

read then (re)written → Important that a developer quickly understands a piece

  • f code
  • Use automation

Avoid

Source: https://www.reddit.com/r/badcode/comments/bjsdyc/my_teach_kees_getting_mad_that_i_never_properly

slide-17
SLIDE 17

Use existing libraries

  • Instead of copy pasting code from a textbook,
  • r stackoverflow

– Correctness – Development effort – Maintenance effort

  • Questions to answer before adopting a library

– Is the license compatible? – Is it actively maintained? – Does it have an active user community? – How big is the library? – How many dependencies does it have?

slide-18
SLIDE 18

Manage scope

  • Lifecycle costs of applications over 10 years

Slide credit: Dr. Marc Brandis

slide-19
SLIDE 19

Advice on debugging

  • Remove complexity
  • Isolate the issue
  • Avoid ad-hoc solutions; find the root cause

5 why’s example from Uber:

– Why did the issue happen? --> A bug was

committed as part of the code.

– Why did the bug not get caught by someone else?

  • -> The code reviewer did not notice that the code

change could cause such an issue.

– Why did we depend on only a code reviewer

catching this bug? ---> Because we don't have an automated test for this use case.

Source: https://blog.pragmaticengineer.com/operating-a-high-scale-distributed-system/

slide-20
SLIDE 20

Refactor

  • Simplify program while running all the tests
  • Because

– We all violate our own best practices from time to

time.

– A reliable, maintainable system is not built

  • vernight.
  • Enabled by testing and continues integration
slide-21
SLIDE 21

Some examples that need refactoring

Source: https://www.reddit.com/r/badcode

slide-22
SLIDE 22

BioDynaMo Implementation

slide-23
SLIDE 23

BioDynaMo overview

slide-24
SLIDE 24

BioDynaMo core concepts

NeuriteElement Cell NeuronSoma Simulation Algorithm Simulation Objects Local Neighborhood

UID 123 UID 123 UID 456 Cell division event 1

st daughter

2

nd daughter

Event

Copy to new Remove from existing x x x x

Biology Modules

Grow Secrete substance into extracellular Matrix Move Divide

slide-25
SLIDE 25

Simulation objects

slide-26
SLIDE 26

Spatial organization

Source: Ahmad Hesam

slide-27
SLIDE 27

Biological behavior

slide-28
SLIDE 28

Physical processes

  • Mechanical

interactions

  • Diffusion
slide-29
SLIDE 29

Performance

  • Minimize serial part of the application

– Amdahl’s law

https://en.wikipedia.org/wiki/Amdahl%27s_law

  • Load balance
  • Optimize data access patterns
  • Avoid unnecessary data movement
  • Minimize synchronization
  • Use caches
  • Pitfalls when measuring performance

http://htor.inf.ethz.ch/publications/img/hoefler-scientific-benchmarking_slides.pdf

From: Scalability! But at what COST!

slide-30
SLIDE 30

Current status

  • Modular simulation engine
  • Fully parallelized with OpenMP
  • GPU & FPGA implementation for

mechanical interactions using CUDA and OpenCL

  • First version of distributed

runtime based on the framework Ray

  • ROOT I/O for storage of simulation

results and snapshots

  • Visualization using ParaView and
slide-31
SLIDE 31

Demos

slide-32
SLIDE 32

“Hello World” Simulation

slide-33
SLIDE 33

Chemotaxis

slide-34
SLIDE 34
  • Simulation at timestep 0
  • Cells are color coded by their

type

  • Simulation at the end
  • As expected, cells form

clusters based on their type

Soma Clustering 1/2

slide-35
SLIDE 35

Soma clustering 2/2

slide-36
SLIDE 36

Tumor concept 1/2

Slide credit: Jean De Montigny

slide-37
SLIDE 37

Tumor concept 2/2

Slide credit: Jean De Montigny

slide-38
SLIDE 38

Neuroscience Demo

slide-39
SLIDE 39

Overview

Image: https://en.wikipedia.org/wiki/File:Brainmaps-macaque-hippocampus.jpg used under CC Attribution 3.0

slide-40
SLIDE 40

Model

NeuriteElement NeuronSoma

slide-41
SLIDE 41

Simulation

  • Single pyramidal cell
  • Neurite elements are colored

based on their diameter

Simulation: Jean De Montigny

slide-42
SLIDE 42

Animation

Simulation: Jean De Montigny

slide-43
SLIDE 43

Comparison with real neurons

Simulation and Analysis: Jean De Montigny

slide-44
SLIDE 44

Large-Scale Simulation

  • 80k Neurons
  • ~2M simulation objects
slide-45
SLIDE 45

Questions?

Lukas.Breitwieser@cern.ch