SLIDE 1 How to build a large-scale biological simulator
CERN openlab summer student lecture Lukas Breitwieser
SLIDE 2 Help life scientists understand (patho)physiological processes
Source: Kaiser, University of Newcastle, UK; www.dynamic-connectome.org
SLIDE 3
From atoms to organisms
SLIDE 4
Agent-based simulations
SLIDE 5
Platform
SLIDE 6 BioDynaMo design goals
- Modular system that supports different fields
(e.g. neuroscience, oncology, immunology, ...)
- Support large-scale biological simulations
- Hide complexity of parallel and distributed
computing
- Promote reproducibility of results
SLIDE 7 Is this a good research idea?
- Why is this question important?
– Most ideas fail – Good ones take a long time to implement – → Terminate bad ideas quickly
- How to determine which ideas to pursue?
– Heilmeier catechism
- What are you trying to do? Articulate your objectives
using absolutely no jargon.
- How is it done today, and what are the limits of current
practice?
- What is new in your approach and why do you think it will
be successful?
Slide credit: Bill Dally; https://www.darpa.mil/work-with-us/heilmeier-catechism George H. Heilmeier
SLIDE 8 How to answer all these questions?
Electrons Hardware System Software BioDynaMo Computational Model Research Question Timeline
Now
Levels of transformation
- Look back
- Look forward
- Look up
- Loop down
SLIDE 9 Look back
- Literature review
- How to review a research paper?
– Summary
- What is the problem the paper is trying to solve? What
are the key ideas of the paper? Key insights?
- What is the key contribution to literature at the time it was
written?
- What are the most important things you take out from it?
– Strengths – Weaknesses – Can you do better? – What have you learned/enjoyed/disliked in the
Slide credit: Onur Mutlu
SLIDE 10 Look forward
- Research is a moving target
Aim here
Inspired by: Bill Dally, Moving the needle
SLIDE 11 Look up
- What do users expect from the
system?
- Which workflows are they used to?
- Which technologies are they
familiar with?
- What kind of models will they run?
Electrons Hardware System Software BioDynaMo Computational Model Research Question
SLIDE 12 Look down
Source: Onur Mutlu; Andrzej Nowak; http://www.iue.tuwien.ac.at/phd/weinbub/dissertationsu16.html
- Abstraction: A higher level only needs to know
about the interface to the lower level, not how the lower level is implemented
- Then, why would you want to know what goes
- n underneath?
– The program you wrote is running slow? – The program you wrote does not run correctly? – ...
SLIDE 13
Design tradeoffs
SLIDE 14
Software Engineering Best Practices
SLIDE 15 Testing & Continues Integration
- Essential to keep code base
maintainable
– Refactoring
- Reduces the risk to “touch”
- thers code
- Protect reputation
– Ensure that software installs
fine on supported systems and demos work
SLIDE 16 Follow a styleguide
best practices which improve readability and maintainability of a code base
read then (re)written → Important that a developer quickly understands a piece
Avoid
Source: https://www.reddit.com/r/badcode/comments/bjsdyc/my_teach_kees_getting_mad_that_i_never_properly
SLIDE 17 Use existing libraries
- Instead of copy pasting code from a textbook,
- r stackoverflow
– Correctness – Development effort – Maintenance effort
- Questions to answer before adopting a library
– Is the license compatible? – Is it actively maintained? – Does it have an active user community? – How big is the library? – How many dependencies does it have?
SLIDE 18 Manage scope
- Lifecycle costs of applications over 10 years
Slide credit: Dr. Marc Brandis
SLIDE 19 Advice on debugging
- Remove complexity
- Isolate the issue
- Avoid ad-hoc solutions; find the root cause
5 why’s example from Uber:
– Why did the issue happen? --> A bug was
committed as part of the code.
– Why did the bug not get caught by someone else?
- -> The code reviewer did not notice that the code
change could cause such an issue.
– Why did we depend on only a code reviewer
catching this bug? ---> Because we don't have an automated test for this use case.
Source: https://blog.pragmaticengineer.com/operating-a-high-scale-distributed-system/
SLIDE 20 Refactor
- Simplify program while running all the tests
- Because
– We all violate our own best practices from time to
time.
– A reliable, maintainable system is not built
- vernight.
- Enabled by testing and continues integration
SLIDE 21 Some examples that need refactoring
Source: https://www.reddit.com/r/badcode
SLIDE 22
BioDynaMo Implementation
SLIDE 23
BioDynaMo overview
SLIDE 24 BioDynaMo core concepts
NeuriteElement Cell NeuronSoma Simulation Algorithm Simulation Objects Local Neighborhood
UID 123 UID 123 UID 456 Cell division event 1
st daughter
2
nd daughter
Event
Copy to new Remove from existing x x x x
Biology Modules
Grow Secrete substance into extracellular Matrix Move Divide
SLIDE 25
Simulation objects
SLIDE 26 Spatial organization
Source: Ahmad Hesam
SLIDE 27
Biological behavior
SLIDE 28 Physical processes
interactions
SLIDE 29 Performance
- Minimize serial part of the application
– Amdahl’s law
https://en.wikipedia.org/wiki/Amdahl%27s_law
- Load balance
- Optimize data access patterns
- Avoid unnecessary data movement
- Minimize synchronization
- Use caches
- Pitfalls when measuring performance
http://htor.inf.ethz.ch/publications/img/hoefler-scientific-benchmarking_slides.pdf
From: Scalability! But at what COST!
SLIDE 30 Current status
- Modular simulation engine
- Fully parallelized with OpenMP
- GPU & FPGA implementation for
mechanical interactions using CUDA and OpenCL
- First version of distributed
runtime based on the framework Ray
- ROOT I/O for storage of simulation
results and snapshots
- Visualization using ParaView and
SLIDE 31
Demos
SLIDE 32
“Hello World” Simulation
SLIDE 33
Chemotaxis
SLIDE 34
- Simulation at timestep 0
- Cells are color coded by their
type
- Simulation at the end
- As expected, cells form
clusters based on their type
Soma Clustering 1/2
SLIDE 35
Soma clustering 2/2
SLIDE 36 Tumor concept 1/2
Slide credit: Jean De Montigny
SLIDE 37 Tumor concept 2/2
Slide credit: Jean De Montigny
SLIDE 38
Neuroscience Demo
SLIDE 39 Overview
Image: https://en.wikipedia.org/wiki/File:Brainmaps-macaque-hippocampus.jpg used under CC Attribution 3.0
SLIDE 40 Model
NeuriteElement NeuronSoma
SLIDE 41 Simulation
- Single pyramidal cell
- Neurite elements are colored
based on their diameter
Simulation: Jean De Montigny
SLIDE 42 Animation
Simulation: Jean De Montigny
SLIDE 43 Comparison with real neurons
Simulation and Analysis: Jean De Montigny
SLIDE 44 Large-Scale Simulation
- 80k Neurons
- ~2M simulation objects
SLIDE 45
Questions?
Lukas.Breitwieser@cern.ch