CoreNeuron : Morphologically Detailed Neuron Simulations Building, - - PowerPoint PPT Presentation

coreneuron morphologically detailed neuron simulations
SMART_READER_LITE
LIVE PREVIEW

CoreNeuron : Morphologically Detailed Neuron Simulations Building, - - PowerPoint PPT Presentation

CoreNeuron : Morphologically Detailed Neuron Simulations Building, Simulating and Optimizing Large Neuron Networks on GPUs Pramod Kumbhar, Michael Hines 7 th April 2016, GTC & Blue Brain HPC Team Understanding Brain.. Brain/MINDS 2014


slide-1
SLIDE 1

CoreNeuron : Morphologically Detailed Neuron Simulations

Building, Simulating and Optimizing Large Neuron Networks on GPUs

Pramod Kumbhar, Michael Hines & Blue Brain HPC Team

7th April 2016, GTC

slide-2
SLIDE 2

Understanding Brain..

2013

BRAIN Initiative

2013

Human Brain Project

2014

Brain/MINDS Better understanding of brain

2007

Izhikevic: brain scale simulation on cluster

1 million cells

2009

IBM: Cat’s brain scale simulation on BG-P

1.5 billion cells

See source1

slide-3
SLIDE 3

Brain Simulations

Point Neurons

Morphologically Detailed Neurons Molecular Level

See source3 See source4

Blue Brain Project, EPFL

slide-4
SLIDE 4

Reverse Engineering Brain (10000 feet view)

See source5 See source6

slide-5
SLIDE 5

Reconstruction Workflow

  • 15+ years of Experiments
  • 14,000 Neurons recorded

and labeled

  • 2,052 Classified neurons
  • 1,009 Reconstructed

neurons

  • 2,000+ Ion channel

recordings

  • 4,000+ Electrical recordings
  • f single neurons
  • 5,000+ Synaptic recordings
  • f pairs of neurons

Markram et al. 2015, Cell

slide-6
SLIDE 6

Modeling Neuron

See source8 See source7 HH, 1952

slide-7
SLIDE 7
  • Prominent components of nervous system
  • More than 300 ion channels

Ion Channels

Biologist view: compartment model

Every channel is a compute kernel, no single hotspot!

slide-8
SLIDE 8

NMODL: Source to Source Compiler

Lexical Analyzer Syntax Analyzer Semantic Analyzer

~ Intermediate Representation

tokens parse tree parse tree

OpenACC C Cuda

Cyme: SIMD

DSL

backends

NMODL Portable Performance

slide-9
SLIDE 9

wrap OpenACC and vectorisation hints related pragmas

auto-generated kernel OpenACC API’s to copy the complex data structure

OpenACC Kernels

User defined DS: Major challenge for many application AoS/SoA, Vectorisation, Memory Coalescing etc..

slide-10
SLIDE 10

bksub: for(i = x; i < nodes; i++) { rhs[i] -= b[i] * rhs[ parent[i] ] rhs[i] /= d[i] }

GPU: Cell level Parallelism (some kernels)

node : 1 2 3 4 parent[i]: 0 1 2 3

step 0 Step 1 Step 2 Step 3

node : 6 7 8 9 parent[i]: 5 6 7 8 node : 156 157 158 159 parent[i]: 155 156 157 158

thread 0 thread 1 thread 31

Memory addresses

warp

Stride Depth of Tree

1 2 3 4

  • 1

1 2 3

parent_index node_index

5 6 7 8 9

  • 1

5 6 7 8

slide-11
SLIDE 11

2 4 6 8

  • 1

1 2 3

parent_index node_index

1 3 5 7 9

  • 1

5 6 7 8

1 2 3 4 5 6 7 8 9

  • 1
  • 1

1 2 3 4 5 6 7

all cells root Cell 0 and Cell 1 nodes interleaved Roots parent Cell 0 parent Cell 1 parent

Nodes Parents

Cell Interleaving

Permutations: ions, synapses, areas, point processes..

Memory addresses

warp

slide-12
SLIDE 12

Spike Exchange

A

bAC dNAC cSTUT bSTUT cIR bIR cAD cAC dSTUT cNAC bNAC m-type

C

cAC cNAC bNAC dNAC

B

bAC cIR

37% 31% 5% 25% 1.5% 1.5%

e-types me-types me-combinations

L1DAC L1NGC-DA L1NGC-SA L1HAC L1LAC L1SAC L23PC L23MC L23BTC L23DBC L23BP L23NGC L23LBC L23NBC L23SBC L23ChC L4PC L4SP L4SS L4MC L4BTC L4DBC L4BP L4NGC L4LBC L4NBC L4SBC L4ChC L5TTPC1 L5TTPC2 L5UTPC L5STPC L5MC L5BTC L5DBC L5BP L5NGC L5LBC L5NBC L5SBC L5ChC L6TPCL1 L6TPCL4 L6UTPC L6IPC L6BPC L6MC L6BTC L6DBC L6BP L6NGC L6LBC

L23 NBC

(burst Accommodating) (continuous Non-accommodating) (burst Non-accommodating) (continuous Accommodating) (delayed Non-accommodating) (continuous Stuttering) (burst Stuttering) (delayed Stuttering) (continuous Irregular) (burst Irregular) (continuous Adapting)

  • Electrical diversity: 11 e-types; 207 me-types
  • Number of connections increases exponentially
  • Different types of events

NetCon List

Buffering Mechanism

slide-13
SLIDE 13

Compute engine of NEURON simulator Being developed for large scale simulations (28 racks BG-Q) Ion Channels: ~ 85% time Linear Algebra: ~ 5-7% Spike Exchange: 7-10%

CoreNeuron

slide-14
SLIDE 14

Timeline

load

CPU GPU

current solve state

dt

copy initialize

setup

threshold

queue MPI queue mindelay

slide-15
SLIDE 15

Toolchain

slide-16
SLIDE 16

Soma

Compartment

hh pas pas pas

pas pas

Simple model…

slide-17
SLIDE 17

larger cells 65536 3072 1024 Varying # : Rings, Cells, Branches, Compartments

Model A Model B Model C Model D Model E

Performance

  • K20x vs 8-core Xeon
  • Cray (OpenACC)
  • Cuda 7
  • No “hand” tuning yet
  • Optimized CPU code with

vectorization

Real World Models

Ion channels are 4-8x faster in all models! Kernels with cell level parallelism, low occupancy!

slide-18
SLIDE 18

393216 65536 4096 3072

  • 8-core Xeon / 8 MPIs
  • BG-Q Node / 32-64 threads
  • Xeon Phi 61 core @ 1.23 GHz,

180-240 threads

  • K20X GPU
  • Optimized Xeon/MIC code with

vectorization (XLC issue)

Performance

Varying # : Rings, Cells, Branches, Compartments

slide-19
SLIDE 19

Cell Interleaving and Exposing Parallelism

Homogenous

Heterogeneous

Ideal ill - suited

How much parallelism? How much imbalance?

slide-20
SLIDE 20

Morphological diversity challenge

“Morphology Aware Scheduling of Kernels using Isomorphic Subtrees”

slide-21
SLIDE 21

Resource

Reconstruction and Simulation

  • f Neocortical Microcircuitry

Henry Markram,1,2,19,* Eilif Muller,1,19 Srikanth Ramaswamy,1,19 Michael W. Reimann,1,19 Marwan Abdellah,1 Carlos Aguado Sanchez,1 Anastasia Ailamaki,16 Lidia Alonso-Nanclares,6,7 Nicolas Antille,1 Selim Arsever,1 Guy Antoine Atenekeng Kahou,1 Thomas K. Berger,2 Ahmet Bilgili,1 Nenad Buncic,1 Athanassia Chalimourda,1 Giuseppe Chindemi,1 Jean-Denis Courcol,1 Fabien Delalondre,1 Vincent Delattre,2 Shaul Druckmann,4,5 Raphael Dumusc,1 James Dynes,1 Stefan Eilemann,1 Eyal Gal,4 Michael Emiel Gevaert,1 Jean-Pierre Ghobril,2 Albert Gidon,3 Joe W. Graham,1 Anirudh Gupta,2 Valentin Haenel,1 Etay Hay,3,4 Thomas Heinis,1,16,17 Juan B. Hernando,8 Michael Hines,12 Lida Kanari,1 Daniel Keller,1 John Kenyon,1 Georges Khazen,1 Yihwa Kim,1 James G. King,1 Zoltan Kisvarday,13 Pramod Kumbhar,1 Se ´ bastien Lasserre,1,15 Jean-Vincent Le Be ´ ,2 Bruno R.C. Magalha ˜ es,1 Angel Mercha ´ n-Pe ´ rez,6,7 Julie Meystre,2 Benjamin Roy Morrice,1 Jeffrey Muller,1 Alberto Mun ˜ oz-Ce ´ spedes,6,7 Shruti Muralidhar,2 Keerthan Muthurasa,1 Daniel Nachbaur,1 Taylor H. Newton,1 Max Nolte,1 Aleksandr Ovcharenko,1 Juan Palacios,1 Luis Pastor,9 Rodrigo Perin,2 Rajnish Ranjan,1,2 Imad Riachi,1 Jose ´ -Rodrigo Rodrı ´guez,6,7 Juan Luis Riquelme,1 Christian Ro ¨ ssert,1 Konstantinos Sfyrakis,1 Ying Shi,1,2 Julian C. Shillcock,1 Gilad Silberberg,18 Ricardo Silva,1 Farhan Tauheed,1,16 Martin Telefont,1 Maria Toledo-Rodriguez,14 Thomas Tra ¨ nkler,1 Werner Van Geit,1 Jafet Villafranca Dı ´az,1 Richard Walker,1 Yun Wang,10,11 Stefano M. Zaninetta,1 Javier DeFelipe,6,7,20 Sean L. Hill,1,20 Idan Segev,3,4,20 and Felix Schu ¨ rmann1,20

1Blue Brain Project, E

´ cole polytechnique fe ´ de ´ rale de Lausanne (EPFL) Biotech Campus, 1202 Geneva, Switzerland

2Laboratory of Neural Microcircuitry, Brain Mind Institute, EPFL, 1015 Lausanne, Switzerland 3Department of Neurobiology, Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem 91904, Israel 4The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem 91904, Israel 5Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA 6Laboratorio Cajal de Circuitos Corticales, Centro de Tecnologı

´a Biome ´ dica, Universidad Polite ´ cnica de Madrid, 28223 Madrid, Spain

7Instituto Cajal (CSIC) and CIBERNED, 28002 Madrid, Spain 8CeSViMa, Centro de Supercomputacio

´ n y Visualizacio ´ n de Madrid, Universidad Polite ´ cnica de Madrid, 28223 Madrid, Spain

9Modeling and Virtual Reality Group, Universidad Rey Juan Carlos, 28933 Mo

´ stoles, Madrid, Spain

10Key Laboratory of Visual Science and National Ministry of Health, School of Optometry and Opthalmology, Wenzhou Medical College,

Wenzhou 325003, China

11Caritas St. Elizabeth’s Medical Center, Genesys Research Institute, Tufts University, Boston, MA 02111, USA 12Department of Neurobiology, Yale University, New Haven, CT 06510 USA 13MTA-Debreceni Egyetem, Neuroscience Research Group, 4032 Debrecen, Hungary 14School of Life Sciences, University of Nottingham, Nottingham NG7 2UH, United Kingdom 15Laboratoire d’informatique et de visualisation, EPFL, 1015 Lausanne, Switzerland 16Data-Intensive Applications and Systems Lab, EPFL, 1015 Lausanne, Switzerland 17Imperial College London, London SW7 2AZ, UK 18Department of Neuroscience, Karolinska Institutet, Stockholm 17177, Sweden 19Co-first author 20Co-senior author

*Correspondence: henry.markram@epfl.ch http://dx.doi.org/10.1016/j.cell.2015.09.029

October 8, 2015 ª2015 Elsevier Inc. http://dx.doi.org/10.1016/j.cell.2015.09.029

slide-22
SLIDE 22

THANK YOU!

slide-23
SLIDE 23

Explanatory Graphic Sources

  • source 1: dgallery.s3.amazonaws.com
  • source 2: clipartbest.com/cliparts, scaryforkids.com, geniusawakening.com
  • source 3: developer.humanbrainproject.eu
  • source 4: nature.com
  • source 5: deviantart.net, squarespace.com
  • source 6: lcn.epfl.ch
  • source 7: nature.com
  • source 8: genesis-sim.org
slide-24
SLIDE 24

low occupancy!

Model A Model B Model C Model D Model E larger cells 65536 2048 1024

Backup