Tadpole on FPGA mapping floating-point equations into integers - - PowerPoint PPT Presentation

tadpole on fpga
SMART_READER_LITE
LIVE PREVIEW

Tadpole on FPGA mapping floating-point equations into integers - - PowerPoint PPT Presentation

. Tadpole on FPGA mapping floating-point equations into integers using code-generation . . . Mike Hull Computer Laboratory University of Cambridge 19th Dec. 2013 () Tadpole on FPGA 19th Dec. 2013 1 / 20 The tadpole Ph.D in modelling


slide-1
SLIDE 1

. . . .

Tadpole on FPGA

mapping floating-point equations into integers using code-generation Mike Hull

Computer Laboratory University of Cambridge

19th Dec. 2013

() Tadpole on FPGA 19th Dec. 2013 1 / 20

slide-2
SLIDE 2

The tadpole

Ph.D in modelling with the experimental lab of Alan Roberts (Bristol) Brief touch -> sustained swimming “How does a tadpole wire up a neuronal network within 48 hours that can generate behaviour?”

() Tadpole on FPGA 19th Dec. 2013 2 / 20

slide-3
SLIDE 3

Why the tadpole?

“While not widely regarded as such, it has arguably become the best-understood spinal cord locomotor network in terms of network organization and functional properties.”1

1Parker, J Physiol 2009

() Tadpole on FPGA 19th Dec. 2013 3 / 20

slide-4
SLIDE 4

Why the tadpole?

‘Simple’, well-characterised nervous system:

direct in situ electrophysiological recordings constrained anatomical layout behavioural studies

Computational model of swimming:

Hodgkin-Huxley type biophysical models (≈ 1500 neurons) Sodium, two potassium and calcium voltage-gated channels Electrical coupling via gap junctions ≈ 100, 000 AMPA, NMDA and glyinergic synaptic connections (growth model)

() Tadpole on FPGA 19th Dec. 2013 4 / 20

slide-5
SLIDE 5

Equations and units

ica

pA

= A

µm2

PCa

cm/s

2ν F

C/mol

[ Ca2+]

i −

[ Ca2+]

  • e−ν

1 − e−ν

mmol/liter

m2

() Tadpole on FPGA 19th Dec. 2013 5 / 20

slide-6
SLIDE 6

Mapping to Integers

Existing C-model - improve performance by running on an FPGA? Sequential + floating point (software emulation) → poor performance Mapping to integers:

Floating-Point hardware expensive in chip-area, power, latency Take advantage of BlueVec on FPGA Other platforms without a floating point unit (SpinNaker)

() Tadpole on FPGA 19th Dec. 2013 6 / 20

slide-7
SLIDE 7

Mapping to Integers

Time consuming to manually convert complex equations Use a high-level description and map this to C++ simulation code [Bonus: take care of the units problem?] Used an experimental NineML-based library (‘neurounits’).

() Tadpole on FPGA 19th Dec. 2013 7 / 20

slide-8
SLIDE 8

Overview of mapping

() Tadpole on FPGA 19th Dec. 2013 8 / 20

slide-9
SLIDE 9

Example neurounits

define_component gap_junction { <= > PARAMETER g_gj : ( S) <= > INPUT V1 : ( V) , V2 : ( V) i 1 = g_gj ∗ (V2−V1) i 2 = −i 1 } () Tadpole on FPGA 19th Dec. 2013 9 / 20

slide-10
SLIDE 10

Example neurounits

define_component gap_junction { <= > PARAMETER g_gj : ( S) <= > INPUT V1 : ( V) , V2 : ( V) i 1 = g_gj ∗ (V2−V1) i 2 = −i 1 } define_component p a s s i v e _ c e l l { <= > TIME t <= > PARAMETER g_leak C = 10pF V’ = ( i _ l e a k + i _ i n j + i_syn ) / C # Leak Channel : i _ l e a k = {2nS} ∗ ({−64mV} − V) # I n j e c t e d Current : i _ i n j = [30pA ] i f [50ms < t < 100ms ] e l s e [0 pA ] # Synapse : i_syn = {300pS} ∗ ({0mV} − V) ∗ (B−A) A’ = −A / {1.5ms} B’ = −B / {5ms}

  • n ampa_input {

A = A + 1 B = B + 1 } } () Tadpole on FPGA 19th Dec. 2013 9 / 20

slide-11
SLIDE 11

Example AST

() Tadpole on FPGA 19th Dec. 2013 10 / 20

slide-12
SLIDE 12

Encoding the fixed-point format of the nodes

In N bits, we can store integers from : −2(N−1) < value_int < 2(N−1) − 1 Based on the range of each node in the AST, choose a suitable upscale factor, U:

value_float = value_int

2N−1 × 2U

U 2U Min-Value Max-Value

  • 8

3.9e-3 3.9e-3

  • 3.9e-3

· · ·

  • 4

0.0625

  • 0.0625

0.0625

  • 3

0.125

  • 0.125

0.125

  • 2

0.25

  • 0.25

0.25

  • 1

0.5

  • 0.5

0.5 1

  • 1.0

1.0 1 2

  • 2.0

2.0 2 4

  • 4.0

4.0 · · · 8 256

  • 256

256 () Tadpole on FPGA 19th Dec. 2013 11 / 20

slide-13
SLIDE 13

Encoding the fixed-point format of the nodes

In N bits, we can store integers from : −2(N−1) < value_int < 2(N−1) − 1 Based on the range of each node in the AST, choose a suitable upscale factor, U:

value_float = value_int

2N−1 × 2U

U 2U Min-Value Max-Value

  • 8

3.9e-3 3.9e-3

  • 3.9e-3

· · ·

  • 4

0.0625

  • 0.0625

0.0625

  • 3

0.125

  • 0.125

0.125

  • 2

0.25

  • 0.25

0.25

  • 1

0.5

  • 0.5

0.5 1

  • 1.0

1.0 1 2

  • 2.0

2.0 2 4

  • 4.0

4.0 · · · 8 256

  • 256

256

For example: −100mV < V < 50mV −0.1 < V < 0.05 → U : −3

() Tadpole on FPGA 19th Dec. 2013 11 / 20

slide-14
SLIDE 14

Example output C++ code

i_leak = 2.5nS

U:−28

×  −50mV

U:−4

− V

U:−3

 

U:−3 U:−31

() Tadpole on FPGA 19th Dec. 2013 12 / 20

slide-15
SLIDE 15

Example output C++ code

i_leak = 2.5nS

U:−28

×  −50mV

U:−4

− V

U:−3

 

U:−3 U:−31

f o r ( i n t i =0; i <NrnPopData : : s i z e ; i ++){ // . . . d . i _ l e a k [ i ] = ( ScalarOp <−31>::mul ( ScalarType <−28>(5629500), // [ Constant 2.5 e −9] ScalarOp <−3>::sub ( ScalarType <−4>(−6710886), // [ Constant −0.05] d .V[ i ] ) ) ) ; // . . . } () Tadpole on FPGA 19th Dec. 2013 12 / 20

slide-16
SLIDE 16

Exponentials

Exponentials are common in biophysical models Implemented as linear-interpolated lookup tables. To maintain accuracy, the value of U varies to encode different values

  • f exp(x).

() Tadpole on FPGA 19th Dec. 2013 13 / 20

slide-17
SLIDE 17

Example simulation construction

network = Network () dIN_comp = n e u r o u n i t s . ComponentLibrary . instantiate_component ( ’dIN ’ ) dINs = network . c r e a t e _ p o p u l a t i o n ( name=’ dINs ’ , component=dIN_comp , s i z e =30, parameters= {’ nmda_multiplier ’ :

  • 1. 0 ,

’ ampa_multiplier ’ : 1.0 , ’ inj_current ’ : ’20 pA ’ }) network . c r e a t e _ e v e n t p o r t c o n n e c t o r ( name=" dIN_dIN_NMDA " , s r c _ p o p u l a t i o n=dINs , dst_population=dINs , src_port_name=’ spike ’ , dst_port_name=’ recv_nmda_spike ’ , d e l a y=’1 ms ’ , connector=AllToAllConnector ( 0 . 2 ) , parameter_map= {’ weight ’ : " 150 pS " }) network . r e c o r d _ t r a c e s ( dINs , ’V ’ ) r e s u l t s = CBasedEqnWriterFixedNetwork ( network ,

  • utput_c_filename=’ simulation1 . cpp ’ ,

CPPFLAGS=’- DON_NIOS = false

  • DPC_DEBUG = false
  • DUSE_BLUEVEC = false

’ , s t e p _ s i z e =0.1e −3, r u n _ u n t i l =1.0 , a s _ f l o a t=F a l s e ) . r e s u l t s () Tadpole on FPGA 19th Dec. 2013 14 / 20

slide-18
SLIDE 18

Results

On PC:

Comparing traces of float/fixed point simulations for single Hodgkin-Huxley neuron Small 30 neuron single sided-model and full tadpole model give behaviourally similar outputs

Mapped full tadpole to take advantage of the BlueVec unit on the FPGA

() Tadpole on FPGA 19th Dec. 2013 15 / 20

slide-19
SLIDE 19

Results - Tadpole on FPGA

Loop-phase Time without BlueVec Electrical coupling 4.5ms (0.1%) Solving state equations 4486.2ms (99.4%) Spike delivery 24.3ms (0.5%)

() Tadpole on FPGA 19th Dec. 2013 16 / 20

slide-20
SLIDE 20

Results - Tadpole on FPGA

Loop-phase Time without BlueVec Time with BlueVec Electrical coupling 4.5ms (0.1%) 4.7ms (15.0%) Solving state equations 4486.2ms (99.4%) 4.2ms (13.3%) Spike delivery 24.3ms (0.5%) 22.4ms (71.6%)

() Tadpole on FPGA 19th Dec. 2013 16 / 20

slide-21
SLIDE 21

Results

[Tadpole swimming video]

() Tadpole on FPGA 19th Dec. 2013 17 / 20

slide-22
SLIDE 22

Issues encountered

Use a bounded optimiser to find range of each node Division by zero & vector processing:

αm(V ) = 25 − V 10 ( e(25−V )/10 − 1 )

() Tadpole on FPGA 19th Dec. 2013 18 / 20

slide-23
SLIDE 23

Issues encountered

Use a bounded optimiser to find range of each node Division by zero & vector processing:

αm(V ) = 25 − V 10 ( e(25−V )/10 − 1 ) =    100e(25−V )/10 if [fabs(V − 25)] < 1e − 5 25 − V 10 ( e(25−V )/10 − 1 )

  • therwise

() Tadpole on FPGA 19th Dec. 2013 18 / 20

slide-24
SLIDE 24

Issues encountered

Use a bounded optimiser to find range of each node Division by zero & vector processing:

αm(V ) = 25 − V 10 ( e(25−V )/10 − 1 ) =    100e(25−V )/10 if [fabs(V − 25)] < 1e − 5 25 − V 10 ( e(25−V )/10 − 1 )

  • therwise

Introduce IfThenElse nodes and consider short-cicuiting semantics.

() Tadpole on FPGA 19th Dec. 2013 18 / 20

slide-25
SLIDE 25

Conclusions

It is possible to simulate a model neuronal network, including complexities such as electrical coupling, calcium channels,

NMDA-voltage dependance, in fixed point.

A minimal, human-readable language can be used to unambiguously specify complex dynamics of components, including inline units. Code-generation can produce efficient simulation code. Vector processing dramatically improved performance for this model.

() Tadpole on FPGA 19th Dec. 2013 19 / 20

slide-26
SLIDE 26

Acknowledgements

Bob Merrison Robert Cannon Matt Naylor Simon Moore & the rest of the group Thanks for listening - any questions?

() Tadpole on FPGA 19th Dec. 2013 20 / 20