Scalable Multi-Precision Simulation of Spiking Neural Networks on GPU with OpenCL
Dmitri Yudanov
(Advanced Micro Devices, USA)
Leon Reznik
(Rochester Institute of Technology, USA) WCCI 2012, IJCNN, June 12
Scalable Multi-Precision Simulation of Spiking Neural Networks on - - PowerPoint PPT Presentation
Scalable Multi-Precision Simulation of Spiking Neural Networks on GPU with OpenCL Dmitri Yudanov (Advanced Micro Devices, USA) Leon Reznik (Rochester Institute of Technology, USA) WCCI 2012, IJCNN, June 12 Agenda Motivation OpenCL.
(Advanced Micro Devices, USA)
(Rochester Institute of Technology, USA) WCCI 2012, IJCNN, June 12
Motivation OpenCL. SNN Simulation Platform GPU Device Architecture SNN Simulation Architecture Results:
Verification and Performance
Next Simulator Architecture Conclusion Q&A
Open Computing Language. Open standard maintained by Khronos Group
Four models:
Platform model
Memory model
Programming model
Execution model
Based on B Gaster et al. Heterogeneous Computing with OpenCL.: Morgan Kaufmann Pub, 2011.
Based on AFDS11 presentation: M Houston and M Mantor. (2011, June) Fusion Developer Summit: AMD Graphics Core Next.
Based on AFDS11 presentation: M Houston and M Mantor. (2011, June) Fusion Developer Summit: AMD Graphics Core Next.
PS solver is based on sequential implementation of R Stewart and W Bair, "Spiking neural network simulation: numerical integration with the Parker-Sochacki method," Journal of Computational Neuroscience, vol. 27, no. 1, pp. 115-133, August 2009.
Modified from T Harada and L Howes. (2011, Dec.) Heterogeneous Compute.[Online]. http://www.heterogeneouscompute.org/wordpress/wpcontent/uploads/2011/06/RadixSort.pdf Radix sort example: 1 bit radix. LSD sort.
A unit test for each kernel
A unified integration test with complete host-device verification
A variety of compilation modes
C++ preprocessor-driven
XML-driven search script for the best performing variant.
User Interface:
Perl script + XML
Microsoft VS
Network Size (neurons) Average Synapses per Neuron Average Events per Step Average Spikes per Step T
Synapse Count (millions) GPU Time per Step, (ms) CPU Time per Step, (ms) Time Ratio
2,100,000 90 230,000 2,522 190 13.5 659 48 131,000 1,458 370,000 257 191 5.7 279 48 16,000 11,677 300,000 25 191 3.2 283 88
Size-connection scalability in multi-precision networks with per-WF precision allocation.
1000 iterations, 250 us step
Randomly-connected SNN with only AMPA synapses.
GPU: Radeon™ HD 7970, CPU: AMD Phenom™ II, 3.2 GHz (single thread)
Out-of-order flow with event-based synchronization
Target-oriented synaptic matrix partitioning
Mixed hybrid and time-driven simulation flows
Variety of neuron models
STDP
Just-in-time spike-to- event expansion
Object-oriented design
Out-of-order execution flows
STDP feature
Linux support
Application examples
User interface (possibly a library with extensions to PyNN)
APU support
Other: root-cause Newton-Raphson divergence, just-in-time spike-to-event expansion, sort radix scalability.
Multi-precision scalable (neurons, connections, precision) SNN
OpenCL, Tahiti architecture. Fully verified with CPU original implementation. Up to 90x faster compared to a single thread on CPU.
R. Stewart and W. Bair, "Spiking neural network simulation: numerical integration with the Parker-Sochacki method," Journal of Computational Neuroscience, vol. 27, no. 1, pp. 115-33, Aug. 2009. E. M. Izhikevich, "Simple model of spiking neurons," Neural Networks, IEEE Transactions on, vol. 14, pp. 1569--1572, 2003. B Gaster, D R Kaeli, L Howes, and P Mistry, Heterogeneous Computing with OpenCL.: Morgan Kaufmann Pub, 2011. T Harada and L Howes. (2011, Dec.) “Introduction to GPU Radix Sort.” Heterogeneous Compute. [Online]. http://www.heterogeneouscompute.org/wordpress/wpcontent/uploads/2011/06/RadixSort.pdf M Houston and M Mantor. (2011, June) Fusion Developer Summit: AMD Graphics Core Next. [Online]. http://developer.amd.com/afds D Yudanov, M Shaaban, R Melton, and L Reznik, "GPU-based simulation of spiking neural networks with real-time performance & high accuracy," in The 2010 International Joint Conference on Neural Networks (IJCNN), 2010, pp. 1-8.
Lee Howes, Dr. Wu-Chun Feng