A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral - PowerPoint PPT Presentation

A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral Neuroscience John Long, PhD Buzsáki Laboratory Neuroscience Institute New York University Langone Medical Center 03.17.2015

A little about me… Jose Carmena György Buzsáki

…and my previous work. Venkatraman et al. 2009 Koralek et al. 2012 Long and Carmena 2011 Long and Carmena 2013

Why am I at a GPU conference? • As a neuroscientist, I find small form factor, massively parallel computing machines intriguing. • For ease of interface and visualization, I often program in Matlab or Python, and I suffer agonizing computational bottlenecks, which has led me to GPUs. • I’ve had a fair amount of success applying GPU computing to my scientific work.

What I have in store for you… • An introduction to the work I do in behavioral neuroscience that led me to GPU computing. • A detailed description of one of the CUDA programs I have implemented in the context of my research. • Throughout, I’ll mention a workflow I’ve found useful for porting CUDA code into Matlab and Python.

Who reads the maps in the brain? Hippocampal “ place ” cells Sensory receptive fields (Hubel and Wiesel 1959) (O ’ Keefe and Nadel, 1978; O ’ Keefe and Recce, 1993) Geisler, Sirota, Zugaro, Robbe, Buzsaki, PNAS 2007 Lurilli et al. 2012

The State of the Art in Behavioral Neuroscience More and more neural data!

The State of the Art in Behavioral Neuroscience The behaving rat…

Advances in Motion Capture Corazza et al. 2006

Environment Construction 6 5 4 1 2 3

Multiple Camera Synchronization Lines to cameras Line to Amplipex system

Image Segmentation

Camera Calibration 3D to 2D perspective transformation Svoboda et al. 2005

Visual Hull Construction Visual Hull Algorithm modified from Forbes et al. 2006

Kinematic Model Design

Kinematic Model Manipulation Murray et al. 1994

Kinematic Model Fitting Generate Candidate Poses Compute Cost Components n j n j n i D j d ij M i d ij = ||M i – D j || 2 α ij = dot(n i ,n j ) Update Posterior Estimate Score Each Pose

Open Chain Kinematics P1 = t1a * t1b * t1c * t1d * t1e * P1; N1 = t1a * t1b * t1c * t1d * t1e * N1; N1 = N1-P1; P4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * P4; N4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * N4; N4 = N4-P4; P5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * P5; N5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * N5; N5 = N5-P5; P7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * P7; N7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * N7; N7 = N7-P7; “A mathematical introduction to robotic manipulation” by Murray, Li, and Sastry 1994

Open Chain Kinematics: On the GPU Exposing Parallelism //Thread parameters unsigned int hWID = threadIdx.x / halfWarpSz; unsigned int hWoff = threadIdx.x % halfWarpSz; • All 4x4 transformation matrices t i can be unsigned int x = hWoff % DimXY; unsigned int y = hWoff / DimXY; computed in parallel. • There are many shared computational //MATRIX REDUCTION: across temporary variables over twists float sum[2]; //1st reduction from 16, 4x4 matrices to 8, 4x4 matrices blocks. if(hWID < 8) { sum[0] = 0.0f; #pragma unroll for(int k = 0; k < 4; k++) { P1 = t1a * t1b * t1c * t1d * t1e * P1; sum[0] += Stwists[4*(2*hWID) + y][k]* N1 = t1a * t1b * t1c * t1d * t1e * N1; Stwists[4*(2*hWID+1) + k][x]; } N1 = N1-P1; Transtmp0[4*hWID + y][x] = sum[0]; }; P4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * P4; __syncthreads(); N4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * N4; N4 = N4-P4; P5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * P5; N5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * N5; N5 = N5-P5; P7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * P7; N7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * N7; N7 = N7-P7;

Open Chain Kinematics: On the GPU: Results Compute Time Comparison CUDA ported into Matlab via Mex single Matlab: mean = 12.6 sec Per frame compute time (seconds) • x22.5 speedup relative to single Matlab process parfor Matlab: mean = 8.2 sec • x14.6 speedup relative to parallel Matlab process (6 CPUs) • Qualitative speedup allowed for parameter tuning resulting in an average 50% reduction in per frame model fit error i.e. better model CUDA in Matlab: mean = 0.55 sec fits! • Promising approach to open chain kinematic Model Fit Comparison CUDA where you need it Per frame model fit error (a.u.) errors prior to tuning • Qualitative speedups mean more efficient science. • Work where you need to and let user friendly languages like Matlab and Python do the rest. errors after tuning Frame number

Putting it all together

Promising Directions Berman et al. 2014

Label Clusters Kinematic Modeling Behavioral Classification forward gaze Parameterize rearing Dynamics Wavelet Analysis tight scan 1 st principal component Cluster Embedded Dynamics T-SNE 2 nd principal component Map Dynamics 3 rd principal component Time (seconds)

Conclusion • Fitting kinematic models to 3D visual hull data is greatly accelerated by GPUs. • The framework I’ve presented can be applied to open chain kinematics models in general. • In science, Big Data too often means sitting around waiting to find out you need to run your analysis again. GPUs are a game changer. • Interfaces like mex (Matlab) and ctypes (Python) allow you to tackle the hard parts and be lazy about the easy parts. – You can incrementally deal with bottlenecks of decreasing priority.

Acknowledgements György Buzsáki The entire Buzsáki lab Andres Grosmark Antal Berenyi Thank you!

Kinematic Model Design

A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral - PowerPoint PPT Presentation

A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral Neuroscience John Long, PhD Buzski Laboratory Neuroscience Institute New York University Langone Medical Center 03.17.2015 A little about me Jose Carmena Gyrgy Buzski

Industrial Robots Industrial Robots Kinematic chains Kinematic chains Kinematic chains Kinematic

6. Kinematic GPS and Applications Tectonic Geodesy GEOS 655 Kinematic GPS Development of

NVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV 2016 Joe Eaton Ph.D. Accelerated

GPU-Accelerated GPU-Accelerated Large Vocabulary Continuous Speech Recognition Large

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

03 - kinematic equations - large deformations and growth 03 - kinematic equations where are

Kinematic Redundancy Robert Platt Northeastern University Kinematic Redundancy A

Introduction to the Kinematic source inversion (Example from Delouis et al., 2002) Kinematic

Picture This! Visualization on GPU Accelerated Supercomputers Peter Messmer, 11/15/2016 NVIDIA

GPU-accelerated similarity searching in a database of short DNA sequences Richard Wilton

Accelerated Reader What is Accelerated Reader? Accelerated Reader is the number one software

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics Jorge

GPU-accelerated Data Management Data Processing on Modern Hardware Sebastian Bre TU Dortmund

Parent Student Orientation Meeting June 10, 2013 6:30 7:30 DeBakey High School BCM Academy

Hi, we re Disclaimer For personal use only This presentation is not a prospectus nor an offer

BRAINSTORM: What ideas do adults have about middle schoolers and teenagers? (Talk with your

Neuroscientists and the public 110, avenue des Pins ouest Montral (Qubec) Canada H2W 1R7

W3: Big Data: The Next Playground for HR Maria Zahra Nicole Caruana Managing Director of

Lsambassadren Swedish reading ambassador Kids do their internetdebute 2 yo 89 % of all

PUZZLED? Helping learners make the most of puzzles Clare Harris: for the QATESOL/QCAL mini

Getting the Most Out of OOH Lara Menzies Marketing Who is COMMB? Canadian OOH Marketing &

A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral - PowerPoint PPT Presentation

A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral Neuroscience John Long, PhD Buzski Laboratory Neuroscience Institute New York University Langone Medical Center 03.17.2015 A little about me Jose Carmena Gyrgy Buzski

Industrial Robots Industrial Robots Kinematic chains Kinematic chains Kinematic chains Kinematic

6. Kinematic GPS and Applications Tectonic Geodesy GEOS 655 Kinematic GPS Development of

NVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV 2016 Joe Eaton Ph.D. Accelerated

GPU-Accelerated GPU-Accelerated Large Vocabulary Continuous Speech Recognition Large

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

03 - kinematic equations - large deformations and growth 03 - kinematic equations where are

Kinematic Redundancy Robert Platt Northeastern University Kinematic Redundancy A

Introduction to the Kinematic source inversion (Example from Delouis et al., 2002) Kinematic

Picture This! Visualization on GPU Accelerated Supercomputers Peter Messmer, 11/15/2016 NVIDIA

GPU-accelerated similarity searching in a database of short DNA sequences Richard Wilton

Accelerated Reader What is Accelerated Reader? Accelerated Reader is the number one software

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO &amp; Co-founder Blagovest Taskov, RT GPU Team

Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics Jorge

GPU-accelerated Data Management Data Processing on Modern Hardware Sebastian Bre TU Dortmund

Parent Student Orientation Meeting June 10, 2013 6:30 7:30 DeBakey High School BCM Academy

Hi, we re Disclaimer For personal use only This presentation is not a prospectus nor an offer

BRAINSTORM: What ideas do adults have about middle schoolers and teenagers? (Talk with your

Neuroscientists and the public 110, avenue des Pins ouest Montral (Qubec) Canada H2W 1R7

W3: Big Data: The Next Playground for HR Maria Zahra Nicole Caruana Managing Director of

Lsambassadren Swedish reading ambassador Kids do their internetdebute 2 yo 89 % of all

PUZZLED? Helping learners make the most of puzzles Clare Harris: for the QATESOL/QCAL mini

Getting the Most Out of OOH Lara Menzies Marketing Who is COMMB? Canadian OOH Marketing &amp;

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

Getting the Most Out of OOH Lara Menzies Marketing Who is COMMB? Canadian OOH Marketing &