A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral Neuroscience
John Long, PhD Buzsáki Laboratory Neuroscience Institute New York University Langone Medical Center 03.17.2015
A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral - - PowerPoint PPT Presentation
A GPU-Accelerated 3D Kinematic Modeling Platform for Behavioral Neuroscience John Long, PhD Buzski Laboratory Neuroscience Institute New York University Langone Medical Center 03.17.2015 A little about me Jose Carmena Gyrgy Buzski
John Long, PhD Buzsáki Laboratory Neuroscience Institute New York University Langone Medical Center 03.17.2015
György Buzsáki Jose Carmena
Venkatraman et al. 2009 Long and Carmena 2013 Long and Carmena 2011 Koralek et al. 2012
Lurilli et al. 2012 Geisler, Sirota, Zugaro, Robbe, Buzsaki, PNAS 2007
(O’Keefe and Nadel, 1978; O’Keefe and Recce, 1993)
(Hubel and Wiesel 1959)
Corazza et al. 2006
4 2 1 6 3 5
Svoboda et al. 2005
3D to 2D perspective transformation
Visual Hull Algorithm modified from Forbes et al. 2006
Murray et al. 1994
Generate Candidate Poses Score Each Pose ni nj nj dij Mi Dj dij = ||Mi – Dj||2 αij = dot(ni,nj) Compute Cost Components Update Posterior Estimate
Generate Candidate Poses Score Each Pose ni nj nj dij Mi Dj dij = ||Mi – Dj||2 αij = dot(ni,nj) Compute Cost Components Update Posterior Estimate
P1 = t1a * t1b * t1c * t1d * t1e * P1; N1 = t1a * t1b * t1c * t1d * t1e * N1; N1 = N1-P1; P4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * P4; N4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * N4; N4 = N4-P4; P5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * P5; N5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * N5; N5 = N5-P5; P7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * P7; N7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * N7; N7 = N7-P7; “A mathematical introduction to robotic manipulation” by Murray, Li, and Sastry 1994
P1 = t1a * t1b * t1c * t1d * t1e * P1; N1 = t1a * t1b * t1c * t1d * t1e * N1; N1 = N1-P1; P4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * P4; N4 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * N4; N4 = N4-P4; P5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * P5; N5 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * N5; N5 = N5-P5; P7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * P7; N7 = t1a * t1b * t1c * t1d * t1e * t4a * t4b * t4c * t5a * t5b * t5c * t6a * t6b * t6c * t7a * t7b * t7c * N7; N7 = N7-P7;
//MATRIX REDUCTION: across temporary variables over twists float sum[2]; //1st reduction from 16, 4x4 matrices to 8, 4x4 matrices if(hWID < 8) { sum[0] = 0.0f; #pragma unroll for(int k = 0; k < 4; k++) { sum[0] += Stwists[4*(2*hWID) + y][k]* Stwists[4*(2*hWID+1) + k][x]; } Transtmp0[4*hWID + y][x] = sum[0]; }; __syncthreads(); //Thread parameters unsigned int hWID = threadIdx.x / halfWarpSz; unsigned int hWoff = threadIdx.x % halfWarpSz; unsigned int x = hWoff % DimXY; unsigned int y = hWoff / DimXY;
computed in parallel.
blocks.
process
process (6 CPUs)
tuning resulting in an average 50% reduction in per frame model fit error i.e. better model fits!
CUDA ported into Matlab via Mex
Per frame compute time (seconds) Frame number Per frame model fit error (a.u.) Compute Time Comparison Model Fit Comparison
single Matlab: mean = 12.6 sec parfor Matlab: mean = 8.2 sec CUDA in Matlab: mean = 0.55 sec
CUDA where you need it
errors prior to tuning errors after tuning
science.
friendly languages like Matlab and Python do the rest.
Berman et al. 2014
Wavelet Analysis 1st principal component 2nd principal component 3rd principal component Time (seconds) Kinematic Modeling Behavioral Classification Parameterize Dynamics Cluster Embedded Dynamics T-SNE Map Dynamics Label Clusters rearing forward gaze tight scan
– You can incrementally deal with bottlenecks of decreasing priority.
György Buzsáki Antal Berenyi Andres Grosmark