from brain research to
play

From brain research to high-energy physics: GPU-accelerated - PowerPoint PPT Presentation

Mitglied der Helmholtz- Gemeinschaft From brain research to high-energy physics: GPU-accelerated applications in Jlich Dirk Pleiter | Jlich Supercomputing Centre (JSC) | SC13 NVIDIA Application Lab at Jlich Collaboration between JSC


  1. Mitglied der Helmholtz- Gemeinschaft From brain research to high-energy physics: GPU-accelerated applications in Jülich Dirk Pleiter | Jülich Supercomputing Centre (JSC) | SC13

  2. NVIDIA Application Lab at Jülich Collaboration between JSC and NVIDIA since July 2012  Enable scientific applications for GPU-based architectures  Provide support for their optimization  Andrew Adinetz Investigate performance and scaling Work focus  Application requirements analysis Jiri Kraus  Current GPU architecture and CUDA feature analysis  Parallelization on many GPUs  Collaboration with performance tools developers  Training 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 2

  3. HPC at Jülich Supercomputing Centre Technology Applications Algorithms, tools, … 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 3

  4. Human Brain Project Application: JuBrain Katrin Amunts, Markus Axer, Marcel Huysegoms Research goal Accurate, highly detailed computer model of the human brain Computational challenge  Registration of high resolution images  Algorithm, e.g., rigid registration → 3 parameters  Computation of metric based on Shannon entropy 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 4

  5. JuBrain Registration Workflow Moving image Metric Optimizer Fixed image Interpolator Transformation Metric computation → for(int y = 0; y < fixed_sz_y; y++) for(int x = 0; x < fixed_sz_x; x++) { Computing joint int i = bin(fixed[x, y]); float x1 = transform_x(x, y); histograms for 2 float y1 = transform_y(x, y); images int j = bin(interpolate(moving, x1, y1)); histogram[i, j]++; // atomic on GPU } L2 atomics performance relevant when computing metric 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 5

  6. JuBrain Parallelization Strategies Simple test bench Remote access y  Only rotation Fixed Image Fixed Image System memory Mask replication  Device holds local part of fixed image (0,0) x  Host memory holds full copy of moving image List update  Send local fixed image data and moving image coordinates 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 6

  7. Parallel JuBrain Performance Results Fermi Reasonable scaling for small angles α   System memory replication faster  Strong performance degradation for intermediate α ← system memory latency Kepler  List update strategy faster due to faster L2 atomics Fine-grained multi-GPU communication potentially tricky 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 7

  8. B-CALM: Belgium-California Light Machine Research goal Pierre Wahl  Simulate electromagnetic fields in matter  Applications  Nano-photonics for optical interconnect  Optimized photo-voltaic Finite-difference time-domain (FDTD) method  3d grid of E and H fields Apply method to large systems  4000 2 x400 grid points → O(250) GBytes 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 8

  9. Parallel B-CALM Performance Model Parallelisation strategies  1d domain decomposition z-direction 8 MPI ranks  Higher dimension decompositions Simple model ansatz Performance models help  Information flow analysis fixing parallelization strategy  Latency-bandwidth model Comparison model and measurement  Good agreement for 1d domain decomposition 1 MPI rank  No need for higher-dimension decomposition [P. Wahl, 2013] 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 9

  10. GPUMAFIA: Data analysis on GPUs Sub-space density clustering  Analysis of high-dimensional data sets  Find clusters which exist in subsets of dimensions Applications  Monte Carlo simulations of protein folding  Data mining in marketing, bio-informatics, medical imaging 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 10

  11. MAFIA = Merging of Adaptive Finite IntervAls Sub-space clustering  If a collection of points S is a cluster in a k-dimensional space, then S is also a part of a cluster in any (k-1)-dimensional projection of the space  Start from constructing histograms in each dimension Adaptive grid  Combine bins with similar histogram values Gradually form higher dimensional clusters 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 11

  12. GPUMAFIA Performance Results Test setup  Dual 6-core Xeon  Single core Xeon + K20x Synthetic dataset  30 dimensions  10 5 data points Observe O(10) speed-up  Realistic data sets can be processed GPUs help getting data analysis in O(1) minutes to “interactive speed” 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 12

  13. PANDA Track Reconstruction Andreas Herten, Marius Mertens, PANDA = Next generation Tobias Stockmanns et al. hadron physics experiment  Part of FAIR accelerator in Darmstadt (Germany) Scientific goal and requirements  Triggerless track reconstruction  Sustain data rate of 20 million events/s → 200 GBytes/s  Achieve O(1000) times data reduction 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 13

  14. PANDA Track Reconstruction Why using GPUs?  Easier to program compared to, e.g., FPGAs  Latencies more predictable than for CPUs Algorithms Close to proof-of-  Hough transformation concept for high  Triplet finder event-rate processing  Riemann tracker Initial results  Triplet finder running at rate of <1 μ s per hit 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 14

  15. Summary NVIDIA Application Lab at Jülich  Fruitful model for collaboration Multi-GPU parallelization  Required, e.g., due to device memory limitations  Applications: JuBrain image registration, B-CALM FDTD application Data-intensive applications on GPUs  Strongly benefit from improved support of L2 atomics  Applications: GPUMAFIA clustering, PANDA track recontruction 21.11.2013 Dirk Pleiter | NVIDIA Application Lab at Jülich 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend