spiral
play

Spiral Computer Generation of Performance Libraries Applications - PowerPoint PPT Presentation

Carnegie Mellon Performance Spiral Computer Generation of Performance Libraries Applications Jos M. F. Moura Markus Pschel Franz Franchetti Platforms & the Spiral Team Carnegie Mellon What is Spiral? Traditionally Spiral


  1. Carnegie Mellon Performance Spiral Computer Generation of Performance Libraries Applications José M. F. Moura Markus Püschel Franz Franchetti Platforms & the Spiral Team

  2. Carnegie Mellon What is Spiral? Traditionally Spiral Approach Spiral Comparable High performance library High performance library performance optimized for given platform optimized for given platform

  3. Carnegie Mellon Main Idea: Program Generation Model: common abstraction = spaces of matching formulas abstraction abstraction ν p defines rewriting search μ pick algorithm architecture space space Architectural parameter: Kernel: optimization Vector length, problem size, #processors, … algorithm choice

  4. Carnegie Mellon How Spiral Works Problem specification (“DFT 1024” or “DFT”) Complete automation of controls Algorithm Generation the implementation and optimization task Algorithm Optimization algorithm Basic ideas: controls • Declarative representation Implementation Search of algorithms Code Optimization • Rewriting systems to C code generate and optimize algorithms at a high level Compilation performance of abstraction Compiler Optimizations Spiral Fast executable

  5. Carnegie Mellon Algorithms: Rules in Domain Specific Language Linear Transforms Viterbi Decoding convolutional 11 10 01 01 10 10 11 00 Viterbi 010001 11 10 00 01 10 01 11 00 010001 encoder decoder £ Matrix-Matrix Multiplication Synthetic Aperture Radar (SAR) matched preprocessing interpolation 2D iFFT = £ filtering

  6. Carnegie Mellon One Approach for all Types of Parallelism  Multithreading (Multicore)  Vector SIMD (SSE, VMX/Altivec ,…)  Message Passing (Clusters, MPP)  Streaming/multibuffering (Cell)  Graphics Processors (GPUs)  Gate-level parallelism (FPGA)  HW/SW partitioning (CPU + FPGA)

  7. Carnegie Mellon Example: Code Generation for Multicore CPUs  Hardware abstraction: shared cache with cache lines  Tensor product: embarrassingly parallel operator A Processor 0 A Processor 1 A Processor 2 A Processor 3 x y  Permutation: problematic; may produce false sharing x y

  8. Carnegie Mellon Spiral: Meta-Tool to Autotuning Libraries Input:  Transform :  Algorithms :  Vectorization : 2-way SSE  Threading : Yes Output:  Optimized library (10,000 lines of C++) Spiral Library Generator  For general input size ( not collection of fixed sizes)  Vectorized High-Performance Library “FFTW - like”  Multithreaded  With runtime adaptation mechanism  Performance competitive with hand-written code

  9. Carnegie Mellon Verification and Testing  Verify algorithms symbolically = ?  Check rules through verification of instances = ?  Check code empirically = ? DFT4([0,1,0,0]) DFT4([0.1,1.77,2.28,-55.3]) = ? DFT4_rnd([0.1,1.77,2.28,-55.3]))

  10. Carnegie Mellon Range: Cell Phone To Supercomputer Global FFT (1D FFT, HPC Challenge) performance [Gflop/s] 6.4 Tflop/s BlueGene/P Samsung i9100 Galaxy S II BlueGene/P at Argonne National Laboratory Dual-core ARM at 1.2GHz with NEON ISA 128k cores (quad-core CPUs) at 850 MHz SIMD vectorization + multi-threading SIMD vectorization + multi-threading + MPI G. Almási, B. Dalton, L. L. Hu, F. Franchetti, Y. Liu, A. Sidelnik, T. Spelce, I. G. Tānase , E. Tiotto, Y. Voronenko, X. Xue: 2010 IBM HPC Challenge Class II Submission. Winner of the 2010 HPC Challenge Class II Award (Most Productive System).

  11. Carnegie Mellon More Results: Spiral Outperforms Humans FFT on Multicore SAR SDR improvement FFT on FPGA

  12. Carnegie Mellon More Information: www.spiral.net www.spiralgen.com

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend