concise parallelism
play

Concise parallelism Natural C/C++ Parallelism A single operator to - PowerPoint PPT Presentation

Concise parallelism Natural C/C++ Parallelism A single operator to control multiple parallel programming paradigms void salute() { parallel() { Natural C/C++ semantics int idx = pix(); and variable visibility A single operator to serial() rules


  1. Concise parallelism

  2. Natural C/C++ Parallelism A single operator to control multiple parallel programming paradigms void salute() { parallel() { Natural C/C++ semantics int idx = pix(); and variable visibility A single operator to serial() rules and scopes { control parallel parallel(3) synchronization { printf("Hello, world, from task %d-%d\n", idx, pix()); } } } } Clear means of parallel identification and interaction

  3. Elegant Multitasking std::vector<Data> data; parallel(5000000) { int i = pix(); Synchronized access to serial(&data[i]) any data element { data[i].process(); without introducing } synchronization objects } Stack Each thread from a pool decrements Single Execution State the task counter and “creates” a job to { Task No. = 5000000; execute from a single execution state: Code pointer; Registers; • No CPU oversubscription } • Dynamic work balancing • Minimal memory footprint • No task queue management overhead

  4. Language-Friendly Multithreading A single operator to A real independent thread in a control multi-threading class X class constructor! { and multitasking void* volatile id; X() Getting a global ID promotes a { parallel(2) task to an independent thread { void* pid = pid(); Thread-0 returns, if(pix()) thread-1 waits until { id = pid; woken up by another thread/task while(id) void X::read() { { wait(); wake(id); getMoreData(); processData(); } } } break; Reaching the break } } demotes a thread to a task };

  5. Easy Software Analysis Use the same compiler, std::vector<Data> data; debugger and profiler tools as for sequential void f(int n) software { parallel(data.size) { /// Timing: 5 sec; Parallelism = 95%; Time per CPU: CPU0 = 30%, CPU1 = 30%... for(int i = 0; i < n; i++) { /// Avrg iterations = 100 int j = pix(); parallel() { /// Timing: 4.5 sec; Parallelism = 80%; Time per CPU: CPU0 = 30%, CPU1 = 30%... data[j].process(); serial() { /// Timing: 4.5 sec; Contention = 30%; data[j].reduce(); } } C= source code is a perfect performance model by itself : a } C= profiler can annotate each parallel, sequential and cyclic } region with timings, contention, iterations, balance, etc. } exactly in alignment with a corresponding operator

  6. Software Implications Re-writing parallel runtimes in C= A powerful parallel will eliminate CPU oversubscription programming language … OpenMP and guarantee efficient resource management , especially in complex, multi-module applications using C= TBB several parallel runtimes simultaneously Cilk CRT PPL AMP @CPU …and a unified parallel runtime OpenCL @CPU

  7. Hardware Implications Slide a tablet into an Memory accelerator box and get std::vector<Data> data; faster software, vivid parallel(data.size) graphics, detailed scenes, { CPU real-time video encoding PU PU PU data[pix()].process(); } – right away! CPU PU PU PU PU PU PU CPU Single Execution State { Task No. = data.size; Code pointer; Registers; } Co-processors fetch the state transparently C= programs are designed for to CPU and OS and massive parallelism w/o smoothly accelerate incurring extra overhead by execution of existing forming a single execution state programs for any number of parallel tasks Truly mobile, data-consistent, cheap and powerful architecture!

  8. One Program Fits All Memory std::vector<Data> data; parallel(data.size) { CPU coload() { CPU data[pix()].process(); } CPU } Remote agents may Single Execution State concurrently “steal” the work { from C= execution states and Task No. = data.size; GPU utilize their CPUs and GPUs Code pointer; Registers; GPU } GPU Unified Semantic Concept of Parallelism enables distributed heterogeneous C= programs are programming with a single executed concurrently by CPUs and GPUs parallel operator

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend