parallelfx parallelism made easy
play

ParallelFX : parallelism made easy Jrmie Laval (Garuma) - PowerPoint PPT Presentation

ParallelFX : parallelism made easy Jrmie Laval (Garuma) jeremie.laval@gmail.com http://garuma.wordpress.com What ? Ease the development of parallel (multi-threaded) applications that take advantage of multi-core processors.


  1. ParallelFX : parallelism made easy Jérémie Laval (Garuma) jeremie.laval@gmail.com http://garuma.wordpress.com

  2. What ? ● Ease the development of parallel (multi-threaded) applications that take advantage of multi-core processors. ● Written by Microsoft and running only on .NET. ● 3 main components : - A T ask API similar in usage to classic Thread - Parallel loops : for, foreach... - Plinq (Parallel Linq) : allows Linq queries to run in parallel The goal : create an open-source cross-platform implementation of ParallelFX running on Mono.

  3. Why ? ● T oday trend is to improve the number of core in CPUs, not their individual speed. ● Theoretically it should give a ×n performance boost but it's not (n being the number of core). ● Reason : application are designed to be single-threaded. ● Second reason : usually doing multi-threading "by hand" is hard and not efficient.

  4. How ? (current design) Mono Application Shared ParallelFX library work (Scheduler) pool Retrieve Steal Local work Local work pool pool Thread Thread Manage Worker Worker OS OS thread thread

  5. Peeking into the "magic" ● Stack with a back-off layer. ● Uses CAS for atomic operations, Shared completely lock-free. work pool ● Back-off layer allows correct performances at high load. ● Deque-like (3 operations : pushBottom, popBottom, popT op). ● Worker uses pushBottom & Local work pool popBottom (LIFO-style). ● popT op used by stealers (FIFO-style). ● Minimize CAS and uses no lock.

  6. Example ● A classic ray-tracer implementation written with no multithreading in mind. ● Processing time on my computer : ~31s.

  7. Example (next) ● Using current implementation of ParallelFX. Color mask represents the ThreadWorker which did the work. ● Processing time on my computer : ~18s, almost 42% speed-up.

  8. Thanks for your attention Questions ?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend