optim imiz izatio ion coachin ing for fork join in applic
play

Optim imiz izatio ion Coachin ing for Fork/Join in Applic - PowerPoint PPT Presentation

Optim imiz izatio ion Coachin ing for Fork/Join in Applic licatio ions on the Java Vir irtual l Machin ine Eduardo Eduar do Ros osal ales es Advisor: Prof. Walter Binder Research area: Parallel applications, performance analysis


  1. Optim imiz izatio ion Coachin ing for Fork/Join in Applic licatio ions on the Java Vir irtual l Machin ine Eduardo Eduar do Ros osal ales es Advisor: Prof. Walter Binder Research area: Parallel applications, performance analysis EuroDW 2018 April 23, 2018 PhD stage: Planner Porto, Portugal

  2. Opti timizati ation Coachi on Coaching ng for for For Fork/J /Joi oin n Applicati cations ons on the J on the Jav ava V a Virtual tual M Machi achine ne § The The pro probl blem: despite the complexities associated with developing and tuning fork/join applications, there is little work focused on assisting developers in optimizing such applications on the JVM . § Re Relevance: fork/join parallelism has an increasing popularity among developers targeting the JVM. It has been integrated to support parallel processing on the Java library , thread management in JVM languages and a variety of parallel applications based on Actors, MapReduce, etc. § Ou Our pro propo posal: coaching developers towards optimizing fork/join applications by diagnosing performance issues on such applications and further suggest concrete code refactoring to solve them. § Ex Expe pected out outcom come: in contrast to the manual experimentation often required to tune fork/join applications on the JVM, we devise a tool able to automatically assist developers in optimizing a fork/join application.

  3. Fork/join Application § Wh What is is a fo fork/j /join appl applicat cation? on? solve(Problem problem) { if (problem is small) if directly solve problem sequentially else { else recursively split problem into independent parts: fork new new tasks to solve each part fork rk fo fork fo rk join all forked tasks join in j j o o jo i i n n } } join in join in j j j j fork rk o o jo fork rk o o jo i i i i n n fo n n fo fork fo fork fo rk rk

  4. The Java Fork/Join Framework § The The Jav ava for fork/j /joi oin fr fram amewor ework [1] is the implementation enabling fork/join applications on the JVM § It implements the work-stealing [2] scheduling strategy: Worker thread 1 Push Push e e k k Task a a T T P P o o p p Deque 1 Submission Worker St Steal eal task thread 2 Tak Take Push Push P P o o p p CP CPU Deque 2 COR ORE COR ORE [1] D. Lea. A Java Fork/Join Framework . JAVA 2000. [2] Burton et al. Executing Functional Programs on a Virtual Tree of Processors . FPCA 1981.

  5. The Java Fork/Join Framework § The The Jav ava for fork/j /joi oin fr fram amewor ework [1] is the implementation enabling fork/join applications on the JVM § It implements the work-stealing [2] scheduling strategy: Worker thread 1 Push Push e e k k Task a a T T P P o o p p Deque 1 Submission Deque 2 Worker task thread 2 Take Tak Push Push P P o o p p CP CPU COR ORE COR ORE [1] D. Lea. A Java Fork/Join Framework . JAVA 2000. [2] Burton et al. Executing Functional Programs on a Virtual Tree of Processors . FPCA 1981.

  6. The Java Fork/Join Framework § Supports parallel processing in the Java library: • java.util.Array • java.util.streams (package) • java.util.concurrent.CompletableFuture<T> § Supports thread management for other JVM languages: • Scala • Apache Groovy • Clojure § Supports diverse fork/join parallelism, including applications based on Actors and MapReduce

  7. The Java Fork/Join Framework § Many of the design forces encountered when implementing fork/join designs surround task granularity at four levels [3] : M M a a x x i i m m i i z z i i n n g g M M i i n n i i m m i i z z i i n n g g l l o o c c a a l l i i t t y y c c o o n n t t e e n n t t i i o o n n M M a a x x i i m m i i z z i i n n g g M M i i n n i i m m i i z z i i n n p p g g a a r r a a l l l l e e l l i i s s m m o o v v e e r r h h e e a a d d s s Task granul Task anular arity [3] D. Lea. Concurrent Programming in Java. Second Edition: Design Principles and Patterns . Addison-Wesley Professional, 2nd edition, 1999.

  8. Example of a common performance issues 1/4 Too Too fine ne-gr grain ined d tasks Sub ubop opti timal al for forking ng § Ex Excessiv ive forkin ing § Push Pus Take Tak Pop Pop Push Pus Take Tak Pop Pop Push Pus Take Tak Pop Pop Take Push Pus Tak Pop Pop ✗ Parallelization overheads due to excessive: CP CPU COR ORE COR ORE • Deque accesses • Object creation/reclaiming COR ORE COR ORE

  9. Example of a common performance issues 2/4 Few coars Few coarse-gr grain ined d tasks Sub ubop opti timal al for forking ng § Spa Sparse forkin ing § Push Pus Pop Pop Take Tak Push Pus Take Tak Pop Pop Push Pus Take Tak Pop Pop Take Push Pus Tak Steal St eal Pop Pop ✗ CPU CP Missed parallelization opportunities: • Low CPU utilization COR ORE COR ORE • Load imbalance COR ORE COR ORE ✗ idle id le

  10. The problem De Despite the complexities associated wi with developing and tuning fork/j fo /join a applicati tions, , there is little wo work focused on assisting developers towa wards optimizing such applications on the JVM. The scope: CPU CORE CORE CORE CORE CPU CPU Memory CORE CORE CORE CORE CORE CORE CORE CORE CPU CORE CORE CORE CORE Fork/j For /joi oin ap n applicati cations ons A single shared-memory running in a single multicore JVM

  11. Our Approach In contrast to manual experimentation used to tune a fork/join application, we propose an approach based on: Ou Our Pr Profiling g Op Optimization Ap Approach te techniques Coachi Coaching ng

  12. Our Approach In contrast to manual experimentation often used to tune a fork/join application, we propose an approach based on: Ou Our Pr Profiling g Op Optimization Ap Approach techniques te Coaching Coachi ng Static and dynamic analysis to autom automati atical cally d diag agnos nose e per erfor formance i ance issues ues

  13. Our Approach In contrast to manual experimentation often used to tune a fork/join application, we propose an approach based on: Ou Our Pr Profiling g Optimization Op Ap Approach techniques te Coaching Coachi ng § Stati tatic anal c analysis: : to automatically inspect the source code to detect fork/join anti patterns. § Dy Dynam namic anal c analysis: : to automatically diagnose performance issues noticeable at runtime (e.g., suboptimal forking, excessive garbage collection, low CPU usage, contention).

  14. Our Approach In contrast to manual experimentation often used to tune a fork/join application, we propose an approach based on: Ou Our Pr Profiling g Optimization Op Approach Ap techniques te Coaching Coachi ng Opti timizati ation coachi on coaching ng [4]: [4]: processing the output generated by the compiler’s optimizer to suggest concrete code modifications that may enable the compiler to achieve missed optimizations. [4] St-Amour et al. Optimization Coaching: Optimizers Learn to Communicate with Programmers . OOPSLA 2012.

  15. Our Approach In contrast to manual experimentation often used to tune a fork/join application, we propose an approach based on: Our Ou Pr Profiling g Optimization Op Approach Ap techniques te Coaching Coachi ng Inspired by Optimization Coaching the goal is aut automat omatical cally sugges uggesting ng concr concret ete e code code modi modificat cations ons to o sol olve e th the d dete tecte ted i issues

  16. Future Work Method ethodol olog ogy for for the autom the automati atic d c diag agnos nosing ng of p of per erfor formance i ance issues ues: § Define a model to characterize fork/join tasks § Characterize all tasks spawned by a fork/join application § Determine the metrics and entities worth to consider to § automatically diagnose performance issues Method ethodol olog ogy for for the autom the automati atic s c sug ugges esti tion of op on of opti timizati ations ons: § Automatic recognition of fork/join anti patterns and matching to § concrete suggestions to avoid them Val alidati ation of the r on of the res esul ults ts: § Discover fork/join workloads, suitable for validating both § aforementioned methodologies

  17. BAC BACKU KUP P SL SLIDES. ES.

  18. Related Work § An Analy lysis is of paralle llel l applic licatio ions on the JVM § A number of parallelism profilers focus on the JVM [9][10] [9][10] Yo YourKi Kit Java Java Java Java JProf JP ofiler er Profiler Pr Inte In tel l vTune vTune Mission Control Mi The The goal oal Characterizing processes or threads over time. o None of the existing tools targets fork/join applications. Limitat Li ations ons [9] Adhianto et al. HPCTOOLKIT: Tools for Performance Analysis of Optimized Parallel Programs . Concurr. Comput.: Pract. Exper., 22(6): pp. 685–701, 2010. [10] Teng et al. THOR: a Performance Analysis Tool for Java Applications Running on Multicore Systems . IBM Journal of Research and Development, 54(5):4:1–4:17, 2010. 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend