for the Uintah Framework Qingyu Meng, Justin Luitjens, and Martin - PowerPoint PPT Presentation

Dynamic Task Scheduling for the Uintah Framework Qingyu Meng, Justin Luitjens, and Martin Berzins Thanks to DOE for funding since 1997, NSF since 2008

Uintah Applications Plume Fires Angiogenesis Industrial Flares Micropin Flow Explosions Sandstone Compaction Virtual Foam Soldier Shaped Charges Compaction

Uintah Development • Uintah is developed over a decade based on a far-sighted design by Steve Parker - Complete separation of user code and parallelism Tuning Expert (CS) Domain Expert (Engineering) Goal Performance, Salability Problem, Methods Responsibility Infrastructure Components Simulation Components Major Load balancing, AMR Arches, ICE, MPM, MPM- Contributions Task-graph based scheduling ICE, etc.. Asynchronous communication View of Parallel Infrastructure Serial Code Written for a Program MPI, Threads Patch

How Does Uintah Work Task-Graph Specification Patch-Based Domain • Computes & Requires Decomposition • No Processor or Domain Information

Patch-Based Domain Decomposition Adaptive Mesh Particles Cells Patch regrid load balance Mesh of cells Patch Processors

Uintah Scalability • Currently scale up to 98K cores on Kraken • Prepare for future machines • Petascale • Exascale 2018 ~2020 e.g. Aggressive Strawmam 742 cores per socket, 166,113,024 cores (DARPA hardware report, 2009)

Software Model for Exascale • Silver model for Exascale software which must: • Directed dynamic graph execution • Latency hiding • Minimize synchronization and overheads • Adaptive resource scheduling • Heterogeneous processing • Graph-based asynchronous-task work queue model (DARPA software report, 2009)

Graph Based Applications 1: 1 1: 1: 1: 2 3 4 2: 2: 2: 2 3 4 2: Charm++: Object-based Virtualization 2 2: 2: 3 4 3: 3: 3 4 Intel CnC: 3: new language for 3 graph based parallelism Plasma (Dongarra): DAG based Parallel linear algebra software

Uintah Distributed Task Graph • Up to 2 million tasks per timestep globally • Tasks on local and neighboring patches • Callback by each patch • Variables in data warehouse(DW) • Read - get() from OldDW and NewDW • Write- put() to NewDW • Communication on cutting edges 4 patches single level ICE task graph

Example Uintah Task from the ICE Algorithm Compute face-centered Velocities: Input variables Output variables (include boundary conditions)

Task Graph Compiling

Uintah Static Task Scheduler • Task List • Static analysis • In order execution, same order for each patch • Task status • Running->Finished->Next Task

Static Task Graph Execution 1) Static: Predetermined order • Tasks are Synchronized • Higher waiting times Task Dependency Execution Order

Dynamic Task Graph Execution 1) Static: Predetermined order • Tasks are Synchronized • Higher waiting times 2) Dynamic: Execute when ready • Tasks are Asynchronous • Lower waiting times Task Dependency Execution Order

Dynamic Multi-threaded Task Graph Execution 1) Static: Predetermined order • Tasks are Synchronized • Higher waiting times 2) Dynamic: Execute when ready • Tasks are Asynchronous • Lower waiting times 3) Dynamic Multi-threaded( Future ): • Task-Level Parallelism • Decreases Communication Task • Decreases Load Imbalance Dependency Multicore friendly Execution Order Support GPU tasks

Uintah Dynamic Task Scheduler Multi-thread • Task queues • Internal ready (MPI waiting tasks) • External ready (ready for concurrent execution) • Task status • Not scheduled -> Internal Ready -> External Ready -> Running -> Finished -> New task(s) satisfied

Ondemand Datawarehouse Directory based hash map <name, type, patchid> Var <name, type, patchid> Var versions del_T Global n/a del_T Global n/a v v1 press CC 1 press CC 1 v v1 v2 v3 press CC 2 press CC 2 v2 v3 v u_vel FC 1 u_vel FC 1 v v1 v3 … … … … … C .. .. V V R M V For fixed order execution Variable versioning for out-of-order execution

Schedule Global Sync Task • Synchronization task R1 R2 • Update global variable R2 R1 • e.g. MPI Allreduce • Call third party library Deadlock • e.g. PETSc • Out-of-order issues • Deadlock • Load imbalance • Task phases • One global task per phase Load imbalance • Global task runs last • In phase out-of-order

Dynamic Scheduler Performance Improvements Strong Scaling Weak Scaling (Fixed problem size) (Fixed problem size/Core)

Task prioritization algorithms not executed executed MPI sends Algorithm Random FCFS PatchOrder MostMsg. Queue Length 3.11 3.16 4.05 4.29 Wait Time 18.9 18.0 7.0 2.6 Overall Time 315.35 308.73 187.19 139.39

Granularity Effect • Decrease patch size • (+) Increase queue length • (+) More overlap, lower task wait time • (+) More patches, better load balance • (-) More MPI messages • (-) More regrid overheads • Other Factors • Problem size • Implied task level parallelism • Interconnection bandwidth and legacy • CPU cache size • Solution- Self Tuning?

Summary • Dynamic task scheduling • Support Out of order execution • Two task queues • Variable versioning • Task phases • Task prioritization algorithms • Ready queue length and task wait time • Granularity effect • Multi-thread and Self-tuning (Future)

Questions

BACKUP SLIDES

Uintah Components Simulation Models Regridder (Arches, ICE, MPM, (EoS, Constitutive, …) MPMICE, MPMArches , …) Callbacks Tasks Simulation Data Tasks Controller Archiver Scheduler Callbacks XML Load Checkpoints Problem MPI Balancer Specification Data I/O Domain Tuning Expert Expert

Uintah: Task Based Application • Automatic dependency analysis • Automatic message generation

Priority: Most Messages • Priority external task queue • Give priority to tasks that satisfy external dependencies first 3 3 5 5 0 3 3 0 3 3 0 0 5 3 3 5 patches on a single core

External Dependency Counter

MPM-ICE Algorithm Uintah originally designed for simulation of fires and explosions, e.g. metal containers embedded in large hydrocarbon fires. • ICE is a cell-centered finite volume method for Navier Stokes equations • ICE now handles fast and slow flows (2009) • MPM is a novel method that uses particles and nodes • Cartesian grid used as a common frame of reference • MPM (solids) and ICE (fluids) exchange data several times per timestep, not just boundary condition exchange. Container with PBX explosive

for the Uintah Framework Qingyu Meng, Justin Luitjens, and Martin - PowerPoint PPT Presentation

Dynamic Task Scheduling for the Uintah Framework Qingyu Meng, Justin Luitjens, and Martin Berzins Thanks to DOE for funding since 1997, NSF since 2008 Uintah Applications Plume Fires Angiogenesis Industrial Flares Micropin Flow Explosions

Developing Software Frameworks for Petascale and Beyond Using Dynamic Graph Based Approaches

Multi-Scale and Multi-Physics Simulations on Present and Future Architectures

Software Abstractions for Extreme-Scale Scalability of Computational Frameworks Martin Berzins

Solving Petascale Turbulent Combustion Problems with the Uintah Software Martin Berzins DOE NNSA

Hardware Observability Framework Hardware Observability Framework Hardware Observability

Play Framework One Web Framework to rule them all Felix Mller Agenda Yet another web

Dynamic Load Balancing of AMR Simulations Justin Luitjens, Qingyu Meng, Martin Berzins, John

A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial

A framework for linking land use and A framework for linking land use and A framework for linking

Uintah Architecture Open source software UQ DRIVERS ARCHES DSL: NEBO Worldwide

LEGAL FRAMEWORK FOR STRENGTHENING LEGAL FRAMEWORK FOR STRENGTHENING LEGAL FRAMEWORK FOR

Technical Framework Technical Framework Technical Framework Supporting eBusiness eBusiness

National Improvement Framework for Scottish Education The National Improvement Framework Aim

Implementation of global framework agreements in transnational Implementation of global framework

Northeast Conservation h Framework Framework What is it and why do we need it? National LCC

Spring 2013 1 A Framework for Junior Cycle The Framework Published by Minister Quinn on 4

Regularization of Inverse Problems Yoeri Boink , Srirang Manohar , Leonie Zeune , Leon

Cancer Research in 2019: Dazzling Opportunities, Daunting Challenges M. Celeste Simon, Ph.D.

Proteogenomic Characterization of Muscle Invasive Bladder Cancer to Identify Mechanisms of

Oncology Medications: Where Oncologic Pharmaceuticals are Going from a Birds Eye View Provi

Cowen and Company Health Care Conference Pfizer Oncology Andy Schmeltz Global President, Pfizer

Prevention of Worsening Heart Failure by Serelaxin in Patients Admitted for Acute Heart Failure:

Continuous Systems Verification Oded Maler CNRS - VERIMAG Grenoble, France Amir Pnueli Memorial

Text Mining and Information Extraction Applications for Bioinformatics and Systems Biology Plant

for the Uintah Framework Qingyu Meng, Justin Luitjens, and Martin - PowerPoint PPT Presentation

Dynamic Task Scheduling for the Uintah Framework Qingyu Meng, Justin Luitjens, and Martin Berzins Thanks to DOE for funding since 1997, NSF since 2008 Uintah Applications Plume Fires Angiogenesis Industrial Flares Micropin Flow Explosions

Developing Software Frameworks for Petascale and Beyond Using Dynamic Graph Based Approaches

Multi-Scale and Multi-Physics Simulations on Present and Future Architectures

Software Abstractions for Extreme-Scale Scalability of Computational Frameworks Martin Berzins

Solving Petascale Turbulent Combustion Problems with the Uintah Software Martin Berzins DOE NNSA

Hardware Observability Framework Hardware Observability Framework Hardware Observability

Play Framework One Web Framework to rule them all Felix Mller Agenda Yet another web

Dynamic Load Balancing of AMR Simulations Justin Luitjens, Qingyu Meng, Martin Berzins, John

A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial

A framework for linking land use and A framework for linking land use and A framework for linking

Uintah Architecture Open source software UQ DRIVERS ARCHES DSL: NEBO Worldwide

LEGAL FRAMEWORK FOR STRENGTHENING LEGAL FRAMEWORK FOR STRENGTHENING LEGAL FRAMEWORK FOR

Technical Framework Technical Framework Technical Framework Supporting eBusiness eBusiness

National Improvement Framework for Scottish Education The National Improvement Framework Aim

Implementation of global framework agreements in transnational Implementation of global framework

Northeast Conservation h Framework Framework What is it and why do we need it? National LCC

Spring 2013 1 A Framework for Junior Cycle The Framework Published by Minister Quinn on 4

Regularization of Inverse Problems Yoeri Boink *, Srirang Manohar , Leonie Zeune *, Leon

Cancer Research in 2019: Dazzling Opportunities, Daunting Challenges M. Celeste Simon, Ph.D.

Proteogenomic Characterization of Muscle Invasive Bladder Cancer to Identify Mechanisms of

Oncology Medications: Where Oncologic Pharmaceuticals are Going from a Birds Eye View Provi

Cowen and Company Health Care Conference Pfizer Oncology Andy Schmeltz Global President, Pfizer

Prevention of Worsening Heart Failure by Serelaxin in Patients Admitted for Acute Heart Failure:

Continuous Systems Verification Oded Maler CNRS - VERIMAG Grenoble, France Amir Pnueli Memorial

Text Mining and Information Extraction Applications for Bioinformatics and Systems Biology Plant

Regularization of Inverse Problems Yoeri Boink , Srirang Manohar , Leonie Zeune , Leon