Multilevel domain decomposition at extreme scales S. Badia, A. - PowerPoint PPT Presentation

Multilevel domain decomposition at extreme scales S. Badia, A. Martin, J. Principe Universitat Politècnica de Catalunya & CIMNE Jeju, July 7th, 2015 0 / 24

Outline 1 Motivation 2 Multilevel framework 3 Multilevel linear solvers 4 Conclusions 0 / 24

Current trends of supercomputing • Transition from today’s 10 Petaflop/s supercomputers (SCs) • ... to exascale systems w/ 1 Exaflop/s expected in 2020 • × 100 performance based on concurrency (not higher freq) • Future: Multi-Million- core (in broad sense) SCs 1 / 24

Weakly scalable solvers • This talk: One challenge, weakly scalable algorithms Weak scalability If we increase X times the number of Source: Dey et al, 2010 processors, we can solve an X times larger problem • Key property to face more complex problems / increase accuracy Source: parFE project 2 / 24

Scalable linear solvers (AMG) • Most scalable solvers for CSE are parallel AMG (Trilinos [Lin, Shadid, Tuminaro, ...], Hypre [Falgout, Yang,...],...) • Hard to scale up to largest SCs today (one million cores, < 10 PFs) • Problems: large communication/computation ratios at coarser levels, densification coarser problems,... 3 / 24

Multilevel framework • Propose a highly scalable implementation of Multilevel DD methods (MLBDDC [Mandel et al’08]) • MLDD based on a hierarchy of meshes/functional spaces • It involves local subdomain problems at all levels (L1, L2, ...) FE mesh Subdomains (L1) Subdomains (L2) 4 / 24

Outline 1 Motivation I: Develop a multilevel framework suitable for extremely scalable implementations 2 Motivation II: Apply the multilevel framework for scalable linear algebra (MLBDDC) 5 / 24

Outline 1 Motivation I: Develop a multilevel framework suitable for extremely scalable implementations 2 Motivation II: Apply the multilevel framework for scalable linear algebra (MLBDDC) All implementations in FEMPAR (in-house code) to be distributed as open-source SW soon * * Funded by Proof of Concept Grant 640957 - FEXFEM: On a free open source extreme scale finite element software 5 / 24

Outline 1 Motivation 2 Multilevel framework 3 Multilevel linear solvers 4 Conclusions 5 / 24

Premilinaries • Element-based (non-overlapping DD) distribution (+ limited ghost info) ˜ T 1 h , T 2 h , T 3 T 1 T h h h • Gluing info based on objects • Object: Maximum set of interface nodes that belong to the same set of subdomains 6 / 24

Automatic hierarchical mesh generator Classification of objects (vef’s at the next level) in 3D • Faces: Objects that belong to 2 subdomains • Edges: Objects that belong to more than 2 subdomains • Corners: Edges and faces with cardinality 1 7 / 24

Coarser triangulation • Similar to FE triangulation object but wo/ reference element • Instead, aggregation info object level 1 = aggregation (vef’s level 0) 8 / 24

Coarser FE space • On top of coarser triangulation, we create a FE-like functional space • DOFs on geometrical objects at the coarser level (as in FEs) • Aggregation info for DOFs ( u α 1 = F α ( u 1 )) 9 / 24

Coarser FE space • On top of coarser triangulation, we create a FE-like functional space • DOFs on geometrical objects at the coarser level (as in FEs) • Aggregation info for DOFs ( u α 1 = F α ( u 1 )) 1 X u α 1 = u 1 ( p ) #( p ) p ∈E α 9 / 24

Hierarchical FE spaces • The under-assembled space ¯ V 0 = { v ∈ ˜ V 0 | continuous F 1 ( v ) } • ¯ V 0 is a multiscale space ˜ ¯ V 0 V 0 V 0 • Compute sol’on in V 0 using ¯ V 0 correction as preconditioner (multilevel precond) • BDDC DD preconditioner is a particular realization of ¯ V 0 (corners/edges/faces) 10 / 24

Hierarchical FE spaces The under-assembled space ¯ V 0 can be decomposed as [Dohrmann’03]: • Its bubble space ¯ 0 = { v ∈ ¯ V b V 0 |F ( v ) = 0 } • The coarser FE space V 1 = { v ∈ ¯ A ¯ V b V 0 | v ⊥ ˜ 0 } F ( u 0 ) = 0 ¯ ¯ V b = ⊕ V 0 V 1 0 11 / 24

Coarse corner function • Compute via local problems a basis for V 1 = { Φ 1 , . . . , Φ n c } • Every Φ is a coarse shape function related to a coarse DoF Circle domain partitioned into 9 V 1 corner basis function subdomains 12 / 24

Coarse edge function • Compute via local problems a basis for V 1 = { Φ 1 , . . . , Φ n c } • Every Φ is a coarse shape function related to a coarse DoF Circle domain partitioned into 9 V 1 edge basis function subdomains 13 / 24

Multilevel/scale concurrency The problem in ¯ V 0 = V 1 ⊕ V b 0 : u 0 ∈ ¯ v 0 ∈ ¯ ¯ V 0 : a (¯ u 0 , ¯ v 0 ) = ( f , ¯ v 0 ) ∀ ¯ V 0 A ¯ u b V b can be decomposed as ¯ u 0 = ¯ 0 + u 1 (orthogonality V 1 ⊥ ˜ 0 ) 0 ∈ ¯ 0 ) ∀ v 0 ∈ ¯ u b V b : a ( u b 0 , v b 0 ) = ( f 0 , v b V b 0 0 u 1 ∈ V 1 : a ( u 1 , v 1 ) = ( f 1 , v 1 ) ∀ v 1 ∈ V 1 • Bubble component is local to every subdomain (parallel) • Coarse global problem 14 / 24

Multilevel/scale concurrency The problem in ¯ V 0 = V 1 ⊕ V b 0 : u 0 ∈ ¯ v 0 ∈ ¯ ¯ V 0 : a (¯ u 0 , ¯ v 0 ) = ( f , ¯ v 0 ) ∀ ¯ V 0 A ¯ u b V b can be decomposed as ¯ u 0 = ¯ 0 + u 1 (orthogonality V 1 ⊥ ˜ 0 ) 0 ∈ ¯ 0 ) ∀ v 0 ∈ ¯ u b V b : a ( u b 0 , v b 0 ) = ( f 0 , v b V b 0 0 u 1 ∈ V 1 : a ( u 1 , v 1 ) = ( f 1 , v 1 ) ∀ v 1 ∈ V 1 • Bubble component is local to every subdomain (parallel) • Coarse global problem Multilevel concurrency is BASIC for extreme scalability implementations 14 / 24

Multilevel concurrency P 0 P 1 P 2 t = • L1 duties are fully parallel • L2 duties destroy scalability because • # L1 proc’s ∼ × 1000 # L2 proc’s • L2 problem size increases w/ number of proc’s 15 / 24

Multilevel concurrency P 0 P 1 P 2 P 3 t = • Every processor has one level/scale duties • Idling dramatically reduced (energy-aware solvers) • Overlapped communications / computations among levels 15 / 24

Multilevel concurrency P 0 P 1 P 2 P 3 t = Inter-level overlapped bulk asynchronous (MPMD) implementation in FEMPAR 15 / 24

FEMPAR implementation Multilevel extension straightforward (starting the alg’thm with V 1 and level-1 mesh) 3rd level 1st level MPI comm 2nd level MPI comm MPI comm 1 2 1 2 3 4 P 1 2 P 1 ..... ..... e e e e e e e e e r r r r r r r r o o o o o o o o r o c c c c c c c c c parallel (distributed) global communication ..... global communication ..... time ..... 16 / 24

Multilevel domain decomposition at extreme scales S. Badia, A. - PowerPoint PPT Presentation

Multilevel domain decomposition at extreme scales S. Badia, A. Martin, J. Principe Universitat Politcnica de Catalunya & CIMNE Jeju, July 7th, 2015 0 / 24 Outline 1 Motivation 2 Multilevel framework 3 Multilevel linear solvers 4

Image Processing A case study for a domain decomposed MPI code Domain Decomposition 1

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Figure 1a: Multilevel Visualization of Market Power Cases in Tree Format Figure 1b: Multilevel

Agenda 1. More on multilevel R formulas 2. Generalized Multilevel Models 3. GMLM in R 1 More

Multilevel Krylov Methods Deflation Deflation, DD, MG Reinhard Nabben Multilevel Krylov

Momentum i i Filtered Filtered = Momentum v f x G

Extreme Heat Preparedness Objectives What is extreme heat ? How does it impact SF? What are the

2014: Extreme territories 2 2015: Extreme territories 3 2016: Extreme territories 4 2018:

MATHEMATICS 1 CONTENTS Extreme values in one dimension Extreme values in two dimensions

Waste Data Automation Alan Housley Vice President Marketing / LoadMan On-Board Truck Scales

Waste Data Automation Alan Housley Vice President Marketing / LoadMan On-Board Truck Scales

An Example of Index An Example of Index pattern of structure in indicators pattern of structure

Spring Scales Theyre only accurate when everything is at rest Turn off all electronic

Multilevel Cooling Systems August Hale May 16, 2014 August Hale Multilevel Cooling Systems May

The Challenge of The Challenge of Multilevel Security Multilevel Security Rick Smith, Ph.D.,

Graph Processing Frameworks Lecture 24 CSCI 4974/6971 5 Dec 2016 1 / 13 Todays Biz 1.

TEM for magnetism: challenges and competitors Olivier Fruchart Institut Nel (CNRS-UJF-INPG)

Exploiting Locality in Distributed SDN Control Stefan Schmid (TU Berlin & T-Labs) Jukka

Matching Using LSH Forest Michael Cochez * 1st International KEYSTONE Conference * Industrial

Data Preparation Discretization Data cleaning (Data pre-processing) Data

Measuring Sustainability CodeGreen Approach 5/20/2014 1 1 About

SCS Scorecard System V3.0 Super Admin (SHRU) Setup agency, category, location, period type

Dr. Olivia Leung January 2014 1 Agenda (1)Balanced Scorecard (BSC) and its 4 perspective (2)Key

Multilevel domain decomposition at extreme scales S. Badia, A. - PowerPoint PPT Presentation

Multilevel domain decomposition at extreme scales S. Badia, A. Martin, J. Principe Universitat Politcnica de Catalunya & CIMNE Jeju, July 7th, 2015 0 / 24 Outline 1 Motivation 2 Multilevel framework 3 Multilevel linear solvers 4

Image Processing A case study for a domain decomposed MPI code Domain Decomposition 1

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Figure 1a: Multilevel Visualization of Market Power Cases in Tree Format Figure 1b: Multilevel

Agenda 1. More on multilevel R formulas 2. Generalized Multilevel Models 3. GMLM in R 1 More

Multilevel Krylov Methods Deflation Deflation, DD, MG Reinhard Nabben Multilevel Krylov

Momentum i i Filtered Filtered = Momentum v f x G

Extreme Heat Preparedness Objectives What is extreme heat ? How does it impact SF? What are the

2014: Extreme territories 2 2015: Extreme territories 3 2016: Extreme territories 4 2018:

MATHEMATICS 1 CONTENTS Extreme values in one dimension Extreme values in two dimensions

Waste Data Automation Alan Housley Vice President Marketing / LoadMan On-Board Truck Scales

Waste Data Automation Alan Housley Vice President Marketing / LoadMan On-Board Truck Scales

An Example of Index An Example of Index pattern of structure in indicators pattern of structure

Spring Scales Theyre only accurate when everything is at rest Turn off all electronic

Multilevel Cooling Systems August Hale May 16, 2014 August Hale Multilevel Cooling Systems May

The Challenge of The Challenge of Multilevel Security Multilevel Security Rick Smith, Ph.D.,

Graph Processing Frameworks Lecture 24 CSCI 4974/6971 5 Dec 2016 1 / 13 Todays Biz 1.

TEM for magnetism: challenges and competitors Olivier Fruchart Institut Nel (CNRS-UJF-INPG)

Exploiting Locality in Distributed SDN Control Stefan Schmid (TU Berlin &amp; T-Labs) Jukka

Matching Using LSH Forest Michael Cochez * 1st International KEYSTONE Conference * Industrial

Data Preparation Discretization Data cleaning (Data pre-processing) Data

Measuring Sustainability CodeGreen Approach 5/20/2014 1 1 About

SCS Scorecard System V3.0 Super Admin (SHRU) Setup agency, category, location, period type

Dr. Olivia Leung January 2014 1 Agenda (1)Balanced Scorecard (BSC) and its 4 perspective (2)Key

Exploiting Locality in Distributed SDN Control Stefan Schmid (TU Berlin & T-Labs) Jukka