Sciences http://dvm-system.org Graph problems; Sparce matrices; - PowerPoint PPT Presentation

V. Bakhtin, A. Kolganov, V. Krukov, N. Podderyugina, M. Pritula, O. Savitskaya Keldysh Institute of Applied Mathematics Russian Academy of Sciences http://dvm-system.org

 Graph problems;  Sparce matrices;  Scientific and technical calculation on irregular grids. http://dvm-system.org 2

 Graph problems;  Sparce matrices;  Scientific and technical calculation on irregular grids. They can use the same data format, for example, CSR http://dvm-system.org 3

Problems: ◦ A single grid step in the computational domain – no flexibility, impossibly high demands on memory and processing power during grinding; ◦ Implementation of numerical methods are often tied to the form of a grid - two-dimensional, three-dimensional, cartesian, cylindrical, etc. So we can not replace geometry. Positive sides: ◦ Neighborhood relations and spatial coordinates are not stored explicitly – memory saving; ◦ There is a simple accesses to arrays with constant shifts – freedom for a compiler optimizations, clarity for parallelization (including automatic parallelization). http://dvm-system.org 4

Positive sides: ◦ We can choose any mesh grinding – maintaining degree of grinding in parts of the area; ◦ Good opportunities for reuse of computing code, the freedom to choose the form of computational areas. Problems: ◦ Neighborhood relations and spatial coordinates to be stored explicitly; ◦ Indirect indexing on arrays accesses – a barrier for a compiler optimizations, the complexity of parallelization (particularly automatic). http://dvm-system.org 5

double A[L][L]; double B[L][L]; int main(int argc, char *argv[]) { for(int it = 0; it < ITMAX; it++) { { for (int i = 1; i < L - 1; i++) for (int j = 1; j < L-1; j++) A[i][j] = B[i][j]; for (int i = 1; i < L - 1; i++) for (int j = 1; j < L - 1; j++) B[i][j] = (A[i - 1][j] + A[i + 1][j] + A[i][j - 1] + A[i][j + 1]) / 4.; } } FILE *f = fopen("jacobi.dat", "wb"); Jacobi algorithm fwrite(B, sizeof(double), L * L, f); fclose(f); return 0; http://dvm-system.org 6 }

#pragma dvm array distribute[block][block], shadow[1:1][1:1] double A[L][L]; #pragma dvm array align([i][j] with A[i][j]) double B[L][L]; int main(int argc, char *argv[]) { for(int it = 0; it < ITMAX; it++) { { for (int i = 1; i < L - 1; i++) for (int j = 1; j < L-1; j++) A[i][j] = B[i][j]; for (int i = 1; i < L - 1; i++) for (int j = 1; j < L - 1; j++) B[i][j] = (A[i - 1][j] + A[i + 1][j] + A[i][j - 1] + A[i][j + 1]) / 4.; } } FILE *f = fopen("jacobi.dat", "wb"); Jacobi algorithm fwrite(B, sizeof(double), L * L, f); in the DVMH model fclose(f); return 0; http://dvm-system.org 7 }

#pragma dvm array distribute[block][block], shadow[1:1][1:1] double A[L][L]; #pragma dvm array align([i][j] with A[i][j]) double B[L][L]; int main(int argc, char *argv[]) { for(int it = 0; it < ITMAX; it++) { { #pragma dvm parallel([i][j] on A[i][j]) for (int i = 1; i < L - 1; i++) for (int j = 1; j < L-1; j++) A[i][j] = B[i][j]; #pragma dvm parallel([i][j] on B[i][j]), shadow_renew(A) for (int i = 1; i < L - 1; i++) for (int j = 1; j < L - 1; j++) B[i][j] = (A[i - 1][j] + A[i + 1][j] + A[i][j - 1] + A[i][j + 1]) / 4.; } } FILE *f = fopen("jacobi.dat", "wb"); Jacobi algorithm fwrite(B, sizeof(double), L * L, f); in the DVMH model fclose(f); return 0; http://dvm-system.org 8 }

#pragma dvm array distribute[block][block], shadow[1:1][1:1] double A[L][L]; #pragma dvm array align([i][j] with A[i][j]) double B[L][L]; int main(int argc, char *argv[]) { for(int it = 0; it < ITMAX; it++) { #pragma dvm region inout(A, B) { #pragma dvm parallel([i][j] on A[i][j]) for (int i = 1; i < L - 1; i++) for (int j = 1; j < L-1; j++) A[i][j] = B[i][j]; #pragma dvm parallel([i][j] on B[i][j]), shadow_renew(A) for (int i = 1; i < L - 1; i++) for (int j = 1; j < L - 1; j++) B[i][j] = (A[i - 1][j] + A[i + 1][j] + A[i][j - 1] + A[i][j + 1]) / 4.; } } FILE *f = fopen("jacobi.dat", "wb"); Jacobi algorithm #pragma dvm get_actual(B) fwrite(B, sizeof(double), L * L, f); in the DVMH model fclose(f); return 0; http://dvm-system.org 9 }

C-DVMH = C language + pragmas Fortran-DVMH = Fortran 95 + pragmas  Pragmas are high-level specification of parallelism in terms of a sequential program;  There are no low-level data transfer and synchronization in the program code;  Sequential programming style;  Pragmas are "invisible" for standard compilers;  There is only one instance of the program for sequential and parallel calculations. http://dvm-system.org 10

 The distribution of arrays between the processors (distribute / align directives);  Distribution of loop iterations between computing devices (parallel directive );  Specification of parallel tasks and their mapping to the processors (task directive );  The effective remote access to data located on other computing devices (shadow / across / remote specifications). http://dvm-system.org 11

 The effective execution of reduction operations (reduction specification: max/min/sum/maxloc/minloc /… );  Determination of the program fragments (regions) for execution on accelerators and multi-core CPU (region directive);  Motion data control between the CPU memory and GPU memory (actual / get_actual directives). http://dvm-system.org 12

 Fortran-DVMH compiler;  C-DVMH compiler;  DVMH Run Time System;  DVMH- программ debugger;  Performance analyzer. http://dvm-system.org 13

 There are a great foundation and experience of writing parallel programs for clusters;  DVMH model suggests parallelizing sequential programs;  The user does not want to give up their parallel program;  DVMH model does not apply to parallelize some programs (eg, with random access memory). http://dvm-system.org 14

 A new mode of DVM-system was addewd locally in each process;  Undistributed parallel loop construction was added;  Incremental parallelism and fast evaluation of DVMH-model of the CPU and GPU threads become available;  Ability to use DVMH-parallelization become available inside the cluster node in the MPI-programs. http://dvm-system.org 15

 Solver with explicit scheme is the part of large developed set of computation programs: ◦ C++, 39 000 LOC, templates, polymorphism, etc;  Local modifications of the one module (~3000 lines) have been made, which are reduced to the addition about 10 DVMH directives;  We were obtained the accelerations: ◦ 2 CPU Intel Xeon X5670 (6 cores on each CPU – 9.8x ; ◦ GPU NVidia GTX Titan (Kepler) – 18x . http://dvm-system.org 16

 Indirect distribution: distribute A[indirect(B)]  Derived distribution: distribute A[derived([cells[i][0]: cells[i][2]] with cells[@i])] http://dvm-system.org 17

 Shadow edges are the set of elements that are not owned by the current process;  New directive for inderect distribution: shadow_add (nodes[neigh[i][0]:neigh[i][numneigh [i]-1] with nodes[@i]] = neighbours) http://dvm-system.org 18

 The procedure for the convert of the global (initial) index to the local (for direct memory access) is too long;  For regular distributions the global and local indexes are the same;  The executable directive was introduced for localization arrays indexes for indirect distributions: localize (neigh => nodes[:]) http://dvm-system.org 19

 Two-dimensional heat conduction problem with a constant but discontinuous coefficient in the hexagon.  The area consists of two materials with different coefficients of thermal. http://dvm-system.org 20

do i = 1, np2  Arrays are one- nn = ii(i) dimensional – tt1,tt2 nb = npa(i) if (nb.ge.0) then s1 = FS(xp2(i),yp2(i),tv) s2 = 0d0  Variable number of do j = 1, nn "neighbors" – ii j1 = jj(j,i) s2 = s2 + aa(j,i) * tt1 (j1) enddo s0 = s1 + s2  Links are specified by tt2 (i) = tt1 (i) + tau * s0 else if (nb.eq.-1) then array – jj tt2 (i) = vtemp1 else if (nb.eq.-2) then tt2 (i) = vtemp2 endif s0 = ( tt2 (i) - tt1 (i)) / tau gt = DMAX1(gt,DABS(s0)) enddo do i = 1, np2 tt1 (i) = tt2 (i) enddo http://dvm-system.org 21

Accelerations on CPU Intel Xeon X5670 Явная Неявная implicit explicit 300 250 200 Speed up 150 100 4 nodes 3 nodes 50 2 nodes 1 node 0 2 4 8 12 24 48 96 Nomber of cores (2 CPU with 6 cores per node) http://dvm-system.org 22

Accelerations на GPU Nvidia Tesla C2050 Явная Неявная implicit explicit 4 nodes 320 270 220 3 nodes Speed up 170 2 nodes 120 1 node 70 20 1 2 3 6 12 24 -30 Number of GPUs (3 per node) http://dvm-system.org 23

cite: http://dvm-system.org mail: dvm@keldysh.ru 24

Sciences http://dvm-system.org Graph problems; Sparce matrices; - PowerPoint PPT Presentation

V. Bakhtin, A. Kolganov, V. Krukov, N. Podderyugina, M. Pritula, O. Savitskaya Keldysh Institute of Applied Mathematics Russian Academy of Sciences http://dvm-system.org Graph problems; Sparce matrices; Scientific and technical

Life Sciences Building Life Sciences Building Life Sciences Building Life Sciences Building

Centers for Ocean Sciences Education Excellence Ocean Sciences Education Excellence Centers for

Bong MW, Nutritional Sciences Programme, Faculty of Health Sciences, UKM, Kuala Lumpur Prof.

FACULTY WELCOME 2019 FACULTY OF SCIENCES AND BIOENGINEERING SCIENCES PROF. DR. BEN CRAPS, DEAN

the social sciences Information Session, NSAC, Copenhagen, September 2019 1 FRANCES LEADING

Multi- -Disciplinary Convergence in Life Sciences: Disciplinary Convergence in Life Sciences:

Sciences and Extension https://jhseonline.com School of Human Sciences Background Sponsored

Estonian University of Life Sciences Eesti Maalikool (EM) Estonian University of Life Sciences

Manish Dutta, Institute of Liver and Biliary Sciences INSTITUTE OF LIVER & BILIARY SCIENCES New

Institute of Environmental Sciences Institute of Environmental Sciences Institute of

College of Natural & Health Sciences College of Natural & Health Sciences College of

We will start very soon FACULTY WELCOME 2020-2021 FACULTY OF SCIENCES AND BIOENGINEERING

Update on the Fusion Update on the Fusion Energy Sciences Program Energy Sciences Program Ed

SUNY Academic Medical Centers/Hospitals Committee UB Health Sciences Overview Michael E. Cain, MD

DEPARTMENT OF PHARMACEUTICAL SCIENCES M. D. UNIVERSITY ROTHAK Annual Progress Work Presentation

Accessing and Manipulating Life-Sciences Ontologies using Web Services Olivier Dameron, Mark A.

Research of Theories and Methods of Classification and Dimensionality Reduction Jie Gui (

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon School of Electrical Engineering and Computer Science

District Governor Club of Brookings Dan Little, DVM Rotary President Holger Knaack and Susanne

Introduction to Linux dynamic device management Birmingham Linux User Group 21 April 2011 Nick

Towards NFC-Aware Process Execution for Dynamic Environments WiVS 2011 Kristof Hamann Sebastian

Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical

Is automatic recognition of makam necessary for MIR? Makam information is available through

Lecture 3: Introduction to Sliding Mode Control Reference: S.C. Tan, Chapter 1. Sliding Mode

Sciences http://dvm-system.org Graph problems; Sparce matrices; - PowerPoint PPT Presentation

V. Bakhtin, A. Kolganov, V. Krukov, N. Podderyugina, M. Pritula, O. Savitskaya Keldysh Institute of Applied Mathematics Russian Academy of Sciences http://dvm-system.org Graph problems; Sparce matrices; Scientific and technical

Life Sciences Building Life Sciences Building Life Sciences Building Life Sciences Building

Centers for Ocean Sciences Education Excellence Ocean Sciences Education Excellence Centers for

Bong MW, Nutritional Sciences Programme, Faculty of Health Sciences, UKM, Kuala Lumpur Prof.

FACULTY WELCOME 2019 FACULTY OF SCIENCES AND BIOENGINEERING SCIENCES PROF. DR. BEN CRAPS, DEAN

the social sciences Information Session, NSAC, Copenhagen, September 2019 1 FRANCES LEADING

Multi- -Disciplinary Convergence in Life Sciences: Disciplinary Convergence in Life Sciences:

Sciences and Extension https://jhseonline.com School of Human Sciences Background Sponsored

Estonian University of Life Sciences Eesti Maalikool (EM) Estonian University of Life Sciences

Manish Dutta, Institute of Liver and Biliary Sciences INSTITUTE OF LIVER &amp; BILIARY SCIENCES New

Institute of Environmental Sciences Institute of Environmental Sciences Institute of

College of Natural &amp; Health Sciences College of Natural &amp; Health Sciences College of

We will start very soon FACULTY WELCOME 2020-2021 FACULTY OF SCIENCES AND BIOENGINEERING

Update on the Fusion Update on the Fusion Energy Sciences Program Energy Sciences Program Ed

SUNY Academic Medical Centers/Hospitals Committee UB Health Sciences Overview Michael E. Cain, MD

DEPARTMENT OF PHARMACEUTICAL SCIENCES M. D. UNIVERSITY ROTHAK Annual Progress Work Presentation

Accessing and Manipulating Life-Sciences Ontologies using Web Services Olivier Dameron, Mark A.

Research of Theories and Methods of Classification and Dimensionality Reduction Jie Gui (

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon School of Electrical Engineering and Computer Science

District Governor Club of Brookings Dan Little, DVM Rotary President Holger Knaack and Susanne

Introduction to Linux dynamic device management Birmingham Linux User Group 21 April 2011 Nick

Towards NFC-Aware Process Execution for Dynamic Environments WiVS 2011 Kristof Hamann Sebastian

Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical

Is automatic recognition of makam necessary for MIR? Makam information is available through

Lecture 3: Introduction to Sliding Mode Control Reference: S.C. Tan, Chapter 1. Sliding Mode

Manish Dutta, Institute of Liver and Biliary Sciences INSTITUTE OF LIVER & BILIARY SCIENCES New

College of Natural & Health Sciences College of Natural & Health Sciences College of