The Integrative Role of COWs and Supercomputers in Research and - - PowerPoint PPT Presentation

the integrative role of cow s and supercomputers in
SMART_READER_LITE
LIVE PREVIEW

The Integrative Role of COWs and Supercomputers in Research and - - PowerPoint PPT Presentation

The Integrative Role of COWs and Supercomputers in Research and Education Activities Don Morton, Ganesh Prabu, Daniel Sandholdt, Lee Slater Department of Computer Science The University of Montana COW's and Supercomputers - CUG 1999


slide-1
SLIDE 1

COW's and Supercomputers - CUG 1999

The Integrative Role of COWÕs and Supercomputers in Research and Education Activities

Don Morton, Ganesh Prabu, Daniel Sandholdt, Lee Slater Department of Computer Science The University of Montana

slide-2
SLIDE 2

COW's and Supercomputers - CUG 1999

Introduction

¥ Thesis - workstation clusters and supercomputers can be used together in environments that benefit everybody ¥ COWÕs (e.g. Beowulf) - training and development activities in HPC ¥ Supercomputers (e.g. Cray T3E) - large-scale production runs

slide-3
SLIDE 3

COW's and Supercomputers - CUG 1999

Acknowledgements

¥ Arctic Region Supercomputing Center ¥ SGI/CRI ¥ National Science Foundation ¥ Pallas

slide-4
SLIDE 4

COW's and Supercomputers - CUG 1999

Outline

¥ Background ¥ Current Computing Environments ¥ Case Study - Parallel Programming Course ¥ Research and Development Activities ¥ COW/Supercomputer Integration Issues ¥ Conclusions

slide-5
SLIDE 5

COW's and Supercomputers - CUG 1999

Background

¥ 1991 - 80486, Linux ¥ 1993-94 - PVM, RS6000, T3D ¥ 1994-97 - Cameron University, ARSC ¥ 1997-Present - U. Montana, ARSC

slide-6
SLIDE 6

COW's and Supercomputers - CUG 1999

Current Computing Environments

UM Scientific Computing Lab

100BaseT Hub 10BaseT Hub p1 p2 p3 p4 p5 p6 p7 p8 frontend.scinet.prairie.edu scinet.prairie.edu Internet LittleHouse.prairie.edu elk.prairie.edu

slide-7
SLIDE 7

COW's and Supercomputers - CUG 1999

Case Study - Parallel Programming Course

¥ Graduate (masters) course ¥ Goals

Ð Hands-on experience using common, portable, programming tools Ð Explore concept of training on COWÕs, then moving to supercomputers

slide-8
SLIDE 8

COW's and Supercomputers - CUG 1999

Parallel Programming Course Outline

¥ Discuss basic concepts of parallel programming ¥ Implement solution to n-body problem with PVM, then MPI, then HPF ¥ Introduce performance analysis tools ¥ Lab session based on Linux/T3E portability issues ¥ Special projects

slide-9
SLIDE 9

COW's and Supercomputers - CUG 1999

Lab Session - Linux/T3E

¥ Port Linux PVM n-body code to T3E PVM ¥ Port Linux MPI n-body code to T3E MPI ¥ Vampir analysis of MPI n-body code ¥ Performance modeling and analysis of MPI Jacobi program on T3E ¥ Analysis and improvement of an MPI code

slide-10
SLIDE 10

COW's and Supercomputers - CUG 1999

Linux PVM to T3E PVM

¥ Network PVM and Cray MPP PVM have significant differences

Ð Heterogeneous vs. Homogeneous SPMD Ð Dynamic vs. static task allocation Ð Cray-specific PVM calls Ð Need to be aware of different size datatypes

¥ Portable codes must be written in SPMD, with conditional compilation

slide-11
SLIDE 11

COW's and Supercomputers - CUG 1999

Conditional Compilation for Portable PVM

#ifdef _CRAYMPP // In Cray MPP, the "global" group is indicated by null pointer #define GROUPNAME (char *) 0 #else #define GROUPNAME "alltasks" #endif ...... #ifdef _CRAYMPP // Cray MPP does not support joining a "global" group, so we simply // use the Cray-specific routine for getting the PE number mype = pvm_get_PE(mytid); #else mype = pvm_joingroup(GROUPNAME); #endif ........ #ifndef _CRAYMPP // This is not executed for Cray MPP PVM - pvm_spawn() is not // implemented - all tasks startup SPMD at beginning if(mype == 0) // I'm the master, spawn the others info = pvm_spawn(argv[0], (char**) 0, PvmTaskDefault, (char*) 0, ntasks-1, &tid_list[1]); #endif

slide-12
SLIDE 12

COW's and Supercomputers - CUG 1999

Comments on Porting PVM and MPI Codes

¥ PVM difficult to port, until network vs. Cray MPP differences are understood ¥ MPI ports easily ¥ Cray MPP is less forgiving of programmer errors than other systems ¥ In general, experienced students found transition from Linux to T3E straightforward

slide-13
SLIDE 13

COW's and Supercomputers - CUG 1999

Performance Analysis

¥ Use of Vampir as a common tool

Ð Vampirtrace - library of routines for generating tracefiles Ð Vampir - viewer for looking at tracefiles

slide-14
SLIDE 14

COW's and Supercomputers - CUG 1999

Vampir

slide-15
SLIDE 15

COW's and Supercomputers - CUG 1999

Special Projects

¥ Conversion of C++ MPI Jacobi program to Fortran ¥ Conversion of C++ MPI Jacobi program to C++ PVM ¥ Porting of Linux C++ parallel finite element code to T3E

slide-16
SLIDE 16

COW's and Supercomputers - CUG 1999

Porting of Linux C++ Parallel Finite Element Code to T3E

1 2 4 6 8 N = 120 N = 240 N = 360 N = 480 N = 600 100 200 300 400 500 600 Time (sec) P Elements

Linear Diffusion - SCINET

N = 120 N = 240 N = 360 N = 480 N = 600 1 2 4 6 8 N = 120 N = 240 N = 360 N = 480 N = 600 20 40 60 80 100 120 Time (sec) P Elements

Linear Diffusion - CrayT3E

N = 120 N = 240 N = 360 N = 480 N = 600

slide-17
SLIDE 17

COW's and Supercomputers - CUG 1999

Research and Development Activities

¥ Parallel, adaptive, finite element methods ¥ Parallelisation of hydrologic model for arctic ecosystems ¥ Coupling of parallel thermal and hydrologic models

slide-18
SLIDE 18

COW's and Supercomputers - CUG 1999

Parallel, Adaptive Finite Element Methods

Homogeneous absolute permeabilities. Heterogeneous absolute permeabilities.

slide-19
SLIDE 19

COW's and Supercomputers - CUG 1999

3D Isosurface (Oil/Water Interface)

slide-20
SLIDE 20

COW's and Supercomputers - CUG 1999

Timings - Linux Cluster

1130.5 209.7 1 10 100 1000 10000 Seconds 4 8 Processors Mesh Distribution Mesh Modification Distributed Solution

Wall time (seconds) required for single timestep with 4548 unknowns.

  • 100 MHz Pentiums
  • 100 Mbs Fast Ethernet
slide-21
SLIDE 21

COW's and Supercomputers - CUG 1999

Timings - Cray T3E

165.7 26.1 4.2 4.5 1 10 100 1000 Seconds 4 8 16 32 Processors Mesh Distribution Mesh Modification Distributed Solution

Wall time (seconds) required for single timestep with 4548 unknowns.

slide-22
SLIDE 22

COW's and Supercomputers - CUG 1999

Parallelisation of Hydrologic Model

slide-23
SLIDE 23

COW's and Supercomputers - CUG 1999

Time Measurements

¥ 6448 elements ¥ Use of MPI+METIS+Shmem on Cray, MPI+METIS

  • n Linux

2 4 6 8 10 12 14 16 18 20 Seconds T3D T3E Linux

Wall Time for Single Timestep

2 4 8 16

slide-24
SLIDE 24

COW's and Supercomputers - CUG 1999

Coupling of Thermal and Hydro Models

¥ Background - previously existing hydro and thermal models ¥ Benefits of coupling - increased detail, capture feedback loops inherent in arctic ecosystems

slide-25
SLIDE 25

COW's and Supercomputers - CUG 1999

Coupled Models

slide-26
SLIDE 26

COW's and Supercomputers - CUG 1999

MPI Inter-communicators

MPI_COMM_WORLD

slide-27
SLIDE 27

COW's and Supercomputers - CUG 1999

Non-coupled vs. Coupled Simulation

slide-28
SLIDE 28

COW's and Supercomputers - CUG 1999

COW/Supercomputer Integration Issues

¥ Code written on COWÕs should run on the T3E, and vice versa ¥ Integration should focus on creating similar programming environments

Ð Users should be able to run programs identically on COWÕs and supercomputers Ð Scripts (mostly on COW side) can aid in this

slide-29
SLIDE 29

COW's and Supercomputers - CUG 1999

COW/Supercomputer Integration Issues (continued)

¥ Portable analysis tools (e.g. Vampir, pgprof) ¥ Affordable, portable, integrated debuggers (Totalview?)

slide-30
SLIDE 30

COW's and Supercomputers - CUG 1999

Conclusions

¥ COWÕs and supercomputers have complementary roles in HPC ¥ Local COWÕs are ideal training and development platform ¥ Supercomputers always needed ¥ Increased usage of COWÕs for training and development should result in more HPC experts, and greater demand for supercomputers