outline
play

Outline The Sixth International Conference on Parallel Processing - PDF document

Outline The Sixth International Conference on Parallel Processing and Applied Mathematics Background (PPAM 2005) Matrix computation libraries Traditional programming style based on function calls SILC: a Flexible and Environment


  1. Outline The Sixth International Conference on Parallel Processing and Applied Mathematics • Background (PPAM 2005) – Matrix computation libraries – Traditional programming style based on function calls SILC: a Flexible and Environment Independent • Proposal of SILC Interface to Matrix Computation Libraries – Simple Interface for Library Collections – How SILC works Tamito KAJIYAMA 1, 2 Akira NUKADA 1, 2 • Design and Implementation of SILC Hidehiko HASEGAWA 3, 1 Reiji SUDA 2, 1 Akira NISHIDA 2, 1 • Experimental Results 1. CREST, Japan Science and Technology Agency (JST), Japan 2. The University of Tokyo, Japan • Future work 3. University of Tsukuba, Japan PPAM 2005 Background The traditional way of using libraries • Matrix computations 1. Preparation of matrices and vectors using library-specific data structures – Fundamental components in large-scale scientific applications 2. Function calls with a function's name and • Taking a major proportion of execution time and its arguments in a prescribed order memory resources As a result... • Long computation time with relatively small data • User programs will depend on a specific – Matrix computation libraries library • Facilitating rapid development of user programs • A few examples of libraries: LAPACK, IMSL, NAG – Not easy to replace the library by another PPAM 2005 PPAM 2005 You need to use other libraries An example in the traditional way • When user programs need to be ported to SSI_MATRIX A; SSI_MATRIX A; SSI_SCALAR *b, *x, work[N*6], params[2]; other computing environments SSI_SCALAR *b, *x, work[N*6], params[2]; int options[6], status; int options[6], status; – Required to use environment-specific libraries /* Create matrix A and vector b, allocate buffer for x */ /* Create matrix A and vector b, allocate buffer for x */ • When solvers and matrix storage formats status = ssi_cg (b, x, work, params, options, &A, NULL); status = ssi_cg (b, x, work, params, options, &A, NULL); in other libraries are necessary • A user program to solve A x = b – The best solver and matrix storage format • Using a library-specific function and data structures depend on: • A source-level dependency upon the library • The problem to be solved – Switch of libraries requires a number of modifications to the user program • The computing environment in use PPAM 2005 PPAM 2005 1

  2. Proposal of SILC An example in SILC • Simple Interface for Library Collections silc_envelope_t A, b, x; /* as in A x = b */ silc_envelope_t A, b, x; /* as in A x = b */ – Separating a function call into data transfer /* Create matrix A and vector b, allocate buffer for x */ /* Create matrix A and vector b, allocate buffer for x */ and a request of computation SILC_PUT ("A", &A); SILC_PUT ("A", &A); – Requesting the computation by means of SILC_PUT ("b", &b); SILC_PUT ("b", &b); mathematical expressions in the form of text SILC_EXEC ("x = A \ b"); /* Call a solver (e.g., ssi_cg) */ SILC_EXEC ("x = A \ b"); /* Call a solver (e.g., ssi_cg) */ – Using separate memory space to carry out SILC_GET (&x, "x"); SILC_GET (&x, "x"); the requested computation • Data transfer and a request of computation • Mathematical expressions in the form of text A, b • Computation in separate memory space "Solve Ax = b" Memory space for User program → Independent of any specific library and environment computation x PPAM 2005 Main benefits of using SILC Functionalities • User programs are independent of libraries • Data types: scalar, vector, matrix, cubic array – Allowing users to change environments easily • Precisions: integer, real, complex (single/double) PC PC PC PC SMP • Matrix storage formats: dense, band, CRS • Mathematical expressions • Only the smallest amount of data is needed – Statements: assignments, procedure calls – Temporary buffers for computation are automatically – Components of a statement allocated in separate memory space • Binary arithmetic operators (+, − , *, /, %) • Mathematical expressions are well-defined and • Solution of systems of linear equations (A \ b) language-independent • Transposition (A'), complex conjugate (A~) • Functions (e.g., “sqrt(b' * b)” is the 2-norm of vector b) – Fit for use in many computing environments with • Subscript (e.g., “A[1:5, 1:5]” is a 5 × 5 submatrix of A) various programming languages (C, Fortran, Python) PPAM 2005 PPAM 2005 How to use alternative solvers Implementation • User program (client) User program • Alternative solvers as separate modules Main program (Client) – Connects to a SILC server – One module for each solver – Issues PUT, EXEC and SILC client routines – The “prefer” statement to specify a preferred module GET requests • Interface thread • An example: a comparison of two solvers Communications – For communications SILC_EXEC ("prefer leq_lu"); – Puts EXEC requests into Interface thread SILC server SILC_EXEC ("x1 = A \ b"); /* solved by LU decomposition */ the request queue SILC_EXEC ("prefer leq_cg"); • Execution thread SILC_EXEC ("x2 = A \ b"); /* solved by the CG method */ Request queue – For computation SILC_EXEC ("d = b − A * x1; norm1 = sqrt(d' * d)"); /* ||b − Ax 1 || */ Execution thread – Handles EXEC requests SILC_EXEC ("d = b − A * x2; norm2 = sqrt(d' * d)"); /* ||b − Ax 2 || */ asynchronously Linear Eigenvalue Modules (pluggable) equation FFT solvers solvers PPAM 2005 PPAM 2005 2

  3. Experiments with 4 SILC servers in Implementation (continued) different computing environments • A user program (client) that solves A x = b • User programs – Where A is a tridiagonal matrix in the CRS format – Sequential programs (at the moment) – Run in the notebook PC of Environment (a) – In a 100-Base TX local-area network – Written in C, Fortran and Python Environment Specification OpenMP • SILC servers (a) A notebook PC Intel Pentium M 733 1.1GHz, N/A C S 768MB memory, – Run in sequential and shared-memory (SMP) Fedora Core 3 parallel computing environments Intel Itanium2 1.3GHz × 32, (b) SGI Altix3700 1 thread C S 32GB memory, Red Hat Linux – OpenMP is used for parallel computation in Advanced Server 2.1 IBM Power5 1.65GHz × 2 (c) IBM eServer 4 threads the execution thread C S OpenPower 710 (4 logical CPUs), 1GB memory, SuSE Linux Enterprise Server 9 (d) SGI Altix3700 Same as (b) 16 threads C S PPAM 2005 Experimental results Observations • About 0.1 second of data communications over the LAN • Performance of SILC is not bad – Data size: 0.46MB (N=10,000) to 4.27MB (N=80,000) – Speedups by parallel computation even with a • SILC servers in (c) and (d) achieved better performance time loss due to data communications because of parallel computation • Communication time will have less impact 10,000 C S (a) Notebook PC Execution time (in seconds) as dimension N increases (b) Altix3700 (1 thread) 1,000 (c) OpenPower 710 (4 threads) C S – Communication time is of O ( N ) (d) Altix3700 (16 threads) 100 – Computation time is of O ( N 2 ) C S 10 • Faster networks and computing environments also reduce communication time in SILC C S 1 10,000 20,000 40,000 80,000 PPAM 2005 Dimension N Future work For your information • Ready-made modules for existing matrix • The first public release of SILC (version 1.0) will be made on September 20 computation libraries • Please visit our project home page at • MPI-based SILC for distributed-memory http://ssi.is.s.u-tokyo.ac.jp/silc/ parallel computing environments • Just-in-time (dynamic) optimizations based on mathematical expressions • Extension of mathematical expressions to an interactive scripting language PPAM 2005 PPAM 2005 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend