- High Performance
High Performance Computing, Computational Grid, and Computing, Computational Grid, and Numerical Libraries Numerical Libraries
- Outline
Outline
♦ ♦
!"
#$% &
- '$()& $
($
Outline Outline - - PDF document
High Performance High Performance Computing, Computational Grid, and Computing, Computational Grid, and Numerical Libraries Numerical Libraries
($
♦ ♦ $* ♦
♦ !($
(!.
♦ (. ♦ /&
.
♦ $
.
♦ /!!
.
♦ #*
.
♦ $
♦ +1"
" 1.
3
&. # * #*-1 .
♦ (
!!.
♦ (&561. ♦ ♦ '% ♦ ' ♦ "
SYSTEM Different Algorithms, Segment Sizes Best Algorithm
computer
♦ 789!
:!8!;<4'$
♦ =#>
♦
♦ ''$
'$$' 3! 3!+!3! !$ 4!$$!0
♦ ( ♦
!-
1?
4 : @( 3 / 1
♦ ''$4'$
& .
ScaLAPACK - PDGESV - Using collapsed NWS query from UCSB 42 machine available, using mainly torc, cypher, msc clusters at UTK [Jan 2002] 0.0 1000.0 2000.0 3000.0 4000.0 5000.0 6000.0 Matrix Size - Nproc T im e (s ec
)
PDGESV spawn NWS MDS
4.7 5.3 1.0 2.3 7.6 1.1 4.7 PDGESV 58.7 394.7 749.4 1686.7 2747.1 4472.7 5020.4 spawn 92.2 105.9 154.1 124.7 135.6 181.0 264.5 NWS 5.5 7.4 12.3 12.0 4.2 10.2 8.5 MDS 13.0 11.0 10.0 11.0 14.0 73.0 12.0 5000-10 10000-12 15000-14 20000-14 25000-18 30000-18 35000-27
♦''$5..-65!,6 ♦(2('5...-7-6
54!'!A!6
♦$"!5B! ;!
)6
♦56 ♦@@ $(
@@,5..6 ,:CCC,(1# $ $(/'5...-76 "!$( 2@@ "!
.
♦ 3*
♦ ,
♦ #!!
!...
/
Today: Collaboration
?--..-
'$?--&..--- #$?--...-- ##@?--...)--
3A$?--...-- ,@ ?--...--- # ?--..
♦ &$ ♦ #& ♦ 4@ ♦ 2$ ♦ 3 ♦ 4
!"#$%&'( !"#$%&'( !"#$%&'(
♦ #$"
♦ 4/( ! !! !!0 ♦ && ♦
AGENT(s)
A C
S1 S2 S3 S4
Client
Matlab Mathematica C, Fortran Java, Excel Schedule Database
No knowledge of the grid required, RPC like.
IBP Depot
AGENT(s)
A C
S1 S2 S3 S4
Client
Matlab Mathematica C, Fortran Java, Excel Schedule Database
No knowledge of the grid required, RPC like. A, B
IBP Depot
AGENT(s)
A C
S1 S2 S3 S4
Client
Matlab Mathematica C, Fortran Java, Excel Schedule Database
No knowledge of the grid required, RPC like.
H a n d l e b a c k
IBP Depot
AGENT(s)
A C
S1 S2 S3 S4
Client Answer (C)
S2 ! Request
Op(C, A, B)
Matlab Mathematica C, Fortran Java, Excel Schedule Database
No knowledge of the grid required, RPC like. A, B OP, handle
IBP Depot
♦
$'('E!($!'F! '/('E ♦ =(>" ( ♦ /"
=>* !899 C
♦ !
!##@!0
♦
@+?$! .!3$! !..
♦
/ !
An OC-3 ATM link routes IP traffic between the Knoxville campus, National Transportation Research Center, and Oak Ridge National Laboratory. UT participates in several national networking initiatives including Internet2 (I2), Abilene, the federal Next Generation Internet (NGI) initiative, Southern Universities Research Association (SURA) Regional Information Infrastructure (RII), and Southern Crossroads (SoX). The UT campus consists of a meshed ATM OC-12 being migrated over to switched Gigabit by early 2002.
♦ $ EAG. ♦ $$ 2$ ♦ 2' 3 ♦ 3#$# ♦ # #,$ ♦ 4( 3. ♦ @ ♦ - ♦ #$ ♦ '''$
4 @)##@
♦ ?
Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation
♦ ?
Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation
♦ $'('E
53((;63(
♦ +
♦ 43!2(&"!@)!
#!$!$ !!#' !3$!0
;
♦ "5
(4'$!4'$!4'$!;3(6.
♦ ,$(3
$8& ( $(3
♦
= % # H>
#?" !10
♦ !
♦ $'('E
3D ( (D ♦ 3
♦ '
♦ , ♦ 3
D
Natural Data (A,b)
Middleware Application Library (e.g. LAPACK, ScaLAPACK, PETSc,…)
Natural Answer (x) Structured Data (A’,b’) Structured Answer (x’)
User A b Stage data to disk
User A b Library Middle-ware
User A b Library Middle-ware NWS Autopilot MDS Resource Selection Time function minimization
User A b Library Middle-ware NWS Autopilot MDS Resource Selection 0,0 0,1 … Time function minimization
Can use Grid infrastructure, i.e.Globus/NWS, but doesn’t have to.
♦ :.:.< ♦ '8.< ♦ #,$8.9.8 ♦ 3(2& :.:.8 ♦ $'('E:.I ♦ ''$-4'$<.9.8 ♦ 4'$:.: ♦ ('(:.:.G ♦ '$D=>
TORC CYPHER OPUS Type Cluster 8 Dual Pentium III Cluster 16 Dual Pentium III Cluster 8 Pentium II OS Red Hat Linux 2.2.15 SMP Debian Linux 2.2.17 SMP Red Hat Linux 2.2.16 Memory 512 MB 512 MB 128 or 256 MB CPU speed 550 MHz 500 MHz 265 – 448 MHz Network Fast Ethernet (100 Mbit/s) (3Com 3C905B) and switch (BayStack 350T) with 16 ports Gigabit Ethernet (SK- 9843) and switch (Foundry FastIron II) with 24 ports Myrinet (LANai 4.3) with 16 ports each
MacroGrid Testbed
Independent components being put together and interacting
ScaLAPACK - PDGESV - Using collapsed NWS query from UCSB 42 machine available, using mainly torc, cypher, msc clusters at UTK [Jan 2002] 0.0 1000.0 2000.0 3000.0 4000.0 5000.0 6000.0 Matrix Size - Nproc Time (seconds)
PDGES V spawn
4.7 5.3 1.0 2.3 7.6 1.1 4.7 PDGESV 58.7 394.7 749.4 1686.7 2747.1 4472.7 5020.4 spawn 92.2 105.9 154.1 124.7 135.6 181.0 264.5 NWS 5.5 7.4 12.3 12.0 4.2 10.2 8.5 MDS 13.0 11.0 10.0 11.0 14.0 73.0 12.0 5000-10 10000-12 15000-14 20000-14 25000-18 30000-18 35000-27
500 1000 1500 2000 2500 3000 3500 5000 10000 15000 20000 Matrix Size Time (seconds)
5 OPUS 8 OPUS 8 OPUS 8 OPUS, 6 CYPHER 8 OPUS, 2 TORC, 6 CYPHER 6 OPUS, 5 CYPHER 2 OPUS, 4 TORC, 6 CYPHER 8 OPUS, 4 TORC, 4 CYPHER
OPUS OPUS, CYPHER OPUS, TORC, CYPHER
♦
♦ $'('E
1
400 600 800 1000 1200 1400 1600 1800 2000
Time (seconds)
App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling
Scenario: start application on 4 processors. After 4 minutes into the run 4 addition processors become available. Want to reorganize the computation to take advantage of the extra computing capability. What’s the additional cost?
400 600 800 1000 1200 1400 1600 1800 2000
Time (seconds)
Application execution Redistribute time Checkpoint read+write Restart time Start time Grid overhead NWS
App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling App1: 13600, 4 torcs App2: 13600, 4 torcs+4 cyphers, introduced 4 mins. into the run With rescheduling
About 12% better performance even with redistribution and restart.
♦ / & #
!"!
♦
.
' !
.
:I!<8!IJ!:8K.
♦ /!! ♦ '
.
♦
''$
'(!$ ,!@$ ♦
$'#$
A)!E ♦
'$
$A!E ' BE!E
"##$
♦
@
!E ( 11!E E/!E ♦
@&3(
@!E-2/$ 4!E 3!E
♦ $'
''$
...--
#$
...--
@
G$'('E
@&3(
'.