- Grid Computing:
Grid Computing: NetSolve and the GrADS Project GrADS Project - - PDF document
Grid Computing: NetSolve and the GrADS Project GrADS Project - - PDF document
Grid Computing: NetSolve and the Grid Computing: NetSolve and the GrADS Project GrADS Project
- Innovative Computing Laboratory
Innovative Computing Laboratory
♦ ♦ ♦ !" ♦ #$
!%%! &'(%))) *+"%("%,#-,"%# *. (/ 0 ( 1 1 *23
"4&5,% 617$171
- Outline
Outline
♦ / ♦ !%/
18
9 /- ! / #:- !%
# /
- Grid Computing
Grid Computing
♦ $2;
<=3%% % >% 11%)
♦ "2#1<
?11 113 @!A )
- Why Grids?
Why Grids?
♦ B!
- ♦ #!
- ♦ !
!%
- Technology Trends:
Technology Trends: Microprocessor Capacity Microprocessor Capacity
2X transistors/Chip Every 1.5 years
Called “Moore’s Law”
Moore’s Law
Microprocessors have become smaller, denser, and more powerful. Not just processors, bandwidth, storage, etc
Gordon Moore (co-founder of Intel) predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months.
- Network Bandwidth Growth
Network Bandwidth Growth
♦ !) *0% !C% D. ♦ *C0+E''' (8.'' !(8F&'1''' ♦ E''*E'*' (8+' !(8&'''
5A!))) !"#$"%#$ &"%&"$%!$" '$
- Bandwidth Won’t Be A Problem Soon
Bandwidth Won’t Be A Problem Soon --
- Bisection Bandwidth (BB) Across the US
Bisection Bandwidth (BB) Across the US
♦ *CG*- HH**EI, ♦ *C0+- HH*5, ♦ E''*- HHE''/, ♦ %1&'''
% %%*'/,
♦ *E
&'''J*'/, &',
♦ .!
% %!,E*E E)&#,
♦ ;K%%!
%A 1% % %
- =
/%" E''' ♦ C
%
♦ 6*''. ♦ HH!!
*E''')
- Grid Possibilities
Grid Possibilities
♦ %8*'1'''
*''1'''%
♦ *1'''%!! ♦ 181?
<%8
♦ <11?<
- ♦
1!%1
- Some Grid Usage Models
Some Grid Usage Models
♦ (:
%/!% 1
♦ K!(%<%
- 82))
- <-3
♦ 1!% ♦ 5-(
- Grid Usage Models
Grid Usage Models
♦
1 %%/(
- %
- ♦ H/!
- 1!
!
- Example Application Projects
Example Application Projects
♦ $% /(2 7$3 ♦ $/(%11)2$3 ♦ $/(2$3 ♦ 62 7$3 ♦ /(%1)2$3 ♦ /#%!2 63 ♦ 5$7 (<2 63 ♦ $$ (2 63 ♦ ##%/2 7$3
- Some Grid Requirements
Some Grid Requirements – – Systems/Deployment Perspective Systems/Deployment Perspective
♦ ?% ♦ %<? ♦ " ♦ "%< ♦ " ♦ 2-31!! ♦ % ♦ " ♦ %- ♦ # ♦ 5 ♦ ♦ ♦ " ♦ ? ♦ 6 ♦ ♦ $)
- Some Grid Requirements
Some Grid Requirements – – User Perspective User Perspective
♦ -(%
/% %
♦ (%
/
♦ (
/
♦ (/
%!
- The Systems Challenges:
The Systems Challenges: Resource Sharing Mechanisms That… Resource Sharing Mechanisms That…
♦
!
♦ 8%!%
%
♦ 1
1
- ♦ 7!%!%
?
- The Security Problem
The Security Problem
♦ "8
?%8
- ♦ "
- $%%!?
♦ %
11,
- :,
♦ ?
1!-1!- !%!
- The Resource Management Problem
The Resource Management Problem
♦ $1
- %%<
"?%< "
- Grid Systems Technologies
Grid Systems Technologies
♦
!?) $))1/(
/ 2/ 3
- /5 25 3
- /"52/"53
- "?-
/6#1H#
- The Programming Problem
The Programming Problem
♦ !1
1- 1%1/L
♦ #( ,,)
- ,% !
%
- Examples of Grid
Examples of Grid Programming Technologies Programming Technologies
♦ 5#-/E(/-
- ♦ / I1/#(#1
- %
♦ /5#1/1 "H(
1
♦ -/(!! ♦ (:/ ♦ (! ♦ (/-!
!
1
- MPICH
MPICH-
- G2: A Grid
G2: A Grid-
- Enabled MPI
Enabled MPI
♦ %5
#25#3%1 !
H%5# 5#2/ 3 ♦ /%1
1811)
♦ #!!%% ♦ (55#1#M1 5#1
5/#$
- Grid Events
Grid Events
♦ //6(! 5F,1) )- $1!%:
- ♦ #(:
#-** !%//6-01 E''E ♦ 7% ## 1/1$/1/ "
- Useful References
Useful References
♦ H25I3 !!!)), ♦ #/ ;%%/($ N7<=1 #1E''* !!!)),%,, ) ♦ "%%
1(
!!!))1!!!)- )1!!!))
- Emergence of Grids
Emergence of Grids
♦ H/%%
- 2!%%%!%5#
3
( !,
- 11
1 %2% 113 23
- Grids Are Inevitable
Grids Are Inevitable
♦ 2#3( !
- 8
1% %A !! O: 1@A!
- ♦ %8
%)
♦ K%!
%% )
♦ %%181
%% )
♦ !11
%% %%- 1)))
"
In the past: Isolation Motivation for Grid Computing Motivation for Grid Computing
Today: Collaboration
- What is Grid Computing?
What is Grid Computing?
"%? 1- <
QuickTime™ and a decompressor are needed to see this picture. QuickTime™ and a decompressor are needed to see this picture.IMAGING INSTRUMENTS COMPUTATIONAL RESOURCES LARGE-SCALE DATABASES DATA ACQUISITION ,ANALYSIS ADVANCED VISUALIZATION
- The Grid Architecture Picture
The Grid Architecture Picture
- High speed networks and routers
- !
" # $%
- "
- !&
%'
- Globus
Globus Grid Services Grid Services
♦ %/
/
111 11))) ♦ %% 18% # ♦ 6%%!-# ♦ 8 $))1#1/ -#1M).'C1))) ♦ PA/1A
8
- Evolution of a Community Grid Model
Evolution of a Community Grid Model
♦ "! K
1%1
- Common Infrastructure layer
(NMI, GGF standards, OGSA etc.) Grid Resources User-focused grid middleware, tools, and services Applications
- Maturation of Grid Computing
Maturation of Grid Computing
♦ "%
- 5!
- /<
♦ 1%1%
%
♦ /-
- /#% 2#%31H"231N7231
2#%31Q
- The Computational Grid is…
The Computational Grid is…
♦ Q%
! )
♦ #!/
#!(%1!1!1
- #!(
♦ !!%/%!
!%! )
%- B
- Computational Grids and
Computational Grids and Electric Power Grids Electric Power Grids
♦ K%%
/ %$ #!/
("$$ $)$ *’
- %
- +$
♦ K%%
/ % $#!/
,$
- ,$
+$
- +
"$$
.$ ' .$"$$"
- An Emerging Grid Community
An Emerging Grid Community
*CC.-E'''
♦ ;/=
%!% %
♦
! " /$ ! 0."+%0$ .1 2 0, .+$
- /.%3
- #/ - %(,,)),R!:,%,#/
/%(,,!!!)),
- %(,,!!!))),R%!,
%(,,!!!-)),,%, %(,,!!!))),, 6%(,,%))):,,
- %(,,!!!))!),,
/ %(,,!!!)%%)),%%,,)% K6! %(,,!!!))),,,
- %(,,))),
Grids are Hot Grids are Hot
- ♦ K
/ Broad Acceptance of Grids as a Critical Broad Acceptance of Grids as a Critical Platform for Computing Platform for Computing
NSF’s Cyberinfrastructure NASA’s Information Power Grid DOE’s Science Grid
- Broad Acceptance of Grids as a Critical
Broad Acceptance of Grids as a Critical Platform for Computing Platform for Computing
♦ K
/
- ♦ H51 1$11
#1Q
On August 2, 2001, IBM announced a new corporate initiative to support and exploit Grid computing. AP reported that IBM was investing $4 billion into building 50 computer server farms around the world.
AVAKI
- Grids Form the Basis of a National
Grids Form the Basis of a National Information Infrastructure Information Infrastructure
/ !
- 4
*F)+ 4 7+'' 4 &'! 4 #!-
4 !$$"$%$%+$""$% August 9, 2001: NSF Awarded $53,000,000 to SDSC/NPACI and NCSA/Alliance for TeraGrid
- “
“Grids Meet Peer Grids Meet Peer-
- to
to-
- Peer”
Peer”
♦ /#E#% /(111
- #E#(1<11
- ♦ 6!
%
1-11-
- ♦ I(8%
<
- Peer to Peer Computing
Peer to Peer Computing
♦ #--!!%%
!% %%)
♦ K ♦ %18%
!%)
♦ K ♦ %
%)
H111
♦
"11
- Internet On Everything
Internet On Everything
- Distributed Computing
Distributed Computing
♦ %!
- ♦ H(%
- 58<<
5< ♦ 8
- %
6
- Examples of Distributed Computing
Examples of Distributed Computing
♦ K11) /% ♦ $S% :1$1) 7O % ♦ 1/(,% ♦ " 1%/
- SETI@home
SETI@home: Global Distributed Computing : Global Distributed Computing
♦ ".''1'''#1R*'''#
P
&0.10E*#P
♦ %?
#
♦
"
- SETI@home
SETI@home
♦ %-
#% %% 8 )
♦ !%
% " 1#"
♦ K%%
!% !!! F''% )
♦ %%
% $ 1!% %% )
♦
: 8
R&''1'''% EG, ♦
%)
- Grid Computing
Grid Computing -
- from ET
from ET toAnthrax toAnthrax
- Distributed and Parallel Systems
Distributed and Parallel Systems
Distributed systems hetero- geneous Massively parallel systems homo- geneous
G r i d b a s e d C
- m
p u t i n g
Beowulf cluster Network of ws
C l u s t e r s w / s p e c i a l i n t e r c
- n
n e c t
Entropia/UD Earth Simulator
♦ /%23 ♦
- ♦
K ♦ K ♦ *'T- E'T%7I ♦ " ♦
- ♦
- %
♦ $S%
- R&''1'''%
- EG,
♦ H ♦ ! ♦
- ♦
K%! ♦ .T%8 ♦ %B ♦ "- ♦
- %
♦ $%
- .'''
- F.,
SETI@home Parallel Dist mem ASCI Tflop/s
- Motivation for NetSolve
Motivation for NetSolve
♦ - ♦ -%% ♦ H6 ♦ $ ♦ 5 ♦ H
- !"#$%&'(
!"#$%&'( !"#$%&'( !"#$%&'(
- NetSolve Network Enabled Server
NetSolve Network Enabled Server
♦ 8/
%!,!,)
♦ H"#
!%Q
1 11 %11Q ♦ $-- ♦
)
- NetSolve:
NetSolve: The The Big Big Picture Picture
AGENT(s)
S1 S2 S3 S4
Client
Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database
No knowledge of the grid required, RPC like.
IBP Depot
- NetSolve:
NetSolve: The The Big Big Picture Picture
AGENT(s)
S1 S2 S3 S4
Client
Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database
No knowledge of the grid required, RPC like. A, B
IBP Depot
- NetSolve:
NetSolve: The The Big Big Picture Picture
AGENT(s)
S1 S2 S3 S4
Client
Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database
No knowledge of the grid required, RPC like.
H a n d l e b a c k
IBP Depot
- NetSolve:
NetSolve: The The Big Big Picture Picture
AGENT(s)
S1 S2 S3 S4
Client Answer (C)
S2 ! Request
Op(C, A, B)
Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database
No knowledge of the grid required, RPC like. A, B OP, handle
IBP Depot
- Hiding the Parallel Processing
Hiding the Parallel Processing
♦ !
- ♦ %%
11 %)
- Basic Usage Scenarios
Basic Usage Scenarios
♦ /
- A%%
!% %1#I1 1 #I1#$ 1U$1 "#I ♦ ;#=8 # ♦ "8 !%
- ♦ ;H=/H
- B
!!
- 8
% %1 ;=B 1E'' C
♦ /1
161Q
- NetSolve Agent
NetSolve Agent
♦ %
)
♦ B% %!!) ♦ "% % % %
Agent
- NetSolve Agent
NetSolve Agent
♦ " %2A3( # #2#I3) !!%1) !) #<,%8) ;)=% ) )
Agent
- ♦ 6H)
♦
A # )
♦ 161
5171 1 5%)
♦ 7B!) ♦
%(1- 11Q
NetSolve Client NetSolve Client
Client
- NetSolve Client
NetSolve Client
♦ ) ♦ 558))( D2H13O
A = netsolve(‘matmul’, B, C);
4 Possible parallelisms hidden.
- In Matlab:
!"#$%%&' ()*+%!, &'
Client
- NetSolve Client
NetSolve Client
) B) ) ) )
)
Client
- Generating New Services in NetSolve
Generating New Services in NetSolve
♦ % /!
- Java GUI
NetSolve Parser/ Compiler
@PROBLEM degsv @DESCRIPTION This is a linear solver for dense matrices from the LAPACK
- Library. Solves Ax=b.
@INPUT 2 @OBJECT MATRIX DOUBLE A Double precision matrix @OBJECT VECTOR DOUBLE b Right hand side @OUTPUT 1 @OBJECT VECTOR DOUBLE x …
Server
Service Service Service Service New Service
New Service Added!
- Task Farming
Task Farming -
- Multiple Requests To Single Problem
Multiple Requests To Single Problem
♦ (
5 """67"$7
♦ 6 (
.$""""86
♦ "B;
)=
♦ %%) ♦ !1
)
- Data Persistence
Data Persistence
♦ %%B
B)
♦ <
)$/ !% !)
♦ ,
23B 8)
♦ %B
8)
- netsl(“command1”, A, B, C);
netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); Client Server
command1(A, B) result C
Client Server
command2(A, C) result D
Client Server
command3(D, E) result F
netsl_begin_sequence( ); netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); netsl_end_sequence(C, D); Client Server
sequence(A, B, E)
Server Client Server
result F input A, intermediate output C intermediate output D, input E
Data Persistence (cont’d) Data Persistence (cont’d)
- UCSD (F. Berman, H. Casanova, M. Ellisman), Salk Institute (T.
Bartol), CMU (J. Stiles), UTK (Dongarra, M. Miller, R. Wolski)
- Study how neurotransmitters diffuse and activate receptors in synapses
- blue unbounded, red singly bounded, green doubly bounded closed,
yellow doubly bounded open
NPACI Alpha Project NPACI Alpha Project -
- MCell
MCell: 3 : 3-
- D Monte
D Monte-
- Carlo Simulation of
Carlo Simulation of Neuro Neuro-
- Transmitter Release in Between Cells
Transmitter Release in Between Cells
- ♦ #" )
5K%A1- ♦ "$ ) 1!1 F!% ♦ $8 ) ♦ 6! #1%!1) #,1<) ♦ 6#" !%) ♦ #" ( 167""1515%1K)
Web Server NetSolve Client
IPARS-enabled Servers
Web Interface
- ♦ %!#"
- SCIRun torso
defibrillator application – Chris Johnson, U of Utah
Netsolve and SCIRun
- University of Tennessee Deployment:
University of Tennessee Deployment: S Scalable calable In Intracampus tracampus R Research esearch G Grid: rid: SInRG SInRG
♦
67!%( 1% $)15 %1 $1$)$)
♦
" 1 ! 1
- !
The Knoxville Campus has two DS-3 commodity Internet connections and one DS-3 Internet2/Abilene connection. An OC-3 ATM link routes IP traffic between the Knoxville campus, National Transportation Research Center, and Oak Ridge National Laboratory. UT participates in several national networking initiatives including Internet2 (I2), Abilene, the federal Next Generation Internet (NGI) initiative, Southern Universities Research Association (SURA) Regional Information Infrastructure (RII), and Southern Crossroads (SoX). The UT campus consists of a meshed ATM OC-12 being migrated over to switched Gigabit by early 2002.
- Resources: Grid Service Cluster
Resources: Grid Service Cluster
♦ / !
- %
#/ ♦
%
- ♦ !
!
- ) *
) * ) * ) * +,-* +,-* +,-* +,-* . /0 . /0 . /0 . /0 "1)2-* "1)2-* "1)2-* "1)2-* * * * * * * * * /0 /0 /0 /0 "1)2-* "1)2-* "1)2-* "1)2-* 3/0 3/0 3/0 3/0 1)2-* 1)2-* 1)2-* 1)2-* $41) $41) $41) $41)
- 56
56 56 56 /0 /0 /0 /0 "+78$ "+78$ "+78$ "+78$ !0 !0 !0 !0 100Mbps 10Mbps 155Mbps 100Mbps 100Mbps 100Mbps 100Mbps
- SInRG
SInRG
♦ "/
!
♦ 5%1-8
- %!%%%-
)
♦ %%
!%)
♦
- UTK
UTK -
- SInRG
SInRG
♦ "/ /
%%
$ N H%% %: ♦ I/%!
- H!
- "%
#
- The Internet Backplane Protocol (IBP)
The Internet Backplane Protocol (IBP)
♦ Network middleware which makes
distributed network storage available as a flexibly allocated resource.
♦ Storage buffers exposed to the
network.
♦ A simple mechanism for
experimenting with allocation and scheduling
- IBP’s Unit of Storage
IBP’s Unit of Storage
♦ P%
)
♦ P%
;=)
♦ -
)
♦
!% %)
♦
% (
2*3 2.3 52*3
% % %% %% !) !)
- IBP Servers
IBP Servers
♦ % ♦ "B) ♦
- ♦
)
♦ $%) ♦ %
- Strategy #1:
Strategy #1: Keep data close to the sender Keep data close to the sender (lazy transmission) (lazy transmission)
♦ !K%
)
♦ %) ♦ A8:)
Sender Receiver IBP Network
- Strategy #2:
Strategy #2: Place data close to the receiver Place data close to the receiver
♦ H#5255$
%1 3
♦ #) ♦ K B)
Sender Receiver IBP Network
- Strategy #3:
Strategy #3:
Utilize transient storage throughout Utilize transient storage throughout
♦ 6 ♦ 7 ♦ H#!K%
2K 3"%K- H
58!
!!% #
Sender Receiver IBP IBP IBP
- Replicated Services
Replicated Services
♦
- ♦ $
- ♦
- ♦ G-*'
- GH5K%5)
2GE/H,C''/H3 RF 2G''/H1 3
♦ /#7#,
- 151
1 $"7 1 ! H5(5"$,"1
- !%
23123
- NetSolve:
NetSolve: The The Big Big Picture Picture
AGENT(s)
A C
S1 S2 S3 S4
Client Answer (C)
S2 ! Request
Op(C, A, B)
Matlab Mathematica C, Fortran Java, Excel Schedule Database
No knowledge of the grid required, RPC like. A, B
h a n d l e b a c k
A, B OP, handle
- State Management in NetSolve
State Management in NetSolve
♦ %#(
- ♦ $8
- For example:
X = F(A, B); Y = G(X, B);
Client
A,B F
Client
X,B G
Client
X Y
Server 1 Server 2
- Client
A F G
Client
Y
Server 1 Server 2 Client
A,B F A,B G
Client
Y
Server 1 Server 2
X X,B Y
Caching
X B B B
Dependence Flow
IBP Cache
Two Logistical Scheduling Strategies Two Logistical Scheduling Strategies
- Stage Data Close to Server
Stage Data Close to Server
♦ $8!%%
) !,H#% %
- ♦ %S
1 !- H1 I)
- NetSolve
NetSolve-
- Things Not Touched On
Things Not Touched On
♦ !%%5 /11!K% ♦ IN.%) ♦ % ! ! ♦ % 5 ♦ 5 ! ♦ 6 ♦ ,/ ♦ ♦ % %%
- ♦ /"#
H!//6!%:!%6
- NSF/NGS
NSF/NGS GrADS GrADS -
- GrADSoft
GrADSoft Architecture Architecture
♦ /(
%
Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation
- .//01234/56-1
7 568/35)"9330 7(: ;8(<=
- NSF/NGS
NSF/NGS GrADS GrADS -
- GrADSoft
GrADSoft Architecture Architecture
♦ /(
%
Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation
- .//01234/56-1
7 568/35)"9330 7(: ;8(<=
- ScaLAPACK
ScaLAPACK
♦ #I
- ♦ 8
- ♦
25##?35#
♦ 7%%!
%
♦ !%!!
%
♦ 67$1 61"# ♦ H51#-816:1
$1 1 /11/15 1Q
?
- To Use ScaLAPACK a User Must:
To Use ScaLAPACK a User Must:
♦ !%82
#H 1H 1H 1?5#3%%)
♦ K #5!%%
%E- #%% % #5% %%%
♦ %%
%%!
♦ %%
; 9 V=
(%8% %1<%Q
♦ 1%%
- ScaLAPACK Grid Enabled
ScaLAPACK Grid Enabled
♦ #I
%%/)
5%A #% #!%%A ♦ 5!%%
!)
♦ %%
;/=% %8 !%%8% )
- GrADS Numerical Library
GrADS Numerical Library
♦ K%% ♦ 5!%%%
%A%% %% 7<% %%
- % #5%
- %%
- %
- User has problem to solve (e.g. Ax = b)
Natural Data (A,b)
Middleware Application Library (e.g. LAPACK, ScaLAPACK, PETSc,…)
Natural Answer (x) Structured Data (A’,b’) Structured Answer (x’)
Big Picture… Big Picture…
- "
- ♦ ;=%
!%)
- $
- $"5
- "$
GrADS Library Sequence GrADS Library Sequence
- Resource Selector
Resource Selector
5 K %%%%)
E2!13E213 58B
7" 1 %%%%! )
- "
- "
- "
- "
- #
5
♦ #5%
%" % )
#%% %%%) %1% ) %) #% ) ;=11)
Performance Model Performance Model
- "
- "
- #
5
- Contract Development
Contract Development
♦ %%
)
♦ %!
%#5% !)
♦ 1
% )
- "
- "
- #
5
- %
; 9% 9 VVV9
Application Launcher Application Launcher
- Resource Selector Input
Resource Selector Input
♦ B ES1 1
- #%5/
6% %2B3 H!% %) # 5 ♦ 58
11 8)
♦ %%!
)
I!!%1% 1%! %%
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
- ScaLAPACK Performance Model
ScaLAPACK Performance Model
- ( , )
f f v v m m
T n p C t C t C t = + +
3
2 3
f
n C p =
2 2
1 (3 log ) 4
v
n C p p = +
2
(6 log )
m
C n p = +
f
t
v
t
m
t
- :$$%
5"; <$$%:$ $%5" ;
Performance Model
Library writer to supply
Optimizer
'" '% !$
MDS, NWS Coarse Grid
Resource Selector/Performance Modeler Resource Selector/Performance Modeler
♦
"% % %!% )
♦
% % % )
♦
%#5 %% %" )
% !%% ) ♦
%% 7<)
% % 8) ♦
- !
- Performance Model Validation
Performance Model Validation
Speed = 60% of the peak
Opus14 Opus13 Opus16 Opus15 Torc4 Torc6 Torc7 mem(MB) 215 214 227 215 233 479 479 speed 270 270 270 270 330 330 330 load 1 0.99 1 0.99 1 1.04 0.87 Bandwidth Opus14 Opus13 Opus16 Opus15 Torc4 Torc6 Torc7 Opus14
- 1
248.83 247.31 246.38 2.83 2.83 2.83 Opus13 248.83
- 1
244.54 240.94 2.83 2.83 2.83 Opus16 247.31 244.54
- 1
247.54 2.83 2.83 2.83 Opus15 246.38 240.94 247.54
- 1
2.83 2.83 2.83 Torc4 2.83 2.83 2.83 2.83
- 1
81.96 56.47 Torc6 2.83 2.83 2.83 2.83 81.96
- 1
50.9 Torc7 2.83 2.83 2.83 2.83 56.47 50.9
- 1
Latency in msec
Latency Opus14 Opus13 Opus16 Opus15 Torc4 Torc6 Torc7 Opus14
- 1
0.24 0.29 0.26 83.78 83.78 83.78 Opus13 0.24
- 1
0.24 0.23 83.78 83.78 83.78 Opus16 0.29 0.24
- 1
0.23 83.78 83.78 83.78 Opus15 0.26 0.23 0.23
- 1
83.78 83.78 83.78 Torc4 83.78 83.78 83.78 83.78
- 1
0.31 0.31 Torc6 83.78 83.78 83.78 83.78 0.31
- 1
0.31 Torc7 83.78 83.78 83.78 83.78 0.31 0.31
- 1
Bandwidth in Mb/s
This is for a refined grid
- Experimental Hardware / Software Grid
Experimental Hardware / Software Grid
♦ /*)*)F ♦ E)F ♦ K E)')E ♦ 5#-/*)*)E ♦ #I*)+ ♦ ,H F)')E ♦ H *)* ♦ ##*)*). ♦ / A;=
TORC CYPHER OPUS Type Cluster 8 Dual Pentium III Cluster 16 Dual Pentium III Cluster 8 Pentium II OS Red Hat Linux 2.2.15 SMP Debian Linux 2.2.17 SMP Red Hat Linux 2.2.16 Memory 512 MB 512 MB 128 or 256 MB CPU speed 550 MHz 500 MHz 265 – 448 MHz Network Fast Ethernet (100 Mbit/s) (3Com 3C905B) and switch (BayStack 350T) with 16 ports Gigabit Ethernet (SK- 9843) and switch (Foundry FastIron II) with 24 ports Myrinet (LANai 4.3) with 16 ports each
MacroGrid Testbed
Independent components being put together and interacting
- PDGESV Time Breakdown
PDGESV Time Breakdown
ScaLAPACK - PDGESV - Using collapsed NWS query from UCSB 42 machine available, using mainly torc, cypher, msc clusters at UTK [Jan 2002] 0.0 1000.0 2000.0 3000.0 4000.0 5000.0 6000.0 Matrix Size - Nproc Time (seconds)
- ther
PDGESV spawn NWS MDS
- ther
4.7 5.3 1.0 2.3 7.6 1.1 4.7 PDGESV 58.7 394.7 749.4 1686.7 2747.1 4472.7 5020.4 spawn 92.2 105.9 154.1 124.7 135.6 181.0 264.5 NWS 5.5 7.4 12.3 12.0 4.2 10.2 8.5 MDS 13.0 11.0 10.0 11.0 14.0 73.0 12.0 5000-10 10000-12 15000-14 20000-14 25000-18 30000-18 35000-27
- ScaLAPACK across 3 Clusters
500 1000 1500 2000 2500 3000 3500 5000 10000 15000 20000 Matrix Size Time (seconds)
5 OPUS 8 OPUS 8 OPUS 8 OPUS, 6 CYPHER 8 OPUS, 2 TORC, 6 CYPHER 6 OPUS, 5 CYPHER 2 OPUS, 4 TORC, 6 CYPHER 8 OPUS, 4 TORC, 4 CYPHER
OPUS OPUS, CYPHER OPUS, TORC, CYPHER
- Largest Problem Solved
Largest Problem Solved
♦ 58<F'1''' G)E/H% FE%
%%.*E5H1*E05H
#5%*G%E 0&
F)+/, E*'5, #I*G! .'T #.''5<.''5, 6%E'T% #I
- QR
QR – – Timing Breakdown Timing Breakdown
Execution of QR factorization over the grid
500 1000 1500 2000 2500 3000 3500 4000 4000 5000 8000 10000 12000 15000 Matrix Size Time (seconds) Application execution Startup time Performance modeling NWS MDS
3 torcs, 3 mscs 3 torcs, 4 mscs 4 torcs, 6 mscs 4 torcs, 6 mscs 4 torcs, 7 mscs 4 torcs, 7 mscs
- PDSYEVX
PDSYEVX – – Timing Breakdown Timing Breakdown
PDSYEVX - torcs, cyphers 1000 2000 3000 4000 5000 6000 1000-1 2000-1 3000-1 4000-2 5000-4 7000-5 10000-10 N-nproc Time (s)
- ther_grid_overhead
perf_modeling_time nws mds pdsyevx_driver_overhead back transformation compute eigenvectors compute eigenvalues tridiagonal reduction Uses torcs only Uses 5 torcs and 5 cyphers
- Adhoc
Adhoc vs vs Annealing Scheduling Annealing Scheduling
Estimated Execution Time for PDGESV Using Adhoc Scheduler and Annealing Scheduler Given 57 possible hosts; all GrADS x86 machines
1000 2000 3000 4000 5000 6000 5 ( 1 ) ( 9 ) 1 ( 1 2 ) ( 1 2 ) 1 5 ( 1 4 ) ( 1 6 ) 2 ( 1 4 ) ( 2 ) 2 5 ( 1 8 ) ( 2 ) 3 ( 2 ) ( 2 ) 3 5 ( 2 7 ) ( 2 7 ) N (nproc_adhoc) (nproc_anneal) Time Estimate (sec) est_adhoc est_anneal
- 200
400 600 800 1000 1200 1400 1600 1800 2000
Time (seconds)
Rescheduling/Redistribution Rescheduling/Redistribution Experimental Results Experimental Results
App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling
Scenario: start application on 4 processors. After 4 minutes into the run 4 addition processors become available. Want to reorganize the computation to take advantage of the extra computing capability. What’s the additional cost?
- 200
400 600 800 1000 1200 1400 1600 1800 2000
Time (seconds)
Application execution Redistribute time Checkpoint read+write Restart time Start time Grid overhead NWS
Rescheduling/Redistribution Rescheduling/Redistribution Experimental Results Experimental Results
App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling App1: 13600, 4 torcs App2: 13600, 4 torcs+4 cyphers, introduced 4 mins. into the run With rescheduling
About 12% better performance even with redistribution and restart.
- Major Challenge
Major Challenge -
- Adaptivity
Adaptivity
♦ %%%:
% B)
♦
)))
<%1 !1 1 !1 %!%!8 )
- Conclusion
Conclusion
♦ $8
- ♦ /%
♦ %/
- ♦ %
!%!! %
- Collaborators
Collaborators
♦
/
%%N%1I PI%1I
&& %<2%
- !$%&$!%
=<%!"&"% / %*1% /$:>%?1$,"$
♦
H#
5%H1I #1I "%K1 H 6H1 1 @!.* ♦ % !1I 1 I% 1I %%N%1I
♦ !
))),,
6
. #I % 8
- Major Challenge
Major Challenge -
- Adaptivity
Adaptivity
♦ %%%:
% B)
♦
)))
<%1 !1 1 !1 %!%!8 )
- Futures for Numerical Algorithms
Futures for Numerical Algorithms and Software on Clusters and Grids and Software on Clusters and Grids
♦ " - !
!181
♦ !
)
- 18)
%1
- ♦ %!
%)
*+1FE1+&1*E0)
♦ "11 ♦
%)
- Conclusion
Conclusion
♦ $8
- ♦ !%
♦ %/
- ♦ %
!%!! %
- Vinny’s
Vinny’s Bad Day Bad Day
♦ %/!
- )