Outline Outline - - PDF document

outline outline
SMART_READER_LITE
LIVE PREVIEW

Outline Outline - - PDF document

High Performance High Performance Computing, Computational Grid, and Computing, Computational Grid, and Numerical Libraries Numerical Libraries


slide-1
SLIDE 1
  • High Performance

High Performance Computing, Computational Grid, and Computing, Computational Grid, and Numerical Libraries Numerical Libraries

  • Outline

Outline

♦ ♦

!"

#$% &

  • '$()& $

($

slide-2
SLIDE 2
  • Software Technology & Performance

Software Technology & Performance

♦ ♦ $* ♦

  • ()
  • +!!
  • ,

♦ !($

  • !

(!.

  • Software Issues:

Software Issues:

♦ (. ♦ /&

.

♦ $

.

♦ /!!

.

♦ #*

.

♦ $

slide-3
SLIDE 3
  • Self Adapting Software

Self Adapting Software

♦ $0 + . '

  • "1
  • .

'

  • Motivation Self Adapting

Motivation Self Adapting Numerical Software (SANS) Effort Numerical Software (SANS) Effort

♦ +1"

" 1.

  • 2

3

  • 4!

&. # * #*-1 .

slide-4
SLIDE 4
  • Self Adapting Numerical Software

Self Adapting Numerical Software -

  • SANS Effort

SANS Effort

♦ (

!!.

♦ (&561. ♦ ♦ '% ♦ ' ♦ "

  • TUNING

SYSTEM Different Algorithms, Segment Sizes Best Algorithm

  • n a given

computer

  • Software Generation

Software Generation Strategy Strategy -

  • ATLAS BLAS

ATLAS BLAS

♦ 789!

:!8!;<4'$

♦ =#>

  • 1.

  • #

♦ ''$

'$$' 3! 3!+!3! !$ 4!$$!0

♦ ( ♦

!-

  • ♦ /
  • ♦ (
  • ♦ :

1?

4 : @( 3 / 1

slide-5
SLIDE 5
  • ATLAS

ATLAS (DGEMM n = 500)

(DGEMM n = 500)

♦ ''$4'$

& .

ScaLAPACK - PDGESV - Using collapsed NWS query from UCSB 42 machine available, using mainly torc, cypher, msc clusters at UTK [Jan 2002] 0.0 1000.0 2000.0 3000.0 4000.0 5000.0 6000.0 Matrix Size - Nproc T im e (s ec

  • nds

)

  • ther

PDGESV spawn NWS MDS

  • ther

4.7 5.3 1.0 2.3 7.6 1.1 4.7 PDGESV 58.7 394.7 749.4 1686.7 2747.1 4472.7 5020.4 spawn 92.2 105.9 154.1 124.7 135.6 181.0 264.5 NWS 5.5 7.4 12.3 12.0 4.2 10.2 8.5 MDS 13.0 11.0 10.0 11.0 14.0 73.0 12.0 5000-10 10000-12 15000-14 20000-14 25000-18 30000-18 35000-27

  • Some Automatic Tuning Projects

Some Automatic Tuning Projects

♦''$5..-65!,6 ♦(2('5...-7-6

54!'!A!6

♦$"!5B! ;!

)6

♦56 ♦@@ $(

@@,5..6 ,:CCC,(1# $ $(/'5...-76 "!$( 2@@ "!

slide-6
SLIDE 6
  • ♦ "

.

♦ 3*

  • .

♦ ,

  • .

♦ #!!

  • &

!...

/

In the past: Isolation Motivation for Grid Computing Motivation for Grid Computing

Today: Collaboration

slide-7
SLIDE 7
  • ( #'$'?--..-7)--(

?--..-

  • ?--...-7-

'$?--&..--- #$?--...-- ##@?--...)--

  • ?--...--

3A$?--...-- ,@ ?--...--- # ?--..

Grids are Hot Grids are Hot

  • Motivation for NetSolve

Motivation for NetSolve

♦ &$ ♦ #& ♦ 4@ ♦ 2$ ♦ 3 ♦ 4

  • !"#$%&'(

!"#$%&'( !"#$%&'( !"#$%&'(

slide-8
SLIDE 8
  • NetSolve Network Enabled Server

NetSolve Network Enabled Server

♦ #$"

  • -.

♦ 4/( ! !! !!0 ♦ && ♦

.

  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

A C

S1 S2 S3 S4

Client

Matlab Mathematica C, Fortran Java, Excel Schedule Database

No knowledge of the grid required, RPC like.

IBP Depot

slide-9
SLIDE 9
  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

A C

S1 S2 S3 S4

Client

Matlab Mathematica C, Fortran Java, Excel Schedule Database

No knowledge of the grid required, RPC like. A, B

IBP Depot

  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

A C

S1 S2 S3 S4

Client

Matlab Mathematica C, Fortran Java, Excel Schedule Database

No knowledge of the grid required, RPC like.

H a n d l e b a c k

IBP Depot

slide-10
SLIDE 10
  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

A C

S1 S2 S3 S4

Client Answer (C)

S2 ! Request

Op(C, A, B)

Matlab Mathematica C, Fortran Java, Excel Schedule Database

No knowledge of the grid required, RPC like. A, B OP, handle

IBP Depot

  • Basic Usage Scenarios

Basic Usage Scenarios

  • D
  • !'('E!$!

$'('E!($!'F! '/('E ♦ =(>" ( ♦ /"

  • ♦ =4> 4
  • *
  • "
  • !

=>* !899 C

♦ !

!##@!0

slide-11
SLIDE 11
  • University of Tennessee Deployment:

University of Tennessee Deployment: S Scalable calable In Intracampus tracampus R Research esearch G Grid: rid: SInRG SInRG

@+?$! .!3$! !..

/ !

  • !
  • The Knoxville Campus has two DS-3 commodity Internet connections and one DS-3 Internet2/Abilene connection.

An OC-3 ATM link routes IP traffic between the Knoxville campus, National Transportation Research Center, and Oak Ridge National Laboratory. UT participates in several national networking initiatives including Internet2 (I2), Abilene, the federal Next Generation Internet (NGI) initiative, Southern Universities Research Association (SURA) Regional Information Infrastructure (RII), and Southern Crossroads (SoX). The UT campus consists of a meshed ATM OC-12 being migrated over to switched Gigabit by early 2002.

  • NetSolve

NetSolve-

  • Things Not Touched On

Things Not Touched On

♦ $ EAG. ♦ $$ 2$ ♦ 2' 3 ♦ 3#$# ♦ # #,$ ♦ 4( 3. ♦ @ ♦ - ♦ #$ ♦ '''$

  • ♦ #$ /(

4 @)##@

slide-12
SLIDE 12
  • NSF/NGS

NSF/NGS GrADS GrADS -

  • GrADSoft

GrADSoft Architecture Architecture

♦ ?

  • Whole-

Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

  • !"#$%&
  • NSF/NGS

NSF/NGS GrADS GrADS -

  • GrADSoft

GrADSoft Architecture Architecture

♦ ?

  • Whole-

Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

  • !"#$%&
slide-13
SLIDE 13
  • ScaLAPACK

ScaLAPACK

♦ $'('E

  • ♦ "

53((;63(

♦ +

  • ♦ #
  • ♦ @+!#$@!'/('

♦ 43!2(&"!@)!

#!$!$ !!#' !3$!0

;

  • To Use ScaLAPACK a User Must:

To Use ScaLAPACK a User Must:

♦ "5

(4'$!4'$!4'$!;3(6.

♦ ,$(3

$8& ( $(3

= % # H>

#?" !10

♦ !

slide-14
SLIDE 14
  • ScaLAPACK Grid Enabled

ScaLAPACK Grid Enabled

♦ $'('E

.

3D ( (D ♦ 3

.

♦ '

= > " " .

  • GrADS Numerical Library

GrADS Numerical Library

♦ , ♦ 3

D

  • +1
  • $$(3
slide-15
SLIDE 15
  • User has problem to solve (e.g. Ax = b)

Natural Data (A,b)

Middleware Application Library (e.g. LAPACK, ScaLAPACK, PETSc,…)

Natural Answer (x) Structured Data (A’,b’) Structured Answer (x’)

Big Picture… Big Picture…

  • Numerical Libraries for Clusters

Numerical Libraries for Clusters

User A b Stage data to disk

slide-16
SLIDE 16
  • Numerical Libraries for Clusters

Numerical Libraries for Clusters

User A b Library Middle-ware

  • Numerical Libraries for Clusters

Numerical Libraries for Clusters

User A b Library Middle-ware NWS Autopilot MDS Resource Selection Time function minimization

slide-17
SLIDE 17
  • Numerical Libraries for Clusters

Numerical Libraries for Clusters

User A b Library Middle-ware NWS Autopilot MDS Resource Selection 0,0 0,1 … Time function minimization

Can use Grid infrastructure, i.e.Globus/NWS, but doesn’t have to.

  • Experimental Hardware / Software Grid

Experimental Hardware / Software Grid

♦ :.:.< ♦ '8.< ♦ #,$8.9.8 ♦ 3(2& :.:.8 ♦ $'('E:.I ♦ ''$-4'$<.9.8 ♦ 4'$:.: ♦ ('(:.:.G ♦ '$D=>

TORC CYPHER OPUS Type Cluster 8 Dual Pentium III Cluster 16 Dual Pentium III Cluster 8 Pentium II OS Red Hat Linux 2.2.15 SMP Debian Linux 2.2.17 SMP Red Hat Linux 2.2.16 Memory 512 MB 512 MB 128 or 256 MB CPU speed 550 MHz 500 MHz 265 – 448 MHz Network Fast Ethernet (100 Mbit/s) (3Com 3C905B) and switch (BayStack 350T) with 16 ports Gigabit Ethernet (SK- 9843) and switch (Foundry FastIron II) with 24 ports Myrinet (LANai 4.3) with 16 ports each

MacroGrid Testbed

Independent components being put together and interacting

slide-18
SLIDE 18
  • PDGESV Experiments: Time

PDGESV Experiments: Time Breakdown Breakdown

ScaLAPACK - PDGESV - Using collapsed NWS query from UCSB 42 machine available, using mainly torc, cypher, msc clusters at UTK [Jan 2002] 0.0 1000.0 2000.0 3000.0 4000.0 5000.0 6000.0 Matrix Size - Nproc Time (seconds)

  • ther

PDGES V spawn

  • ther

4.7 5.3 1.0 2.3 7.6 1.1 4.7 PDGESV 58.7 394.7 749.4 1686.7 2747.1 4472.7 5020.4 spawn 92.2 105.9 154.1 124.7 135.6 181.0 264.5 NWS 5.5 7.4 12.3 12.0 4.2 10.2 8.5 MDS 13.0 11.0 10.0 11.0 14.0 73.0 12.0 5000-10 10000-12 15000-14 20000-14 25000-18 30000-18 35000-27

  • ScaLAPACK across 3 Clusters

500 1000 1500 2000 2500 3000 3500 5000 10000 15000 20000 Matrix Size Time (seconds)

5 OPUS 8 OPUS 8 OPUS 8 OPUS, 6 CYPHER 8 OPUS, 2 TORC, 6 CYPHER 6 OPUS, 5 CYPHER 2 OPUS, 4 TORC, 6 CYPHER 8 OPUS, 4 TORC, 4 CYPHER

OPUS OPUS, CYPHER OPUS, TORC, CYPHER

slide-19
SLIDE 19
  • LAPACK For Clusters

LAPACK For Clusters

  • =>

.

♦ $'('E

1

  • 200

400 600 800 1000 1200 1400 1600 1800 2000

Time (seconds)

Rescheduling/Redistribution Rescheduling/Redistribution Experimental Results Experimental Results

App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling

Scenario: start application on 4 processors. After 4 minutes into the run 4 addition processors become available. Want to reorganize the computation to take advantage of the extra computing capability. What’s the additional cost?

slide-20
SLIDE 20
  • 200

400 600 800 1000 1200 1400 1600 1800 2000

Time (seconds)

Application execution Redistribute time Checkpoint read+write Restart time Start time Grid overhead NWS

Rescheduling/Redistribution Rescheduling/Redistribution Experimental Results Experimental Results

App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling App1: 13600, 4 torcs App2: 13600, 4 torcs+4 cyphers, introduced 4 mins. into the run With rescheduling

About 12% better performance even with redistribution and restart.

  • Fault Tolerance in the Message Passing

Fault Tolerance in the Message Passing

  • ♦ 3(D
  • ♦ #)

@&3(

slide-21
SLIDE 21
  • Algorithmic Fault Tolerance

Algorithmic Fault Tolerance

.

♦ #

.

♦ ,

$'('E ($ .

  • Research Directions

Research Directions

♦ (1 ♦ @ ♦ ' ♦ 2 ♦ = >56

' !!

  • .
slide-22
SLIDE 22
  • Futures for Numerical Algorithms

Futures for Numerical Algorithms and Software on Clusters and Grids and Software on Clusters and Grids

♦ / & #

!"!

.

  • '!".

' !

.

:I!<8!IJ!:8K.

♦ /!! ♦ '

.

  • Conclusion

Conclusion

♦ "

slide-23
SLIDE 23
  • Collaborators

Collaborators

''$

'(!$ ,!@$ ♦

$'#$

A)!E ♦

'$

$A!E ' BE!E

  • !

"##$

@

!E ( 11!E E/!E ♦

@&3(

@!E-2/$ 4!E 3!E

♦ $'

''$

...--

#$

...--

@

G$'('E

  • #"

@&3(

'.