Overview Overview Grid/NetSolve Grid enabled software Allows easy - - PDF document

overview overview
SMART_READER_LITE
LIVE PREVIEW

Overview Overview Grid/NetSolve Grid enabled software Allows easy - - PDF document

GridSolve: A Seamless Bridge Between : A Seamless Bridge Between GridSolve the Standard Programming Interfaces the Standard Programming Interfaces and Remote Resources and Remote Resources Jack Dongarra University of Tennessee and Oak


slide-1
SLIDE 1

1

2/25/2006 1

GridSolve GridSolve: A Seamless Bridge Between : A Seamless Bridge Between the Standard Programming Interfaces the Standard Programming Interfaces and Remote Resources and Remote Resources

Jack Dongarra University of Tennessee and Oak Ridge National Laboratory

$countupmm 2

Overview Overview

♦ Grid/NetSolve Grid enabled software Allows easy access to remote resources ♦ No magic Someone has to write a program The program may run on a parallel computer ♦ NetSolve is a tool for distributed

computing

slide-2
SLIDE 2

2

$countupmm 3

The Grid

$countupmm 4

What is What is NetSolve NetSolve? ?

♦ Client-server RPC-like system Designed for ease-of-use ♦ Interactions mediated by an agent e.g. scheduling, tracking, fault tolerance ♦ Dynamic service bindings Client does not need to have stubs for the services that it wishes to use ♦ Multiple clients C, Fortran, Matlab, Java, Mathematica, Octave ♦ Extended to support GridRPC API Part of GGF working group defining a standard API

slide-3
SLIDE 3

3

$countupmm 5

University of Tennessee University of Tennessee’ ’s s NetSolve NetSolve Grid Enabled Server Grid Enabled Server

♦ NetSolve is an example of a Grid based

hardware/software/data server.

♦ Based on a Remote Procedure Call model

but with …

resource discovery, dynamic problem solving capabilities, load balancing, fault tolerance asynchronicity, security, … ♦ Easy-of-use paramount ♦ Its about providing transparent

access to resources.

♦ Legacy codes easily

wrapped into services

$countupmm 6

GridSolve Architecture GridSolve Architecture

Agent

se r ve r list cluster data cluster r e que st cluster cluster r e sult

Client

[x,y,z,info] = gr idsolve (‘dge sv’, A, B)

Can be from Matlab, C, F

  • rtran, Python,

Java, Mathe ma tic a, E xc e l, …

`

R e sour c e disc ove r y Sc he duling L

  • ad balanc ing

F ault tole r anc e

slide-4
SLIDE 4

4

$countupmm 7

♦ Function Based Interface. ♦ Client program embeds call

from NetSolve’s API to access additional resources.

♦ Interface available to C, Fortran,

Matlab, and Mathematica.

♦ Opaque networking interactions. ♦ NetSolve can be invoked using a

variety of methods: blocking, non- blocking, task farms, …

NetSolve Client NetSolve Client

Client

$countupmm 8

NetSolve Client NetSolve Client

♦ Intuitive and easy to use. ♦ Matlab Matrix multiply e.g.:

A = matmul(B, C);

A = netsolve(‘matmul’, B, C);

  • Possible parallelisms hidden.

Client

slide-5
SLIDE 5

5

$countupmm 9

NetSolve Client NetSolve Client

i.

Client makes request to agent.

  • ii. Agent returns list of servers.
  • iii. Client tries first one to

solve problem.

Client

$countupmm 10

NetSolve Agent NetSolve Agent

♦ Name server for the

NetSolve system.

♦ Information Service client users and administrators can query the hardware and software services available. ♦ Resource scheduler maintains both static and dynamic information regarding the NetSolve server components to use for the allocation of resources

Agent

slide-6
SLIDE 6

6

$countupmm 11

NetSolve Agent NetSolve Agent

♦ Resource Scheduling (cont’d): CPU Performance. Network bandwidth, latency. Server workload. Problem size/algorithm complexity. Calculates a “Time to Compute.” for each appropriate server. Notifies client of most appropriate server.

Agent

$countupmm 12

Basic Usage Scenarios Basic Usage Scenarios

♦ Grid based numerical library

routines

User doesn’t have to have software library on their machine, LAPACK, SuperLU, ScaLAPACK, PETSc, ARPACK, … ♦ Task farming applications “Pleasantly parallel” execution eg Parameter studies Scavenge cycles ♦ Remote application execution Complete applications with user specifying input parameters and receiving output

♦ “Blue Collar” Grid Based

Computing

Does not require deep knowledge of network programming Level of expressiveness right for many users User can set things up, no “su” required In use today, up to 130 servers on the experimental grid

♦ Can plug into Globus,

Condor, NINF, …

slide-7
SLIDE 7

7

$countupmm 13

Task Farming Task Farming -

  • Multiple Requests To Single Problem

Multiple Requests To Single Problem

♦ A Solution:

Many calls to netslnb( ); /* non-blocking */

♦ Farming Solution:

Single call to netsolve_farm( );

♦ Request iterates over an “array of input

parameters.”

♦ Adaptive scheduling algorithm. ♦ Useful for parameter sweeping, and

independently parallel applications.

$countupmm 14

Server Proxies Server Proxies – – Hide Parallelism Hide Parallelism

Agent

Server Server Server Server

NetSolve Client NetSolve Client

Server Server

LFC (LAPACK for Clusters), Condor, ScaLAPACK, etc. NetSolve System User maybe unaware of parallel processing

slide-8
SLIDE 8

8

$countupmm 15

GridSolve Usage with VGrADS GridSolve Usage with VGrADS

♦ Simple-to-use access to complicated software libraries, with no

knowledge of grid based computing.

♦ Selection of best machines in your grid to service user request ♦ Portability Non-portable calls can be run from a client using RPC like mechanisms as long there is a server provisioned with the code ♦ Legacy codes easily wrapped into services ♦ Plug into VGrADS Framework ♦ Using the vgES for resource

selection and launching of application:

Integrated performance information Integrated monitoring Fault prediction Integrating the software and resource information repositories

$countupmm 16 Application Application vgES API s vgMON

vgDL

I nformation Services Resource Managers

vgLAUNCH vgFAB

VG

VG

VG

VG DVCW vgAgent

Grid Resources

vgDL Description Virtual Grid

Successfully Bound Candidates Grid Resource Universe

Virtual Grid Execution System ( Virtual Grid Execution System (vgES vgES) )

♦ A Virtual Grid (VG) takes

Shared heterogeneous resources Scalable information service

♦ and provides

An hierarchy of application- defined aggregations (e.g. ClusterOf) with constraints (e.g. processor type) and rankings

♦ Virtual Grid Execution

System (vgES) implements VG

VG Definition Language (vgDL) VG Find And Bind (vgFAB) VG Monitor (vgMON) VG Application Launch (VgLAUNCH+DVCW) VG Resource Info (vgAgent)

slide-9
SLIDE 9

9

$countupmm 17

VGrADS/GridSolve VGrADS/GridSolve Architecture Architecture

Agent

r e que st

Client

[x,y,z,info] = gr idsolve (‘solve r ’, A, B)

`

Se r vic e Catalog Se r vic e Catalog

Data se nt & app star te d r e sult vgDL

V i r t u a l G r i d

Softwar e Re positor y

que r y s

  • f

t w a r e l

  • c

a t i

  • n

T ransfe r

Star t se r ve r r e giste r Se r ve r info

Process Killed Process restarted

$countupmm 18

Data Persistence Data Persistence

♦ Chain together a sequence of NetSolve

requests.

♦ Analyze parameters to determine data

  • dependencies. Essentially a DAG is created

where nodes represent computational modules and arcs represent data flow.

♦ Transmit superset of all input/output

parameters and make persistent near server(s) for duration of sequence execution.

♦ Schedule individual request modules for

execution.

slide-10
SLIDE 10

10

$countupmm 19

netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); Client Server

command1(A, B) result C

Client Server

command2(A, C) result D

Client Server

command3(D, E) result F

netsl_begin_sequence( ); netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); netsl_end_sequence(C, D); Client Server

sequence(A, B, E)

Server Client Server

result F input A, intermediate output C intermediate output D, input E

Data Persistence (cont Data Persistence (cont’ ’d) d)

$countupmm 20

Current SInRG Infrastructure Current SInRG Infrastructure

Industry Partners: Microsoft, Sun, Dell, Cisco, Foundry, Dolphin, Myracom

Federated Ownership: CS, Chem Eng., Medical School, Computational Ecology, El. Eng.

Real applications, middleware development, logistical networking

slide-11
SLIDE 11

11

$countupmm 21

NetSolve NetSolve-

  • Things Not Touched On

Things Not Touched On

♦ Integration with other NMI tools

Globus, Condor, Network Weather Service

♦ Security

Using Kerberos V5 for authentication.

♦ Monitor NetSolve Network

Track and monitor usage

♦ Fault Tolerance ♦ Local / Global Configurations ♦ Dynamic Nature of Servers ♦ Automated Adaptive Algorithm Selection

Dynamic determine the best algorithm based on system status and nature of user problem

♦ NetSolve evolving into GridRPC

Being worked on under GGF with joint with NINF

$countupmm 22

♦ Sudesh Agrawal ♦ Don Fike ♦ Eric Meek ♦ Keith Seymour ♦ Zhiao Shi ♦ Asim YarKhan

NetSolve NetSolve Team Team

Software at: http://icl.cs.utk.edu/netsolve/