An Input/Output LIbrary for cluster of SMP Adrien Lebre , Yves - - PowerPoint PPT Presentation

an input output library for cluster of smp
SMART_READER_LITE
LIVE PREVIEW

An Input/Output LIbrary for cluster of SMP Adrien Lebre , Yves - - PowerPoint PPT Presentation

An Input/Output LIbrary for cluster of SMP Adrien Lebre , Yves Denneulin { Adrien.Lebre,Yves.Denneulin } @imag.fr ID-IMAG (UMR 5132) Laboratory, Grenoble, France BULL - HPC, Echirolles, France. Slide 1/17 6th May 2005 aIOLi - CCGRID05 -


slide-1
SLIDE 1

An Input/Output LIbrary for cluster

  • f SMP

Adrien Lebre, Yves Denneulin

{Adrien.Lebre,Yves.Denneulin}@imag.fr ID-IMAG (UMR 5132) Laboratory, Grenoble, France BULL - HPC, ´ Echirolles, France.

6th May 2005

Slide 1/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-2
SLIDE 2

Plan

1 Introduction

Context Parallel Input/Output

2 aIOLi system

Preamble Principles Technical aspects

3 Results

POSIX vs aIOLi MPI I/O vs aIOLi

4 Conclusion

Slide 2/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-3
SLIDE 3

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Context

Environment Cluster of SMPs Linux High Performance Computing Intensive I/O applications

CPU bounded application ⇒ I/O bounded application Remote hard drive I/O

Parallel I/O Handling concurrent accesses to a same resource (a file) Accesses : different in size, in offset Example : matrix product

Slide 3/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-4
SLIDE 4

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Context

Environment Cluster of SMPs Linux High Performance Computing Intensive I/O applications

CPU bounded application ⇒ I/O bounded application Remote hard drive I/O

Parallel I/O Handling concurrent accesses to a same resource (a file) Accesses : different in size, in offset Example : matrix product

Slide 3/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-5
SLIDE 5

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Parallel I/O Example

Matrix product Specific parts to fetch (according to data distribution: columns, rows, BLOCK/BLOCK, BLOCK/CYCLIC ...) Several requests at the same time : disjoint/contiguous “lethal” behavior for I/O subsystem

P0 P1 P2 P3 SMP Client read(fd,buf,1024); //file position=0 read(fd,buf,1024); //file position=1024 read(fd,buf,1024); //file position=3072 read(fd,buf,1024); //file position=2048 from a global point of view 4 independent requests but contiguous

Slide 4/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-6
SLIDE 6

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Parallel I/O Example

Matrix product Specific parts to fetch (according to data distribution: columns, rows, BLOCK/BLOCK, BLOCK/CYCLIC ...) Several requests at the same time : disjoint/contiguous “lethal” behavior for I/O subsystem

P0 P1 P2 P3 SMP Client read(fd,buf,1024); //file position=3072 read(fd,buf,1024); //file position=0 read(fd,buf,1024); //file position=5120 read(fd,buf,1024); //file position=7168 read(fd,buf,1024); //file position=1024 ... 4 requests have been processed ? read(fd,buf,1024); //file position=4096 read(fd,buf,1024); //file position=6144 read(fd,buf,1024); //file position=2048 What about the new requests ? Contiguous / Disjoint ?

Slide 4/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-7
SLIDE 7

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Parallel I/O Example

Matrix product Specific parts to fetch (according to data distribution: columns, rows, BLOCK/BLOCK, BLOCK/CYCLIC ...) Several requests at the same time : disjoint/contiguous “lethal” behavior for I/O subsystem

P0 P1 P2 P3 SMP Client read(fd,buf,1024); //file position=3072 read(fd,buf,1024); //file position=0 read(fd,buf,1024); //file position=4096 read(fd,buf,1024); //file position=5120 read(fd,buf,1024); //file position=7168 read(fd,buf,1024); //file position=1024 ... ... ... ... read(fd,buf,1024); //file position=2048 read(fd,buf,1024); //file position=6144 4 requests have been processed ? What about the new requests ? Contiguous / Disjoint ?

Slide 4/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-8
SLIDE 8

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Parallel I/O

Requirements / constraints Methods for disjoint data (readv) ⇒ complexity of API Collective operations ⇒ Synchronization mechanisms logical view (the files) ⇒ physical placements (block devices) Available solutions - related works Many Parallel File Systems : +/- efficient but hardware dependent

“cluster compliant” : PVFS, NFSparallel, GPFS, Lustre Designed for “ Parallel I/O” : PIOUS, VESTA ...

Libraries : Focus on portability aspects

A lot ! : MPI I/O is the reference.

Sophisticated API ⇒ Development overhead / Language bindings

Slide 5/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-9
SLIDE 9

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Parallel I/O

Requirements / constraints Methods for disjoint data (readv) ⇒ complexity of API Collective operations ⇒ Synchronization mechanisms logical view (the files) ⇒ physical placements (block devices) Available solutions - related works Many Parallel File Systems : +/- efficient but hardware dependent

“cluster compliant” : PVFS, NFSparallel, GPFS, Lustre Designed for “ Parallel I/O” : PIOUS, VESTA ...

Libraries : Focus on portability aspects

A lot ! : MPI I/O is the reference.

Sophisticated API ⇒ Development overhead / Language bindings

Slide 5/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-10
SLIDE 10

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Context summary

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 2

... ... P1 Pn

IO server n

... ... P1 Pn

SMP Client

... ...

Slide 6/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-11
SLIDE 11

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Context summary

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 2

... ... P1 Pn

IO server n

... ... P1 Pn

SMP Client

... ...

Slide 6/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-12
SLIDE 12

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Context summary

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 2

... ... P1 Pn

IO server n

... ... P1 Pn

SMP Client

... ...

Slide 6/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-13
SLIDE 13

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Context summary

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 2

... ... P1 Pn

IO server n

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ...

Slide 6/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-14
SLIDE 14

Introduction aIOLi system Results Conclusion Context Parallel Input/Output

Context summary

P1 Pn P1 Pn

IO server n

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 2

... ...

SMP Client

... ...

Slide 6/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-15
SLIDE 15

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

aIOLi system

Objectives Supply Parallel I/O algorithms

scheduling policies aggregating access ⇒ efficiency

  • verlapping access

Only through the use of the ubiquitous POSIX calls

  • pen/creat/lseek/read/write/close ⇒ Simplicity

Minimal overhead

avoid expensive synchronisation mechanisms (barrier, . . . )

Slide 7/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-16
SLIDE 16

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Evaluation of “the Linux” I/O stack

1 GB File decomposition on a SMP (kernel 2.4.27, IDPOT cluster, NFS version 3, mpich 1.2.5)

50 100 150 200 250 300 4096 1024 512 128 64 32 16 8 4 1 Completion time (sec) Access granularity (KBytes) 1 2 4 8 1 randomize

Observations 1 process ⇒ Sequential read (optimal) + processes ⇒ - performance 1 process in random access ⇒ more performance for large access than parallel approach

Slide 8/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-17
SLIDE 17

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Define a “think time” window Maximize the use of I/O server (bandwidth)

At least one request should be in the queue on the server side

Apply parallel I/O algorithms in the queue on the client side

At most one request on the server side !

(1) A first request is sent to the file server. (2) The server processes it on the attached disk. (3) The reply is returned to the client.

(2)

Waiting I/O queue

(1) Waiting I/O queue (3) SMP Client I/O Node Slide 9/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-18
SLIDE 18

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 0 P0 t Waiting Processing Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-19
SLIDE 19

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 0 P0 t Waiting Processing Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-20
SLIDE 20

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 10 P1 read(fd,dest,10) fd offset = 0 P0 t Waiting Processing Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-21
SLIDE 21

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 20 P2 read(fd,dest,10) fd offset = 10 P1 read(fd,dest,10) fd offset = 0 P0 read(fd,dest,10) fd offset = 40 P0 t Waiting Processing Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-22
SLIDE 22

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 30 P3 read(fd,dest,10) fd offset = 0 P0 read(fd,dest,10) fd offset = 10 P1 Waiting Processing read(fd,dest,10) fd offset = 20 P2 read(fd,dest,10) fd offset = 40 P0 read(fd,dest,10) fd offset = 50 P1 t Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-23
SLIDE 23

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 30 P3 read(fd,dest,10) fd offset = 0 P0 read(fd,dest,10) fd offset = 10 P1 Waiting Processing read(fd,dest,10) fd offset = 40 P0 read(fd,dest,10) fd offset = 50 P1 read(fd,dest,10) fd offset = 20 P2 t Contiguous pattern Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-24
SLIDE 24

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 10 P1 read(fd,dest,10) fd offset = 20 P2 read(fd,dest,30) fd offset = 30 P3/P0/P1 read(fd,dest,10) fd offset = 60 P2 Waiting Processing t Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-25
SLIDE 25

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 60 P2 read(fd,dest,10) fd offset = 20 P2 Waiting Processing t read(fd,dest,30) fd offset = 30 P3/P0/P1 wait during a "Think time" period as we discovered an aggregation Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-26
SLIDE 26

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 60 P2 read(fd,dest,10) fd offset = 20 P2 read(fd,dest,10) fd offset = 70 P3 Waiting Processing read(fd,dest,10) fd offset = 90 P1 read(fd,dest,10) fd offset = 80 P0 t read(fd,dest,30) fd offset = 30 P3/P0/P1 Wait during a "think time" period Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-27
SLIDE 27

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 60 P2 read(fd,dest,10) fd offset = 20 P2 read(fd,dest,10) fd offset = 70 P3 read(fd,dest,10) fd offset = 90 P1 read(fd,dest,10) fd offset = 80 P0 read(fd,dest,10) fd offset = 60 P2 read(fd,dest,10) fd offset = 20 P2 read(fd,dest,10) fd offset = 70 P3 read(fd,dest,10) fd offset = 90 P1 read(fd,dest,10) fd offset = 80 P0 t read(fd,dest,30) fd offset = 30 P3/P0/P1 Waiting Processing t read(fd,dest,30) fd offset = 30 P3/P0/P1 Waiting Processing Reccurent contiguous pattern on fd Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-28
SLIDE 28

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

Fundamental concepts

Aggregating example basic decomposition including 4 processes (granularity=10 bytes)

read(fd,dest,10) fd offset = 20 P2 Waiting Processing t read(fd,dest,30) fd offset = 30 P3/P0/P1 read(fd,dest,40) fd offset = 60 P2/P3/P0/P1 access pattern has been discovered Slide 10/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-29
SLIDE 29

Introduction aIOLi system Results Conclusion Preamble Principles Technical aspects

aIOLi prototype

User library A component overloads POSIX calls (linked to the HPC applications) aIOLi daemon

“Multi-threaded” Includes distinct improvements Processes real I/O calls

IPC mechanisms and shared memory

process n

aIOLi module

process 1

aIOLi module

process 0

aIOLi module I/O thread 0 I/O thread 1 I/O thread n I/O stack and remote file system clients Receiver thread waiting queues Network stack user space kernel space

aIOLi service (1) (3) (2) To remote File Server SMP Client Node Slide 11/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-30
SLIDE 30

Introduction aIOLi system Results Conclusion POSIX vs aIOLi MPI I/O vs aIOLi

Evaluation : POSIX vs aIOLi

1 GB File decomposition on a SMP (kernel 2.4.27, IDPOT cluster, NFS version 3, mpich 1.2.5) compiled without aIOLi

50 100 150 200 250 300 4096 1024 512 128 64 32 16 8 4 1 Completion time (sec) Access granularity (KBytes) 1 2 4 8 Slide 12/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-31
SLIDE 31

Introduction aIOLi system Results Conclusion POSIX vs aIOLi MPI I/O vs aIOLi

Evaluation : POSIX vs aIOLi

1 GB File decomposition on a SMP (kernel 2.4.27, IDPOT cluster, NFS version 3, mpich 1.2.5) compiled with aIOLi

50 100 150 200 250 300 4096 1024 512 128 64 32 16 8 4 1 Completion time (sec) Access granularity (KBytes) 1 2 4 8 Slide 12/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-32
SLIDE 32

Introduction aIOLi system Results Conclusion POSIX vs aIOLi MPI I/O vs aIOLi

Evaluation : MPI I/O vs aIOLi

1 GB File decomposition including 8 MPI instances on a SMP (kernel 2.4.27, IDPOT cluster, NFS version 3, mpich 1.2.5, ROMIO)

50 100 150 200 250 300 4096 1024 512 128 64 32 16 8 4 1 Completion time (sec) Access granularity (KBytes) posix level0 level1 level2 level3 aioli

Observations MPI I/O : explicit access pattern For all levels, aIOLi provided the best results

Slide 13/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-33
SLIDE 33

Introduction aIOLi system Results Conclusion

Conclusion

Positive results Efficient Simplicity of the API ⇒ POSIX No requirements for inter-process synchronization. Current constraints Centralized distributed file system (such as NFS) vs Parallel FS Reduce overhead for single access Kernel scheduler dependent

Current and future works Add Data striping considerations (stabilization phase) Implement a patch for the VFS (summer 2005) Evaluation on bigger SMP and Lustre Coordination between several SMPs, in progress (Master) The grid ...

Slide 14/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-34
SLIDE 34

Introduction aIOLi system Results Conclusion

Conclusion - summary

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

SMP Client

... ...

Slide 15/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-35
SLIDE 35

Introduction aIOLi system Results Conclusion

Conclusion - summary

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

SMP Client

... ... aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi

Slide 15/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-36
SLIDE 36

Introduction aIOLi system Results Conclusion

Conclusion - summary

aIOLi Master

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

SMP Client

... ... aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi

Slide 15/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-37
SLIDE 37

Introduction aIOLi system Results Conclusion

Conclusion - summary

Master aIOLi

GATEWAY

Master aIOLi

GATEWAY

Master aIOLi

GATEWAY

aIOLi Master

INTERNET / GRID Cluster X Cluster Y

P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

SMP Client

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

IO server 1

... ... P1 Pn

SMP Client

... ... aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi aIOLi

Slide 15/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-38
SLIDE 38

Question

Questions ?

http://aioli.imag.fr LIPS Project BULL - INRIA - ID Laboratory Thanks

Slide 16/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005

slide-39
SLIDE 39

Question

MPI I/O improvements

Independant noncontigous request using a derived data types (level2, Data Sieving) using derived data types (level 3, Two Phases) Collective noncontigous requests Independant noncontigous request using a derived data types (level2, Data Sieving) File Space Collective contigous requests like aIOLi concept (level 1) 0 1 2 3 Processes MPI I/O − Four levels [Thakhur/Gropp/Lusk02] Representing increasing amounts of data per request Slide 17/17 Adrien Lebre c Bull-ID LIPS 2004

aIOLi - CCGRID05 - May 2005