Remote Memory Access
A deeper look at RMA
Access A deeper look at RMA Outline Additional MPI RMA concepts - - - PowerPoint PPT Presentation
Remote Memory Access A deeper look at RMA Outline Additional MPI RMA concepts - Synchronization modes - Types of epoch - Memory model The other two MPI synchronization calls - Post-Start-Complete-Wait (PSCW) - Locking and unlocking
A deeper look at RMA
2
Synchronisation modes, epochs types and memory model
3
the synchronisation.
no calls made on the target process. For instance two origin processes might communicate by accessing the same location in a target window, and the target process (which does not participate) might be distinct from the origin processes.
Fence is an example of active target as each process issues the fence calls
4
an access epoch. This is created by starting the epoch and completed by stopping the epoch.
memory on the target so it can be accessed by other processes’ RMA operations. Fences abstract the programmer from this as they will complete/start both access and exposure epochs automatically as required
5
exposed main memory)
which is only locally visible but elements from public memory might be stored.
reflected in private copy consistently
synchronised
6
process memory but also a distinct public copy for each window that contains it. The old model
Public window copy Public window copy Process memory
synchronisation call (i.e. end of an epoch) must be issued to make these
synchronisation calls might be
model it follows
Put Get Local write Local read
Illustration of separate model
7
global synchronisation with fences
not need to make any RMA calls
8
Post-start-complete-wait and locking & unlocking
9
epoch
epoch
calls to support synchronising over a subset of processes in the communicator. Assertions also provided if you want.
guarantees target RMA completion
have completed and guarantees origin RMA completion
10
These can overlap, i.e. you can have an exposure epoch and also create an access epoch. Remember wait will block for its matching complete, so you must have complete and then wait Dashed arrows illustrate synchronization, solid arrows data transfer
11
int ranks[]={0,1,2}; if (rank == 0) { MPI_Win_create(buf, sizeof(int)*3, sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &win); } else { MPI_Win_create(NULL, 0, sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &win); } if (rank == 0) { MPI_Group_incl(comm_group, 2, ranks+1, &group); MPI_Win_post(group, 0, win); MPI_Win_wait(win); } else { MPI_Group_incl(comm_group, 1, ranks, &group); MPI_Win_start(group, 0, win); MPI_Put(buf, 1, MPI_INT, 0, rank, 1, MPI_INT, win); MPI_Win_complete(win); }
Group contains ranks 1 and 2 Start exposure epoch Stop exposure epoch Group contains rank 0 Start access epoch Stop access epoch
Based on an example at cvw.cac.cornell.edu/MPIoneSided/pscw
12
the target must still explicitly create an exposure epoch
where only the origin takes part.
int MPI_Win_lock(int lock_type, int rank, int assert, MPI_Win win) int MPI_Win_unlock(int rank, MPI_Win win)
communication calls as normal, these complete for both the origin and target on the corresponding unlock. Starts an access epoch Stops the access epoch
13
target window at any one time
target window at any one time
which control access to all processes associated with a window.
target rank
memory model)
14
MPI_Win win; if (rank == 0) { MPI_Win_create(NULL,0,1,MPI_INFO_NULL,MPI_COMM_WORLD,&win); MPI_Win_lock(MPI_LOCK_SHARED,1,0,win); MPI_Put(buf,1,MPI_INT,1,0,1,MPI_INT,win); MPI_Win_unlock(1,win); MPI_Win_free(&win); } else { MPI_Win_create(buf,2*sizeof(int),1, MPI_INFO_NULL, MPI_COMM_WORLD, &win); MPI_Win_free(&win); }
Based on an example at cvw.cac.cornell.edu/MPIoneSided/lul
Start access epoch Stop access epoch Communication call 15
Request handles, dynamic windows
16
which associate a request handle with the operation
request handle (you can call MPI_test, MPI_wait etc
17
attached at window creation
point?
additional synchronisation
int MPI_Win_create_dynamic(MPI_Info info, MPI_Comm comm, MPI_Win *win)
int MPI_Win_attach(MPI_Win win, void *base, MPI_Aint size)
int MPI_Win_detach(MPI_Win win, const void *base)
18
memory that is already attached
communication call is the address at the target
target process before the origin issues any RMA calls referencing it
19
window management functions.
synchronisation calls for unified windows. We have covered a “safe” approach here which works fine for both types.
RMA to your existing codes
20