Access Programming with MPI-3 One Sided R OBERT G ERSTENBERGER , M - PowerPoint PPT Presentation

spcl.inf.ethz.ch @spcl_eth Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided R OBERT G ERSTENBERGER , M ACIEJ B ESTA , T ORSTEN H OEFLER

spcl.inf.ethz.ch @spcl_eth MPI-3.0 R EMOTE M EMORY A CCESS  MPI-3.0 supports RMA (“MPI One Sided”)  Designed to react to hardware trends  Majority of HPC networks support RDMA [1] http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf 2

spcl.inf.ethz.ch @spcl_eth MPI-3.0 R EMOTE M EMORY A CCESS  MPI-3.0 supports RMA (“MPI One Sided”)  Designed to react to hardware trends  Majority of HPC networks support RDMA  Communication is „one sided” (no involvement of destination)  RMA decouples communication & synchronization  Different from message passing one sided two sided Proc B Proc A Proc A Proc B Communication send Communication put + recv Synchronization sync Synchronization [1] http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf 5

spcl.inf.ethz.ch @spcl_eth P RESENTATION O VERVIEW 1. Overview of three MPI-3 RMA concepts 2. MPI window creation 3. Communication 5. Application evaluation 4. Synchronization 6

spcl.inf.ethz.ch @spcl_eth MPI-3 RMA C OMMUNICATION O VERVIEW Process B (active) Memory Process A (passive) Memory Put Non-atomic communication calls (put, get) MPI window Atomic Get MPI window Process C (active) … Process D (active) … Atomic communication calls (Acc, Get & Acc, CAS, FAO) 7

spcl.inf.ethz.ch @spcl_eth MPI-3.0 RMA S YNCHRONIZATION O VERVIEW Active Target Mode Passive Target Mode Active process Passive process Fence Lock Synchroni- zation Communi- cation Post/Start/ Lock All Complete/Wait 12

spcl.inf.ethz.ch @spcl_eth S CALABLE P ROTOCOLS & R EFERENCE I MPLEMENTATION  Scalable & generic protocols  Can be used on any RDMA network (e.g., OFED/IB) 17

spcl.inf.ethz.ch @spcl_eth S CALABLE P ROTOCOLS & R EFERENCE I MPLEMENTATION  Scalable & generic protocols  Can be used on any RDMA network (e.g., OFED/IB) 18

spcl.inf.ethz.ch @spcl_eth S CALABLE P ROTOCOLS & R EFERENCE I MPLEMENTATION  Scalable & generic protocols  Can be used on any RDMA network (e.g., OFED/IB)  Window creation, communication and synchronization Synchronization Communication Window creation 19

spcl.inf.ethz.ch @spcl_eth S CALABLE P ROTOCOLS & R EFERENCE I MPLEMENTATION  Scalable & generic protocols  Can be used on any RDMA network (e.g., OFED/IB)  Window creation, communication and synchronization  foMPI, a fully functional MPI-3 RMA implementation  DMAPP: lowest-level networking API for Cray Gemini/Aries systems  XPMEM: a portable Linux kernel module http://spcl.inf.ethz.ch/Research/Parallel_Programming/foMPI 20

spcl.inf.ethz.ch @spcl_eth P ART 1: S CALABLE W INDOW C REATION Traditional windows Process A Process B Process C Memory Memory Memory 0x123 0x120 0x111 𝑞 = total number Time bound: 𝒫 𝑞 backwards compatible of processes Memory bound: 𝒫 𝑞 (MPI-2) 23

spcl.inf.ethz.ch @spcl_eth P ART 1: S CALABLE W INDOW C REATION Allocated windows Process A Process B Process C Memory Memory Memory 0x123 0x123 0x123 𝑞 = total number Time bound: 𝒫 log 𝑞 (𝑥ℎ𝑞) Allows MPI of processes Memory bound: 𝒫 1 to allocate memory 24

spcl.inf.ethz.ch @spcl_eth P ART 1: S CALABLE W INDOW C REATION Dynamic windows Process A Process B Process C Memory Memory Memory 0x129 0x129 0x123 0x120 0x111 𝑞 = total number Time bound: 𝒫 𝑞 Local attach/detach of processes Memory bound: 𝒫 𝑞 Most flexible 25

spcl.inf.ethz.ch @spcl_eth P ART 2: C OMMUNICATION MPI_Put Remote  Put and Get: process  Direct DMAPP put and get operations or … local (blocking) memcpy (XPMEM) dmapp_put_nbi  Accumulate:  DMAPP atomic operations for 64 bit types  ...or fall back to remote locking protocol  MPI datatype handling with MPITypes library [1]  Fast path for contiguous data transfers of common intrinisic datatypes (e.g., MPI_DOUBLE) MPI_Compare _and_swap Remote process … Contiguous memory dmapp_ acswap_qw_nbi [1] Ross, Latham, Gropp, Lusk, Thakur. Processing MPI datatypes outside MPI. EuroMPI /PVM’09 26

spcl.inf.ethz.ch @spcl_eth P ERFORMANCE I NTER - NODE : L ATENCY Put Inter-Node Get Inter-Node 80% faster 20% faster Half ping-pong Proc 1 Proc 0 put sync memory 27

spcl.inf.ethz.ch @spcl_eth P ERFORMANCE I NTRA - NODE : L ATENCY Put/Get Intra-Node Half ping-pong 3x Proc 0 Proc 1 faster put sync memory 28

spcl.inf.ethz.ch @spcl_eth P ERFORMANCE : O VERLAP Proc 1 Proc 0 put comp. Inter-Node Overlap in % Sync memory Useful for, e.g., scientific codes: AWM-Olsen 3D FFT seismic MILC 29

spcl.inf.ethz.ch @spcl_eth Proc 1 P ERFORMANCE : M ESSAGE R ATE Proc 0 puts ... Sync memory Intra-Node Inter-Node 30

spcl.inf.ethz.ch @spcl_eth P ERFORMANCE : A TOMICS hardware- accelerated protocol: lower latency fall back protocol: higher bandwidth proprietary 64 bit integers 31

spcl.inf.ethz.ch @spcl_eth P ART 3: S YNCHRONIZATION Active Target Mode Passive Target Mode Active process Passive process Fence Lock Synchroni- zation Communi- cation Post/Start/ Lock All Complete/Wait 32

spcl.inf.ethz.ch @spcl_eth S CALABLE F ENCE I MPLEMENTATION  Collective call  Completes all outstanding memory operations Node 0 Node 1 int int MPI_Win_fence(…) { asm( mfence ); Proc 1 Proc 2 Proc 3 Proc 0 dmapp_gsync_wait(); MPI_Barrier(...); return MPI_SUCCESS; return put } put put put put put put 33

Access Programming with MPI-3 One Sided R OBERT G ERSTENBERGER , M - PowerPoint PPT Presentation

spcl.inf.ethz.ch @spcl_eth Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided R OBERT G ERSTENBERGER , M ACIEJ B ESTA , T ORSTEN H OEFLER spcl.inf.ethz.ch @spcl_eth MPI-3.0 R EMOTE M EMORY A CCESS MPI-3.0

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

MPI Internals Advanced Parallel Programming Overview MPI Library Structure Point-to-point

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

One-Sided Access in Two-Sided Markets Marianne Verdier Universit de Lille 1, Laboratoire

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

Message Passing Programming Designing MPI Applications Overview Lecture will cover MPI

Programming Introduction to MPI What is MPI? 2 MPI Forum First message-passing interface

More on Address Translation CS170 Fall 2015. T. Yang Based on Slides from John Kubiatowicz

Council Meeting January 25, 2016 Sandy Watershed Learning Center Council Development

Manufacturing Productivity: Effects of Institutions and Service Sector Innovations Johannes Pschl

Historical Patterns of Inequality and Productivity around Financial Crises Pascal Paul Federal

KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel,

Dense matrix algorithms We are going to study algorithms involving dense matrices (as opposed

Linux perf_events updates Stephane Eranian Scalable Tools Workshop 2018 Solitude, UT

Process Management Outline Main concepts Basic services for process management (Linux

Sambuz

Useful Links

Newsletter

Mail Us

Access Programming with MPI-3 One Sided R OBERT G ERSTENBERGER , M - PowerPoint PPT Presentation

spcl.inf.ethz.ch @spcl_eth Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided R OBERT G ERSTENBERGER , M ACIEJ B ESTA , T ORSTEN H OEFLER spcl.inf.ethz.ch @spcl_eth MPI-3.0 R EMOTE M EMORY A CCESS MPI-3.0

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

MPI Internals Advanced Parallel Programming Overview MPI Library Structure Point-to-point

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

One-Sided Access in Two-Sided Markets Marianne Verdier Universit de Lille 1, Laboratoire

MPI &amp; MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards

Open MPI on the Cray XT presented by Richard L. Graham Galen Shipman Open MPI Is Open

Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication

Investigation of Parallel Processing Using How to Enable/Access Open MPI in Open MPI ADMB.

Message Passing Programming Designing MPI Applications Overview Lecture will cover MPI

Programming Introduction to MPI What is MPI? 2 MPI Forum First message-passing interface

More on Address Translation CS170 Fall 2015. T. Yang Based on Slides from John Kubiatowicz

Council Meeting January 25, 2016 Sandy Watershed Learning Center Council Development

Manufacturing Productivity: Effects of Institutions and Service Sector Innovations Johannes Pschl

Historical Patterns of Inequality and Productivity around Financial Crises Pascal Paul Federal

KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel,

Dense matrix algorithms We are going to study algorithms involving dense matrices (as opposed

Linux perf_events updates Stephane Eranian Scalable Tools Workshop 2018 Solitude, UT

Process Management Outline Main concepts Basic services for process management (Linux

Sambuz

Useful Links

Newsletter

Mail Us

MPI & MPICH Presenter: Naznin Fauzia CSE 788.08 Winter 2012 Outline MPI-1 standards