design of mpi passive target synchronization for a non
play

Design of MPI Passive Target Synchronization for a Non-Cache- - PowerPoint PPT Presentation

Design of MPI Passive Target Synchronization for a Non-Cache- Coherent Many-Core Processor 27th PARS Workshop, Hagen, Germany, May 5 2017 Steffen Christgau , Bettina Schnor Operating Systems and Distributed Systems Institute for Computer


  1. Design of MPI Passive Target Synchronization for a Non-Cache- Coherent Many-Core Processor 27th PARS Workshop, Hagen, Germany, May 5 2017 Steffen Christgau , Bettina Schnor Operating Systems and Distributed Systems Institute for Computer Science University of Potsdam, Germany

  2. Motivation: Distributed Hash Table (DHT) • hash table as cache for computational results in MPI application PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 1 / 14

  3. Motivation: Distributed Hash Table (DHT) • hash table as cache for computational results in MPI application • large amount of data → distribute across processes → DHT PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 1 / 14

  4. Motivation: Distributed Hash Table (DHT) • hash table as cache for computational results in MPI application • large amount of data → distribute across processes → DHT local local local DHT DHT part DHT part DHT part ... rank n − 1 rank 0 rank 1 PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 1 / 14

  5. Motivation: Distributed Hash Table (DHT) • hash table as cache for computational results in MPI application • large amount of data → distribute across processes → DHT local local local DHT DHT part DHT part DHT part ... rank n − 1 rank 0 rank 1 • accessing distributed data: hash function returns arbitrary process and address difficult to program with two-sided message passing MPI passive target one-sided communication to the rescue synchronization required PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 1 / 14

  6. Motivation: nCC Systems • Future many-cores may not provide (global) cache coherence. PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 2 / 14

  7. Motivation: nCC Systems • Future many-cores may not provide (global) cache coherence. Intel Knights Landing: coherent multi-socket systems not feasible https://www.extremetech.com/wp-content/uploads/2016/04/KnightsLanding.png PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 2 / 14

  8. Motivation: nCC Systems • Future many-cores may not provide (global) cache coherence. Intel Knights Landing: coherent multi-socket systems not feasible HPE "The Machine", EuroServer: coherence islands https://regmedia.co.uk/2016/11/22/the_machine_universal_memory_pool_access.jpg PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 2 / 14

  9. Research Platform • nCC many-core research system: Intel SCC 48 Pentium cores with L1/2 caches no HW cache coherence L2$ Core MC 2 MC 3 MIU MPB L2$ Core R Tile MC 0 MC 1 PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 3 / 14

  10. Research Platform • nCC many-core research system: Intel SCC 48 Pentium cores with L1/2 caches no HW cache coherence L2$ Core MC 2 MC 3 MIU MPB L2$ Core R Tile MC 0 MC 1 • This talk: design of synchronization on nCC platform. PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 3 / 14

  11. Agenda MPI Passive Target One-Sided Communication Design for Passive Target Synchronization on the SCC Data Structures and Algorithms Data Placement Outlook and Future Work PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 4 / 14

  12. MPI One-Sided Communication • process memory exposed via windows process ’ address space local DHT part local DHT part local DHT part DHT rank 0 rank 1 ... rank n − 1 PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  13. MPI One-Sided Communication • process memory exposed via windows process ’ address space local DHT part local DHT part local DHT part DHT (window) (window) (window) rank 0 rank 1 ... rank n − 1 PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  14. MPI One-Sided Communication • process memory exposed via windows • access to windows with window object (handle) process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) rank 0 rank 1 ... rank n − 1 PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  15. MPI One-Sided Communication • process memory exposed via windows • access to windows with window object (handle) process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) rank 0 rank 1 ... rank n − 1 • key concept : only one communication partner issues communication operations PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  16. MPI One-Sided Communication • process memory exposed via windows • access to windows with window object (handle) process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) rank 0 rank 1 ... rank n − 1 • key concept : only one communication partner issues communication operations origin processes issue communication operations PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  17. MPI One-Sided Communication • process memory exposed via windows • access to windows with window object (handle) process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) rank 0 rank 1 ... rank n − 1 • key concept : only one communication partner issues communication operations origin processes issue communication operations target processes are addressed by operations PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  18. MPI One-Sided Communication • process memory exposed via windows • access to windows with window object (handle) process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) rank 0 rank 1 ... rank n − 1 • key concept : only one communication partner issues communication operations origin processes issue communication operations target processes are addressed by operations typical RMA operations: PUT, GET, . . . PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  19. MPI One-Sided Communication • process memory exposed via windows • access to windows with window object (handle) process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) rank 0 rank 1 ... rank n − 1 • key concept : only one communication partner issues communication operations origin processes issue communication operations target processes are addressed by operations typical RMA operations: PUT, GET, . . . explicit synchronization required PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 5 / 14

  20. MPI Passive Target Synchronization • locks as means for synchronization, used by origins only • no participation of targets in synchronization (passive targets) PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 6 / 14

  21. MPI Passive Target Synchronization • locks as means for synchronization, used by origins only • no participation of targets in synchronization (passive targets) • usage similar to shared memory locks WIN_LOCK(win, rank, ...) 1. acquire lock for target window PUT(win, rank, ...) 2. perform operations WIN_UNLOCK(win, rank) 3. release lock PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 6 / 14

  22. MPI Passive Target Synchronization • locks as means for synchronization, used by origins only • no participation of targets in synchronization (passive targets) • usage similar to shared memory locks WIN_LOCK(win, rank, ...) 1. acquire lock for target window PUT(win, rank, ...) 2. perform operations WIN_UNLOCK(win, rank) 3. release lock MPI de fi nes two lock types: shared concurrent accesses on target window allowed exclusive prevent concurrent accesses on same target window PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 6 / 14

  23. Distributed Hash Table with MPI OSC process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) ... rank n − 1 rank 0 rank 1 PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 7 / 14

  24. Distributed Hash Table with MPI OSC process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) ... rank n − 1 rank 0 rank 1 DHT_read LOCK(window_obj, target, SHARED) GET(window_obj, target, &data) UNLOCK(window_obj, target) PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 7 / 14

  25. Distributed Hash Table with MPI OSC process ’ address space window object window object window object local DHT part local DHT part local DHT part DHT (window) (window) (window) ... rank n − 1 rank 0 rank 1 DHT_read DHT_write LOCK(window_obj, target, SHARED) LOCK(window_obj, target, EXCLUSIVE) GET(window_obj, target, &data) PUT(window_obj, target, data) UNLOCK(window_obj, target) UNLOCK(window_obj, target) PARS 2017 S. Christgau (U Potsdam): MPI Passive Target Synchronization 7 / 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend