Trace-based detection of lock contention in MPI one-sided - - PowerPoint PPT Presentation
Trace-based detection of lock contention in MPI one-sided - - PowerPoint PPT Presentation
Trace-based detection of lock contention in MPI one-sided communication Marc-Andr e Hermanns Bernd Mohr Felix Wolf October 4, 2016 Parallel Tools Workshop, Stuttgart Motivation The Message Passing Interface (MPI) standard De-facto
Motivation
The Message Passing Interface (MPI) standard
De-facto distributed-memory programming standard in HPC Defines multiple communication paradigms
MPI one-sided communication not often used (yet)
Initial interface not well adopted by users Gaining traction since MPI-3
Tool support for one-sided communication is narrow
Crucial for understanding of complex synchronization behavior Required for supporting multi-paradigm applications
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 2
The Scalasca Toolkit
Toolkit for trace-based performance analysis
Processes OTF2 traces created by Score-P Also processes legacy traces in EPILOG format
Parallel wait state detection
Inter-process synchronization Inter-thread synchronization
Uses message replay to interchange local data just in time
Synchronizing processes exchange data Uses recorded communication information Uses similar communication pattern
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 3
MPI one-sided communication
Separate communication from synchronization
Multiple (logically concurrent) RMA operations Single synchronization to ensure completion of pending
- perations
Two different synchronization modes
Active-target synchronization
Both origin and target call synchronization functions Target needs to know when to synchronize window
Passive-target synchronization
Only origin calls synchronization functions Target is not explicitly involved in synchronization Mutual exclusion to window using locks (shared & exclusive)
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 4
Lock Contention
time processes A B C D E Lock Put Unlock E L Rq E L E L Rl
RMA operations in passive target synchronization need to be placed in a lock epoch
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock E L Rq E L E L Rl E L Rq E L E L Rl
The intuitive behavior would have competing lock epochs to be mutually exclusive
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock E L Rq E L E L Rl E L Rq E L E L Rl
Waiting time
Waiting time occurs in lock on process B, waiting for process A to release the lock
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock Lock Unlock Put E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl
Waiting time
MPI semantics allow the lock call to postpone the actual acquisition
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock Lock Unlock Put E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl
Waiting time Waiting time
Waiting time now occurs in RMA operation waiting for the lock
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock Lock Unlock Put Lock Get Unlock E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl
Waiting time Waiting time
MPI semantics even allow lock acquisition and RMA
- perations to be postponed until the unlock call
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock Lock Unlock Put Lock Get Unlock E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl
Waiting time Waiting time Waiting time
Waiting time occurs in unlock operation
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock Lock Unlock Put Lock Get Unlock Lock Get Unlock E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl
Waiting time Waiting time Waiting time
Lock epochs with shared locks may overlap
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Lock Contention
time processes A B C D E Lock Put Unlock Lock Put Unlock Lock Unlock Put Lock Get Unlock Lock Get Unlock E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl E L Rq E L E L Rl
Waiting time Waiting time Waiting time Waiting time
Waiting time can occur in context of previous conflicting locks
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 5
Trace-based message-replay
Difficulties with passive-target synchronization
Time of actual lock acquisition unknown
Use heuristic to compute lock acquisition Full epoch information needed
Target cannot record synchronization data via wrappers
No target-side events to trigger data exchange
Incomplete synchronization information at the origin
Events contain target information Access conflict is among two or more origin processes
Locks may suffer contention and insufficient progress
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 6
Active-message replay
Target Origin
Individual trace processing can be at arbitrary points
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 7
Active-message replay
Target Origin
Origin packs and sends active message to target
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 7
Active-message replay
Target Origin
Origin continues processing
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 7
Active-message replay
Target Origin
Target processes active message upon arrival Identifies corresponding event in O(log n)
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 7
Analysis phase I
Collation of epoch data
time processes A B C Lock Foo Put Bar Unlock Foo Bar Lock Foot Put Bar Unlock E L Rq E L E L P E L E L Rl E L E L E L Rq E L E L P E L E L Rl Begin tracking lock epoch Pack RMA operation data Send epoch data to target Process epoch data from A Begin tracking lock epoch Pack RMA operation data Send epoch data to target Process epoch data from C
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 8
Analysis phase II
Detecting contention
Epochs are sorted by latest unlock event Analysis starts with last epoch and continues back in time For each lock epoch in queue:
- 1. Find conflicting preceding epoch
- 2. Get unlock time from preceding epoch
- 3. Get local (target-side) progress region
- 4. Find location of wait state within current epoch
- 5. Send active message with wait state information to affected
processes
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 9
Simple benchmark
Single iteration Skewed begin of lock ensures lock request order on processes Target blocks window until all processes requested lock Target releases lock to let origins complete RMA operation
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 10
Simple benchmark
Vampir view: Lock phase
All processes execute foo for time depending on their rank Process 0 is target for RMA operations Target locks window exclusively Target then executes bar for 2s Rest of processes wait for access in unlock operation
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 11
Simple benchmark
Vampir view: Target unlocks
Target unlocks window after leaving bar First origin (process 1) gains access to window Target continues to execute foo again for 100ms Second origin (process 2) is waiting for target progress
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 12
Simple benchmark
Vampir view: Unlock completion
After foo completes, target enters barrier Barrier provides progress for remaining origin processes Remaining origins complete access
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 13
Simple benchmark
Vampir view: Benchmark completion
foo completes on remaining processes Sequence completes with all processes entering barrier
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 14
Simple benchmark
Cube view: Lock contention & wait for progress
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 15
SOR benchmark
Solves Poisson equation using successive over relaxation 2D domain decomposition Ghost-cell exchange configured for
MPI one-sided communication Passive-target synchronization
Configured for weak scaling (keeps local trace data constant)
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 16
SOR benchmark
Weak scaling
29 210 211 212 213 214 215 216 20 40 60 processes analysis time [s]
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 17
Conclusion
Trace-based detection of lock contention in MPI
Identifies lock acquisition time and order Differentiates between lock contention and insufficient progress Enables understanding of complex synchronization schemes Narrows support gap in wait-state detection for one-sided communication
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 18
Outlook
Port analysis prototype to Scalasca 2.x
Extend support to other one-sided libraries (OpenSHMEM, ARMCI, etc.)
Further improve analysis performance
Target-side message handling Work distribution
Integrate contention wait states into higher-level analysis
Root-cause detection Critical path detection
Enhance MPI interfaces
Enable target-side event generation Provide more precise locking information on origin
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 19
Thank you.
Lock contention in MPI one-sided communication (Hermanns et al.) | Oct 4, 2016 Slide 20