Review of Memory Models: A Case for Rethinking Parallel Languages - PDF document

Review of Memory Models: A Case for Rethinking Parallel Languages and Hardware by Sarita V. Adve and Hans-J. Boehm Michelle Strout September 23, 2010 What is the memory model problem? Memory Model – Interface for programmer to reason about what values could be returned when a read is performed in a shared memory parallel program. – Necessary for understanding the semantics of a shared memory parallel program. What is the problem? – ISA memory models and programming language memory models have evolved separately and in ad hoc ways. – Even with significant work in the last 30 years on these problems, current solutions still have bugs, are difficult to understand, and have performance issues. (why we should care) 9/23/2010 Example Review 2

How should we evaluate memory models? Programmability – Easy to explain and teach to undergraduates. – Should enforce no data races. – Should enable the expression of "important parallel algorithms and patterns” – Can multiple programming languages implement the memory model. Portable Performance – After all, that is one of the main reasons we do parallel computing in the first place. – Is the model reasonably supported (efficient and inexpensive) by various computer architecture paradigms 9/23/2010 Example Review 3 Quick Review of Terminology Sequential consistency as a memory model – Possible parallel results can be determined by trying all possible sequential interleavings between instructions in threads. Instructions in a single thread must occur in same order in interleaving as they do in the thread. Data Race – When two instructions that could be executed in parallel read or write the same memory location and at least one of the instructions is a write 9/23/2010 Example Review 4

Detailed Examples of the Problem Figure 1, Dekker’s algorithm Figure 2, non-determinism in the hardware Figure 4a and 4b, speculation combined with control and data dependences – They say 4a does not have a data race. Figure 5, do we allow individual program optimizations? 9/23/2010 Example Review 5 Current Solutions and Remaining Problems Ada – Pro: high-level programming model with support for shared memory parallel programming – Con: “did not fully formalize the notion of well-synchronized” Java – Pro: threads in the language – Con: must deal with data races due to safety guarantees of the language – Con: in memory model a “future can determine whether the current access is legal” C/C++ and POSIX threads – Pro: simpler than Java memory model, because can leave data races undefined – Con: Atomic keyword can break sequential consistency Data-race free – Single-thread program optimizations must still be modified – Does not deal with data races – May not have efficient sequential consistency support in HW – Does not "eliminate atomicity violations or non-deterministic behavior" 9/23/2010 Example Review 6

Author Conclusions and Future Research The shared memory programming model is important and should be supported with good memory models – Hardware already supports it (e.g., cache coherence) – Can pass references to complex data structures, which is much more efficient than copying – Incremental parallelization is easier – Do not have to explicitly distribute data structures. Memory model development should be more disciplined – Should move away from the test case only based development – “disciplined shared-memory models” System architecture and programming languages should enforce no data races Need SW/HW co-design to successfully evolve and/or reinvent memory models 9/23/2010 Example Review 7 My Future Research Questions Composition of programming models and memory models – Do parallel programming models that we want to compose need to have the same underlying memory model? – Can we develop memory models so they are composable? Implementation details surrounding shared memory parallel programming – Examples include synchronization constructs, atomic, shared, and volatile keywords. – Is this the way we should be expressing these implementation details? – Can we make implementation details such as these more orthogonal to the algorithm specification? 9/23/2010 Example Review 8

Review of Memory Models: A Case for Rethinking Parallel Languages - PDF document

Review of Memory Models: A Case for Rethinking Parallel Languages and Hardware by Sarita V. Adve and Hans-J. Boehm Michelle Strout September 23, 2010 What is the memory model problem? Memory Model Interface for programmer to reason about

RETHINKING THE TOOLS OF ENGAGEMENT FLIPPING THE OUTCOMES RETHINKING THE TOOLS OF ENGAGEMENT /

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016

CSC2/458 Parallel and Distributed Systems Parallel Memory Systems: Coherence Sreepathi Pai

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Shared Memory Programming with OpenMP Lecture 3: Parallel Regions Parallel region directive

A Case for Flash Memory SSD in A Case for Flash Memory SSD in A Case for Flash Memory SSD in

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

Robustness against Relaxed Memory Models Memory Models Roland Meyer Technische Universit at

CSC2/458 Parallel and Distributed Systems Parallel Memory Systems Consistency Sreepathi Pai

Rethinking of the Rethinking of the debian/watch debian/watch With thought experiments about

Rethinking Club Rethinking Club Promotion JUDO BC AGM | JUNE 2014 JENNIFER HOOD | JUMP

Software Specification and Verification in Rewriting Logic: Lecture 1 Jos e Meseguer Computer

Lecture 2: Intro to Concurrent Processing The SR Language. Correctness and Concurrency.

Roadmap for Section 3.1. The Critical-Section Problem Software Solutions Synchronization

Shared Memory Consistency Models: A Tutorial Outline Concurrent programming on a uniprocessor

SHARED MEMORY SYSTEMS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of

Towards a separation logic for Multicore OCaml Glen Mvel , Jacques-Henri Jourdan, Franois

On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul

AM P A R CudA Multiple Precision ARithmetic librarY Floating point arithmetics A real