Transactional Memory Companion slides for The Art of Multiprocessor - PowerPoint PPT Presentation

Processor Issues Load Request load x E cache cache cache Bus Bus Got it! memory data data 57 Art of Multiprocessor Programming

Processor Issues Load Request Load x E data cache cache Bus Bus memory data 58 Art of Multiprocessor Programming

Other Cache Responds Got it S S E data data cache cache Bus Bus memory data 59 Art of Multiprocessor Programming

Modify Cached Data S S data data data cache Bus memory data 60 Art of Multiprocessor Programming

Invalidate Invalidate x S S M I data data cache cache Bus Bus memory data 61 Art of Multiprocessor Programming

Invalidate data cache cache Bus This cache acquires write permission memory data 62 Art of Multiprocessor Programming

Invalidate Other caches lose read permission data cache cache Bus This cache acquires write permission memory data 63 Art of Multiprocessor Programming

Invalidate Memory provides data only if not present in any cache, so no need to change it now (expensive) data cache cache Bus memory data 64 Art of Multiprocessor Programming

HW Transactional Memory read active T caches Interconnect memory Art of Multiprocessor 65 65 Programming

Transactional Memory active read active T T caches memory Art of Multiprocessor 66 66 Programming

Transactional Memory active committed active T T caches memory Art of Multiprocessor 67 67 Programming

Transactional Memory write committed active T D caches memory Art of Multiprocessor 68 68 Programming

Rewind write aborted active active T T D caches memory Art of Multiprocessor 69 69 Programming

Transaction Commit At Commit point … No cache conflicts? We win. Mark transactional cache entries … . Was: read-only, Now: valid Was: modified, Now: dirty (will be written back) That’s (almost) everything! Art of Multiprocessor 70 70 Programming

Road Map Transactional Memory Hardware Transactional Memory Hybrid Transactional Memory Software Transactional Memory Research Questions 71

Hardware Transactional Memory (HTM) IBM’s Blue Gene/Q & System Z & Power8 Intel’s Haswell TSX extensions 72

Intel RTM if (_xbegin() == _XBEGIN_STARTED) { speculative code _xend() } else { abort handler }

Intel RTM if (_xbegin() == _XBEGIN_STARTED) { speculative code _xend() } else { abort handler } start a speculative transaction

Intel RTM if (_xbegin() == _XBEGIN_STARTED) { speculative code _xend() } else { abort handler } If you see this, you are inside a transaction

Intel RTM if (_xbegin() == _XBEGIN_STARTED) { speculative code _xend() If you see anything else, } else { your transaction aborted abort handler }

Intel RTM if (_xbegin() == _XBEGIN_STARTED) { speculative code _xend() } else { abort handler } you could retry the transaction, or take an alternative path

Abort codes if (_xbegin() == _XBEGIN_STARTED) { speculative code } else if (status & _XABORT_EXPLICIT) { aborted by user code } else if (status & _XABORT_CONFLICT) { read-write conflict } else if (status & _XABORT_CAPACITY) { cache overflow } else { … }

Abort codes if (_xbegin() == _XBEGIN_STARTED) { speculative code } else if (status & _XABORT_EXPLICIT) { aborted by user code } else if (status & _XABORT_CONFLICT) { read-write conflict } else if (status & _XABORT_CAPACITY) { speculative code can call cache overflow } else { _xabort() … }

Abort codes synchronization conflict if (_xbegin() == _XBEGIN_STARTED) { speculative code occurred (maybe retry) } else if (status & _XABORT_EXPLICIT) { aborted by user code } else if (status & _XABORT_CONFLICT) { read-write conflict } else if (status & _XABORT_CAPACITY) { cache overflow } else { … }

Abort codes if (_xbegin() == _XBEGIN_STARTED) { speculative code } else if (status & _XABORT_EXPLICIT) { read/write set too big aborted by user code (maybe don’t retry) } else if (status & _XABORT_CONFLICT) { read-write conflict } else if (status & _XABORT_CAPACITY) { cache overflow } else { … }

Abort codes if (_xbegin() == _XBEGIN_STARTED) { speculative code } else if (status & _XABORT_EXPLICIT) { aborted by user code } else if (status & _XABORT_CONFLICT) { other abort codes … read-write conflict } else if (status & _XABORT_CAPACITY) { cache overflow } else { … }

Too Big Transaction aborts if data set overflows caches, internal buffers

Too Slow Transaction aborts on timer interrupt

Just Not in the Mood Many other reasons: TLB miss, illegal instruction, page fault …

Hybrid Transactional Memory

Non-Speculative Fallback if (_xbegin() == _XBEGIN_STARTED) { read lock state if (lock taken) _xabort(); work; _xend() } else { lock->lock(); work; lock->unlock(); }

Non-Speculative Fallback if (_xbegin() == _XBEGIN_STARTED) { read lock state if (lock taken) _xabort(); work; _xend() } else { reading lock ensures that lock->lock(); transaction will abort if another work; thread acquires lock lock->unlock(); }

Non-Speculative Fallback if (_xbegin() == _XBEGIN_STARTED) { read lock state if (lock taken) _xabort(); work; _xend() } else { lock->lock(); abort if another thread has work; acquired lock lock->unlock(); }

Non-Speculative Fallback on abort, acquire lock & do work (aborting concurrent speculative if (_xbegin() == _XBEGIN_STARTED) { transactions) read lock state if (lock taken) _xabort(); work; _xend() } else { lock->lock(); work; lock->unlock(); } Art of Multiprocessor Programming

Lock Elision <HLE acquire prefix> lock(); do work; <HLE release prefix> unlock() 91 Art of Multiprocessor Programming

Lock Elision <HLE acquire prefix> lock(); do work; <HLE release prefix> unlock() first time around, read lock and execute speculatively 92 Art of Multiprocessor Programming

Lock Elision <HLE acquire prefix> lock(); do work; <HLE release prefix> unlock() if speculation fails, no more Mr. Nice Guy, acquire the lock 93 Art of Multiprocessor Programming

Conventional Locks lock transfer latencies serialized execution locks 94 Art of Multiprocessor Programming

Lock Elision locks lock elision 95 Art of Multiprocessor Programming

Lock Teleportation 96

Hand-over-Hand locking a b c Art of Multiprocessor 97 Art of Multiprocessor Programming Programming

Hand-over-Hand locking a b c 98 Art of Multiprocessor Programming

Transactional Memory Companion slides for The Art of Multiprocessor - PowerPoint PPT Presentation

Transactional Memory Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Moores Law Transistor count still rising Clock speed flattening sharply Art of Multiprocessor 2 Programming

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 25 November 2016 Lecture 8

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015 Lecture 8

Transactional Memory: Architectural support for Lock-Free Data Structure Transactional Memory:

Transactional memory with data Transactional memory with data invariants: or putting the

Hardware Transactional Memory Shao-Hung Chiu, Upasana Sridhar Transactional Memory - Where did

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Transactional Memory 1 To read more This days papers: Herlihy and Moss, Transactional

Extending Hardware Transactional Memory to Support Non-busy Waiting and Non-transactional Actions

Verification of Transactional Memories that support Non-Transactional Memory Accesses Ariel Cohen

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory

Time-Warp: Lightweight Abort Minimization in Transactional Memory Nuno Diegues and Paolo Romano

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis

Enhancing Permissiveness of Transactional Memory via Time-Warp Nuno Diegues and Paolo Romano

Inevitability Mechanisms for Inevitability Mechanisms for Software Transactional Memory Software

Recovery Techniques does not contain values written by committed transactions. Recovery

Granola: LowOverhead Distributed Transac9on Coordina9on James Cowling and Barbara Liskov MIT

Distributed Systems and Databases of the Globe Unite! The Cloud, the Edge and Blockchains Amr El

Service Oriented Architecture: Principles and Practice Dr Mark Little Technical Development

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Concurrency Control CMPSCI 645 Apr 3, 2008 Slide content adapted from Ramakrishnan & Gehrke,

Implementing Distributed Consensus Dan Ldtke danrl@google.com Disclaimer This work is not

Atomically Trading with Roger: Gambling on the success of a hardfork* Patrick McCorry, Ethan

Transactional Memory Companion slides for The Art of Multiprocessor - PowerPoint PPT Presentation

Transactional Memory Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Moores Law Transistor count still rising Clock speed flattening sharply Art of Multiprocessor 2 Programming

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 25 November 2016 Lecture 8

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015 Lecture 8

Transactional Memory: Architectural support for Lock-Free Data Structure Transactional Memory:

Transactional memory with data Transactional memory with data invariants: or putting the

Hardware Transactional Memory Shao-Hung Chiu, Upasana Sridhar Transactional Memory - Where did

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Transactional Memory 1 To read more This days papers: Herlihy and Moss, Transactional

Extending Hardware Transactional Memory to Support Non-busy Waiting and Non-transactional Actions

Verification of Transactional Memories that support Non-Transactional Memory Accesses Ariel Cohen

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory

Time-Warp: Lightweight Abort Minimization in Transactional Memory Nuno Diegues and Paolo Romano

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis

Enhancing Permissiveness of Transactional Memory via Time-Warp Nuno Diegues and Paolo Romano

Inevitability Mechanisms for Inevitability Mechanisms for Software Transactional Memory Software

Recovery Techniques does not contain values written by committed transactions. Recovery

Granola: LowOverhead Distributed Transac9on Coordina9on James Cowling and Barbara Liskov MIT

Distributed Systems and Databases of the Globe Unite! The Cloud, the Edge and Blockchains Amr El

Service Oriented Architecture: Principles and Practice Dr Mark Little Technical Development

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Concurrency Control CMPSCI 645 Apr 3, 2008 Slide content adapted from Ramakrishnan &amp; Gehrke,

Implementing Distributed Consensus Dan Ldtke danrl@google.com Disclaimer This work is not

Atomically Trading with Roger: Gambling on the success of a hardfork* Patrick McCorry, Ethan

Concurrency Control CMPSCI 645 Apr 3, 2008 Slide content adapted from Ramakrishnan & Gehrke,