Lazy Hardware Transactional Memory Anurag Negi *, Rubn Titos-Gil^, - PowerPoint PPT Presentation

Improving Commit Scalability in Lazy Hardware Transactional Memory Anurag Negi *, Rubén Titos-Gil^, Manuel E. Acacio^, Jose M. Garcia^, Per Stenström* *Chalmers University of Technology, Sweden ^Universidad de Murcia, Spain Fourth Swedish Workshop on Multicore Computing (MCC) at Linköping University, 2011

Outline The importance of HTM The key challenges An approach to finding solutions Prior work and associated inefficiencies The π -TM approach

Where does HTM fit in the big picture?

HTM: Economy and Performance HTM Challenges • Manage design complexity Performance • Utilize existing mechanisms better FGLocks • Minimize changes required HTM • Improve performance Economy Productivity • Go lazy !! STM • Yet avoid bulk communication !!!

Managing complexity Use coherence protocol to detect conflicts early Managing design complexity by and utilize existing mechanisms better track these at cache line granularity No ad-hoc communcation hardware for TM Managing design complexity by and minimizing changes Piggy-back TM information on coherence messages

Improving performance Optimisitically run past conflicts Improving performance by going Minimize abort overhead lazy Utilize MLP better Lightweight commits using point- Improving performance by to-point messaging only avoiding bulk commuication between affected cores

Scalability of lazy commits Naïve: One at a time … the entire address space is one giant bank Better: Split address space into banks … lock all required banks prior to committing updates … ensure progress guarantees Ideal: Ensure conflicting transactions re-execute and prevent re-executions/new transactions from reading locations not yet updated

Prior Work • Detect early – Resolve late • Ad-hoc communication channel for EAZY-HTM[Micro2009] TM • Relies on directory communication for correctness Prevent other cores from accessing lines that are part of a committing transaction ’s write - The correctness concern set but haven’t yet been made globally visible

The correctness concern in more detail L1@Core1: {X old , Y old } TCommit@Core2: {X new , Y new } INV(X) L1@Core1: {Y old } D Core 1 commits an E inconsistent computation L Core1:TRead(X) X new A Core1:TRead(Y) Y old Y Atomicity requires Core1 INV(Y) to either see (X old ,Y old ) TCommit@Core1: {P, Q} or (X new ,Y new ) L1@Core1: {} but not (X new ,Y old ) The EAZY-HTM Approach Every first TRead or TWrite to a cache line communicates with the directory Ensures correctness but causes severe performance degradation

Reason for performance degradation Most cache lines accessed in a typical transaction are not contended Excessive communication with the directory causes congestion The π -TM Approach Speed up the common case Do extra work only for contended lines

The π -TM Approach Goals Speed up the common case Do extra work only for contended lines Design changes Add π -bit to track contended lines Pessimitically Invalidate such lines on commit or abort Other aspects No ad-hoc communication channel for TM TM info is piggy-backed on coherence messages

Incorporating adaptability Why? For short transactions with high contention , early conflict detection can increase transactional execution time Lazy Detection and Resolution Commit scalability problems but works well when application scalability is the dominant limiting factor (high contention) We employ a global commit token (GCT) scheme in such scenarios Each thread decides locally whether to use π -mode or GCT-mode Both π -mode or GCT-mode transactions can coexist safely Most applications run in π -mode

Estimating impact Baseline Faithfully implement Eazy-HTM information flow However, we use the NoC for communication (no ad-hoc communication) Coherence requests carry TM info as well π -TM is implemented on top of this baseline Adaptability mechanisms are enabled Other configurations evaluated EE: LogTM, an eager conflict resolution design LL-GCT: Global commit token (transactions commit on at a time) LL-STCC: A detailed scalable TCC implementation

Baseline Performance Effect of adaptability Best overall Improved commit performance bandwidth 4bars (L2R): π -TM EE(LogTM) 16 threads on 16 cores, SIMICS+GEMS, STAMP applications LL-GCT STCC

Conclusion π -TM achieves the following : A fully decentralized scalable commit protocol Only conflicting threads/transactions get affected Low design cost Performs the best among evaluated design points

Lazy Hardware Transactional Memory Anurag Negi *, Rubn Titos-Gil^, - PowerPoint PPT Presentation

Improving Commit Scalability in Lazy Hardware Transactional Memory Anurag Negi , Rubn Titos-Gil^, Manuel E. Acacio^, Jose M. Garcia^, Per Stenstrm *Chalmers University of Technology, Sweden ^Universidad de Murcia, Spain Fourth Swedish

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 25 November 2016 Lecture 8

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015 Lecture 8

Hardware Transactional Memory Shao-Hung Chiu, Upasana Sridhar Transactional Memory - Where did

Extending Hardware Transactional Memory to Support Non-busy Waiting and Non-transactional Actions

Extending Hardware Transactional Memory Capacity via Rollback-Only Transactions and Suspend/Resume

Can We Represent Infinite Lists? Lazy Evaluation Amtoft Motivation Lazy Lists Conversions

Imagine for a moment @trentmwillis Lazy Loading Engines: Anything But Lazy Engines allow

Transactional Memory: Architectural support for Lock-Free Data Structure Transactional Memory:

Transactional memory with data Transactional memory with data invariants: or putting the

Hardware Observability Framework Hardware Observability Framework Hardware Observability

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Transactional Memory 1 To read more This days papers: Herlihy and Moss, Transactional

Verification of Transactional Memories that support Non-Transactional Memory Accesses Ariel Cohen

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory

6 Transactional Memory Chip Multiprocessors (ACS MPhil) Robert Mullins Overview

Distributed databases I largely follow Silberschatz (not the latest edition) while adding info

Imp mpleme mentaon techniques for libr libraries o aries of tr f transac ansaco *onal c

On Blockchain Commit Times An analysis of how miners choose Bitcoin transactions Johnnatan

Enacting Protocols by Commitment Concession Pnar Yolum a and Munindar P . Singh b

Zero-Knowledge Proofs I Lelantus Oct. 16, 2019 Overview Zero-Knowledge Proving a

Designing for Understandability: the Raft Consensus Algorithm Diego Ongaro John Ousterhout

Git and GitHub CS 4411 Spring 2020 If that doesnt fix it, git.txt contains the phone number of

No compromises: distributed transactions with consistency, availability, and performance

Lazy Hardware Transactional Memory Anurag Negi *, Rubn Titos-Gil^, - PowerPoint PPT Presentation

Improving Commit Scalability in Lazy Hardware Transactional Memory Anurag Negi *, Rubn Titos-Gil^, Manuel E. Acacio^, Jose M. Garcia^, Per Stenstrm* *Chalmers University of Technology, Sweden ^Universidad de Murcia, Spain Fourth Swedish

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 25 November 2016 Lecture 8

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015 Lecture 8

Hardware Transactional Memory Shao-Hung Chiu, Upasana Sridhar Transactional Memory - Where did

Extending Hardware Transactional Memory to Support Non-busy Waiting and Non-transactional Actions

Extending Hardware Transactional Memory Capacity via Rollback-Only Transactions and Suspend/Resume

Can We Represent Infinite Lists? Lazy Evaluation Amtoft Motivation Lazy Lists Conversions

Imagine for a moment @trentmwillis Lazy Loading Engines: Anything But Lazy Engines allow

Transactional Memory: Architectural support for Lock-Free Data Structure Transactional Memory:

Transactional memory with data Transactional memory with data invariants: or putting the

Hardware Observability Framework Hardware Observability Framework Hardware Observability

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Transactional Memory 1 To read more This days papers: Herlihy and Moss, Transactional

Verification of Transactional Memories that support Non-Transactional Memory Accesses Ariel Cohen

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory

6 Transactional Memory Chip Multiprocessors (ACS MPhil) Robert Mullins Overview

Distributed databases I largely follow Silberschatz (not the latest edition) while adding info

Imp mpleme menta*on techniques for libr libraries o aries of tr f transac ansac*o *onal c

On Blockchain Commit Times An analysis of how miners choose Bitcoin transactions Johnnatan

Enacting Protocols by Commitment Concession Pnar Yolum a and Munindar P . Singh b

Zero-Knowledge Proofs I Lelantus Oct. 16, 2019 Overview Zero-Knowledge Proving a

Designing for Understandability: the Raft Consensus Algorithm Diego Ongaro John Ousterhout

Git and GitHub CS 4411 Spring 2020 If that doesnt fix it, git.txt contains the phone number of

No compromises: distributed transactions with consistency, availability, and performance

Improving Commit Scalability in Lazy Hardware Transactional Memory Anurag Negi , Rubn Titos-Gil^, Manuel E. Acacio^, Jose M. Garcia^, Per Stenstrm *Chalmers University of Technology, Sweden ^Universidad de Murcia, Spain Fourth Swedish

Imp mpleme mentaon techniques for libr libraries o aries of tr f transac ansaco *onal c