Lowering the Overhead of Nonblocking Software Transactional Memory - PowerPoint PPT Presentation

Lowering the Overhead of Nonblocking Software Transactional Memory Virendra J. Marathe Michael F. Spear Christopher Heriot Athul Acharya David Eisenstat William N. Scherer III Michael L. Scott

Background • Hardware support for managed code STMs is a daunting task • C/C++ users need a fast nonblocking STM library • The larger community needs STM libraries that are free and unencumbered by license restrictions • RSTM: a fast, free, pthreads STM library Lowering the Overhead of Nonblocking STM 2

Outline • Reducing indirection • Limiting heap use • Fast, flexible conflict detection • Performance • Future work • Conclusions Lowering the Overhead of Nonblocking STM 3

Indirection Costs Owner State Data TMObject Descriptor Old (old) New Data Locator (new) • Basic DSTM / ASTM / SXM organization – Adds 2 levels of indirection – Adds 3 pointer dereferences to access data • Up to 4 cache misses to determine valid version Lowering the Overhead of Nonblocking STM 4

Reducing Indirection Transaction Descriptor Owner State readers Old Owner Header never accessed Data Old (new) Clean Bit Data (old) • Adds up to 2 levels of indirection • Adds up to 3 dereferences – Unacquired objects: 1 dereference 4 cache misses – Committed owner: 2 dereferences only on dirty, aborted owner – Aborted owner: 3 dereferences Lowering the Overhead of Nonblocking STM 5

Reusing Heap Objects • Reference counting descriptors risks a cache miss on every decrement • At transaction end, RSTM cleans up all pointers to the descriptor – If abort, install clean header pointing to old object – If commit, install clean header pointing to new object – Most headers will be in cache – Appropriate data objects marked for lazy reclamation Owner State readers Old Owner Data Old (new) Data (old) Lowering the Overhead of Nonblocking STM 6

Preallocation • Initial read/write sets are fields in descriptor – Dynamic allocation only if set > 64 items • Sets optimized for iteration – Every method that may do a lookup also does a full validation – Predict result of lookup, then verify it during the validation – High locality during iteration – Similar to McRT’s Sequential Store Buffers [PPoPP 06] size 64 element array Lowering the Overhead of Nonblocking STM 7

Conflict Detection • “Eager” and “Lazy” acquire are straightforward • What about “Visible” readers? – Saves validation overhead, allows writer-reader arbitration – Typical implementation is as field in locator; visible reader list is modified atomically as part of header • Increases heap use and takes time to get memory, construct locator, and CAS it in • Simpler solution via bitmap – Limits # visible readers – Allows (rare) spurious aborts – No memory management required Lowering the Overhead of Nonblocking STM 8

RSTM Visible Readers CAS Read IDs 1. Get ReaderID T 1 COMMITTED ACTIVE ? 2: T 1 2: avail 2 2. On open_RO(), set bit CAS 0 Owner 3. On commit/abort, 00000000 00000100 Old Data clear read bits CAS 0 Owner 00100000 00100100 Old 2n CAS instrs to Data read n objects CAS 0 Owner 11000100 Old 11000000 Data Lowering the Overhead of Nonblocking STM 9

RSTM Performance • Tests conducted on 16-processor SunFire 6800 • Always outperforms Java ASTM • C++ ASTM implementation shows that language is less important than metadata and conflict detection policy • No single conflict detection policy is best Lowering the Overhead of Nonblocking STM 10

HashTable (embarrassingly parallel) Java ASTM Few conflicts == strategy doesn’t matter much C++ ASTM RSTM VE RSTM IE RSTM IL RSTM VL Metadata is the only difference CGL between C++ ASTM and RSTM Eager has slightly less overhead Lowering the Overhead of Nonblocking STM 11

RBTree (some conflicts) 2500 @ 1 thread Java ASTM C++ ASTM RSTM VE RSTM IE Visible reads force tree head to RSTM IL bounce between cache lines RSTM VL CGL Lowering the Overhead of Nonblocking STM 12

LFUCache (no parallelism) 4500 @ 1 thread Java ASTM C++ ASTM RSTM VE RSTM IE No natural parallelism; Lazy RSTM IL conflicts don’t impede progress RSTM VL CGL Lowering the Overhead of Nonblocking STM 13

RandomGraph (torture test) Java ASTM C++ ASTM RSTM VE RSTM IE Visible reads dramatically RSTM IL reduce validation RSTM VL CGL Log scale, Tx/sec Eager acquire leads to livelock Lowering the Overhead of Nonblocking STM 14

Future Work • Adaptation between lazy and eager, visible and invisible – Architectural implications…Intel Xeon, Sun Niagara have very different CAS overheads • Avoiding validation with heuristics • Mixed invalidation • Hardware assistance Lowering the Overhead of Nonblocking STM 15

Summary • Better metadata organization reduces cache misses • Limiting dynamic memory management helps • Conflict detection is workload dependent • Download RSTM for SPARC/Solaris at http://www.cs.rochester.edu/research/synchronization/rstm/ (check back soon for x86/Linux version) Lowering the Overhead of Nonblocking STM 16

Supplemental Material

Linked List with Early Release Java ASTM FGL cache & preemption effects C++ ASTM RSTM VE RSTM IE C++ ASTM is best RSTM IL (no writer cleanup) RSTM VL CGL FGL Visible reads: 2 CASes in rapid succession Lowering the Overhead of Nonblocking STM 18

Lowering the Overhead of Nonblocking Software Transactional Memory - PowerPoint PPT Presentation

Lowering the Overhead of Nonblocking Software Transactional Memory Virendra J. Marathe Michael F. Spear Christopher Heriot Athul Acharya David Eisenstat William N. Scherer III Michael L. Scott Background Hardware support for managed

Nonblocking commit protocols Dale Skeen, SIGMOD81 Jingchao Fang, Zhuoer Tong Abstract

Low-Overhead System Tracing With eBPF Akshay Kapoor DevOps Engineer @ SAP Labs May 2018

Electric Traction Electrified railway systems Prof. Dr. Ir. R.P.B.J. Dollevoet Introduction

Temporary Seasonal Lake Lowering Overview Pertinent facts and details about Lake Conroe, Lake

Lowering Water Demand Lowering Water Demand on Pima County Roadways on Pima County Roadways Low

Hierarchy Aware Blocking and Nonblocking Collective Communications-The Effects of Shared Memory

Wren: Nonblocking Reads in a Partitioned Transactional Causally Consistent Data Store Kristina

Implementation and Analysis of Nonblocking Collective Operations on SCI Networks Christian Kaiser

OVERHEAD CRANE OVERHEAD CRANE-HOIST HOIST-JIB CRANE JIB CRANE ATEX PLANT ATEX PLANT

Tables in TEX \eTD \bTD overhead \eTD \eTR overhead so much \eTABLE \eTD \eTR \bTABLE even

Bursty Tracing: A Framework for Low-Overhead Temporal Profiling Martin Hirzel Trishul Chilimbi

File System Performance File System Performance Memory mapped files - Avoid system call overhead

O/H to U/G Conversion Presentation Overhead Engineering & Services June 27, 2017

Overhead U*lity Undergrounding Project Update June 30,

#23040 Sackum Overhead Bridge Replacement Ministry of Transportation & Infrastructure

Fast dynamic and partial reconfiguration Data Path with low Hardware overhead on Xilinx FPGAs

Undocumented CIFS Jeremy Allison : Samba Team But isn't CIFS documented ? SNIA document does

Lion Taming 101 Classroom Management Part 1 Presented By: Debbie Silver, Ed. D.

Ethical Decision Making During a Pandemic Sergia Hay (philosophy) and Paul T. Menzel (philosophy

r t s srt ts

Clinical Tenure April 15, 2020 Cornell Law School Key Moments in Clinical Legal Education ABA

Disclosures The Painful TKA: Are we going to experience an epidemic? Biomet: Consultant,

Limited Liability Company (LLC) TRIPLE oil products; local tourism; was founded in

CEE 370 Environmental Engineering Principles Lecture #35 Hazardous Waste I: Intro and

Lowering the Overhead of Nonblocking Software Transactional Memory - PowerPoint PPT Presentation

Lowering the Overhead of Nonblocking Software Transactional Memory Virendra J. Marathe Michael F. Spear Christopher Heriot Athul Acharya David Eisenstat William N. Scherer III Michael L. Scott Background Hardware support for managed

Nonblocking commit protocols Dale Skeen, SIGMOD81 Jingchao Fang, Zhuoer Tong Abstract

Low-Overhead System Tracing With eBPF Akshay Kapoor DevOps Engineer @ SAP Labs May 2018

Electric Traction Electrified railway systems Prof. Dr. Ir. R.P.B.J. Dollevoet Introduction

Temporary Seasonal Lake Lowering Overview Pertinent facts and details about Lake Conroe, Lake

Lowering Water Demand Lowering Water Demand on Pima County Roadways on Pima County Roadways Low

Hierarchy Aware Blocking and Nonblocking Collective Communications-The Effects of Shared Memory

Wren: Nonblocking Reads in a Partitioned Transactional Causally Consistent Data Store Kristina

Implementation and Analysis of Nonblocking Collective Operations on SCI Networks Christian Kaiser

OVERHEAD CRANE OVERHEAD CRANE-HOIST HOIST-JIB CRANE JIB CRANE ATEX PLANT ATEX PLANT

Tables in TEX \eTD \bTD overhead \eTD \eTR overhead so much \eTABLE \eTD \eTR \bTABLE even

Bursty Tracing: A Framework for Low-Overhead Temporal Profiling Martin Hirzel Trishul Chilimbi

File System Performance File System Performance Memory mapped files - Avoid system call overhead

O/H to U/G Conversion Presentation Overhead Engineering &amp; Services June 27, 2017

Overhead U*lity Undergrounding Project Update June 30,

#23040 Sackum Overhead Bridge Replacement Ministry of Transportation &amp; Infrastructure

Fast dynamic and partial reconfiguration Data Path with low Hardware overhead on Xilinx FPGAs

Undocumented CIFS Jeremy Allison : Samba Team But isn't CIFS documented ? SNIA document does

Lion Taming 101 Classroom Management Part 1 Presented By: Debbie Silver, Ed. D.

Ethical Decision Making During a Pandemic Sergia Hay (philosophy) and Paul T. Menzel (philosophy

r t s srt ts

Clinical Tenure April 15, 2020 Cornell Law School Key Moments in Clinical Legal Education ABA

Disclosures The Painful TKA: Are we going to experience an epidemic? Biomet: Consultant,

Limited Liability Company (LLC) TRIPLE oil products; local tourism; was founded in

CEE 370 Environmental Engineering Principles Lecture #35 Hazardous Waste I: Intro and

O/H to U/G Conversion Presentation Overhead Engineering & Services June 27, 2017

#23040 Sackum Overhead Bridge Replacement Ministry of Transportation & Infrastructure