Inevitability Mechanisms for Inevitability Mechanisms for Software - - PowerPoint PPT Presentation

inevitability mechanisms for inevitability mechanisms for
SMART_READER_LITE
LIVE PREVIEW

Inevitability Mechanisms for Inevitability Mechanisms for Software - - PowerPoint PPT Presentation

Inevitability Mechanisms for Inevitability Mechanisms for Software Transactional Memory Software Transactional Memory Michael Spear (Rochester) Maged Michael (IBM) Michael Scott (Rochester) Why Inevitability Why Inevitability


slide-1
SLIDE 1

Inevitability Mechanisms for Inevitability Mechanisms for Software Transactional Memory Software Transactional Memory

Michael Spear (Rochester) Maged Michael (IBM) Michael Scott (Rochester)

slide-2
SLIDE 2

Inevitability Mechanisms for STM 2

Why Inevitability Why Inevitability

  • Irreversible operations (I/O)

– Especially “I after O” – Bufferable output when order matters or interleaving is forbidden – Preserves local reasoning about correctness

  • “atomic” means “all or nothing” and “all at once”
  • Non-transactional code

– Precompiled libraries (if binary rewriting is not available) – Lock-based code – Syscalls that change kernel state

  • Speed

– Turn off read/write instrumentation – e.g. matrix math

slide-3
SLIDE 3

Inevitability Mechanisms for STM 3

Caveats Caveats

  • Condition Synchronization

– Inevitable code can synchronize up to first potentially irreversible operation – Or at any point before becoming inevitable – Or via a special (limited applicability) closed nested transaction – All but last option are statically checkable (future work)

  • Library code

– Unpredictable read/write sets may dictate mechanism – May need inevitable prefetching – Indirection-based backends cause problems

  • I/O deadlocks remain

– Inevitable blocking read from empty pipe by T1 before inevitable write to same pipe by T2

slide-4
SLIDE 4

Inevitability Mechanisms for STM 4

How to Achieve Inevitability How to Achieve Inevitability

  • Only permit one inevitable transaction at a time
  • Don’t let it abort

– No explicit aborts: use eager locking, in-place update, augmented CM – No self aborts – No implicit aborts

  • Concurrent writer cannot commit changes to locations

read by active inevitable transaction

  • This is the hard part
  • Note: concurrent writer can’t commit if its read set
  • verlaps with inevitable transaction’s write set
slide-5
SLIDE 5

Inevitability Mechanisms for STM 5

Inevitability Mechanisms Inevitability Mechanisms

  • No concurrency

– Global Read/Write Lock (GRL)

  • Concurrent readers

– Global Write Lock (GWL) – Global Write Lock with Fence (GWL + Fence) – Drain

  • Concurrent writers

– Inevitable Read Locks (IRL) – Inevitable Read Filter (Filter)

  • See the paper for implementation details

Note: Drain, GRL, and GWL+Fence may delay at inevitability point

slide-6
SLIDE 6

Inevitability Mechanisms for STM 6

Sources of Latency Sources of Latency

2 CASes (writers

  • nly)

CAS Store Drain Test WB R Store GWL + Fenc e Test Acquir e Wait GWL WB R GRL Non-Inev Commit Tx Begi n Inev Comm it Inev Read Loggin g Inev Write Instr Inev Read Instr

slide-7
SLIDE 7

Inevitability Mechanisms for STM 7

Suitability to Tasks Suitability to Tasks

  • Library / Syscall with unpredictable write set

– GRL

  • Library / Syscall with unpredictable read set

– Drain, GWL+Fence, GRL

  • Short inevitable transactions with likely conflicts

– GWL

  • Short inevitable transactions with few conflicts

– IRL, Bloom

  • Long but infrequent inevitable transactions

– GWL+Fence

  • Long, frequent inevitable transactions

– Drain

slide-8
SLIDE 8

Inevitability Mechanisms for STM 8

Evaluation Evaluation

  • In the paper: microbenchmarks

– Only Drain increases latency of short non-inevitable transactions – GWL and “small” Filter flat-line a scalable benchmark – Drain starts higher, but dampens scaling – Fences are relatively fair, but don’t accelerate workloads with >1 thread – For big tasks, Drain is a good accelerator

  • In this talk: a new benchmark

– Asynchronous OpenGL 3-D rendering – Joint work with Michael Silverman and Luke Dalessandro

slide-9
SLIDE 9

Inevitability Mechanisms for STM 9

Why Write a New Benchmark? Why Write a New Benchmark?

  • Today’s programs written by today’s programmers

– Trained to think about critical sections, locking, deadlock, and mutual exclusion

  • Who writes tomorrow’s programs?

– We hope they will think about transactions, rollback, conflicts, and atomicity

  • Social experiment: get a smart undergraduate to write

code with (moderate) supervision

– Takes a couple of iterations to get the code “right” – But the programmer has a different (more transaction- friendly) philosophy – The result will probably have some relation to a game 

slide-10
SLIDE 10

Inevitability Mechanisms for STM 10

A 3-D OpenGL Scene Graph A 3-D OpenGL Scene Graph

  • Animated Multisegment Objects (AMOs)

– Big transaction does physics, animation, collision detection – Not a “read then write” transaction – Collision detection with anything that is “close”

  • Gravity Emitting Objects (GEOs)

– Not animated, don’t have initial velocity, but do collision detection – Attract AMOs

  • Game: rescue AMOs before they fall into a GMO (about 2

minutes)

  • Benchmark: nobody playing the game, 500 AMOs, 10 GEOs
slide-11
SLIDE 11

Inevitability Mechanisms for STM 11

Screenshot: Early in Simulation Screenshot: Early in Simulation

slide-12
SLIDE 12

Inevitability Mechanisms for STM 12

Screenshot: Screenshot: AMOs AMOs Converging Converging

slide-13
SLIDE 13

Inevitability Mechanisms for STM 13

Thread Configuration Thread Configuration

  • One thread continuously renders

– Read AMOs in transactions to render new frame, then make an OpenGL call

  • All other threads continuously update AMOs / GEOs

– Simulate physics based on time

  • Inevitable rendering or inevitable AMO updates

– Without inevitability, renderer must explicitly buffer reads to ensure consistency – With inevitability, can aggressively batch renderer’s reads

  • “Best” choice is a function of the number of cores

– Frame rate is decoupled from update rate, so higher not always better – Ideally, update rate ≈ frame rate ≈ screen refresh rate

slide-14
SLIDE 14

Inevitability Mechanisms for STM 14

Environment Environment

  • Code

– Uses new RSTM v2 API for word-based STMs – TL2-like back-end – Open source (will release soon)

  • Platform

– Visual C++ 2005, Windows Vista (32-bit)

  • Also OS X, Linux versions

– 2.6 GHz Q6600 (quad core), 4 GB RAM – NVIDIA 8800 GTS

slide-15
SLIDE 15

Inevitability Mechanisms for STM 15

Rendering (FPS) Rendering (FPS)

20 40 60 80 100 120 140 160 180 200

None IRL Bloom (L) Bloom (M) Bloom (S) Drain GWL GWL + TFence GRL Frames Per Second Inev Render 1 Inev Render 10 Inev Update 1

  • 60 FPS is the refresh rate
  • Inevitable render overheads
  • Fences hurt
  • Do updaters impede

renderer?

slide-16
SLIDE 16

Inevitability Mechanisms for STM 16

AMO Commits per Second AMO Commits per Second

2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 20,000 22,000

None IRL Bloom (L) Bloom (M) Bloom (S) Drain GWL GWL + TFence GRL AMO Updates Per Second Inev Render 1 Inev Render 10 Inev Update 1

  • Desire 30,000 commits
  • Bloom filter precision
  • No writer commit != blocked
  • GWL starves, fences don’t
slide-17
SLIDE 17

Inevitability Mechanisms for STM 17

Conclusions Conclusions

  • Mechanisms have benefits and drawbacks

– We think the mechanisms can compose – Many are also applicable to HTM, HyTM

  • If transactions are for the masses, inevitability is

crucial

– Local (and simple) reasoning about correctness – Need not sacrifice concurrency

  • New open-source OpenGL benchmark to further our

understanding of transactions and inevitability

slide-18
SLIDE 18

Questions / Discussion Questions / Discussion

slide-19
SLIDE 19

Supplemental Slides Supplemental Slides

slide-20
SLIDE 20

Inevitability Mechanisms for STM 20

Global Read/Write Lock Global Read/Write Lock

  • Acquire exclusive permission to read / write shared

locations

– Independent of orecs

  • Must wait for clean-up

– Otherwise, would have to instrument reads and writes

  • Concurrent readers won’t detect conflicts, so they

can’t run

  • State of the art for STM
slide-21
SLIDE 21

Inevitability Mechanisms for STM 21

Read-Only Concurrency (1/2) Read-Only Concurrency (1/2)

  • Global Write Lock

– Acquire exclusive permission to write shared locations

  • Update metadata when writing
  • With commit-time locking, writers can run up to commit point

– No waiting, but instrument reads to handle delayed cleanup – Rapid succession of inevitable transactions can starve big concurrent writers

  • Global Write Lock + Fence

– Wait for cleanup after becoming inevitable

  • No risk of delayed cleanup… no read instrumentation for

inevitable transaction

  • Inevitable transaction acquires with stores, not CASes
slide-22
SLIDE 22

Inevitability Mechanisms for STM 22

Read-Only Concurrency (2/2) Read-Only Concurrency (2/2)

  • The Drain

– Like a fair reader-writer lock

  • Inevitable transaction is “writer”
  • Concurrent writer transactions are “readers”

– No inevitable read instrumentation, store to acquire inevitably – Serialization on single global

  • 2 CASes to commit any writer
  • CAS to release inevitability
slide-23
SLIDE 23

Inevitability Mechanisms for STM 23

Read/Write Concurrency Read/Write Concurrency

  • Inevitable Read Locks

– Add an inevitable read bit to each orec

  • Noninevitable writers can’t acquire orec if bit is set

– CAS on every inevitable read

  • Cache misses for concurrent readers
  • Inevitable Read Filter

– Approximate IRL bits as a Bloom filter

  • Less precise, but no misses for concurrent readers

– WBR ordering to update filter

  • Write filter before checking if orec is held

– WBR ordering to coordinate concurrent writers

  • Acquire orec before checking filter (PPC only)
  • Favors commit-time locking
slide-24
SLIDE 24

Inevitability Mechanisms for STM 24

Concurrency Summary Concurrency Summary

X X No Bloom X X No IRL X Sometimes Drain X Yes GWL + Fence X No GWL Yes GRL Concurrent writes Concurrent read-only Delay upon becoming inevitable