Performance Evaluation of Adaptivity in STM Mathias Payer and - PowerPoint PPT Presentation

Performance Evaluation of Adaptivity in STM Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zürich

Motivation ● STM systems rely on many assumptions ● Often contradicting for different programs ● Statically tuned to a baseline ● Use self-optimizing systems ● Adapt to different workloads ● What parameters can be adapted? ● How to measure effectiveness? ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 2

Outline ● Introduction ● STM System ● STM Baseline ● Adaptive Parameters ● Evaluation ● Related work ● Conclusion ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 3

Introduction ● Software Transactional Memory (STM) applies transactions to memory ● (Optimistic) concurrency control mechanism ● Alternative to lock-based synchronization ● Multiple concurrent threads run transactions ● Concurrent memory modifications ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 4

Introduction ● Concurrent transactions modify memory without synchronization ● Transaction is verified after completion ● Conflicts are detected and resolved ● Changes committed for conflict-free transactions ● Modifications only visible after commit ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 5

Introduction TX starts balance in read-set withdraw { deposit { balance in tmp = balance; tmp = balance; write-set tmp = tmp – 100 tmp = tmp + 100 Conflict detection, balance = tmp; balance = tmp; data committed } } ● What happens when balance is accessed concurrently? ● Either locking or STM needed to ensure correct end balance ● STM system decides which tx is executed first ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 6

STM Baseline ● Many efficient STM implementations agree on important design decisions: ● Word-based locking ● Global locking / version table ● Eager locking ● (Almost) no contention management ● Simple write-set and read-set implementations ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 7

STM Baseline Combined global write lock / version array Read Read Lock Lock Write Write list / list / list list list / list / buffer buffer buffer buffer Write Write Read Read Hash Hash Hash Hash Transaction Transaction ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 8

Adaptive STM Parameters ● Global adaptivity ● Synchronization needed ● Optimizes to global optimum ● Averages over all concurrent transactions ● (Thread-) local adaptivity ● No synchronization needed ● Limits adaptable parameters ● Best parameters for each thread/transaction ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 9

Adaptive STM Parameters ● Different adaptive parameters measured: ● Size of global locking/version-table *G ● Size of local hash-tables *L ● Write strategy *L ● Locality tuning for hash-functions *L ● Contention management *L *L – local, *G – global ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 10

Adaptive Hash-Table ● Global hash-table: trade-off between over- locking and locality ● Global strategy: coordinate lock collisions and over- locking between threads ● Adapt size based on global information ● Local hash-table: trade-off between reset cost, and # hash-collisions ● Local strategy: sample moving average of unique write locations ● Adapt size based on trend ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 11

Adaptive Write Strategy ● Different costs depending on strategy ● Write-back: cheap abort, expensive commit ● Write-through: expensive abort, cheap commit ● Adapt strategy to per-thread workload ● Measure abort rate ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 12

Adaptive Locality Tuning ● Different applications have different data access patterns ● No optimal hash function for all data accesses ● Measure number of hash collisions for thread- local hash tables ● Circle through different hash functions ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 13

Adaptive Contention Management ● No single strategy works in all environments ● Measure contention and implement an adaptive back-off strategy ● Wait and retry ● Abort later ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 14

Local Adaptive STM Parameters (for local hash-table) # writes vs. hash-table space enlarge write-hash no change shrink write-hash 0 ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 15

Local Adaptive STM Parameters (for local hash-table) no change change hash-function 0 # hash collisions ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 16

Local Adaptive STM Parameters (for local hash-table) # writes vs. hash-table space enlarge write-hash enlarge write-hash & change hash-function no change change hash-function shrink write-hash & shrink write-hash change hash-function 0 # hash collisions ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 17

AdaptSTM ● Adaptive STM system built on presented features ● Statically tuned competitive baseline – Static global hash function and hash table ● Mature and stable implementation ● Different local adaptive parameters – Write-set hash function and size of hash table – Write-through and write-back write strategy – Adaptive contention management ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 18

Evaluation ● Benchmark: STAMP 0.9.10 ● ++ configuration (increased workload for kmeans) ● AdaptSTM version 0.5.1 ● Intel 4-core Xeon E5520 CPU ● 8 cores @ 2.27GHz, 12GB RAM ● 64bit Ubuntu 9.04 ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 19

Evaluation: Global Hash-Table kmeans Genome 4 Threads 4 Threads 80 4.5 4 70 2^16 2^16 3.5 60 2^18 2^18 3 2^20 2^20 50 2^22 2^22 2.5 Time [s] 2^24 Time [s] 2^24 40 2^26 2^26 2 30 1.5 20 1 10 0.5 0 0 0 2 4 6 8 10 0 2 4 6 8 10 # Shifts # Shifts ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 20

Evaluation: Global Adaptivity ● Global optimizations have limited potential ● Small optimization potential ● High synchronization cost ● Reasonable baseline outperforms global optimization ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 21

Evaluation: Local Adaptivity ● Different configurations: ● naWB: no adaptivity, use write-back ● aWBT: adaptivity, adjust write-through / write-back ● aWWH: aWBT plus an adaptive hash-table for the write-set ● aWHH: aWWH plus different hash functions ● aALL: all adaptive parameters plus Bloom filter for write-entries ● Adaptation system starts with best 'average' parameters, improves from there ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 22

Evaluation: Local Adaptivity kmeans Labyrinth 15.00% 3.00% 2.00% 10.00% Speedup to non adaptive Speedup to non adaptive 1.00% 5.00% aWBT 0.00% aWBT aWWH 0.00% aWWH aWHH aWHH -1.00% aALL aALL -5.00% -2.00% -10.00% -3.00% -15.00% -4.00% 1 2 4 8 16 1 2 4 8 16 Threads Threads aWBT: adaptive, write-back/-through ● aWWH: adaptive, write-back/-through, write-hash ● aWHH: adaptive, write-back/-through, write-hash, hash-function ● aALL: adaptive, write-back/-through, write-hash, hash-function, Bloom filter ● ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 23

Evaluation: Local Adaptivity Genome Vacation 6.00% 5.00% 5.00% 4.00% 4.00% Speedup to non adaptive Speedup to non adaptive 3.00% 3.00% aWBT 2.00% 2.00% aWBT aWWH aWWH 1.00% aWHH aWHH 1.00% aALL aALL 0.00% 0.00% -1.00% -1.00% -2.00% -3.00% -2.00% 1 2 4 8 16 1 2 4 8 16 Threads Threads aWBT: adaptive, write-back/-through ● aWWH: adaptive, write-back/-through, write-hash ● aWHH: adaptive, write-back/-through, write-hash, hash-function ● aALL: adaptive, write-back/-through, write-hash, hash-function, Bloom filter ● ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 24

Evaluation: Local Adaptivity ● No single optimization works for all benchmarks ● Combination of all options leads to best performance ● Impressive speed-ups for individual benchmarks compared to the globally optimized case ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 25

Related Work ● TL2 (Dice et al.): baseline STM system ● Different related work on static tuning of global parameters (Harris, Dice, Ennals, Felber) ● Crucial for efficient baseline ● TinySTM (Felber et al.): adapts size and hash function of global locking table ● ASTM (Marathe et. al.): adapts lazy-eager locking strategies and different meta-formats ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 26

Conclusions ● Adaptivity in STM is important for good performance ● Speedups up to 10% possible ● Global optimization are limited ● Low potential, high synchronization cost ● Local optimizations tune thread-local parameters ● High correlation with workload ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 27

Questions ? ● Contact: mathias.payer@nebelwelt.net ● Source: http://nebelwelt.net/projects/adaptSTM/ ISPASS'11 / 2011-04-12 Mathias Payer / ETH Zürich 28

Performance Evaluation of Adaptivity in STM Mathias Payer and - PowerPoint PPT Presentation

Performance Evaluation of Adaptivity in STM Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zrich Motivation STM systems rely on many assumptions Often contradicting for different programs Statically tuned to

Scanning Tunneling Microscopy (STM) and spin-polarized STM Part I - STM Wulf Wulfhekel

Scanning Tunneling Microscopy (STM) and spin-polarized STM Part II - spin polarized STM Wulf

Basic Blocks and Traces Lecture 8 Canonical Trees signature CANON = sig val linearize :

fifty shades of adaptivity (in property testing) An Adaptivity Hierarchy Theorem for Property

Natural Analysts in Adaptive Data Analysis Tijana Zrnic joint with Moritz Hardt Adaptivity

A Flexible Mechanism for Providing Adaptivity Based on Learning Providing Adaptivity Based on

STM/STS study of surface electronic STM/STS study of surface electronic density of states of Sr 2

R&D R&D sul sul fotovoltaico fotovoltaico in STM in STM Marina Foti IMS R&D

Hybrid STM/HTM for Nested Transactions in Java Keith Chapman Tony Hosking Eliot Moss Purdue U

PPP over SONET from STS-1 (STM-0/AU-3) to STS-192c (STM-64/AU-4-64c)

Adaptivity and Personalization in Learning System s Sabine Graf School of Computing and

Adaptivity helps for testing juntas Rocco Servedio, Li-Yang Tan, John Wright Columbia TTIC CMU

Q1) How important is the problem of adaptivity and its various guises as a cause of false

Adaptive Sparse Recovery with Limited Adaptivity Akshay Kamath Eric Price UT Austin 2018-11-27

SmartMill Technology Guaranteeing Particle Size, Quality and Production with On-site Sorbent

Exploring a subsurface in metals with STM O. Kurnosikov o.kurnosikov@tue.nl Introduction:

Denial of Service via Algorithmic Complexity Attacks Scott Crosby and Dan Wallach Presented by

FIPS 201 Cryptography FIPS 201 Cryptography Tim Polk tim.polk@nist.gov Nov 18, 2004

Dynamic Programming Hash Tables, and Biostatistics 615/815 Lecture 8: . . . . . . Summary

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni

Analysis of a Proposed Hash- Based Signature Standard Jonathan Katz Motivation and background

, R ADIO G ATN , a belt-and-mill hash function a belt-and-mill hash function Guido Bertoni,

Organon Analytics AI Platform We use our own advanced machine learning platform to help

Task Oriented Pearl: Distributed Blockchain Applications M. Lubbers 1,2 J.M. Jansen 1 1 Military

Performance Evaluation of Adaptivity in STM Mathias Payer and - PowerPoint PPT Presentation

Performance Evaluation of Adaptivity in STM Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zrich Motivation STM systems rely on many assumptions Often contradicting for different programs Statically tuned to

Scanning Tunneling Microscopy (STM) and spin-polarized STM Part I - STM Wulf Wulfhekel

Scanning Tunneling Microscopy (STM) and spin-polarized STM Part II - spin polarized STM Wulf

Basic Blocks and Traces Lecture 8 Canonical Trees signature CANON = sig val linearize :

fifty shades of adaptivity (in property testing) An Adaptivity Hierarchy Theorem for Property

Natural Analysts in Adaptive Data Analysis Tijana Zrnic joint with Moritz Hardt Adaptivity

A Flexible Mechanism for Providing Adaptivity Based on Learning Providing Adaptivity Based on

STM/STS study of surface electronic STM/STS study of surface electronic density of states of Sr 2

R&amp;D R&amp;D sul sul fotovoltaico fotovoltaico in STM in STM Marina Foti IMS R&amp;D

Hybrid STM/HTM for Nested Transactions in Java Keith Chapman Tony Hosking Eliot Moss Purdue U

PPP over SONET from STS-1 (STM-0/AU-3) to STS-192c (STM-64/AU-4-64c)

Adaptivity and Personalization in Learning System s Sabine Graf School of Computing and

Adaptivity helps for testing juntas Rocco Servedio, Li-Yang Tan, John Wright Columbia TTIC CMU

Q1) How important is the problem of adaptivity and its various guises as a cause of false

Adaptive Sparse Recovery with Limited Adaptivity Akshay Kamath Eric Price UT Austin 2018-11-27

SmartMill Technology Guaranteeing Particle Size, Quality and Production with On-site Sorbent

Exploring a subsurface in metals with STM O. Kurnosikov o.kurnosikov@tue.nl Introduction:

Denial of Service via Algorithmic Complexity Attacks Scott Crosby and Dan Wallach Presented by

FIPS 201 Cryptography FIPS 201 Cryptography Tim Polk tim.polk@nist.gov Nov 18, 2004

Dynamic Programming Hash Tables, and Biostatistics 615/815 Lecture 8: . . . . . . Summary

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni

Analysis of a Proposed Hash- Based Signature Standard Jonathan Katz Motivation and background

, R ADIO G ATN , a belt-and-mill hash function a belt-and-mill hash function Guido Bertoni,

Organon Analytics AI Platform We use our own advanced machine learning platform to help

Task Oriented Pearl: Distributed Blockchain Applications M. Lubbers 1,2 J.M. Jansen 1 1 Military

R&D R&D sul sul fotovoltaico fotovoltaico in STM in STM Marina Foti IMS R&D