Extending Hardware Transactional Memory Capacity
via
Rollback-Only Transactions and Suspend/Resume
1
Shady Issa Pascal Felber Alexander Matveev Paolo Romano
Extending Hardware Transactional Memory Capacity via Rollback-Only - - PowerPoint PPT Presentation
Extending Hardware Transactional Memory Capacity via Rollback-Only Transactions and Suspend/Resume Alexander Shady Issa Pascal Felber Paolo Romano Matveev 1 Extending Hardware Transactional Memory Capacity via Rollback-Only Transactions and
Extending Hardware Transactional Memory Capacity
via
Rollback-Only Transactions and Suspend/Resume
1
Shady Issa Pascal Felber Alexander Matveev Paolo Romano
Extending Hardware Transactional Memory Capacity
via
Rollback-Only Transactions and Suspend/Resume
1
Shady Issa Pascal Felber Alexander Matveev Paolo Romano
withdraw(account, value){ __transaction{ if account.balance > value: account.balance -= value; return account.balance; else return -1; } } Transactional memory
implementation
2
3
4
1 2 3 4 5 6 Throughput (106Tx/s) Transaction size
HTM-SGL
10 20 30 40 50 60 70 80 90 Abort rate (%)
HTM tx HTM non-tx HTM capacity Lock aborts ROT conflicts ROT capacity
4
1 2 3 4 5 6 Throughput (106Tx/s) Transaction size
HTM-SGL
10 20 30 40 50 60 70 80 90 Abort rate (%)
HTM tx HTM non-tx HTM capacity Lock aborts ROT conflicts ROT capacity
capacity aborts
4
1 2 3 4 5 6 Throughput (106Tx/s) Transaction size
HTM-SGL
10 20 30 40 50 60 70 80 90 Abort rate (%)
HTM tx HTM non-tx HTM capacity Lock aborts ROT conflicts ROT capacity
activation of the fallback path capacity aborts
5
6
7
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 End ROT read X
inconsistent value
X = 0 X = 0 returns 0 returns 1
7
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 End ROT read X
inconsistent value
WAR X = 0 X = 0 returns 0 returns 1
8
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 End ROT read X X = 0 X = 0 returns 0
8
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 End ROT read X
new value can
consistent
X = 0 X = 0 returns 0 returns 0
8
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 End ROT read X
new value can
consistent
RAW X = 0 X = 0 returns 0 returns 0
9
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 End ROT read X
wait for concurrent ROTs non-transactionally
X = 0 X = 0
10
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 Y = 1 read Y End ROT End ROT WAR WAR
X = 0 Y = 0 X = 0 Y = 0
10
Thread 1 Thread 2 Begin ROT Begin ROT read X X = 1 Y = 1 read Y End ROT End ROT
X = 0 Y = 1 X = 1 Y = 0 X = 0 Y = 0 X = 0 Y = 0
serialisable
11
12
Thread 1 Thread 2 Begin ROT Begin ROT read X write X write Y read Y End ROT End ROT
X = 0 Y = 0 X = 0 Y = 0
12
Thread 1 Thread 2 Begin ROT Begin ROT read X write X write Y read Y End ROT End ROT
X = 0 Y = 0 X = 0 Y = 0
12
Thread 1 Thread 2 Begin ROT Begin ROT read X write X write Y read Y End ROT re-read X re-read Y End ROT
X = 0 Y = 0 X = 0 Y = 0
12
Thread 1 Thread 2 Begin ROT Begin ROT read X write X write Y read Y End ROT re-read X re-read Y End ROT
X = 0 Y = 0 X = 0 Y = 0
12
Thread 1 Thread 2 Begin ROT Begin ROT read X write X write Y read Y End ROT re-read X re-read Y End ROT
X = 0 Y = 0 X = 0 Y = 0
13
1:____________ 2:____________ 3:____________ 4:____________ 5:____________ 6:____________ 7:____________ 8:____________ 9:____________ 10:____________ 64:___________
14
Begin HTM read A read B End HTM
TMCAM
read C read D write E
1:____________ 2:____________ 3:____________ 4:____________ 5:____________ 6:____________ 7:____________ 8:____________ 9:____________ 10:____________ 64:___________
14
Begin HTM read A read B End HTM
&A &B
TMCAM
read C read D write E
&C &D &E
15
1:___________________________________ 2:___________________________________ 3:___________________________________ 4:___________________________________ 5:___________________________________ 6:___________________________________ 7:___________________________________ 8:___________________________________ 9:___________________________________ 10:__________________________________ 64:__________________________________
Begin ROT read A read B End ROT read C read D write E
15
1:___________________________________ 2:___________________________________ 3:___________________________________ 4:___________________________________ 5:___________________________________ 6:___________________________________ 7:___________________________________ 8:___________________________________ 9:___________________________________ 10:__________________________________ 64:__________________________________
Begin ROT read A read B End ROT read C read D write E
store &A store &B store &C store &D
15
1:___________________________________ 2:___________________________________ 3:___________________________________ 4:___________________________________ 5:___________________________________ 6:___________________________________ 7:___________________________________ 8:___________________________________ 9:___________________________________ 10:__________________________________ 64:__________________________________
Begin ROT read A read B End ROT read C read D write E
store &A store &B store &C store &D
&A &B &C&D &E
15
1:___________________________________ 2:___________________________________ 3:___________________________________ 4:___________________________________ 5:___________________________________ 6:___________________________________ 7:___________________________________ 8:___________________________________ 9:___________________________________ 10:__________________________________ 64:__________________________________
Begin ROT read A read B End ROT read C read D write E
128bytes
8 bytes
store &A store &B store &C store &D
&A &B &C&D &E
15
1:___________________________________ 2:___________________________________ 3:___________________________________ 4:___________________________________ 5:___________________________________ 6:___________________________________ 7:___________________________________ 8:___________________________________ 9:___________________________________ 10:__________________________________ 64:__________________________________
Begin ROT read A read B End ROT read C read D write E
128bytes
8 bytes
store &A store &B store &C store &D
&A &B &C&D &E
16
17
Thread 1 Thread 2 Begin HTM Begin ROT read X X = 1 Y = 1 End ROT End HTM
X = 0 Y = 0 X = 0 Y = 0
17
Thread 1 Thread 2 Begin HTM Begin ROT read X X = 1 Y = 1 End ROT End HTM
HTM is protected by H/W
X = 0 Y = 0 X = 0 Y = 0
17
Thread 1 Thread 2 Begin HTM Begin ROT read X Y = 1 End ROT End HTM
HTM is protected by H/W
X = 0 Y = 0 X = 0 Y = 0
17
Thread 1 Thread 2 Begin HTM Begin ROT read X Y = 1 read Y End ROT End HTM
HTM is protected by H/W
read Y
X = 0 Y = 0 X = 0 Y = 0
17
Thread 1 Thread 2 Begin HTM Begin ROT read X Y = 1 read Y End ROT End HTM
HTM is protected by H/W
read Y
inconsistent value
returns 0 returns 1
X = 0 Y = 0 X = 0 Y = 0
17
Thread 1 Thread 2 Begin HTM Begin ROT read X Y = 1 read Y End ROT End HTM
HTM is protected by H/W
read Y T2V
inconsistent value
returns 0 returns 1
X = 0 Y = 0 X = 0 Y = 0
17
Thread 1 Thread 2 Begin HTM Begin ROT read X Y = 1 read Y End ROT End HTM
HTM is protected by H/W
read Y T2V
consistent value using S/R
returns 0
X = 0 Y = 0 X = 0 Y = 0
returns 0
18
19
update Tx read-only Tx
w/o instrumentation
19
update Tx read-only Tx
w/o instrumentation
19
update Tx read-only Tx
w/o instrumentation
20
1 2 3 4 5 6 7 8 2 4 8 16 32 64 Throughput (105 tx/s) Number of threads
P8TM P8TMUCB HTM-SGL HyNoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
21
10 physical cores
1 2 3 4 5 6 7 8 2 4 8 16 32 64 Throughput (105 tx/s) Number of threads
P8TM P8TMUCB HTM-SGL HyNoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
21
10 physical cores
1 2 3 4 5 6 7 8 2 4 8 16 32 64 Throughput (105 tx/s) Number of threads
P8TM P8TMUCB HTM-SGL HyNoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
21
10 physical cores
2 4 6 8 10 12 14 16 2 4 8 16 32 64 Throughput (106 tx/s) Number of threads
P8TM P8TMUCB HTM-SGL HyNoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
22
2 4 6 8 10 12 14 16 2 4 8 16 32 64 Throughput (106 tx/s) Number of threads
P8TM P8TMUCB HTM-SGL HyNoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
22
2 4 6 8 10 12 14 16 2 4 8 16 32 64 Throughput (106 tx/s) Number of threads
P8TM P8TMUCB HTM-SGL HyNoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
22
UCB disables ROTs
suspend/resume to expand the capacity limitations
workloads
features that can be used in innovative techniques to mitigate hardware limitations
23
20 40 60 80 100 Abort rate (%)
HTM non-tx Lock aborts
SE++ SE TE
24
Bucket length (20,50,100,266,800,1333,2666) HTM capacity ROT conflicts ROT capacity
1 2 3 4 5 6 101 102 103 Almost no contention Speedup w.r.t. HTM-SGL
SE SE++
0.4 0.6 0.8 1.2 1.4 1.6 1.8
TE HTM-SGL
1 2 3 4 5 6 7 8 2 4 8 16 32 64 Throughput (105 tx/s) Number of threads
P8TM HERWL P8TMUCB HTM-SGL HyNoRec NoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
25
10 physical cores
1 2 3 4 5 6 7 8 2 4 8 16 32 64 Throughput (105 tx/s) Number of threads
P8TM HERWL P8TMUCB HTM-SGL HyNoRec NoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
25
10 physical cores
1 2 3 4 5 6 7 8 2 4 8 16 32 64 Throughput (105 tx/s) Number of threads
P8TM HERWL P8TMUCB HTM-SGL HyNoRec NoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
25
committing in h/w
10 physical cores
20 40 60 80 100 120 140 160 2 4 8 16 32 64 Throughput (106 tx/s) Number of threads
P8TM HERWL P8TMUCB HTM-SGL HyNoRec NoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
26
20 40 60 80 100 120 140 160 2 4 8 16 32 64 Throughput (106 tx/s) Number of threads
P8TM HERWL P8TMUCB HTM-SGL HyNoRec NoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
26
small Txs
20 40 60 80 100 120 140 160 2 4 8 16 32 64 Throughput (106 tx/s) Number of threads
P8TM HERWL P8TMUCB HTM-SGL HyNoRec NoRec
20 40 60 80 100 Commits (%)
HTM ROT GL/STM URO
HyNoRec HTM-SGL P8TMUCB P8TM
26
UCB disables ROTs small Txs