Time-Warp: Lightweight Abort Minimization in Transactional Memory
Nuno Diegues and Paolo Romano
ndiegues@gsd.inesc-id.pt
Nuno Diegues 1/27
Time-Warp: Lightweight Abort Minimization in Transactional Memory - - PowerPoint PPT Presentation
Time-Warp: Lightweight Abort Minimization in Transactional Memory Nuno Diegues and Paolo Romano ndiegues@gsd.inesc-id.pt Nuno Diegues 1/27 Transactional Memory Powerful abstraction for synchronization in shared memory Nuno Diegues 2/27
ndiegues@gsd.inesc-id.pt
Nuno Diegues 1/27
Nuno Diegues 2/27
Nuno Diegues 2/27
Nuno Diegues 2/27
◮ Typically, more aborts than needed Nuno Diegues 2/27
head A D E
Linked List
...
Nuno Diegues 3/27
T head A D E
Linked List
...
B
insert B Nuno Diegues 3/27
T head A D E
Linked List
...
U B
insert B remove E Nuno Diegues 3/27
RO
contains D?
T head A D E
Linked List
...
U B
insert B remove E Nuno Diegues 3/27
RO
contains D?
T head A D E
Linked List
...
U B
insert B remove E Nuno Diegues 4/27
RO
contains D?
T head A D E
Linked List
...
U B
insert B remove E
X
Nuno Diegues 4/27
RO
contains D?
T head A D E
Linked List
...
U B
insert B remove E read-set: { head.next, A.next, D.next } write-set: { D.next } read-set: { head.next, A.next } write-set: { A.next } read-set: { head.next, A.next }
X
Nuno Diegues 4/27
RO
contains D?
head A D E
Linked List
...
U B
remove E read-set: { head.next, A.next, D.next } write-set: { D.next } read-set: { head.next, A.next }
X
T
read head.next: A read A.next: D write B.next = D write A.next = B Nuno Diegues 4/27
head A D E
Linked List
...
U B
remove E read-set: { head.next, A.next, D.next } write-set: { D.next }
X
T
read head.next: A read A.next: D write B.next = D write A.next = B
RO
read head.next: A read A.next: D
rw
Nuno Diegues 4/27
RO
read head.next: A
T
rw
head A D E
Linked List
...
read head.next: A read A.next: D write B.next = D write A.next = B
U
read head.next: A read A.next: D read D.next = E write D.next = E.next read A.next: D
rw
B
X
Nuno Diegues 4/27
RO T
rw
head A D E
Linked List
...
write A.next = B
U
read A.next: D read A.next: D
rw
B
X
Nuno Diegues 4/27
Nuno Diegues 5/27
Nuno Diegues 5/27
Nuno Diegues 6/27
Nuno Diegues 6/27
◮ without being overly conservative (eg., precluding all concurrency) Nuno Diegues 6/27
RO T
rw
head A D E
Linked List
...
write A.next = B
U
read A.next: D read A.next: D
rw
B
X
Nuno Diegues 7/27
RO T
rw
head A D E
Linked List
...
write A.next = B
U
read A.next: D read A.next: D
rw
B
Nuno Diegues 7/27
Nuno Diegues 7/27
Nuno Diegues 8/27
Nuno Diegues 9/27
Nuno Diegues 10/27
Nuno Diegues 10/27
Nuno Diegues 11/27
read x
T1
read y write y read x write x
rw
T2
Nuno Diegues 12/27
read x
T1
read y write y read x write x
rw
T2 T2 T1
time serialization Nuno Diegues 12/27
read x
T1
read y write y read x write x
rw
T2 T2 T1
time
rw
serialization
Nuno Diegues 12/27
read x
T1
read y write y read x write x
rw
T2 T2
serialization Nuno Diegues 12/27
read x
T1
read y write y read x write x
rw
T2 T2
serialization
T1
rw
Nuno Diegues 12/27
read x
T1
read y write y read x write x
rw
T2 T2
serialization
T1
rw
Nuno Diegues 12/27
read x
T1
read y write y read x write x
rw
T2 T2
serialization
T1
rw
t=4 t=3
Nuno Diegues 12/27
read x
T1
read y write y read x write x
rw
T2 T2
serialization
T1
rw
t=4 t=3 time-warp=2
Nuno Diegues 12/27
Nuno Diegues 13/27
Nuno Diegues 13/27
◮ a triad
A
write y read y
rw
T
write x
B
read x
rw
Nuno Diegues 13/27
◮ a triad
◮ the pivot
A
write y read y
rw
T
write x
B
read x
rw
Pivot Nuno Diegues 13/27
◮ a triad
◮ the pivot
◮ Completes a triad ◮ Whose pivot time-warp
commits
A
write y read y
rw
T
write x
B
read x
rw
Pivot Nuno Diegues 13/27
A
write y read y
rw
T
write x
B
read x
rw
Pivot read z write z
wr
Nuno Diegues 13/27
Nuno Diegues 14/27
Nuno Diegues 14/27
◮ Detect if some concurrent T ′ read k ◮ If so, T ′ witnessed that T did not exist ◮ We forbid T from time-warping Nuno Diegues 14/27
◮ Detect if some concurrent T ′ read k ◮ If so, T ′ witnessed that T did not exist ◮ We forbid T from time-warping
Nuno Diegues 14/27
◮ Detect if some concurrent T ′ read k ◮ If so, T ′ witnessed that T did not exist ◮ We forbid T from time-warping
◮ Detect if some concurrent T ′ committed a new version to k ◮ If so, T must time-warp Nuno Diegues 14/27
◮ Detect if some concurrent T ′ read k ◮ If so, T ′ witnessed that T did not exist ◮ We forbid T from time-warping
◮ Detect if some concurrent T ′ committed a new version to k ◮ If so, T must time-warp
Nuno Diegues 14/27
◮ Detect if some concurrent T ′ read k ◮ If so, T ′ witnessed that T did not exist ◮ We forbid T from time-warping
◮ Detect if some concurrent T ′ committed a new version to k ◮ If so, T must time-warp
◮ Know that some transaction read, not which ◮ Write transactions amortize the cost during read validation Nuno Diegues 14/27
Nuno Diegues 15/27
Nuno Diegues 15/27
Nuno Diegues 15/27
Nuno Diegues 15/27
Nuno Diegues 15/27
Nuno Diegues 15/27
Nuno Diegues 15/27
JVSTM TL2 NOrec AVSTM TWM
1000 2000 3000 4000 5000 1 4 8 16 32 64 throughput (1000 * txs/s) threads
Nuno Diegues 16/27
Nuno Diegues 17/27
JVSTM TL2 NOrec AVSTM TWM
100 thousand elements 25% modifications 1.8× speedup over AVSTM
1000 2000 3000 4000 5000 1 4 8 16 32 64 throughput (1000 * txs/s) threads
Nuno Diegues 18/27
JVSTM TL2 NOrec AVSTM TWM
Overhead of instrumentation Time-warp benefits from concurrency AVSTM lags behind
400 600 800 1000 1200 1400 1600 1 4 8 throughput (1000 * txs/s) threads
Nuno Diegues 18/27
JVSTM TL2 NOrec AVSTM TWM
Multi-version is not enough Time-Warp is similar to AVSTM
20 40 60 1 4 8 16 32 64 aborted txs (%) threads
Nuno Diegues 18/27
JVSTM TL2 NOrec AVSTM TWM
250 500 750 1000 1 4 8 16 32 64 throughput (1000 * txs/s) threads
Conflict-free workload Unveil overheads and contention points
Nuno Diegues 19/27
JVSTM TL2 NOrec AVSTM TWM
TL2
read commit readSet-val writeSet-val 1 4 8 16 32 64 time (microseconds) 60 120 180
JVSTM TWM AVSTM NOrec
threads Nuno Diegues 19/27
JVSTM TL2 NOrec AVSTM TWM
TL2
read commit readSet-val writeSet-val 1 4 8 16 32 64 time (microseconds) 60 120 180
JVSTM TWM AVSTM NOrec
threads
Increased parallelism bottlenecks on NOrec commit TWM and AVSTM have most overheads TWM remains close to JVSTM, both lock-free and multi-versioned
Nuno Diegues 19/27
Nuno Diegues 20/27
JVSTM TL2 NOrec AVSTM TWM
0.6 0.8 1 1.2 1.4 1.6 1.8 2 JVSTM TL2 Norec AVSTM speedup of TWM relative to STM
1 thread 4 threads 8 threads 16 threads 32 threads 64 threads
Nuno Diegues 21/27
JVSTM TL2 NOrec AVSTM TWM
Always better than AVSTM But difference is smaller with more threads Takes some concurrency to improve
0.6 0.8 1 1.2 1.4 1.6 1.8 2 JVSTM TL2 Norec AVSTM speedup of TWM relative to STM
1 thread 4 threads 8 threads 16 threads 32 threads 64 threads
Nuno Diegues 21/27
JVSTM TL2 NOrec AVSTM TWM
1 2 3 4 5 6 7 8 9 1 4 8 16 32 64 speedup threads
Nuno Diegues 22/27
JVSTM TL2 NOrec AVSTM TWM
1 2 3 4 5 6 7 8 9 1 4 8 16 32 64 speedup threads
Nuno Diegues 23/27
Benchmark STM genome intruder kmeans-l kmeans-h labyrinth ssca2 vac-l vac-h TWM 3.8 3.8 1.4 4.2 8.8 10.5 6.4 17.8 JVSTM 15.4 3.2 1.6 4.9 12.3 11.3 12.1 41.1 TL2 12.1 4.8 3.8 3.4 13.8 11.7 10.0 41.4 NOrec 21.1 6.0 3.8 6.4 27.6 14.9 19.9 55.0 AVSTM 13.0 3.5 2.6 4.8 10.4 11.5 9.4 18.9
Nuno Diegues 24/27
Benchmark STM genome intruder kmeans-l kmeans-h labyrinth ssca2 vac-l vac-h TWM 3.8 3.8 1.4 4.2 8.8 10.5 6.4 17.8 JVSTM 15.4 3.2 1.6 4.9 12.3 11.3 12.1 41.1 TL2 12.1 4.8 3.8 3.4 13.8 11.7 10.0 41.4 NOrec 21.1 6.0 3.8 6.4 27.6 14.9 19.9 55.0 AVSTM 13.0 3.5 2.6 4.8 10.4 11.5 9.4 18.9
Nuno Diegues 25/27
Threads STM 4 8 16 32 64 TWM 1.2 4.4 6.6 9.9 15.7 JVSTM 1.8 7.0 10.2 15.7 21.2 TL2 2.6 6.5 11.4 16.1 20.9 NOrec 3.4 9.6 18.6 24.9 34.0 AVSTM 2.5 5.5 8.6 12.7 17.6
Nuno Diegues 26/27
Nuno Diegues 27/27
Nuno Diegues 28/27