[PPT] - rs r rr r PowerPoint Presentation, free download

SLIDE 1

✐❚❤r❡❛❞s✿ ❆ ❚❤r❡❛❞✐♥❣ ▲✐❜r❛r② ❢♦r P❛r❛❧❧❡❧ ■♥❝r❡♠❡♥t❛❧ ❈♦♠♣✉t❛t✐♦♥

P❛♣❡r ❘❡❛❞✐♥❣ ●r♦✉♣ Pr❛♠♦❞ ❇❤❛t♦t✐❛ P❡❞r♦ ❋♦♥s❡❝❛ ❯♠✉t ❆✳ ❆❝❛r ❇❥⑧ ♦r♥ ❇✳ ❇r❛♥❞❡♥❜✉r❣ ❘♦❞r✐❣♦ ❘♦❞r✐❣✉❡s Pr❡s❡♥ts✿ ▼❛❦s②♠ P❧❛♥❡t❛ ✵✾✳✵✼✳✷✵✶✺

SLIDE 2

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

SLIDE 3

♦❛❧s

▼❛❦❡ ✐♥❝r❡♠❡♥t❛❧ ❝♦♠♣✉t❛t✐♦♥s ❡❛s② t♦ ✉s❡✿

■ ❈♦♥✈❡♥✐❡♥t ❢♦r ✉s❡r ■ ▲❡❣❛❝② s✉♣♣♦rt ■ ▲❛♥❣✉❛❣❡ ✐♥❞❡♣❡♥❞❡♥t ■ ◆♦ ♣r♦❣r❛♠♠❡r ✐♥t❡r✈❡♥t✐♦♥ ■ ▼✉❧t✐t❤r❡❛❞❡❞ ❡♥✈✐r♦♥♠❡♥t ■ ❯s❡ ❡①✐st✐♥❣ ❖❙ ❢❛❝✐❧✐t✐❡s ■ ●❡♥❡r✐❝ ♣r♦❣r❛♠ ♠♦❞❡❧ ■ ▲♦✇ ♦✈❡r❤❡❛❞

SLIDE 4

❲♦r❦✌♦✇

✶✳ ■♥✐t❛❧ r✉♥ ✷✳ ❇✉✐❧❞ ❈♦♥❝✉rr❡♥t ❉②♥❛♠✐❝ ❉❡♣❡♥❞❡♥❝❡ ●r❛♣❤ ✭❈❉❉●✮ ✸✳ ❙♣❡❝✐❢② ✐♥♣✉t ❝❤❛♥❣❡s ✹✳ ■♥❝r❡♠❡♥t❛❧ r✉♥ ✉s❡s ❝❤❛♥❣❡ ♣r♦♣❛❣❛t✐♦♥ ✺✳ ❯♣❞❛t❡ ❈❉❉●

SLIDE 5

❲♦r❦✌♦✇

✶✳ ■♥✐t❛❧ r✉♥ ✷✳ ❇✉✐❧❞ ❈♦♥❝✉rr❡♥t ❉②♥❛♠✐❝ ❉❡♣❡♥❞❡♥❝❡ ●r❛♣❤ ✭❈❉❉●✮ ✸✳ ❙♣❡❝✐❢② ✐♥♣✉t ❝❤❛♥❣❡s ✹✳ ■♥❝r❡♠❡♥t❛❧ r✉♥ ✉s❡s ❝❤❛♥❣❡ ♣r♦♣❛❣❛t✐♦♥ ✺✳ ❯♣❞❛t❡ ❈❉❉●

$ LD PRELOAD=iThreads.so // preload iThreads $./<program executable> <input-file> // initial run $ emacs <input-file> // input modified $ echo "<off> <len>" >> changes.txt // specify changes $./<program executable> <input-file> // incremental run

Figure 1. How to run an executable using iThreads

646

SLIDE 6

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

SLIDE 7

❙②st❡♠ ♠♦❞❡❧

■ ▼❡♠♦r② ♠♦❞❡❧

■ ❘❡❧❡❛s❡ ❝♦♥s✐st❡♥❝②

■ ❙②♥❝❤r♦♥✐③❛t✐♦♥ ♠♦❞❡❧

■ ♣t❤r❡❛❞s ❆P■

■ ❉❡t❡r♠✐♥✐st✐❝ ❜❡❤❛✈✐♦r

SLIDE 8

❚❤✉♥❦

■ ❯♥✐t ♦❢ s❡q✉❡♥t✐❛❧ ❡①❡❝✉t✐♦♥ ■ ❙✉rr♦✉♥❞❡❞ ❜② s②♥❝❤r♦♥✐③❛t✐♦♥ ♦♣❡r❛t✐♦♥s ■ ❙t❛t❡ ■ ❘❡❛❞ ❛♥❞ ✇r✐t❡ s❡ts ■ ❈❛✉s❛❧❧② ♦r❞❡r❡❞ ✭✈❡❝t♦r ❝❧♦❝❦s✮ ■ ❚❤✉♥❦ r❡❝♦♠♣✉t❡❞ ✮ ❆❧❧ t❤✉♥❦s ✐♥ t❤❡ t❤r❡❛❞ r❡❝♦♠♣✉t❡❞

Resolved invalid Resolved valid Pending Enabled Invalid Reused and applied memoized effects Re-executed and modified dirty set Unresolved Resolved

2 1 3 5 4

Figure 4. State transition for thunks during incremental run

650

SLIDE 9

❊①❛♠♣❧❡

Sub-computations Case Input Thread schedule Reused Recomputed A x, y, z T1.a → T2.a → T2.b T2.a T1.a, T2.b B x, y, z (T2.a → T2.b → T1.a) T2.a T1.a, T2.b C x, y, z T1.a → T2.a → T2.b T1.a, T1.b, T2.a —

Figure 3. For the incremental run, some cases with changed input or thread schedule (changes are marked with *)

647 Thread 1 (T1) Thread 2 (T2) /* T1.a / lock(); read={y} z = ++y; write={y, z} unlock(); ց lock(); / T2.a / x++; read={x} unlock(); write={x} ↓ lock(); / T2.b / y = 2x + z; read={x, z} unlock(); write={y}

Figure 2. An example of shared-memory multithreading

646

SLIDE 10

❆r❝❤✐t❡❝t✉r❡

CDDG Application OS Memoizer iThreads library Memory subsystem OS support Recorder / Replayer

Figure 5. iThreads implementation architecture. Shaded boxes represent the main components of the system.

651

SLIDE 11

■♠♣❧❡♠❡♥t❛t✐♦♥

■ ❉t❤r❡❛❞s ■ ❙❡♣❛r❛t❡ ❛❞❞r❡ss s♣❛❝❡s ❢♦r t❤r❡❛❞s ■ P❛❣❡ r❡❛❞✴✇r✐t❡ ♣r♦t❡❝t✐♦♥ ■ ❇②t❡✲❧❡✈❡❧ ❞❡❧t❛

Shared address space Thread-1 private address space Thread-2 private address space Shared memory commit Sync Sync Thunk execution Thunk execution Write Write Shared memory commit Thunk execution Thunk execution

Figure 6. Overview of the RC model implementation

SLIDE 12

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

SLIDE 13

▼❡tr✐❝s

❚✐♠❡ r✉♥t✐♠❡ ♦❢ t❤❡ s❧♦✇❡st t❤r❡❛❞ ❲♦r❦ s✉♠ ♦❢ t❤❡ t♦t❛❧ r✉♥t✐♠❡ ♦❢ ❛❧❧ t❤r❡❛❞s ❇❡♥❝❤♠❛r❦s✿ P❆❘❙❊❈ ❛♥❞ P❤♦❡♥✐①

SLIDE 14

0.01 0.1 1 10 100

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Work speedup

Number of threads 12 24 48 64

0.1 1 10

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Time speedup

<0.1

Number of threads 12 24 48 64

653 Figure 7. Performance gains of iThreads with respect to pthreads for the incremental run

653

SLIDE 15

❙✐♥❣❧❡ ♠♦❞✐☞❡❞ ♣❛❣❡

0.1 1 10 100

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Work speedup

Number of threads 12 24 48 64

0.1 1 10

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Time speedup

Number of threads 12 24 48 64

653 Figure 8. Performance gains of iThreads with respect to Dthreads for the incremental run

653

SLIDE 16

❙✐♥❣❧❡ ♠♦❞✐☞❡❞ ♣❛❣❡✱ ❞✐☛❡r❡♥t ✐♥♣✉t s✐③❡s

1 10 100 S M L S M L S M L 1 4 7 10 13

Work speedup Normalized input size

Histogram Linear-reg. String-match Work Input

1 10 100 S M L S M L S M L 1 4 7 10 13

Work speedup Normalized input size

Histogram Linear-reg. String-match

0.5 1 1.5 2 2.5 3 3.5 4 4.5 S M L S M L S M L 1 4 7 10 13

Time speedup Normalized input size

Histogram Linear-reg. String-match Time Input

0.5 1 1.5 2 2.5 3 3.5 4 4.5 S M L S M L S M L 1 4 7 10 13

Time speedup Normalized input size

Histogram Linear-reg. String-match

Figure 9. Scalability with data (work and time speedups)

654

SLIDE 17

❙✐♥❣❧❡ ♠♦❞✐☞❡❞ ♣❛❣❡✱ ❞✐☛❡r❡♥t ✇♦r❦ ❛♠♦✉♥t

2 4 6 8 10 12 14 16

1X2X 4X 8X 16X

Normalized total work Normalized computation size

pthreads Blackscholes iThreads Blackscholes pthreads Swapations iThreads Swapations

Figure 10. Scalability with work

654

SLIDE 18

❙❡✈❡r❛❧ ♠♦❞✐☞❡❞ ♣❛❣❡s

0.01 0.1 1 10 100

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Work speedup

<0.01 <0.01 Number of dirty pages 2 4 8 16 32 64 0.1 1 10

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Time speedup

<0.1 <0.1 Number of dirty pages 2 4 8 16 32 64

654 Figure 11. Scalability with input change compared to pthreads for 64 threads 654

SLIDE 19

❖✈❡r❤❡❛❞ ♦❢ ✐❚❤r❡❛❞s s②st❡♠ ❞❛t❛

Application Input size Memoized state CDDG Histogram 230400 347 (0.15%) 57 (0.02%) Linear-reg. 132436 192 (0.14%) 33 (0.02%) Kmeans 586 1145 (195.39%) 27 (4.61%) Matrix-mul. 41609 4162 (10.00%) 64 (0.15%) Swapations 143 1473 (1030.07%) 1 (0.70%) Blackscholes 155 201 (129.68%) 1 (0.65%) String match 132436 128 (0.10%) 33 (0.02%) PCA 140625 3777 (2.69%) 43 (0.03%) Canneal 9 15381 (170900.00%) 4 (44.44%) Word count 12811 10191 (79.55%) 24 (0.19%) Rev-index 359 260679 (72612.53%) 64 (17.83%)

Table 1. Space overheads in pages and input percentage

654

SLIDE 20

■♥✐t✐❛❧ r✉♥ ♦✈❡r❤❡❛❞

0.1 1 10 100 1000

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Work overhead

Number of threads 12 24 48 64

0.1 1 10 100 1000

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Time overhead

Number of threads 12 24 48 64

655 Figure 12. Performance overheads of iThreads with respect to pthreads for the initial run

655

SLIDE 21

■♥✐t✐❛❧ r✉♥ ♦✈❡r❤❡❛❞

1 1.05 1.1 1.15 1.2 1.25 1.3

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Work overhead

Histogram = 3.58X maximum

Number of threads 12 24 48 64

1 1.05 1.1 1.15 1.2 1.25 1.3

Histogram Linear_reg Kmeans Matrix_mul Swapations Blackscholes String_match PCA Canneal Word_count Reverse_index

Time overhead

Histogram = 3.13X maximum

Number of threads 12 24 48 64

655 Figure 13. Performance overheads of iThreads with respect to Dthreads for the initial run

655

SLIDE 22

❈❛s❡✲st✉❞② ❛♣♣❧✐❝❛t✐♦♥s

1 10 100

12 24 48 64

Speedup

Number of threads

Work - Parallel Gzip Time - Parallel Gzip Work - Monte-carlo Time - Monte-carlo

Figure 15. Work & time speedups for case-studies

656

SLIDE 23

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

SLIDE 24

▲✐♠✐t❛t✐♦♥s

■ ◆♦ s✉♣♣♦rt ❢♦r ❛❞✲❤♦❝ s②♥❝❤r♦♥✐③❛t✐♦♥

■ ◆♦ ❈✰✰ ❛t♦♠✐❝s

■ ◆♦ s✉♣♣♦rt ❢♦r s♠❛❧❧ ❧♦❝❛❧✐③❡❞ ✐♥s❡rt✐♦♥s ■ ❆ss✉♠❡s ❝♦♥st❛♥t ❛♠♦✉♥t ♦❢ t❤r❡❛❞s ■ ▼❛② ❤❛✈❡ s✐❣♥✐☞❝❛♥t ♦✈❡r❤❡❛❞ ■ ◆❛rr♦✇ ❛♣♣❧✐❝❛t✐♦♥ ❛r❡❛

SLIDE 25

❖✉t❝♦♠❡

■ ◆✐❝❡ ✐❞❡❛ ■ Pr❛❝t✐❝❛❧ ■ ❚r❛♥s♣❛r❡♥t ■ ❊✍❝✐❡♥t ■ ❲♦r❦s ❢♦r s♦♠❡ ❛♣♣❧✐❝❛t✐♦♥s ■ ❲❛② s✐❣♥✐☞❝❛♥t❧② ❞❡❝r❡❛s❡ r❡q✉✐r❡❞ ✇♦r❦

SLIDE 26

❉✐s❝✉ss✐♦♥

■ ❯♥✐ts ❢♦r s❝❛❧❡s ❛r❡ ♥♦t s♣❡❝✐☞❡❞✿

❙♦♠❡t✐♠❡s ♣❡r❝❡♥t❛❣❡✱ s♦♠❡t✐♠❡s t✐♠❡s

■ ■♥t❡r❛❝t✐✈❡ ❛♣♣❧✐❝❛t✐♦♥s ■ ❱❡❝t♦r ❝❧♦❝❦ ❢♦r ❡❛❝❤ t❤✉♥❦ ④ ♥♦t t♦♦ ♠✉❝❤❄ ■ ■❖ ♠❡♠♦r② ④ ❝❛♥ ②♦✉ ❞♦ s♦♠❡t❤✐♥❣❄ ❋♦r ✐♥st❛♥❝❡ ❢r❛♠❡

❜✉☛❡r✳

■ ❈❛♥ ❜❡ ❝♦♠❜✐♥❡❞ ✇✐t❤ ❞②♥❛♠✐❝ ❛❧❣♦r✐t❤♠s❄

SLIDE 27

❊①♣❧❛♥❛t✐♦♥ ♦❢ ❉t❤r❡❛❞s ❤✐❣❤ ♦✈❡r❤❡❛❞

0.01 0.1 1 10 100

H i s t

g

r a m L i n e a r _ r e g K m e a n s M a t r i x _ m u l S w a p a t i

n

s B l a c k s c h

l

e s S t r i n g _ m a t c h P C A C a n n e a l W

r

d _ c

u

n t R e v e r s e _ i n d e x

1 1.05 1.1 1.15 1.2 1.25 1.3

Percentage (%) breakdown Work Overhead

Histogram = 3.58X

Read fault Memoization Work overhead

Figure 14. Work overheads breakdown w.r.t Dthreads

656

SLIDE 28

❘❡❧❡❛s❡ ❝♦♥s✐st❡♥❝②

■ ❖❜❥❡❝ts ❛r❡ ❛❝q✉✐r❡❞ ❛♥❞ r❡❧❡❛s❡❞ ■ ❈r✐t✐❝❛❧ s❡❝t✐♦♥ ❜❡t✇❡❡♥ ❛❝q✉✐r❡ ❛♥❞ r❡❧❡❛s❡ ■ ●✉❛r❛♥t❡❡❞ ❝♦rr❡❝t♥❡ss ❛♥❞ ❧✐✈❡♥❡ss ❢♦r ❞❛t❛✲r❛❝❡✲❢r❡❡

♣r♦❣r❛♠s

SLIDE 29

❱❡❝t♦r ❝❧♦❝❦s

■ ❯s❡❞ ❢♦r ✐♥✈❛❧✐❞❛t✐♦♥ ♣r♦♣❛❣❛t✐♦♥ ■ ▼❛✐♥t❛✐♥❡❞ ❢♦r✿

■ ❖❜❥❡❝ts ■ ❚❤r❡❛❞s ■ ❚❤✉♥❦s

✐❚❤r❡❛❞s✿ ❆ ❚❤r❡❛❞✐♥❣ ▲✐❜r❛r② ❢♦r P❛r❛❧❧❡❧ ■♥❝r❡♠❡♥t❛❧ ❈♦♠♣✉t❛t✐♦♥

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

▼❛❦❡ ✐♥❝r❡♠❡♥t❛❧ ❝♦♠♣✉t❛t✐♦♥s ❡❛s② t♦ ✉s❡✿

❲♦r❦✌♦✇

❲♦r❦✌♦✇

$ LD PRELOAD=iThreads.so // preload iThreads $./<program executable> <input-file> // initial run $ emacs <input-file> // input modified $ echo "<off> <len>" >> changes.txt // specify changes $./<program executable> <input-file> // incremental run

Figure 1. How to run an executable using iThreads

646

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

❙②st❡♠ ♠♦❞❡❧

■ ▼❡♠♦r② ♠♦❞❡❧

■ ❙②♥❝❤r♦♥✐③❛t✐♦♥ ♠♦❞❡❧

■ ❉❡t❡r♠✐♥✐st✐❝ ❜❡❤❛✈✐♦r

❚❤✉♥❦

2 1 3 5 4

Figure 4. State transition for thunks during incremental run

650

❊①❛♠♣❧❡

Sub-computations Case Input Thread schedule Reused Recomputed A x, y*, z T1.a → T2.a → T2.b T2.a T1.a, T2.b B x, y, z (T2.a → T2.b → T1.a)* T2.a T1.a, T2.b C x, y, z T1.a → T2.a → T2.b T1.a, T1.b, T2.a —

Figure 3. For the incremental run, some cases with changed input or thread schedule (changes are marked with *)

647

Thread 1 (T1) Thread 2 (T2) /* T1.a */ lock(); read={y} z = ++y; write={y, z} unlock(); ց lock(); /* T2.a */ x++; read={x} unlock(); write={x} ↓ lock(); /* T2.b */ y = 2*x + z; read={x, z} unlock(); write={y}

Figure 2. An example of shared-memory multithreading

646

❆r❝❤✐t❡❝t✉r❡

Figure 5. iThreads implementation architecture. Shaded boxes represent the main components of the system.

651

■♠♣❧❡♠❡♥t❛t✐♦♥

■ ❉t❤r❡❛❞s ■ ❙❡♣❛r❛t❡ ❛❞❞r❡ss s♣❛❝❡s ❢♦r t❤r❡❛❞s ■ P❛❣❡ r❡❛❞✴✇r✐t❡ ♣r♦t❡❝t✐♦♥ ■ ❇②t❡✲❧❡✈❡❧ ❞❡❧t❛

Figure 6. Overview of the RC model implementation

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

▼❡tr✐❝s

❚✐♠❡ r✉♥t✐♠❡ ♦❢ t❤❡ s❧♦✇❡st t❤r❡❛❞ ❲♦r❦ s✉♠ ♦❢ t❤❡ t♦t❛❧ r✉♥t✐♠❡ ♦❢ ❛❧❧ t❤r❡❛❞s ❇❡♥❝❤♠❛r❦s✿ P❆❘❙❊❈ ❛♥❞ P❤♦❡♥✐①

653

Figure 7. Performance gains of iThreads with respect to pthreads for the incremental run

653

❙✐♥❣❧❡ ♠♦❞✐☞❡❞ ♣❛❣❡

653

Figure 8. Performance gains of iThreads with respect to Dthreads for the incremental run

653

❙✐♥❣❧❡ ♠♦❞✐☞❡❞ ♣❛❣❡✱ ❞✐☛❡r❡♥t ✐♥♣✉t s✐③❡s

Figure 9. Scalability with data (work and time speedups)

654

❙✐♥❣❧❡ ♠♦❞✐☞❡❞ ♣❛❣❡✱ ❞✐☛❡r❡♥t ✇♦r❦ ❛♠♦✉♥t

Normalized total work Normalized computation size

Figure 10. Scalability with work

654

❙❡✈❡r❛❧ ♠♦❞✐☞❡❞ ♣❛❣❡s

Work speedup

Time speedup

654 Figure 11. Scalability with input change compared to pthreads for 64 threads 654

❖✈❡r❤❡❛❞ ♦❢ ✐❚❤r❡❛❞s s②st❡♠ ❞❛t❛

Table 1. Space overheads in pages and input percentage

654

■♥✐t✐❛❧ r✉♥ ♦✈❡r❤❡❛❞

655

Figure 12. Performance overheads of iThreads with respect to pthreads for the initial run

655

■♥✐t✐❛❧ r✉♥ ♦✈❡r❤❡❛❞

655

Figure 13. Performance overheads of iThreads with respect to Dthreads for the initial run

655

❈❛s❡✲st✉❞② ❛♣♣❧✐❝❛t✐♦♥s

1 10 100

Speedup

Number of threads

Figure 15. Work & time speedups for case-studies

656

❚❛❜❧❡ ♦❢ ❈♦♥t❡♥ts

▼♦t✐✈❛t✐♦♥ ❉❡t❛✐❧s ❊✈❛❧✉❛t✐♦♥ ❈♦♥❝❧✉s✐♦♥

▲✐♠✐t❛t✐♦♥s

■ ◆♦ s✉♣♣♦rt ❢♦r ❛❞✲❤♦❝ s②♥❝❤r♦♥✐③❛t✐♦♥

❖✉t❝♦♠❡

■ ◆✐❝❡ ✐❞❡❛ ■ Pr❛❝t✐❝❛❧ ■ ❚r❛♥s♣❛r❡♥t ■ ❊✍❝✐❡♥t ■ ❲♦r❦s ❢♦r s♦♠❡ ❛♣♣❧✐❝❛t✐♦♥s ■ ❲❛② s✐❣♥✐☞❝❛♥t❧② ❞❡❝r❡❛s❡ r❡q✉✐r❡❞ ✇♦r❦

❉✐s❝✉ss✐♦♥

■ ❯♥✐ts ❢♦r s❝❛❧❡s ❛r❡ ♥♦t s♣❡❝✐☞❡❞✿

❙♦♠❡t✐♠❡s ♣❡r❝❡♥t❛❣❡✱ s♦♠❡t✐♠❡s t✐♠❡s

Sub-computations Case Input Thread schedule Reused Recomputed A x, y, z T1.a → T2.a → T2.b T2.a T1.a, T2.b B x, y, z (T2.a → T2.b → T1.a) T2.a T1.a, T2.b C x, y, z T1.a → T2.a → T2.b T1.a, T1.b, T2.a —

Thread 1 (T1) Thread 2 (T2) /* T1.a / lock(); read={y} z = ++y; write={y, z} unlock(); ց lock(); / T2.a / x++; read={x} unlock(); write={x} ↓ lock(); / T2.b / y = 2x + z; read={x, z} unlock(); write={y}