[PPT] - AutoSynch An Automatic-Signal Monitor Based on Predicate Tagging PowerPoint Presentation

SLIDE 1

AutoSynch

An Automatic-Signal Monitor Based on Predicate Tagging Wei-Lun Hung

wlhung@utexas.edu

Vijay K. Garg

garg@ece.utexas.edu

Parallel and Distributed Systems Laboratory Department of Electrical & Computer Engineering

PLDI 2013

SLIDE 2

Outline

1

Introduction

2

Our Approach Evaluate Predicate: Closure Avoid signalAll Calls: Relay Signaling Rule Reduce Predicate Evaluations: Predicate Tagging

3

Results

4

Conclusions and Future Work

PLDI 2013 1 / 31

SLIDE 3

Bounded Buffer

public class BoundedBuffer { Object[] buff; int putPtr, takePtr, count; // for mutual exclusion and synchronization Lock mutex = new ReentrantLock(); Condition full = mutex.newCondition(); Condition empty = mutex.newCondition(); public BoundedBuffer(int n) { buff = new Object[n]; putPtr = takePtr = count = 0; } }

takePtr putPtr count = 4

PLDI 2013 2 / 31

SLIDE 4

Bounded Buffer

public Object take() { // lock before operations mutex.lock(); // wait if the buffer is empty while (count == 0) { empty.await(); } Object ret = buff[takePtr++]; takePtr %= buff.length; count--; // signal other threads when the buffer // is no longer full if (count == buff.length - 1) { full.signalAll(); } // unlock after operations mutex.unlock(); }

PLDI 2013 3 / 31

SLIDE 5

Bounded Buffer

public Object take() { // lock before operations mutex.lock(); // wait if the buffer is empty while (count == 0) { empty.await(); } Object ret = buff[takePtr++]; takePtr %= buff.length; count--; // signal other threads when the buffer // is no longer full if (count == buff.length - 1) { full.signalAll(); } // unlock after operations mutex.unlock(); }

PLDI 2013 3 / 31

SLIDE 6

Bounded Buffer: Common Bugs

public Object take() { mutex.lock(); if (count == 0) { empty.await(); } Object ret = buff[takePtr++]; takePtr %= buff.length; count--; if (count == buff.length - 1) { full.signal(); } mutex.unlock(); }

PLDI 2013 4 / 31

SLIDE 7

Bounded Buffer: Common Bugs

public Object take() { mutex.lock(); if while (count == 0) { empty.await(); } Object ret = buff[takePtr++]; takePtr %= buff.length; count--; if (count == buff.length - 1) { full.signal(); } mutex.unlock(); }

PLDI 2013 5 / 31

SLIDE 8

Bounded Buffer: Common Bugs

public Object take() { mutex.lock(); while (count == 0) { empty.await(); } Object ret = buff[takePtr++]; takePtr %= buff.length; count--; if (count == buff.length - 1) { full.signal() full.signalAll(); } mutex.unlock(); }

PLDI 2013 6 / 31

SLIDE 9

AutoSynch Bounded Buffer

public AutoSynch class BoundedBuffer { Object[] buff; int putPtr, takePtr, count; public BoundedBuffer(int n) { buff = new Object[n]; putPtr = takePtr = count = 0 ; } }

PLDI 2013 7 / 31

SLIDE 10

AutoSynch Bounded Buffer

public Object take() { waituntil (count > 0); Object ret = buff[takePtr++]; takePtr %= buff.length; count--; }

PLDI 2013 8 / 31

SLIDE 11

AutoSynch vs. Explicit Signaling

1 public AutoSynch class BoundedBuffer { 2

Object[] buff;

3

int putPtr, takePtr, count;

4

public BoundedBuffer(int n) {

5

buff = new Object[n];

6

putPtr = takePtr = count = 0 ;

7

}

8

public Object take() {

9

waituntil (count > 0);

10

Object ret = buff[takePtr++];

11

takePtr %= buff.length;

12

count--;

13

}

14 } 1 public class BoundedBuffer { 2

Object[] buff;

3

int putPtr, takePtr, count;

4

Lock mutex = new ReentrantLock();

5

Condition full = mutex.newCondition();

6

Condition empty = mutex.newCondition();

7

public BoundedBuffer(int n) {

8

buff = new Object[n];

9

putPtr = takePtr = count = 0;

10

}

11

public Object take() {

12

mutex.lock();

13

while (count == 0) {

14

empty.await();

15

}

16

Object ret = buff[takePtr++];

17

takePtr %= buff.length;

18

count--;

19

if (count == buff.length - 1) {

20

full.signalAll();

21

}

22

mutex.unlock();

23

}

24 } PLDI 2013 9 / 31

SLIDE 12

Related Work

The idea of automatic signaling was suggested by Hoare in [Hoa74], but rejected due to efficiency considerations The common belief: automatic signaling is extremely inefficient compared to explicit signaling [BFC95][BH05]

PLDI 2013 10 / 31

SLIDE 13

Design Principles

Reduce number of context switches and predicate evaluations Predicate evaluation Is essential in automatic signaling to decide which thread should be signaled Context switches Avoid signalAll calls Introduces redundant context switches Is required in explicit signaling

PLDI 2013 11 / 31

SLIDE 14

Parameterized Bounded Buffer

C1

buff.length = 64 count = 24 waituntil (count >= numi) waituntil (itemsi.length + count <= buff.length)

num1 = 32

items0.length = 18

C2 num2 = 40 P0 C3 C4

A producer puts a bunch of items into the buffer A consumer takes a number of items out of the buffer

local variable shared variable condition waiting queue

PLDI 2013 12 / 31

SLIDE 15

Parameterized Bounded Buffer

public Object[] take(int num) { mutex.lock(); while (count < num) { insufficientItem.await(); } Object[] ret = new Object[num]; for (int i = 0; i < num; i++) { ret[i] = buff[takePtr++]; takePtr %= buff.length; } count -= num; insufficientSpace.signalAll(); mutex.unlock(); return ret; }

PLDI 2013 13 / 31

SLIDE 16

Parameterized Bounded Buffer

public Object[] take(int num) { mutex.lock(); while (count < num) { insufficientItem.await(); } Object[] ret = new Object[num]; for (int i = 0; i < num; i++) { ret[i] = buff[takePtr++]; takePtr %= buff.length; } count -= num; insufficientSpace.signalAll(); mutex.unlock(); return ret; }

PLDI 2013 13 / 31

SLIDE 17

Parameterized Bounded Buffer

public Object[] take(int num) { mutex.lock(); while (count < num) { insufficientItem.await(); } Object[] ret = new Object[num]; for (int i = 0; i < num; i++) { ret[i] = buff[takePtr++]; takePtr %= buff.length; } count -= num; insufficientSpace.signalAll(); mutex.unlock(); return ret; }

PLDI 2013 13 / 31

SLIDE 18

AutoSynch Framework

AutoSynch Java Library AutoSynch Preprocessor AutoSynch Code Java Code Standard Java Compiler Java Bytecode

PLDI 2013 14 / 31

SLIDE 19

Outline

1

Introduction

2

Our Approach Evaluate Predicate: Closure Avoid signalAll Calls: Relay Signaling Rule Reduce Predicate Evaluations: Predicate Tagging

3

Results

4

Conclusions and Future Work

PLDI 2013 15 / 31

SLIDE 20

Closure

C1

buff.length = 64 count = 24 waituntil (count >= numi) waituntil (itemsi.length + count <= buff.length)

num1 = 32

items0.length = 18

C2 num2 = 40 P0 C3 C4

PLDI 2013 16 / 31

SLIDE 21

Closure

C1

buff.length = 64 count = 24 waituntil (count >= num1) items0.length = 18

C2 P0 C3 C4

waituntil (count >= num2) Replace local variables with the values at runtime

PLDI 2013 16 / 31

SLIDE 22

Closure

C1

buff.length = 64 count = 24 waituntil (count >= 32) items0.length = 18

C2 P0 C3 C4

waituntil (count >= 40) Replace local variables with the values at runtime

PLDI 2013 16 / 31

SLIDE 23

Closure

C1

buff.length = 64 count = 42 waituntil (count >= 32)

C2 P0 C3 C4

waituntil (count >= 40) The thread (P0) owning a monitor performs predicate evaluations for other threads (C1 and C2) before leaving Put 18 items

PLDI 2013 16 / 31

SLIDE 24

Outline

1

Introduction

2

Our Approach Evaluate Predicate: Closure Avoid signalAll Calls: Relay Signaling Rule Reduce Predicate Evaluations: Predicate Tagging

3

Results

4

Conclusions and Future Work

PLDI 2013 17 / 31

SLIDE 25

Relay Signaling Rule

C1

buff.length = 64 count = 42 waituntil (count >= 32)

C2 P0 C3 C4

waituntil (count >= 40)

PLDI 2013 18 / 31

SLIDE 26

Relay Signaling Rule

C1

buff.length = 64 count = 42 waituntil (count >= 32)

C2 P0 C3 C4

waituntil (count >= 40)

PLDI 2013 18 / 31

SLIDE 27

Relay Signaling Rule

C1

buff.length = 64 count = 10

C2 C3 C4

Take 32 items

PLDI 2013 18 / 31

SLIDE 28

Relay Signaling Rule

C2

buff.length = 64 count = 10

C3 C4

count < 40

PLDI 2013 18 / 31

SLIDE 29

Relay Signaling Rule

buff.length = 64 count = 10

C2 C3 C4

waituntil (count >= 40)

PLDI 2013 18 / 31

SLIDE 30

Relay Signaling Rule

Before exiting a monitor Signal at most one thread waiting on a condition that has become true

PLDI 2013 19 / 31

SLIDE 31

Relay Signaling Rule

Before exiting a monitor Signal at most one thread waiting on a condition that has become true

PLDI 2013 19 / 31

SLIDE 32

Outline

1

Introduction

2

Our Approach Evaluate Predicate: Closure Avoid signalAll Calls: Relay Signaling Rule Reduce Predicate Evaluations: Predicate Tagging

3

Results

4

Conclusions and Future Work

PLDI 2013 20 / 31

SLIDE 33

Predicate Tagging

Three types of predicates:

1

Equivalence predicate: x = 5, y = a

2

Threshold predicate: x > 8, y < b, z ≥ c + 3

3

None of above: x = 3, assertion.isTrue()

PLDI 2013 21 / 31

SLIDE 34

Predicate Tagging

Three types of tags: equivalence, threshold, and none Convert every predicate into disjunctive normal form (DNF) Assign a tag to every conjunction Assignment order: equivalence > threshold > none e.g.

((x < 5) ∧ (y = 3)) ∨ ((x > 5) ∧ foo1()) ∨ foo2()

◮ ((x < 5) ∧ (y = 3)) ◮ ((x > 5) ∧ foo1()) ◮ foo2()

PLDI 2013 22 / 31

SLIDE 35

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4

PLDI 2013 23 / 31

SLIDE 36

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4 3 5 8 9 Hashtable for the equivalence tag with shared expression x

PLDI 2013 23 / 31

SLIDE 37

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4 3 5 8 9 x = 8

PLDI 2013 23 / 31

SLIDE 38

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4 3 5 8 9 x = 8

PLDI 2013 23 / 31

SLIDE 39

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4 (6, >) (11, >) (6, ≥) Min-heap for the threshold tag with shared expression x x = 4

PLDI 2013 23 / 31

SLIDE 40

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4 (6, >) (11, >) (6, ≥) Min-heap for the threshold tag with shared expression x x = 6

PLDI 2013 23 / 31

SLIDE 41

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4 (6, >) (11, >) (6, ≥) Min-heap for the threshold tag with shared expression x x = 6

PLDI 2013 23 / 31

SLIDE 42

Predicate Tagging

x = 9 x ≤ 7 x = 8 x < 10 x > 6 x = 5 x > 11 x ≥ 6 x = 3 x ≤ 4 (7, ≤) (4, ≤) (10, <) Max-heap for the threshold tag with shared expression x

PLDI 2013 23 / 31

SLIDE 43

Evaluation

Four different signaling approaches: Explicit Using the Java explicit-signal Baseline Automatic-signal relying on only one condition variable. AutoSynch-T Using closure and relay signaling rule but predicate tagging AutoSynch Using closure, relay signaling rule and predicate tagging

PLDI 2013 24 / 31

SLIDE 44

Evaluation

Three types of problems: Shared predicate Depends only on shared variables bounded-buffer, H2O problem Complex predicate Depends on both shared and local variables readers-writers, round-robin access pattern signalAll Requires signalAll calls parameterized bounded-buffer

PLDI 2013 25 / 31

SLIDE 45

Evaluation: Shared Predicate

7.5 15 22.5 30 2 4 8 16 32 64 128 256 runtime (seconds) # consumers/producers

Explicit AutoSynch AutoSynch-T Baseline

Bounded Buffer Problem

75 150 225 300 2 4 8 16 32 64 128 256 runtime (seconds) # consumers/producers

Explicit AutoSynch AutoSynch-T Baseline

H2O Problem

PLDI 2013 26 / 31

SLIDE 46

Evaluation: Complex Predicate

7.5 15 22.5 30 2 4 8 16 32 64 128 256 runtime (seconds) # threads

Explicit AutoSynch AutoSynch-T

Round-Robin Access Pattern

7.5 15 22.5 30 2/10 4/20 8/40 16/80 32/160 64/320 runtime (seconds) # threads

Explicit AutoSynch AutoSynch-T

Readers-Writers Problem

PLDI 2013 27 / 31

SLIDE 47

Evaluation: signalAll

Parameterized Bounded Buffer (Requires signalAll calls)

3.75 7.5 11.25 15 2 4 8 16 32 64 128 256 runtime (seconds) # consumers

Explicit AutoSynch 750 1500 2250 3000 2 4 8 16 32 64 128 256 # context swiches (× 1000) # consumers Explicit AutoSynch

PLDI 2013 28 / 31

SLIDE 48

Evaluation: Workload

Workload Simulation Evaluate performance of AutoSynch under different workloads Perform other operations out of the monitor between every two monitor operations Report runtime ratio with respect to Explicit

PLDI 2013 29 / 31

SLIDE 49

Evaluation: Workload

1 2 3 4 1000 2000 3000 4000 5000 runtime (seconds) # delay time (microseconds)

Explicit AutoSynch AutoSynch-T

Round-Robin Access Pattern

0.75 1.5 2.25 3 1000 2000 3000 4000 5000 runtime (seconds) # delay time microseconds

Explicit AutoSynch AutoSynch-T

Readers-Writers Problem

PLDI 2013 30 / 31

SLIDE 50

Conclusions and Future Work

Conclusions We propose AutoSynch that supports automatic signaling with simple syntax AutoSynch is almost as efficient as the explicit-signal or even more efficient Future work Use the architecture information Implement AutoSynch directly in JVM

PLDI 2013 31 / 31