[537] Concurrency Bugs Chapter 32 Tyler Harter 10/22/14 Review - - PowerPoint PPT Presentation

537 concurrency bugs
SMART_READER_LITE
LIVE PREVIEW

[537] Concurrency Bugs Chapter 32 Tyler Harter 10/22/14 Review - - PowerPoint PPT Presentation

[537] Concurrency Bugs Chapter 32 Tyler Harter 10/22/14 Review Semaphores CVs vs. Semaphores CV rules of thumb: - Keep state in addition to CVs - Always do wait/signal with lock held - Whenever you acquire a lock, recheck state


slide-1
SLIDE 1

[537] Concurrency Bugs

Chapter 32 Tyler Harter 10/22/14

slide-2
SLIDE 2

Review Semaphores

slide-3
SLIDE 3

CV’s vs. Semaphores

CV rules of thumb:

  • Keep state in addition to CV’s
  • Always do wait/signal with lock held
  • Whenever you acquire a lock, recheck state
  • How do semaphores eliminate these needs?
slide-4
SLIDE 4

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue:

slide-5
SLIDE 5

Thread Queue: Signal Queue:

A

wait() Condition Variable (CV) Semaphore Thread Queue:

A

slide-6
SLIDE 6

Thread Queue: Signal Queue:

A

Condition Variable (CV) Semaphore Thread Queue:

A

slide-7
SLIDE 7

Thread Queue: Signal Queue:

A

Condition Variable (CV) Semaphore Thread Queue:

A

signal()

slide-8
SLIDE 8

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue: signal()

slide-9
SLIDE 9

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue:

slide-10
SLIDE 10

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue: signal()

signal

slide-11
SLIDE 11

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue:

signal

slide-12
SLIDE 12

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue:

signal

wait()

B B

slide-13
SLIDE 13

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue: wait()

B

slide-14
SLIDE 14

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue:

B

slide-15
SLIDE 15

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue:

B

may wait forever (if not careful)

slide-16
SLIDE 16

Thread Queue: Signal Queue: Condition Variable (CV) Semaphore Thread Queue:

B

may wait forever (if not careful) just use counter

slide-17
SLIDE 17

Join w/ CV

int done = 0; mutex_t m = MUTEX_INIT; cond_t c = COND_INIT; void *child(void *arg) { printf(“child\n”); Mutex_lock(&m); done = 1; cond_signal(&c); Mutex_unlock(&m); }

  • int main(int argc, char *argv[]) {

pthread_t c; printf(“parent: begin\n”); Pthread_create(c, NULL, child, NULL); Mutex_lock(&m); while(done == 0) Cond_wait(&c, &m); Mutex_unlock(&m); printf(“parent: end\n”); }

slide-18
SLIDE 18

Join w/ CV

int done = 0; mutex_t m = MUTEX_INIT; cond_t c = COND_INIT; void *child(void *arg) { printf(“child\n”); Mutex_lock(&m); done = 1; cond_signal(&c); Mutex_unlock(&m); }

  • int main(int argc, char *argv[]) {

pthread_t c; printf(“parent: begin\n”); Pthread_create(c, NULL, child, NULL); Mutex_lock(&m); while(done == 0) Cond_wait(&c, &m); Mutex_unlock(&m); printf(“parent: end\n”); }

extra state and mutex locks around state/signal while loop for checking state

slide-19
SLIDE 19

Join w/ CV

int done = 0; mutex_t m = MUTEX_INIT; cond_t c = COND_INIT; void *child(void *arg) { printf(“child\n”); Mutex_lock(&m); done = 1; cond_signal(&c); Mutex_unlock(&m); }

  • int main(int argc, char *argv[]) {

pthread_t c; printf(“parent: begin\n”); Pthread_create(c, NULL, child, NULL); Mutex_lock(&m); while(done == 0) Cond_wait(&c, &m); Mutex_unlock(&m); printf(“parent: end\n”); }

slide-20
SLIDE 20

Join w/ Semaphore

sem_t s; void *child(void *arg) { printf(“child\n”); sem_post(&s); }

  • int main(int argc, char *argv[]) {

sem_init(&s, 0); pthread_t c; printf(“parent: begin\n”); Pthread_create(c, NULL, child, NULL); sem_wait(&s); printf(“parent: end\n”); }

slide-21
SLIDE 21

Semaphore Uses

For the following init’s, what might the use be?

  • (a) sem_init(&s, 0);
  • (b) sem_init(&s, 1);
  • (c) sem_init(&s, N);
slide-22
SLIDE 22

Producer/Consumer

How many semaphores do we need?

slide-23
SLIDE 23

Producer/Consumer

How many semaphores do we need?

  • Sem_init(&empty, max); // max are empty

Sem_init(&full, 0); // 0 are full Sem_init(&mutex, 1); // mutex

slide-24
SLIDE 24

Producer/Consumer

void *producer(void *arg) { for (int i = 0; i < loops; i++) { Sem_wait(&empty); Sem_wait(&mutex); do_fill(i); Sem_post(&mutex); Sem_post(&full); } } void *consumer(void *arg) { while (1) { Sem_wait(&full); Sem_wait(&mutex); tmp = do_get(); Sem_post(&mutex); Sem_post(&empty); printf("%d\n", tmp); } }

slide-25
SLIDE 25

Producer/Consumer

void *producer(void *arg) { for (int i = 0; i < loops; i++) { Sem_wait(&empty); Sem_wait(&mutex); do_fill(i); Sem_post(&mutex); Sem_post(&full); } } void *consumer(void *arg) { while (1) { Sem_wait(&full); Sem_wait(&mutex); tmp = do_get(); Sem_post(&mutex); Sem_post(&empty); printf("%d\n", tmp); } }

Mutual Exclusion

slide-26
SLIDE 26

Producer/Consumer

void *producer(void *arg) { for (int i = 0; i < loops; i++) { Sem_wait(&empty); Sem_wait(&mutex); do_fill(i); Sem_post(&mutex); Sem_post(&full); } } void *consumer(void *arg) { while (1) { Sem_wait(&full); Sem_wait(&mutex); tmp = do_get(); Sem_post(&mutex); Sem_post(&empty); printf("%d\n", tmp); } }

Signaling

slide-27
SLIDE 27

Concurrency Bugs

slide-28
SLIDE 28

Concurrency in Medicine: Therac-25

“The accidents occurred when the high-power electron beam was activated instead of the intended low power beam, and without the beam spreader plate rotated into place. Previous models had hardware interlocks in place to prevent this, but Therac-25 had removed them, depending instead on software interlocks for safety. The software interlock could fail due to a race condition.”

Source: http://en.wikipedia.org/wiki/Therac-25

slide-29
SLIDE 29

Concurrency in Medicine: Therac-25

“The accidents occurred when the high-power electron beam was activated instead of the intended low power beam, and without the beam spreader plate rotated into place. Previous models had hardware interlocks in place to prevent this, but Therac-25 had removed them, depending instead on software interlocks for safety. The software interlock could fail due to a race condition.”

  • “…in three cases, the injured patients later died.”

Source: http://en.wikipedia.org/wiki/Therac-25

slide-30
SLIDE 30

Concurrency in Medicine: Therac-25

“The accidents occurred when the high-power electron beam was activated instead of the intended low power beam, and without the beam spreader plate rotated into place. Previous models had hardware interlocks in place to prevent this, but Therac-25 had removed them, depending instead on software interlocks for safety. The software interlock could fail due to a race condition.”

  • “…in three cases, the injured patients later died.”
  • Getting concurrency right can sometimes save lives!

Source: http://en.wikipedia.org/wiki/Therac-25

slide-31
SLIDE 31

Concurrency Bugs are Common and Various

Lu etal. Study:

  • For four major projects,

search for concurrency bugs among >500K bug

  • reports. Analyze small

sample to identify common types of concurrency bugs.

Bugs 15 30 45 60 MySQL Apache Mozilla OpenOffice

Atomicity Order Deadlock Other

Source: http://pages.cs.wisc.edu/~shanlu/paper/asplos122-lu.pdf

slide-32
SLIDE 32

Concurrency Bugs are Common and Various

Lu etal. Study:

  • For four major projects,

search for concurrency bugs among >500K bug

  • reports. Analyze small

sample to identify common types of concurrency bugs.

Bugs 15 30 45 60 MySQL Apache Mozilla OpenOffice

Atomicity Order Deadlock Other

Source: http://pages.cs.wisc.edu/~shanlu/paper/asplos122-lu.pdf

slide-33
SLIDE 33

Atomicity: MySQL

Thread 1:

  • if (thd->proc_info) {

… fputs(thd->proc_info, …); … }

What’s wrong?

Thread 2:

  • thd->proc_info = NULL;
slide-34
SLIDE 34

Atomicity: MySQL

Thread 1:

  • pthread_mutex_lock(&lock);

if (thd->proc_info) { … fputs(thd->proc_info, …); … } pthread_mutex_unlock(&lock); Thread 2:

  • pthread_mutex_lock(&lock);

thd->proc_info = NULL; pthread_mutex_unlock(&lock);

slide-35
SLIDE 35

Concurrency Bugs are Common and Various

Lu etal. Study:

  • For four major projects,

search for concurrency bugs among >500K bug

  • reports. Analyze small

sample to identify common types of concurrency bugs.

Bugs 15 30 45 60 MySQL Apache Mozilla OpenOffice

Atomicity Order Deadlock Other

Source: http://pages.cs.wisc.edu/~shanlu/paper/asplos122-lu.pdf

slide-36
SLIDE 36

Concurrency Bugs are Common and Various

Lu etal. Study:

  • For four major projects,

search for concurrency bugs among >500K bug

  • reports. Analyze small

sample to identify common types of concurrency bugs.

Bugs 15 30 45 60 MySQL Apache Mozilla OpenOffice

Atomicity Order Deadlock Other

Source: http://pages.cs.wisc.edu/~shanlu/paper/asplos122-lu.pdf

slide-37
SLIDE 37

Thread 1:

  • void init() {

… mThread = PR_CreateThread(mMain, …); … } Thread 2:

  • void mMain(…) {

… mState = mThread->State; … }

Ordering: Mozilla

slide-38
SLIDE 38

Thread 1:

  • void init() {

… mThread = PR_CreateThread(mMain, …);

  • pthread_mutex_lock(&mtLock);

mtInit = 1; pthread_cond_signal(&mtCond); pthread_mutex_unlock(&mtLock); … } Thread 2:

  • void mMain(…) {

… Mutex_lock(&mtLock); while(mtInit == 0) Cond_wait(&mtCond, &mtLock); Mutex_unlock(&mtLock);

  • mState = mThread->State;

… }

Ordering: Mozilla

slide-39
SLIDE 39

Concurrency Bugs are Common and Various

Lu etal. Study:

  • For four major projects,

search for concurrency bugs among >500K bug

  • reports. Analyze small

sample to identify common types of concurrency bugs.

Bugs 15 30 45 60 MySQL Apache Mozilla OpenOffice

Atomicity Order Deadlock Other

Source: http://pages.cs.wisc.edu/~shanlu/paper/asplos122-lu.pdf

slide-40
SLIDE 40

Concurrency Bugs are Common and Various

Lu etal. Study:

  • For four major projects,

search for concurrency bugs among >500K bug

  • reports. Analyze small

sample to identify common types of concurrency bugs.

Bugs 15 30 45 60 MySQL Apache Mozilla OpenOffice

Atomicity Order Deadlock Other

Source: http://pages.cs.wisc.edu/~shanlu/paper/asplos122-lu.pdf

slide-41
SLIDE 41

Deadlock

Cooler name: the deadly embrace (Dijkstra).

slide-42
SLIDE 42

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

slide-43
SLIDE 43

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A

slide-44
SLIDE 44

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B

slide-45
SLIDE 45

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B

slide-46
SLIDE 46

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B

who goes?

slide-47
SLIDE 47

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B

slide-48
SLIDE 48

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B

slide-49
SLIDE 49

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

slide-50
SLIDE 50

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B C D

slide-51
SLIDE 51

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B C D

who goes?

slide-52
SLIDE 52

Deadlock

Cooler name: the deadly embrace (Dijkstra).

STOP STOP STOP STOP

A B C D

who goes? Deadlock!

slide-53
SLIDE 53

Boring Code Example

Thread 1 [RUNNING]:

  • lock(&A);

lock(&B) Thread 2 [RUNNABLE]:

  • lock(&B);

lock(&A)

slide-54
SLIDE 54

Boring Code Example

Thread 1 [RUNNING]:

  • lock(&A);

lock(&B) Thread 2 [RUNNABLE]:

  • lock(&B);

lock(&A)

slide-55
SLIDE 55

Boring Code Example

Thread 1 [RUNNABLE]:

  • lock(&A);

lock(&B) Thread 2 [RUNNING]:

  • lock(&B);

lock(&A)

slide-56
SLIDE 56

Boring Code Example

Thread 1 [RUNNABLE]:

  • lock(&A);

lock(&B) Thread 2 [RUNNING]:

  • lock(&B);

lock(&A)

slide-57
SLIDE 57

Boring Code Example

Thread 1 [RUNNING]:

  • lock(&A);

lock(&B) Thread 2 [RUNNABLE]:

  • lock(&B);

lock(&A)

slide-58
SLIDE 58

Boring Code Example

Thread 1 [SLEEPING]:

  • lock(&A);

lock(&B) Thread 2 [RUNNABLE]:

  • lock(&B);

lock(&A)

slide-59
SLIDE 59

Boring Code Example

Thread 1 [SLEEPING]:

  • lock(&A);

lock(&B) Thread 2 [RUNNING]:

  • lock(&B);

lock(&A)

slide-60
SLIDE 60

Boring Code Example

Thread 1 [SLEEPING]:

  • lock(&A);

lock(&B) Thread 2 [SLEEPING]:

  • lock(&B);

lock(&A)

slide-61
SLIDE 61

Boring Code Example

Thread 1 [SLEEPING]:

  • lock(&A);

lock(&B) Thread 2 [SLEEPING]:

  • lock(&B);

lock(&A)

Deadlock!

slide-62
SLIDE 62

Circular Dependency

Lock A Lock B Thread 1 Thread 2

holds holds wanted by wanted by

slide-63
SLIDE 63

Boring Code Example

Thread 1 [RUNNING]:

  • lock(&A);

lock(&B) Thread 2 [RUNNABLE]:

  • lock(&A);

lock(&B)

slide-64
SLIDE 64

Boring Code Example

Thread 1 [RUNNING]:

  • lock(&A);

lock(&B) Thread 2 [RUNNABLE]:

  • lock(&A);

lock(&B)

Can’t deadlock.

slide-65
SLIDE 65

Non-circular Dependency (fine)

Lock A Lock B Thread 1 Thread 2

holds wanted by wanted by

slide-66
SLIDE 66

What’s Wrong?

set_t *set_union (set_t *s1, set_t *s2) { set_t *rv = Malloc(sizeof(*rv)); Mutex_lock(&s1->lock); Mutex_lock(&s2->lock);

  • for(int i=0; i<s1->len; i++) {

if(set_contains(s2, s1->items[i]) set_add(rv, s1->items[i]);

  • Mutex_unlock(&s2->lock);

Mutex_unlock(&s1->lock); }

slide-67
SLIDE 67

Encapsulation

Modularity can make it harder to see deadlocks.

Thread 1:

  • rv = set_union(setA, setB);

Thread 2:

  • rv = set_union(setB, setA);
slide-68
SLIDE 68

Encapsulation

Modularity can make it harder to see deadlocks.

  • Solutions?

Thread 1:

  • rv = set_union(setA, setB);

Thread 2:

  • rv = set_union(setB, setA);
slide-69
SLIDE 69

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
slide-70
SLIDE 70

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-71
SLIDE 71

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-72
SLIDE 72

Mutual Exclusion

Def:

  • Threads claim exclusive control of resources that

they require (e.g., thread grabs a lock).

slide-73
SLIDE 73

Wait-Free Algorithms

Strategy: eliminate lock use.

  • Assume we have:

int CompAndSwap(int *addr, int expected, int new) 0: fail, 1: success

void add_v2(int *val, int amt) { do { int old = *value; } while(!CompAndSwap(val, old, old+amt); } void add_v1(int *val, int amt) { Mutex_lock(&m); *val += amt; Mutex_unlock(&m); }

slide-74
SLIDE 74

Wait-Free Algorithms

Strategy: eliminate lock use.

  • Assume we have:

int CompAndSwap(int *addr, int expected, int new)

void insert(int val) { node_t *n = Malloc(sizeof(*n)); n->val = val; lock(&m); n->next = head; head = n; unlock(&m); }

eliminate the lock!

slide-75
SLIDE 75

Wait-Free Algorithms

Strategy: eliminate lock use.

  • Assume we have:

int CompAndSwap(int *addr, int expected, int new)

void insert(int val) { node_t *n = Malloc(sizeof(*n)); n->val = val; do { n->next = head; } while (!CompAndSwap(&head, n->next, n)); }

slide-76
SLIDE 76

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-77
SLIDE 77

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-78
SLIDE 78

Hold-and-Wait

Def:

  • Threads hold resources allocated to them (e.g., locks

they have already acquired) while waiting for additional resources (e.g., locks they wish to acquire).

slide-79
SLIDE 79

Eliminate Hold-and-Wait

Strategy: acquire all locks atomically once (cannot acquire again until all have been released).

  • For this, use a meta lock, like this:
  • lock(&meta);

lock(&L1); lock(&L2); … unlock(&meta);

slide-80
SLIDE 80

Eliminate Hold-and-Wait

Strategy: acquire all locks atomically once (cannot acquire again until all have been released).

  • For this, use a meta lock, like this:
  • lock(&meta);

lock(&L1); lock(&L2); … unlock(&meta);

Discuss:

  • how should unlock work?
  • disadvantages?
slide-81
SLIDE 81

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-82
SLIDE 82

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-83
SLIDE 83

No preemption

Def:

  • Resources (e.g., locks) cannot be forcibly removed

from threads that are holding them.

slide-84
SLIDE 84

Support Preemption

Strategy: if we can’t get what we want, release what we have.

  • top:

lock(A); if (trylock(B) == -1) { unlock(A); goto top; } …

slide-85
SLIDE 85

Support Preemption

Strategy: if we can’t get what we want, release what we have.

  • top:

lock(A); if (trylock(B) == -1) { unlock(A); goto top; } …

Discuss:

  • disadvantages?
slide-86
SLIDE 86

Support Preemption

Strategy: if we can’t get what we want, release what we have.

  • top:

lock(A); if (trylock(B) == -1) { unlock(A); goto top; } …

Discuss:

  • disadvantages? (livelock)
slide-87
SLIDE 87

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-88
SLIDE 88

Deadlock Theory

Deadlocks can only happen with these four conditions:

  • mutual exclusion
  • hold-and-wait
  • no preemption
  • circular wait
  • Eliminate deadlock by eliminating one condition.
slide-89
SLIDE 89

Circular Wait

Def:

  • There exists a circular chain of threads such that each

thread holds a resource (e.g., lock) being requested by next thread in the chain.

slide-90
SLIDE 90

Eliminating Circular Wait

Strategy:

  • decide which locks should be acquired before others
  • if A before B, never acquire A if B is already held!
  • document this, and write code accordingly
slide-91
SLIDE 91

Lock Ordering in Linux

In linux-3.2.51/include/linux/fs.h /* * inode->i_mutex nesting subclasses for the lock * validator: * * 0: the object of the current VFS operation * 1: parent * 2: child/target * 3: quota file * * The locking order between these classes is * parent -> child -> normal -> xattr -> quota */

slide-92
SLIDE 92

Lock Ordering in Linux

In linux-3.2.51/include/linux/fs.h /* * inode->i_mutex nesting subclasses for the lock * validator: * * 0: the object of the current VFS operation * 1: parent * 2: child/target * 3: quota file * * The locking order between these classes is * parent -> child -> normal -> xattr -> quota */

slide-93
SLIDE 93

Linux lockdep Module

Idea:

  • track order in which locks are acquired
  • give warning if circular
  • Extremely useful for debugging!
slide-94
SLIDE 94

Example Output

=========================================== [ INFO: possible circular locking dependency detected ] 3.1.0rc4test00131g9e79e3e #2 insmod/1357 is trying to acquire lock: (lockC){+.+...}, at: [<ffffffffa000d438>] pick_test+0x2a2/0x892 [lockdep_test]

  • but task is already holding lock:

(lockB){+.+...}, at: [<ffffffffa000d42c>] pick_test+0x296/0x892 [lockdep_test]

Source: http://www.linuxplumbersconf.org/2011/ocw/sessions/153

slide-95
SLIDE 95

Summary

Concurrency is hard, encapsulation makes it harder!

  • Have a strategy to avoid deadlock and stick to it.
  • Choosing a lock order is probably most practical.
  • When possible, avoid concurrent solutions altogether!
slide-96
SLIDE 96

Announcements

Office hours: 1pm in office.

  • p3a due Friday.
  • Start p3b!
  • Thursday discussion: hand back and discuss test.