Deadlock CS 450 : Operating Systems Michael Lee <lee@iit.edu> - - PowerPoint PPT Presentation

deadlock
SMART_READER_LITE
LIVE PREVIEW

Deadlock CS 450 : Operating Systems Michael Lee <lee@iit.edu> - - PowerPoint PPT Presentation

Deadlock CS 450 : Operating Systems Michael Lee <lee@iit.edu> deadlock |dedlk| noun 1 [in sing. ] a situation, typically one involving opposing parties, in which no progress can be made : an attempt to break the deadlock . - New


slide-1
SLIDE 1

CS 450 : Operating Systems Michael Lee <lee@iit.edu>

Deadlock

slide-2
SLIDE 2
  • New Oxford American Dictionary

deadlock |ˈdedˌläk|

noun 1 [in sing. ] a situation, typically one involving opposing parties, in which no progress can be made : an attempt to break the deadlock.

slide-3
SLIDE 3

Traffic Gridlock

slide-4
SLIDE 4

Sofuware Gridlock

mtx_A.lock() mtx_B.lock() # critical section mtx_B.unlock() mtx_A.unlock() mtx_B.lock() mtx_A.lock() # critical section mtx_B.unlock() mtx_A.unlock()

slide-5
SLIDE 5

§ Necessary conditions for Deadlock

slide-6
SLIDE 6

i.e., what conditions need to be true (of some system) so that deadlock is possible? (not the same as causing deadlock!)

slide-7
SLIDE 7
  • I. Mutual Exclusion
  • resources can be held by processes in 


a mutually exclusive manner

slide-8
SLIDE 8
  • II. Hold & Wait
  • while holding one resource (in mutex), 


a process can request another resource

slide-9
SLIDE 9
  • III. No Preemption
  • one process can not force another to give

up a resource; i.e., releasing is voluntary

slide-10
SLIDE 10
  • IV. Circular Wait
  • resource requests and allocations create a

cycle in the resource allocation graph

slide-11
SLIDE 11

§ Resource Allocation Graphs

slide-12
SLIDE 12

Process : Resource : Request : Allocation :

slide-13
SLIDE 13

P1 P2 P3

R1 R2 R3

Circular wait is absent = no deadlock

slide-14
SLIDE 14

All 4 necessary conditions in place; Deadlock!

P1 P2 P3

R1 R2 R3

slide-15
SLIDE 15

in a system with only single-instance resources, necessary conditions ⇔ deadlock

slide-16
SLIDE 16

Cycle without Deadlock!

P1 P2 P4

R1 R2

P3

slide-17
SLIDE 17

not practical (or always possible) to detect deadlock using a graph — but convenient to help us 
 reason about things

slide-18
SLIDE 18

§ Approaches to Dealing with Deadlock

slide-19
SLIDE 19
  • 1. Ostrich algorithm

(ignore it and hope it never happens)

  • 2. Prevent it from occurring (avoidance)
  • 3. Detection & recovery
slide-20
SLIDE 20

§ Deadlock avoidance

slide-21
SLIDE 21

¶ Approach 1: eliminate necessary condition(s)

slide-22
SLIDE 22

Mutual exclusion?

  • eliminating mutex requires that all

resources be shareable

  • when not possible (e.g., disk, printer), can

sometimes use a spooler process

slide-23
SLIDE 23

but what about semaphores, file locks, etc.?

  • not all resources are spoolable
  • cannot eliminate mutex in general
slide-24
SLIDE 24

Hold & Wait?

  • elimination requires resource requests to be

all-or-nothing affair

  • if currently holding, needs to release all

before requesting more

slide-25
SLIDE 25

in practice, very inefficient 
 & starvation is possible! — cannot eliminate hold & wait

slide-26
SLIDE 26

No preemption?

  • alternative: allow process to preempt each
  • ther and “steal” resources
  • mutex locks can not be counted on to

stay locked!

  • in practice, cannot eliminate this either!
slide-27
SLIDE 27

Circular Wait is where it’s at.

slide-28
SLIDE 28

simple mechanism to prevent wait cycles:

  • order all resources
  • require that processes request 


resources in order

slide-29
SLIDE 29

but impractical — can not count on processes to need resources in a certain order … and forcing a certain order can 
 result in poor resource utilization

slide-30
SLIDE 30

¶ Approach 2: intelligently prevent circular wait

slide-31
SLIDE 31

possible to create a cycle (with one edge)?

P1 P2

R1 R2

slide-32
SLIDE 32

possible to create a cycle (with one edge)?

P1 P2

R1 R2

slide-33
SLIDE 33

P1 P2

R1 R2

it’s quite possible that P2 won’t need R2, or maybe P2 will release R1 before requesting R2, but we don’t know if/when…

slide-34
SLIDE 34

preventing circular wait means avoiding a state where a cycle is an imminent possibility

P1 P2

R1 R2

slide-35
SLIDE 35

to predict deadlock, we can ask processes to “claim” all resources they need in advance

P1 P2

R1 R2

slide-36
SLIDE 36

P1 P2

R1 R2

graph with “claim edges”

slide-37
SLIDE 37

P1 P2

R1 R2

P2 requests R1

slide-38
SLIDE 38

convert to allocation edge; no cycle

P1 P2

R1 R2

slide-39
SLIDE 39

P1 requests R2

P1 P2

R1 R2

slide-40
SLIDE 40

if we convert to an allocation edge ...

P1 P2

R1 R2

slide-41
SLIDE 41

cycle involving claim edges!

P1 P2

R1 R2

slide-42
SLIDE 42

means that if processes fulfill their claims, we cannot avoid deadlock!

P1 P2

R1 R2

slide-43
SLIDE 43

i.e., P1 → R1, P2 → R2

P1 P2

R1 R2

slide-44
SLIDE 44

P1 → R2 should be blocked by the kernel, even if it can be satisfied with available resources

P1 P2

R1 R2

slide-45
SLIDE 45

this is a “safe” state … i.e., no way a process can cause deadlock directly (i.e., without OS alloc)

P1 P2

R1 R2

slide-46
SLIDE 46

idea: if granting an incoming request would create a cycle in a graph with claim edges, deny that request (i.e., block the process) — approve later when no cycle would occur

slide-47
SLIDE 47

P2 releases R1

P1 P2

R1 R2

slide-48
SLIDE 48

now ok to approve P1 → R2 (unblock P1)

P1 P2

R1 R2

slide-49
SLIDE 49

should we still deny P1 → R2?

P1 P2

R1 R2

P3

slide-50
SLIDE 50

problem: this approach may incorrectly predict imminent deadlock when resources with multiple instances are involved

slide-51
SLIDE 51

requires a more general definition of “safe state”

P1 P2

R1 R2

P3

slide-52
SLIDE 52

¶ Banker’s Algorithm (by Edsger Dijkstra)

slide-53
SLIDE 53

basic idea:

  • define how to recognize system “safety”
  • whenever a resource request arrives:
  • simulate allocation & check state
  • allocate iff simulated state is safe
slide-54
SLIDE 54

some assumptions we need to make:

  • 1. a non-blocked process holding a resource

will eventually release it

  • 2. it is known a priori how many instances of

each resource a given process needs

slide-55
SLIDE 55
  • Tiere exists a sequence <P1, P2, ..., Pn>,

where each Pk can complete with:

  • currently available (free) resources
  • resources held by P1...Pk-1

Safe State

slide-56
SLIDE 56

Processes P1…Pn, Resources R1…Rm: available[j] = num of Rj available max[i][j] = max num of Rj required by Pi allocated[i][j] = num of Rj allocated to Pi need[i][j] = max[i][j] - allocated[i][j]

Data Structures

slide-57
SLIDE 57
  • 1. finish[i] ← false ∀ i ∈ 1…n


work ← available

  • 2. Find i : finish[i] = false & need[i][j] ≤ work[j] ∀ j 


If none, go to 4.

  • 3. work ← work + allocated[i]; finish[i] ← true


Go to 2.

  • 4. Safe state iff finish[i] = true ∀ i

Safety Algorithm

slide-58
SLIDE 58

incoming request represented by request array request[j] = num of resource Rj requested (a process can require multiple instances of more than one resource at a time)

slide-59
SLIDE 59
  • 1. If request[j] ≤ need[k][j] ∀ j, continue, else error
  • 2. If request[j] ≤ available[j] ∀ j, continue, else block
  • 3. Run safety algorithm with:
  • available ← available - request
  • allocated[k] ← allocated[k] + request
  • need[k] ← need[k] - request

Processing Request from Pk:

slide-60
SLIDE 60

if safety algorithm fails, do not allocate, even if resources are available!

— either deny request or block caller

slide-61
SLIDE 61

A B C P0 7 5 3 P1 3 2 2 P2 9 2 P3 2 2 2 P4 4 3 3 A B C 1 2 3 2 2 1 1 2

Allocated

A B C 3 3 2

Available

A B C 7 4 3 1 2 2 6 1 1 4 3 1

Need Max

  • Safe state: <P1, P3, P0, P2, P4>
  • P3 requests <0, 0, 1>
  • P0 requests <0, 3, 0>

3 resources: A (10), B (5), C (7)

slide-62
SLIDE 62

¶ Banker’s algorithm discussion

slide-63
SLIDE 63
  • 1. Efficiency?
  • how fast is it?
  • how ofuen is it run?
slide-64
SLIDE 64
  • 1. finish[i] ← false ∀ i ∈ 1…n


work ← available

  • 2. Find i : finish[i] = false & need[i][j] ≤ work[j] ∀ j 


If none, go to 4.

  • 3. work ← work + allocated[i]; finish[i] ← true


Go to 2.

  • 4. Safe state iff finish[i] = true ∀ i

for up to N processes, check M resources loop for N processes

O(N∙N∙M) = O(N2∙M)

slide-65
SLIDE 65

how ofuen to run?

  • need to run on every resource request
  • can’t relax this, otherwise system might

become unsafe!

slide-66
SLIDE 66
  • 2. Assumption #1: processes will eventually

release resources

slide-67
SLIDE 67
  • assuming well-behaved processes
  • not 100% realistic, but what else to do?
slide-68
SLIDE 68
  • 3. Assumption #2: a priori knowledge of max

resource requirements

slide-69
SLIDE 69
  • highly unrealistic
  • process resource needs are dynamic!
  • without this assumption, deadlock

prevention becomes much harder…

slide-70
SLIDE 70

¶ Aside: decision problems, complexity theory & the halting problem

slide-71
SLIDE 71

a decision problem

input decision algorithm yes no

slide-72
SLIDE 72

e.g., is X evenly divisible by Y? is N a prime number? does string S contain pattern P?

slide-73
SLIDE 73

a lot of important problems can be reworded as decision problems: e.g., traveling salesman problem (find the shortest tour through a graph) ⇒ is there a tour shorter than L?

slide-74
SLIDE 74

complexity theory classifies decision problems by their difficulty, and draws relationships between those problems & classes

slide-75
SLIDE 75

class P: solutions to these problems can be found in polynomial time (e.g., O(N2))

slide-76
SLIDE 76

class NP: solutions to these problems can be verified in polynomial time — but finding solutions may be harder!
 (i.e., superpolynomial)

slide-77
SLIDE 77

big open problem in CS: P = NP?

slide-78
SLIDE 78

why is this important?

slide-79
SLIDE 79

all problems in NP can be reduced to another problem in the NP-complete class,
 and all problems in NP-complete can be reduced to each other)

slide-80
SLIDE 80

if you can prove that any NP-complete problem is in P, then all NP problems are in P! (more motivation: you also win $1M)

slide-81
SLIDE 81

if you can prove P ≠ NP, we can stop looking for fast solutions to many hard problems (motivation: you still win $1M)

slide-82
SLIDE 82

a decision problem

input decision algorithm yes no

slide-83
SLIDE 83

deadlock prevention

resources available, request & allocations, running programs will the system deadlock? yes no

slide-84
SLIDE 84

the halting problem

description of a program and its inputs will the system halt (or run forever)? yes no

slide-85
SLIDE 85

e.g., write the function: halt(f) ¡ ¡bool ¡

  • return true if f will halt
  • return false otherwise
slide-86
SLIDE 86

def ¡halt(f): ¡ ¡ ¡ ¡ ¡# ¡your ¡code ¡here ¡ def ¡loop_forever(): ¡ ¡ ¡ ¡ ¡while ¡True: ¡pass ¡ def ¡just_return(): ¡ ¡ ¡ ¡ ¡return ¡True halt(loop_forever) ¡ ¡# ¡=> ¡False halt(just_return) ¡ ¡ ¡# ¡=> ¡True

slide-87
SLIDE 87

#$^%&#@!!!

def ¡halt(f): ¡ ¡ ¡ ¡ ¡# ¡your ¡code ¡here ¡ def ¡gotcha(): ¡ ¡ ¡ ¡ ¡if ¡halt(gotcha): ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡loop_forever() ¡ ¡ ¡ ¡ ¡else: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡just_return() halt(gotcha)

slide-88
SLIDE 88
slide-89
SLIDE 89

proof by contradiction: 
 the halting problem is undecidable

slide-90
SLIDE 90

generally speaking, deadlock prediction can be reduced to the halting problem

slide-91
SLIDE 91

i.e., determining if a system is deadlocked is, in general, provably impossible!!

slide-92
SLIDE 92

§ Deadlock Detection & Recovery

slide-93
SLIDE 93

¶ Basic approach: cycle detection

slide-94
SLIDE 94

e.g., Tarjan’s strongly connected components algorithm; O(|V|+|E|)

slide-95
SLIDE 95

need only run on mutex resources and “involved” processes … still, would be nice to reduce the 
 size of the resource allocation graph

slide-96
SLIDE 96

actual resources involved are unimportant —

  • nly care about relationships between processes
slide-97
SLIDE 97

P1 P2 P3 P5 P4

Resource Allocation Graph

slide-98
SLIDE 98

P1 P2 P3 P4 P5

“Wait-for” Graph

slide-99
SLIDE 99

Substantial optimization!

P1 P2 P3 P5 P4 P1 P2 P3 P4 P5

slide-100
SLIDE 100

… but not very useful when we have multi- instance resources (false positives are likely)

slide-101
SLIDE 101

¶ Deadlock detection algorithm

slide-102
SLIDE 102

important: do away with requirement of
 a priori resource need declarations

slide-103
SLIDE 103

new assumption: processes can complete with
 current allocation + all pending requests i.e., no future requests

unrealistic! (but we have no crystal ball)

slide-104
SLIDE 104

keep track of all pending requests in:

request[i][j] = num of Rj requested by Pi

slide-105
SLIDE 105
  • 1. finish[i] ← all_nil?(allocated[i]) ∀ i ∈ 1…n


work ← available

  • 2. Find i: finish[i] = false & request[i][j] ≤ work[j] ∀ j 


If none, go to 4.

  • 3. work ← work + allocated[i]; finish[i] ← true


Go to 2.

  • 4. If finish[i] ≠ true ∀ i, system is deadlocked. 


Detection algorithm

ignore processes 
 that aren’t allocated anything

slide-106
SLIDE 106

A B C P0 1 P1 2 P2 3 3 P3 2 1 1 P4 2 A B C 2 2 1 2 Allocated Request A B C Available

3 resources: A (7), B (2), C (6)

  • Not deadlocked: <P0, P2, P1, P3, P4>
  • P2 requests <0, 0, 1>
slide-107
SLIDE 107

¶ Discussion

slide-108
SLIDE 108
  • 1. Speed?
slide-109
SLIDE 109
  • 1. finish[i] ← all_nil?(allocated[i]) ∀ i ∈ 1…n


work ← available

  • 2. Find i: finish[i] = false & request[i][j] ≤ work[j] ∀ j 


If none, go to 4.

  • 3. work ← work + allocated[i]; finish[i] ← true


Go to 2.

  • 4. If finish[i] ≠ true ∀ i, system is deadlocked.

Still O(N∙N∙M) = O(N2∙M)

slide-110
SLIDE 110
  • 2. When to run?
slide-111
SLIDE 111

… as seldom as possible! tradeoff: the longer we wait between checks, the messier resulting deadlocks might be

slide-112
SLIDE 112
  • 3. Recovery?
slide-113
SLIDE 113

One or more processes must release resources:

  • via forced termination
  • resource preemption
  • system rollback

cool, but how?

slide-114
SLIDE 114

Resource preemption only possible with certain types of resources

  • no intermediate state
  • can be taken away and returned (while

blocking process)

  • e.g., mapped VM page
slide-115
SLIDE 115

Rollback requires process checkpointing:

  • periodically autosave/reload process state
  • cost depends on process complexity
  • easier for special-purpose systems
slide-116
SLIDE 116

How many to terminate/preempt/rollback?

  • at least one for each disjoint cycle
  • non-trivial to determine how many cycles

and which processes!

slide-117
SLIDE 117

Selection criteria (who to kill) = minimize cost

  • # processes
  • completed run-time
  • # resources held / needed
  • arbitrary priority (no killing system

processes!)

slide-118
SLIDE 118

Dealing with deadlock is hard!

slide-119
SLIDE 119

Moral of this and the concurrency material:

  • be careful with concurrent resource sharing
  • use concurrency mechanisms that avoid

explicit locking whenever possible!