DistributedAlgorithms Setofsharedobjects(variables) - - PDF document

distributed algorithms
SMART_READER_LITE
LIVE PREVIEW

DistributedAlgorithms Setofsharedobjects(variables) - - PDF document

2/23/09 Distributed Sharedmemory Bertinoro, March 2009 Algorithms Part III 2 Sharedmemorydistributedsystem Setofprocesses DistributedAlgorithms Setofsharedobjects(variables)


slide-1
SLIDE 1

2/23/09
 1


Distributed
Algorithms
 PART III
 SHARED
MEMORY


1
 Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


  • Shared
memory
distributed
system


– Set
of
processes
 – Set
of
shared
objects
(variables)


  • access
to
a
shared
variable
is
a
single
(indivisible)
event

  • CommunicaCon
through
shared
variables


– No
channels


  • We
use
a
single
automaton
to
model
the


enCre
system


– using
several
automata
and
the
composiCon


  • peraCon
leads
to
some
(technical)
difficulCes


2


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


3


p1
 p2
 pn

Port
1
 Port
2
 Port
n Automaton
for
the
enCre
system


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


  • However,
we
think
of
each
pi
as
an
automaton


– statesi , starti , …

  • Each
shared
variable
x



– valuesx, ini0alx , …

  • binary
values,
arbitrary
values,
bounded,
unbounded
…


– single/mulCple


  • single‐writer
or
mulCple‐writer

  • single‐reader
or
mulCple‐reader


– type
(read‐write,
read‐modify‐write,
…)


  • The
overall
system
automaton
A


– states,
consists
of
a
state
for
each
pi
and
a
value
for
each
 shared
variable
 – start,
consists
of
a
start
state
for
each
pi
and
an
iniCal
value
 for
each
shared
variable


4


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


  • act(A),
set
of
acCons


– each
acCon
is
“associated”
with
a
pi

  • Some
acCons


– “associated”
also
with
a
(some)
shared
variable(s)


  • Input/output
acCons
of
pi


– “port”
of
process
i

  • Internal
acCons
of
pi

– local
computaCon


5


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


  • trans(A)


– restricted
so
that
only
the
state
of
involved
 process
and
shared
variables
can
be
changed
 – (s,π,s’)
is
a
valid
step
for
A
if
(si,π,si’)
is
a
valid
step
 for
pi

  • π cannot
change
state
of
other
processes


– π can
access
only
associated
variables


  • Technicality


– if
π associated
with
pi and
with
shared
variable
x,
 then
whether
or
not
π is
enabled
should
depend


  • nly
on
the
state
of
pi and
not
on
x.




6


slide-2
SLIDE 2

2/23/09
 2


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


  • tasks(A)


– task
structure
consistent
with
process
structure
 – each
equivalence
class
should
include
only
acCons
 associated
with
a
process


  • that
is
an
equivalence
class
cannot
contain
acCons
of
pi


and
acCons
of
pj.


Example


Set
of
n
processes,
one
shared
variable
x Consensus
problem:
 
port
i:
initi(v),
decidei(v)


7


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


8


States,
for
each
i:
 


statusi ∈
{idle,access,decide,done},
iniCally
idle
 


inputi ∈
V
∪
{⊥} , iniCally ⊥
 


outputi ∈
V
∪
{⊥} , iniCally ⊥ Shared
variables:
 


x
∈
V
∪
{⊥} , iniCally ⊥,
accessible
by
all
processes
 Transi=ons,
for
each
i: 

 inp
init(v)i

  • ut decide(v)i

Ef

Effect: fect: inputi :=
v Pr Pre: e: statusi
=
decide if
statusi
=
idle
then

  • utputi
=
v

statusi:= access 
 
 
 

 

Ef Effect: fect: statusi
=
done
 int accessi

Pr

Pre: e: statusi
=
access Ef Effect: fect: if x = ⊥
then x:=inputi

  • utputi
:=
x

statusi
:=
decide
 Tasks:
 for
every i,

{accessi, decidei}


SharedVarConsensus
automaton


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


  • We
can
model
the
environment
that
controls


each
process
with
an
automaton


  • For
the
previous
case
we
can
have


– Automaton
Ui
that
controls
initi
and
receives
the
 decidei
acCons
 – initi

executed
only
once


  • Exercise:
write
automaton
code
for
Ui


9


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
type


  • Variable
type


– set
V
of
values
 – an
iniCal
value
v0
∈
V – a
set
of
invoca0ons – a
set
of
responses – a
funcCon
f:
invoca0ons
×
V
→
responses
×
V

  • A
variable
type
is
not
an
automaton


– Looks
similar
 – Invoca=ons
and
responses
occur
together
as
part


  • f
a
func=on
applica=on


10


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variable
types


  • Read/write
shared
variables
(register)


– invocaCons
are:
read
and
write(v)
 – responses
are:
v (a
value),
ack – funcCon:



  • f(read,v)
=(v,v)

  • f(write(v),w)=(ack,v)

  • QuesCon


– The
shared
variable
used
in
the
Consensus
example
is
a
 read‐write
variable?


  • Exercise:
rewrite
the
code
of
the
Consensus
example


using
a
read‐write
shared
variable


– Hint
instead
of
one
single
acCon
that
reads
and
writes
x
 you
need
two
acCons
(one
for
reading
and
another
for
 wriCng)
 – Can
you
make
it
work?
Show
an
example
where
it
fails.


11


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variable
types


  • Read‐modify‐write
shared
variables


– the
funcCon
f
allows
to



  • read
the
variable

  • perform
some
local
computaCon

  • write
a
new
value
into
the
variable


– all
in
one
step


  • Difficult
to
implement


– two
accesses
(a
read
and
a
write)


12


slide-3
SLIDE 3

2/23/09
 3


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
type


  • Swap


f(u,v)=(v,u)


  • Compare‐and‐swap


f((u,v),w)=
(w,v),

if
u=w 
 
 
 
 


















(w,w),

otherwise


  • Test‐and‐set


f(u,v)=(v,1)


  • Fetch‐and‐add


f(u,v)=(v,u+v)


13


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
 MUTUAL EXCLUSION AND RESUORCE ALLOCATION


14


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


  • Resource
allocaCon
problem


– single
resource
that
can
support
only
one‐user
at
 a
Cme
(e.g.
a
printer)

 – several
users
U1,
U2,
…
,Un
want
to
access
the
 resource


  • In
an
“operaCng
systems”
sehng


– several
processes
have
a
porCon
of
their
code
 which
is
called
“criCcal
region”
 – no
two
processes
can
concurrently
execute
their
 criCcal
regions


15


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


  • Each
process
has
4
regions


– Remainder
 – Trying
 – CriCcal
 – Exit


  • ExecuCon
loops
in
these
regions


16


Remainder
 Trying
 CriCcal
 Exit


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


  • User
Ui
issues


– tryi
acCons
(request
to
access
the
resource)
 – exiti
acCons
(request
to
leave
the
resource)


  • Need
to
output


– criti
(to
grant
the
resource)
 – remi
(to
return
to
the
remainder
region)


  • Ui
is
well‐formed


– if
it
follows
the
cycle
of
acCons
tryi ‐
(criti)
–
exiti
‐
 (remi)


17


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


18


p1
 p2
 pn U1


try1
 crit1
 rem1
 exit1


Un

tryn critn remn exitn

slide-4
SLIDE 4

2/23/09
 4


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


  • Problem
definiCon


– well‐formedness:
In
any
execuCon,
and
for
any
i,
the

 subsequence
of
acCons
between
Ui
and
A
is
well‐formed


  • Mutual
exclusion


– there
is
no
reachable
state
where
more
than
one
user
is
in
 its
criCcal
region


  • Progress:
At
any
point
in
a
fair
execuCon


– (Progress
for
trying
region):
if
at
least
one
user
is
in
the
T
 region
and
no
user
in
the
C
region,
then
at
some
later
 point
some
user
enters
its
C
region
 – (Progress
for
exit
region):
if
at
least
one
user
is
in
its
E
 region,
then
at
some
later
point
some
user
enters
its
R
 region



19


Bertinoro, March 2009

Distributed Algorithms

Part III

Dijkstra’s
algorithm


20


Shared
read/write
variables
 


turn ∈
{1,…,n},
iniCally
arbitrary, writeable
and
readable
by
all
processes
 


for
every
i: flag(i) ∈
{0,1,2} , iniCally 0,
writable
by
process
i,
readable
by
all
processes
 Pseudocode
for
process
i:
 1. **
remainder
region
**
 2. 



  • 3. flag(i)
:=
1


4. while
turn
≠
i
do
 5. 




if flag(turn)
=
0
then
turn
:=
i **
trying
region
**


  • 6. flag(i)
:=
2


7. for
j
≠
i
do
 8. 




if
flag(j)=2
then
goto
3
 9. 



  • 10. **
criCcal
region**

  • 11. 


  • 12. flag(i)
:=
0


 
 
 
 
 
 
 
 
**
exit
region
**
 DijkstraME
(pseudocode)


Bertinoro, March 2009

Distributed Algorithms

Part III

Dijkstra’s
algorithm


21


States,
for
each
i:
 


pci ∈
{rem,set‐flag1, test‐turn, test‐flag(j), set‐turn, set‐flag2, check, leave‐try, crit, reset, leave‐exit},
iniCally
rem
 


Si, a
set
of
process
indices, iniCally {} Shared
variables:
 



turn ∈
{1,…,n},
iniCally
arbitrary
 



for
every
i: flag(i) ∈
{0,1,2} , iniCally 0
 Transi=ons,
for
each
i:

 inp
tryi

Ef

Effect: fect: pci :=
set‐flag1 int set‐flag1i

Pr

Pre: e: pci
=
set‐flag1 Ef Effect: fect: flag(i)
:=1;
pci :=
test‐turn DijkstraME
automaton
 int test‐turni

Pr

Pre: e: pci
=
test‐turn Ef Effect: fect: if
turn=i
then
pci :=
set‐flag2 else
pci :=
test‐flag(turn) int test‐flag(j)i

Pr

Pre: e: pci
=
test‐flag(j) Ef Effect: fect: if
flag(j)=0
then
pci :=
set‐turn else
pci :=
test‐turn

Bertinoro, March 2009

Distributed Algorithms

Part III

Dijkstra’s
algorithm


22


int set‐turni

Pr

Pre: e: pci
=
set‐turn Ef Effect: fect: turn
:=
i;
pci :=
set‐flag2 int set‐flag2i

Pr

Pre: e: pci
=
set‐flag2 Ef Effect: fect: flag(i) := 2
 





S :={
i}


 pci :=
check
 int check(j)i

Pr

Pre: e: pci
=
check j ∉
S Ef Effect: fect: if flag(j)
=2
then
 
 




Si
:=
{}
 
 



 pci :=
set‐flag1 else

Si := Si ∪
{j}



 



if
|Si|=n
then
pci :=
leave‐try DijkstraME
automaton
(ctnd)


  • ut crii

Pr

Pre: e: pci
=
leave‐try Ef Effect: fect: pci :=
crit
 inp exiti

Ef

Effect: fect: pci :=
reset int reseti

Pr

Pre: e: pci
=
reset Ef Effect: fect: flag(i)
:=
0

 






Si :=
{}

 






pci :=
leave‐exit

  • ut remi

Pr

Pre: e: pci
=
leave‐exit Ef Effect: fect: pci :=
rem


Bertinoro, March 2009

Distributed Algorithms

Part III

Dijkstra’s
algorithm


A
correctness
argument
(operaConal)
 Lemma:
Dijkstra’s
algorithm
saCsfies
mutual
exclusion
 Proof:
By
contradicCon.
Assume
that
Ui
and
Uj,
i
≠
j,
are
 simultaneously
in
region
C,
in
some
reachable
state.
 Consider
an
execuCon
that
leads
to
such
a
state.
 By
the
code,
both
process
i
and
j,
set
flag to
2
before
 entering
C.
 Assume
that
pi
sets
flag(i) to
2.
Then
flag(i)
remains
equal
 to
2
unCl
i
leaves
C.
But
since
j
enters
C
before
i
leaves
it,
 it
follows
that
j
sets
flag(j) to
2
while
flag(i)=2.
 By
the
code
j
cannot
enter
C.
ContradicCon.
 ☐


23


Bertinoro, March 2009

Distributed Algorithms

Part III

Dijkstra’s
algorithm


  • AsserConal
proof
for
Mutual
Exlcusion


Asser=on
1:
In
any
reachable
state,
|{i :
pci=crit}|
≤
1


  • AsserCon
1
is
a
consequence
of
the
following


Asser=on
2:
In
any
reachable
state,
if
pci ∈
{leave‐try, crit, reset},
then
|Si|=n.
 Asser=on
3:
In
any
reachable
state,
there
do
not
exist
i
 and
j,
i
≠
j,
such
that
i ∈
Si and
j ∈
Sj.


24


slide-5
SLIDE 5

2/23/09
 5


Bertinoro, March 2009

Distributed Algorithms

Part III

Dijkstra’s
algorithm


  • To
prove
AsserCon
3,
we
user
other
simpler


asserCons
 Asser=on
4:
In
any
reachable
state,
if
Si ≠
{},
then
pci

∈
{check,leave‐try,crit,reset}.


Asser=on
5:
In
any
reachable
state,
if
pci ∈
 {check,leave‐try,crit,reset},
then
flag(i)
=
2.
 Asser=on
6:
In
any
reachable
state,
if
Si ≠
{},
then
 flag(i)
=
2.

 Exercises:
prove
AsserCon
2,4,5,3.


25


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


  • The
previous
definiCon
of
the
problem
allows


the
following
condiCon


– a
given
process
stays
forever
in
the
T
region
 because
other
processes
enter
the
C
region
 – the
process
is
locked‐out


  • Stronger
condiCons:
lockout‐freedom


– If
all
users
always
return
the
resource,
then
any
 user
that
reaches
T,
eventually
enters
C – any
user
that
reaches
E,
eventually
enters
R

26


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithms


  • A
first
algorithm


– works
for
n=2
processes


  • Generalized
to
the
case
of
n
processes


– A
second
algorithm


  • using
a
series
of
n‐1
“compeCCons”
based
on
the
first


algorithm


– A
third
algorithm


  • using
the
first
algorithm
in
a
“tournament”

  • All
three
algorithms
are
lockout‐free


27


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithms


28


Shred
read/write
variables
 


turn ∈
{0,1},
iniCally
arbitrary, writeable
and
readable
by
the
two
processes
 


for
every
i ∈
{0,1}: flag(i) ∈
{0,1} , iniCally 0,
writable
by
process
i,
readable
by
(1‐i)
 Pseudocode
for
process
i:
 1. **
remainder
region
**
 2. 



  • 3. flag(i)
:=
1

  • 4. turn
:=
i

**
trying
region
**
 5. waitor flag(1‐i)
=
0
or
turn
≠
i 
 6. 

 7. **
criCcal
region**
 8. 



  • 9. flag(i)
:=
0


 
 
 
 
 
 
 
 
**
exit
region
**
 Peterson‐2P
(pseudocode)


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithms


29


Shred
read/write
variables
 


for
every
k ∈
{1,…,n‐1}: turn(k) ∈
{1,…,n},
iniCally
arbitrary, writeable
and
readable
by
all
processes
 


for
every
i ∈
{1,…,n}: flag(i) ∈
{0,…,n‐1} , iniCally 0,
writable
by
process
i,
readable
by
all
j≠i Pseudocode
for
process
i:
 1. **
remainder
region
**
 2. 



  • 3. for k=1 to n‐1 do


4. flag(i)
:=
k 5. turn(k)
:=
i **
trying
region
**
 6. 
 
waitor [∀j≠i :
flag(j)
<
k]
or
[turn(k)
≠
i] 
 7. 

 8. **
criCcal
region**
 9. 



  • 10. flag(i)
:=
0


 
 
 
 
 
 
 
 
**
exit
region
**
 Peterson‐NP1
(pseudocode)


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithms


  • We
can
also
use
Peterson‐2P
as
a
subrouCne


for
a
“tournament”‐like
algorithm


– the
winner
gets
the
C
region


  • For
simplicity,
assume
n=2k
for
some
k

– number
processes
from
0
through
n‐1


  • Arrange
processes
in
a
complete
tree


– pi,
i=0,…,n‐1,
corresponds
to
ith
leaf
 – Label
ith
leaf
with
binary
representaCon
of
i – Label
internal
nodes
with
common
prefixes
 – Root
has
null
label


30


slide-6
SLIDE 6

2/23/09
 6


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithms


31


000


p0


001


p1


010


p2


011


p3


100


p4


101


p5


110


p6


111


p7


00
 01
 10
 11
 0
 1
 null
 Level
1
 Level
2
 Level
3
(log
n)


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithms


  • comp(i,k),
the
level
k
compeCCon
of
pi

– is
the
string
of
the
high‐order
(log
n
–
k)

bits
of
the
 binary
representaCon
of
i – is
equal
to
the
label
of
the
ancestor
of
i
in
level
k

  • role(i,k)
∈
{0,1},
the
“role”
of
pi


– is
the
(log
n‐k+1st)
bit
of
the
binary
representaCon
of
i

  • opponents(i,k),
the
“opponents”
of
pi

– the
set
of
process
indices
with
the
same
comp(i,k)
but
 different
role
 – that
is,
the
processes
in
the
other
subtree.


32


Bertinoro, March 2009

Distributed Algorithms

Part III

  • pponents(5,2)


role(6,2)=1


Peterson’s
algorithms


33


000


p0


001


p1


010


p2


011


p3


100


p4


101


p5


110


p6


111


p7


00
 01
 11
 0
 1
 null
 Level
1
 Level
2
 Level
3
(log
n)
 comp(5,2)
 comp(4,2)
 comp(6,2)
 comp(7,2)
 role(5,2)=0
 10


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithms


34


Shared
read/write
variables
 


for
every
binary
string
x
of
length
a
most
log
n
‐1
 turn(x) ∈
{0,1},
iniCally
arbitrary, writeable
and
readable
by
those
processes
 
 
 
 




i
for
which
x
is
a
prefix
of
binary
representaCon
of
i
 


for
every
i ∈
{1,…,n‐1}: flag(i) ∈
{0,…,log
n} , iniCally 0,
writable
by
process
i,
readable
by
all
j≠i Pseudocode
for
process
i:
 1. **
remainder
region
**
 2. 



  • 3. for k=1 to
log n do


4. flag(i)
:=
k 5. turn(comp(i,k))
:=
role(i,k) **
trying
region
**
 6. 
 
waitor [∀j
∈
opponents(i,k):
flag(j)
<
k]

 7. 
 

 

or
[turn(comp(i,k))
≠
role(i,k)] 
 8. 

 9. **
criCcal
region**


  • 10. 


  • 11. flag(i)
:=
0


 
 
 
 
 
 
 
 
**
exit
region
**
 Peterson‐NP2
(pseudocode)


Bertinoro, March 2009

Distributed Algorithms

Part III

Peterson’s
algorithm


  • Exercises:



– Write
I/O
automata
code
for
Peterson2P

 – Write
I/O
automata
code
for
PetersonNP1

 – Write
I/O
automata
code
for
PetersonNP2



  • Exercises:



– Write
(or
sketch)
a
correctness
proof
for
 Peterson2P
using
the
pseudocode
 – Write
(or
sketch)
a
correctness
proof
for
 Peterson2P
using
the
IOA
code


35


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


  • Algorithms
for
Mutual
Exclusion
so
far


– use
mulC‐writer
shared
registers


  • We
now
present
two
algorithms
that
use
only


single‐writer
registers


– Burns’
algorithm


  • does
not
guarantee
lockout‐freedom


– Bakery
algorithm


  • uses
unbounded
size
(SAFE)
register

  • Will
not
present
the
proof
of
correctness


– Exercise:
Prove
that
Burn’s
algorithm
is
correct


36


slide-7
SLIDE 7

2/23/09
 7


Bertinoro, March 2009

Distributed Algorithms

Part III

Burn’s
algorithm


37


Shared
variables
 


for
every
i ∈
{1,…,n}: flag(i) ∈
{0,1} , iniCally 0,
writable
by
process
i,
readable
by
all
j≠i Pseudocode
for
process
i:
 1. **
remainder
region
**
 2. 



  • 3. flag(i)
:=
0


  • 4. for j=1 to
i‐1 do


5. if flag(j)
:=
1
then
goto
3


  • 6. flag(i)
:=
1



 
 
 
 
 
 
 
 
**
trying
region**


  • 7. for j=1 to
i‐1 do


8. if flag(j)
:=
1
then
goto
3
 9. for j=i+1 to
n do
 10. if flag(j)
:=
1
then
goto
9


  • 11. 


  • 12. **
criCcal
region**

  • 13. 


  • 14. flag(i)
:=
0


 
 
 
 
 
 
 
 
**
exit
region
**
 Burns
(pseudocode)


Bertinoro, March 2009

Distributed Algorithms

Part III

Bakery
algorithm


38


Shared
variables
 


for
every
i ∈
{1,…,n}: choosing(i) ∈
{0,1} , iniCally 0,
writable
by
process
i,
readable
by
all
j≠i number(i) ∈
Nat , iniCally 0,
writable
by
process
i,
readable
by
all
j≠i Pseudocode
for
process
i:
 1. **
remainder
region
**
 2. 



  • 3. choosing(i)
:=
1


  • 4. number(i)
:=
1
+
maxj≠i
number(j)

  • 5. choosing(i)
:=
0


 
 
 
 
 
 
 
 





**
trying
region**


  • 6. for j ≠ i do


7. waitor choosing(j)
=
0
 8. waitor number(j)
=
0

or
(number(i),i)<(number(j),j)
 9. 



  • 10. **
criCcal
region**

  • 11. 


  • 12. number(i)
:=
0


 
 
 
 
 
 
 


 





**
exit
region
**
 Bakery
(pseudocode)


Bertinoro, March 2009

Distributed Algorithms

Part III

Mutual
Exclusion


  • Read‐Modify‐Write
shared
register

  • Exercise:
write
a
(trivial)
algorithm
(IOA)
to
solve


Mutual
Exclusion
with
read‐modify‐write
registers


  • Number
of
bypasses:


– Consider
a
pi
in
the
T
region;
pj
bypasses
pi
if
it
enters
the
C
 region.
 – If
the
number
of
bypasses,
for
any
process
that
enters
T,
is
 bounded
then
lockout‐freedom
is
guaranteed


  • Exercise:
write
a
(simple)
algorithm
(IOA)
that


guarantees
bounded
bypass
(using
read‐modify‐write
 registers).


39


Bertinoro, March 2009

Distributed Algorithms

Part III

Resource
allocaCon


  • Mutual
Exclusion


– one
shared
resource,
with
single
access


  • no
two
users
can
use
the
resource
at
the
same
Cme

  • GeneralizaCon


– several
shared
resources,
with
single
access
 – each
user
might
need
more
than
one
resource


  • Resources
have
to
be
granted
so
that


– each
user
gets
the
set
it
needs
 – no
single
resource
is
given
to
two
user
at
the
same
 Cme


40


Bertinoro, March 2009

Distributed Algorithms

Part III

Resource
allocaCon


  • Example


– 4
users:

U1,U2,U3,U4
 – 4
resources:

r1,r2,r3,r4


  • User
requirements


– U1

needs
r1,r2
 – U2


needs
r1,r3
 – U3

needs
r2,r4
 – U4


needs
r3,r4


  • Exclusion
specificaCon


– “forbidden”
sets
of
users
 – E
=
{{1,2},{1,3},{2,4},{3,4}}


41


U2
 r1
 r3
 r4
 r2
 U1
 U4
 U3


Bertinoro, March 2009

Distributed Algorithms

Part III

Dining
Philosophers


42


P6
 P3
 P1
 P2
 P5
 P4
 f1
 f2
 f3
 f4
 f5
 f6


slide-8
SLIDE 8

2/23/09
 8


Bertinoro, March 2009

Distributed Algorithms

Part III

Dining
Philosophers


  • Symmetric
algorithms


– all
processes
are
idenCcal

 – can
refer
to
the
resource
only
as
“le{”
and
“right”


Theorem:
There
is
no
symmetric
algorithm
for
 the
Dining
Philosopher
problem.


  • Exercise:
Prove
the
theorem.


– Hint:
Construct
an
infinite
execuCon
(remember
 that
processes
are
idenCcal).


43


Bertinoro, March 2009

Distributed Algorithms

Part III

Dining
Philosophers


44


P6
 P3
 P1
 P2
 P5
 P4
 f1
 f2
 f3
 f4
 f5
 f6


Bertinoro, March 2009

Distributed Algorithms

Part III

Dining
Philosophers


  • Right‐Le{
algorithm


– processes
are
of
two
kinds


  • “right”
processes

  • “le{”
processes

  • Odd
numbered
philosophers
are
“right”


– they
try
to
grab
first
the
right
fork


  • Even
numbered
philosophers
are
“le{”


– they
try
to
grab
first
the
le{
fork


45


Bertinoro, March 2009

Distributed Algorithms

Part III

Dining
Philosophers


46


P6
 P3
 P1
 P2
 P5
 P4
 f1
 f2
 f3
 f4
 f5
 f6


Bertinoro, March 2009

Distributed Algorithms

Part III

Randomized
Dining
Philosophers


  • Processes
all
idenCcal


– can
access
their
forks
using


  • fork(le{)

  • fork(right)

  • First
fork
chosen
at
random

  • Define 



¬k
=



leY, 

if
k=right
 






right, 
if
k=leY

47


Bertinoro, March 2009

Distributed Algorithms

Part III

Randomized
Dining
Philosophers


48


Shared
variables
 


for
every
i ∈
{1,…,n}: fork(i),
a
Boolean, iniCally
false,
accessible
by
processes
i‐1 and i Pseudocode
for
process
i:
 1. **
remainder
region
**
 2. 

 3. do
forever

 4. first
:=
random()
 5. 
 
second
:=
¬first
 6. 
 
wait
unCl
fork(first)=false

 7. fork(first) := true
 8. if fork(second)
=
false
then 
 
 
 
 





**
trying
region**
 9. 
 
fork(second)
:=
true 10. 
goto
13
 11. else
fork(first)
:=
false

  • 12. 


  • 13. **
criCcal
region**

  • 14. 


  • 15. fork(first)
:=
fork(second)
:=
false


 
 
 
 





**
exit
region
**


LehmannRabin
(pseudocode)


slide-9
SLIDE 9

2/23/09
 9


Bertinoro, March 2009

Distributed Algorithms

Part III

Randomized
Dining
Philosophers


49


States,
for
each
i:
 


pci ∈
{rem, flip, wait, second, drop, leave‐try, crit, reset‐leY, reset‐right, leave‐exit},
 





iniCally
rem
 


firsti ∈
{leY, right}, iniCally arbitrary Shared
variables:
 



for
every
i: fork(i) ∈
{false,true} , iniCally false
 Transi=ons,
for
each
i:

 inp
tryi

Ef

Effect: fect: pci :=
flip int flipi

Pr

Pre: e: pci
= flip Ef Effect: fect: first
:=random();
pci :=
wait int waiti

Pr

Pre: e: pci
=
wait Ef Effect: fect: if
fork(first)
=
false then
 
 
fork(first)
:=
true
 
 
pci :=
second LehmannRabin
automaton
 int secondi

Pr

Pre: e: pci
=
second Ef Effect: fect: if
fork(¬first)
=
false then fork(¬first) := true pci :=
leave‐try else pci :=
drop int dropi

Pr

Pre: e: pci
=
drop Ef Effect: fect: fork(first) :=
false pci :=
flip

Bertinoro, March 2009

Distributed Algorithms

Part III

Randomized
Dining
Philosophers


50


  • ut criti

Pr

Pre: e: pci
= leave‐try Ef Effect: fect: pci :=
wait inp exiti

Ef

Effect: fect: pci :=
reset‐right int reset‐righti

Pr

Pre: e: pci
= reset‐right Ef Effect: fect: fork(right) := false pci :=
reset‐leY int reset‐leYi

Pr

Pre: e: pci
= reset‐leY Ef Effect: fect: fork(leY) := false pci :=
leave‐exit

  • ut remi

Pr

Pre: e: pci
= leave‐exit Ef Effect: fect: pci :=
rem LehmannRabin
automaton


Bertinoro, March 2009

Distributed Algorithms

Part III

Randomized
Dining
Philosophers


  • Mutual
exclusion


– Fairly
easy
to
prove


  • What
about
progress?

  • ExecuCon
of
LehmannRabin
that
does
not
make


progress


– all
processes
take
steps
in
round‐robin
order
and
 always
make
the
same
random
choice.
 – In
such
an
execuCon,
no
process
ever
reaches
C
 – probability
of
such
execuCon
is
0


  • So
progress
is
guaranteed
with
probability
1


51


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
 CONSENSUS


52


Bertinoro, March 2009

Distributed Algorithms

Part III

Consensus


  • Consensus
problem
with
shared
memory

  • Read/write
variables


– cannot
be
solved
 – proof
similar
to
the
FLP
impossibility
result


  • Easy
to
solve
with
more
powerful
shared


variables


– read‐modify‐write
 – compare‐and‐swap


  • Exercise:
devise
an
algorithm
for
these
types
of
shared


variables


53


Bertinoro, March 2009

Distributed Algorithms

Part III

Consensus
number


  • Consensus
number


– Given
a
(shared)
object,
the
consensus
number
is
 the
maximum
number
of
processes
for
which
we
 can
solve
consensus
using
the
object


  • Wait‐free
hierarchy


– Given
two
objects
X
and
Y,
does
there
exists
a
 wait‐free
implementaCon
of
X
using
Y?


  • Consensus
number
allows
to
build
such
a


hierarchy


54


slide-10
SLIDE 10

2/23/09
 10


Bertinoro, March 2009

Distributed Algorithms

Part III

Consensus
number


  • Read‐write
registers
have
consensus
number
1


– That
is
we
cannot
solve
consensus


  • Test‐and‐set,
swap,
fetch‐and‐add,
FIFO
queues,


stack
have
consensus
number
2


– Exercise:
devise
a
consensus
algorithm
for
2
processes
 using
one
of
these
objects


  • Compare‐and‐swap
and
FIFO
queues
with
peek

  • peraCon
have
infinite
consensus
number


– Exercise:
devise
a
consensus
algorithm
(for
any
 number
of
processes)
using
one
of
these
objects


55


Bertinoro, March 2009

Distributed Algorithms

Part III

Consensus,
augmented
FIFO
queue


56


States,
for
each
i:
 


statusi ∈
{idle,enqueue,decide,done},
iniCally
idle
 


inputi ∈
V
∪
{⊥} , iniCally ⊥
 


outputi ∈
V
∪
{⊥} , iniCally ⊥ Shared
variables:
 


Q, an
augemented
FIFO
queue, iniCally ⊥,
accessible
by
all
processes
 Transi=ons,
for
each
i: 

 inp
init(v)i

  • ut decide(v)i

Ef

Effect: fect: inputi :=
v Pr Pre: e: statusi
=
decide if
statusi
=
idle
then

  • utputi
=
v

statusi:= access 
 
 
 

 

Ef Effect: fect: statusi
=
done
 int enqueuei

Pr

Pre: e: statusi
=
enqueue Ef Effect: fect: enqueue(Q,inputi)


  • utputi
:=
peek(Q)

statusi
:=
decide
 Tasks:
 for
every i,

{accessi, decidei}


AugmentedFIFOqueueConsensus
automaton


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
 ATOMIC OBJECTS


57


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • Atomic
object
vs.
shared
variable


– accesses
are


  • concurrent
for
atomic
object


– invoca=ons
and
responses
are
split


  • occur
indivisibly
for
shared
variables

  • Although
accesses
are
concurrent


– responses
to
invocaCons
are
such
that
accesses
 look
like
they
occur
in
some
sequenCal
order


  • Atomic
objects
are
also
called
“linearizable”

  • bjects


58


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • Useful
building
blocks
for
distributed
system

  • Basic
atomic
objects


– single‐writer/single‐reader
read/write
variables
 – are
provided
by
hardware


  • StarCng
from
basic
atomic
objects
one
can
build


more
powerful
atomic
objects


  • Atomic
objects
provide
the
user
with
something


that
looks
like
a
centralized
(coherent)
shared
 memory


59


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • Recall
that
a
shared
variable
type
is


– set
V
of
values
 – an
iniCal
value
v0
∈
V – a
set
of
invoca0ons – a
set
of
responses – a
funcCon
f:
invoca0ons
×
V
→
responses
×
V


  • If
T
is
a
shared
variable
type,
an
atomic
object
of


type
T
is
I/O
automaton
that
saCsfies
some
 properCes,
informally


– well‐formedness
 – atomicity
 – liveness
condiCons


60


slide-11
SLIDE 11

2/23/09
 11


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • External
interface


61


p1


invoca0on1
 stop1
 response1


pn

invoca0onn stopn responsen

pi

invoca0oni stopi responsei

ATOMIC
 OBJECT


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • Read/write
object


62


p1


write(3)1
 stop1
 ack1


pn

write(5)n stopn ackn

pi

readi stopi value(3)i

ATOMIC
 OBJECT


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • Consider
a
well‐formed
sequence
β
of
external
acCons

  • Atomicity
property
for
β:


– for
each
completed
operaCon
π,
we
can
insert
a
 serializaCon
point
π★ somewhere
between
π’s
invocaCon
 and
response
in
β.
 – for
some
incomplete
operaCon
π,
we
can
select
a
 corresponding
response
and
insert
a
serializaCon
point
π★ a{er
π’s
invocaCon
in
β.


  • SerializaCon
points


– if
we
“shrink”
the
invocaCons
and
the
responses
to
the
 serializaCon
points
we
get
a
trace
of
the
underlying
 variable
type.


  • the
incomplete
operaCon
with
no
serializaCon
point
are
ignored


(that
is
the
invocaCons
are
removed)



63


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


64


write(8)1
 ack1
 read2
 02
 write(8)1
 ack1
 read2
 82
 write(8)1
 read2
 02
 read2
 02


★
 ★
 ★
 ★
 ★
 ★


write(8)1
 read2
 02
 read2
 82


★
 ★
 ★


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


65


write(8)1
 read2
 82
 read2
 02
 write(8)1
 read2
 02
 read2
 02
 ack1


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • Liveness
properCes


– Failure‐free
terminaCon:
in
any
execuCon
without
 failures,
every
invocaCon
has
a
response
 – f‐failure
terminaCon:
in
any
execuCon
with
at
 most
f
failures,
every
invocaCon
on
a
non‐failing
 process
has
a
response

 – Wait‐free
terminaCon:
in
any
execuCon
every
 invocaCon
of
a
non‐failing
process
has
a
response


66


slide-12
SLIDE 12

2/23/09
 12


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • Single‐writer
single‐reader


– simplest
type
of
(shared)
variable


  • MulC‐writer
mulC‐reader


– more
complex


Theorem:
It
is
possible
to
implement
a
wait‐free
 m‐writer
p‐reader
atomic
object
using
single‐ writer
single‐reader
shared
variables


– where
m+p=n – that
is
among
the
n
processes,
m
are
writers
p
are
 readers


67


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
objects


  • ImplemenCng
read‐modify‐write
variables


– in
terms
of
read‐write
variables


  • Exercise:
give
an
algorithm
to
do
this
in
the


failure
free
case


– Hint:
Use
a
mutual
exclusion
algorithm


Theorem:
There
does
not
exist
a
shared
memory
 system
using
read‐write
variables
that
 implements
a
read‐modify‐write
atomic
object
 and
guarantees
1‐failure
terminaCon.


68


Bertinoro, March 2009

Distributed Algorithms

Part III

Atomic
snapshot


  • Instantaneous
snapshot
of
all
shared
variables


Theorem:
it
is
possible
to
implement
a
snapshot
 atomic
object
guaranteeing
wait‐free
terminaCon
 using
single‐reader
mulC‐writer
shared
variables.


69


update(v)1


x1
 xm


ack1
 update(v)m
 ack1
 snapshotm+1
 (v1,v2,…,vm)
 snapshotn
 (v1,v2,…,vm)
 write(v)1


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
 TRANSFORMATION FROM SHARED MEMORY TO MESSAGE-PASSING


70


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables


  • How
we
can
simulate
shared
variables
if
we


have
a
network?


  • Let’s
start
from
the
failure‐free
case
and
read‐

write
variables


– mulC‐reader/mulC‐writer


  • A
simple
algorithm
(single
copy)


– each
shared
variable
x
is
“owned”
by
a
process
 – accesses
to
the
variables


  • messages
to
the
owner

  • the
owner
applies
the
invocaCon
and
sends
a
response


71


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables


  • A
fault
tolerant
algorithm:
Majority
VoCng


– Each
process
maintains
a
copy
of
x

  • pair
<value,tag>,
iniCally
<v0,0>


– operaCons
require
an
atomic transac0on
involving
 at
least
a
majoriCes
of
the
copies


  • read


– atomically
read
a
majority,
take
the
value
 associated
with
the
largest
tag.


  • write(v)


– perform
an
“embedded‐read”
to
determine
the
 largest
tag
t,
then
write
to
at
least
a
majority
of
 the
processes
<v,t+1>.
Everything
as
an
atomic
 transacCon.


72


slide-13
SLIDE 13

2/23/09
 13


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
 Theorem:
It
is
possible
to
simulate
a
single‐ writer
mulC‐reader
shared
memory
register
in
a
 message‐passing
system
guaranteeing
f‐failure
 terminaCon,
for
n
>
2f.


  • We
next
describe
an
algorithm
that


implements
an
atomic
single‐reader
mulC‐ writer
object


– Ahya,
Bar‐Noy
and
Dolev


73


Bertinoro, March 2009

Distributed Algorithms

Part III

The
ABD
algorithm


  • Each
process
maintains
a
copy
of
x

– pair
<value,tag>,
iniCally
<v0,0>


  • When
the
writer
pw
wants
to
perform
a
write(v)


– let
t
be
the
smallest
tag
not
yet
used

 – sets
its
local
copy
to
<v,t>
 – sends
(“write”,v,t)
to
all
other
processes
 – a
process
that
receives
(“write”,v,t)
updates
its
copy
if
 t
is
greater
than
its
current
tag;
in
any
case
it
sends
an
 ack
to
the
writer
 – when
writer
receives
acks
from
majority
the
write


  • peraCon
is
over.


74


Bertinoro, March 2009

Distributed Algorithms

Part III

The
ABD
algorithm


  • When
a
reader
pr wants
to
perform
a
read


– sends
a
“read”
message
to
all
processes
 – a
process
that
receives
“read”


  • responds
with
is
current
values
<v,t>


– when
pr 
has
learned
a
majority
of
the
<v,t>
pairs
 (including
its
own),
it
does
the
following:


  • let
<v,t>
the
pair
with
the
largest
tag

  • propagates
<v,t>
to
a
majority
of
the
processes

  • once
pr 
gets
acks
from
a
majority
of
the
processes

  • the
read
operaCon
is
over


– The
value
v
is
returned
as
result
of
the
read


  • peraCon


75


Bertinoro, March 2009

Distributed Algorithms

Part III

The
ABD
algorithm


  • ABD
can
be
used
to
obtain
distributed


(message‐passing)
implementaCons
of
many
 shared
memory
algorithm
based
on
single‐ writer
mulC‐reader
registers


  • For
example


– snapshot
atomic
objects
 – mulC‐writer
mulC‐reader
atomic
object


  • The
fault
tolerance
is
limited
by
the
condiCon


n
>
2f

76


Bertinoro, March 2009

Distributed Algorithms

Part III

The
ABD
algorithm


  • Exercise:
find
a
situaCon
where,
assuming
that


f=n/2+1
the
ABD
algorithm
fails
 Theorem:
for
n
<
2f
it
is
not
possible
to
obtain
a
 message‐passing
implementaCons
of
read/write
 atomic
objects
guaranteeing
f‐failure
 terminaCon.


77


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
variables
 TRANSFORMATION FROM MESSAGE-PASSING TO SHARED MEMORY

78


slide-14
SLIDE 14

2/23/09
 14


Bertinoro, March 2009

Distributed Algorithms

Part III

Shared
memory


  • Transforming
a
network
system
to
a
shared


memory
system
is
easier


  • TransformaCon
keeps
fault
tolerant
properCes


– no
special
requirement
(like
n
>
2f)


  • Are
easier
to
implement


– e.g.,
a
channel
can
be
simulated
easily
with
a
 shared
variable


  • Shared
memory
more powerful
than
network


model


79