CS4513 Dist ribut ed synchronizat ion needed f or t ransact ions - - PDF document

cs4513
SMART_READER_LITE
LIVE PREVIEW

CS4513 Dist ribut ed synchronizat ion needed f or t ransact ions - - PDF document

I nt roduct ion Communicat ion not enough. Need cooperat ion Synchronizat ion CS4513 Dist ribut ed synchronizat ion needed f or t ransact ions (bank account via ATM) Dist ribut ed Comput er access t o shar ed r esour ce


slide-1
SLIDE 1

1

CS4513 Dist ribut ed Comput er Syst ems

Synchronizat ion (Ch 5)

I nt roduct ion

  • Communicat ion not enough. Need

cooperat ion Synchronizat ion

  • Dist ribut ed synchronizat ion needed f or

– t ransact ions (bank account via ATM) – access t o shar ed r esour ce (net wor k pr int er ) – ordering of event s (net work games where players have dif f erent ping t imes)

Out line

  • I nt ro

(done)

  • Clock Synchronizat ion

(next )

  • Global Time and St at e
  • Elect ion Algor it hms
  • Mut ual Exclusion
  • Dist ribut ed Transact ions

Clock Synchronizat ion

  • When each machine has it s own clock, an event t hat
  • ccurred af t er anot her event may nevert heless be assigned

an earlier t ime

  • Consider make

– Compiling machine compares t ime st amps

  • Same holds when using NFS mount
  • Can we set all clocks in a dist ribut ed syst em t o have t he

same t ime?

Physical Clocks

  • “Exact ” t ime was comput ed by ast ronomers

– Take “noon” f or t wo days, divide by 24*60*60 Mean solar second

  • But …

– Ear t h is slowing! (35 days over 300 million year s) – Shor t t er m f luct uat ions (Magma cor e, and such) – Could t ake many days f or aver age, but st ill er r oneous

  • Physicist s t ake over (J an 1, 1958)

– Count t r ansit ions of cesium 133 at om

  • 9,192,631,770 == 1 solar second

– 50 cesium 133 clocks aver aged

  • I nt er nat ional At omic Time ( TAI )

– To st op day f r om “shif t ing” (r emember , ear t h is slowing) t r anslat e TAI int o Univer sal Coor dinat ed Time (UTC)

  • UTC is br oadcast (short wave radio pulses)

Clock Synchronizat ion Algorit hms

  • Not ever y machine has UTC r eceiver

– I f one, t hen keep ot hers synchronized

  • Comput er t imer s go of f H t imes/ sec, incr count er
  • I deally, if H=60, 216,000 per hour (dC

/ dt = 0)

  • But t ypical er r or s, 10–5, so 215,998 t o 216,002
  • Specs can give you

maximum dr if t r at e(ρ)

  • Every ∆t seconds, will

be at most 2ρ∆t apart

  • I f want drif t of δ, r e-

synchronize every δ/2ρ Various algs (next )

slide-2
SLIDE 2

2

Crist ian' s Algorit hm

  • Every δ/2ρ, ask server f or t ime
  • What are t he problems?
  • Maj or

– Client clock is f ast – What t o do?

  • Minor

– Non-zero amount of t ime t o sender – What t o do?

Crist ian' s Algorit hm

  • Want one-way (T1 – T0)/ 2. Problems?

– T0!= T1? I gnore. – Variance? Take average. Or smallest . – I ? C an subt ract , but need t o det ermine t ime.

The Berkeley Algorit hm

a) The t ime daemon asks all t he ot her machines f or t heir clock values b) The machines answer c) The t ime daemon t ells everyone how t o adj ust t heir clock Crist ian’s and Berkeley’s are cent ralized. P roblems?

Decent ralized Algorit hms

  • Periodically (every R seconds), each

machine br oadcast s cur r ent t ime

  • Collect t ime samples f or some t ime t ime

(S)

  • Take average and set t ime
  • Can discard m so m f ault y clocks don’t hurt
  • Can improve by comput ing (T1 – T 0)/ 2

– Need probes t o obt ain

  • Used by Net work Time Prot ocol (NTP)

– Worldwide accuracy of 1-50 msec

Out line

  • I nt ro

(done)

  • Clock Synchronizat ion

(done)

  • Global Time and St at e

(next )

  • Elect ion Algor it hms
  • Mut ual Exclusion
  • Dist ribut ed Transact ions

Lamport Timest amps

a) Each processes wit h own clock wit h dif f erent rat es. b) Lamport ' s algorit hm correct s t he clocks. c) Can add machine I D t o break t ies

  • Often don’t need time, but ordering ab (happens before)

(impossible)

slide-3
SLIDE 3

3

Use Example: Tot ally-Or der ed Mult icast ing

  • San Fran cust omer adds $100, NY bank adds 1% int erest

– San Fran will have $1,111 and NY will have $1,110

  • Updat ing a replicat ed dat abase and leaving it in an

inconsist ent st at e.

  • Can use Lamport ’s t o t ot ally order

(San Francisco) (New York) (+$100) (+1%)

Consist ent Global St at e

a) A consist ent cut b) An inconsist ent cut

  • How do ensur e always a consist ent cut ?
  • Need for state of distributed system, say, for termination detection

Consist ent Global St at e (2)

  • Pr ocesses all connect ed. Can init iat e st at e message

(M) a) Or ganizat ion of a pr ocess and channels f or a dist r ibut ed snapshot

Consist ent Global St at e (3)

b) Pr ocess Q r eceives M f or t he f ir st t ime and r ecor ds it s local st at e. Sends M on all out going links c) Q r ecor ds all incoming messages d) Q r eceives M f or it s incoming channel and f inishes r ecor ding t he st at e of t he incoming channel

  • Can t hen send st at e t o init iat ing pr ocess
  • Syst em can st ill pr oceed nor mally

Out line

  • I nt ro

(done)

  • Clock Synchronizat ion

(done)

  • Global Time and St at e

(done)

  • Elect ion Algor it hms

(next )

  • Mut ual Exclusion
  • Dist ribut ed Transact ions

Elect ion Algorit hms

  • Of t en need one process as a coordinat or
  • All processes in dist ribut ed syst ems may

be equal

– Assume have some “I D” t hat is a number

  • Need way t o “elect ” process wit h t he

highest number as leader

slide-4
SLIDE 4

4

The Bully Algorit hm (1)

  • Pr ocess 4 not ices 7 down
  • Pr ocess 4 holds an elect ion
  • Pr ocess 5 and 6 r espond, t elling 4 t o st op
  • Now 5 and 6 each hold an elect ion

The Bully Algor it hm (2)

d) P rocess 6 t ells process 5 t o st op e) P rocess 6 wins and t ells everyone

  • Event ually “biggest ” (bully) wins
  • I f processes 7 comes up, st art s elect ions again

A Ring Algorit hm

  • Coor dinat or down, st ar t ELECTI ON

– Send message down ring, add I D – Once around, change t o COORDI NATOR (biggest )

  • Even if t wo ELECTI ONS st ar t ed at once, ever yone

will pick same leader

Out line

  • I nt ro

(done)

  • Clock Synchronizat ion

(done)

  • Global Time and St at e

(done)

  • Elect ion Algor it hms

(done)

  • Mut ual Exclusion

(next )

  • Dist ribut ed Transact ions

Mut ual Exclusion: A Cent ralized Algorit hm

a) Pr ocess 1 asks t he coor dinat or f or per mission t o ent er a cr it ical r egion. Per mission is gr ant ed b) Pr ocess 2 t hen asks per mission t o ent er t he same cr it ical r egion. The coor dinat or does not r eply. (Or , can say “denied”) c) When pr ocess 1 exit s t he cr it ical r egion, it t ells t he coor dinator, when t hen replies t o 2.

  • But cent r alized, single point of f ailur e

A Dist ribut ed Algorit hm

a) P rocesses 0 and 2 want t o ent er t he same crit ical region at t he same moment . b) P rocess 1 doesn’t want t o, says “OK”. P rocess 0 has t he lowest t imest amp, so it wins. Queues up “OK” f or 2. c) When process 0 is done, it sends an OK t o 2 so can now ent er t he crit ical region.

  • (Again, can modif y t o say “denied”)
slide-5
SLIDE 5

5

A Token Ring Algorit hm

a) An unor der ed gr oup of pr ocesses on a net wor k. b) A logical r ing const r uct ed in sof t war e.

  • Pr ocess must have t oken t o ent er .
  • I f don’t want t o ent er , pass t oken along.
  • I f host down, r ecover r ing. I f t oken lost ,

r egener at e t oken. I f in cr it ical sect ion long?

Mut ual Exclusion Algorit hm Comparison

  • Cent ralized most ef f icient
  • Token ring ef f icient when many want t o use

cr it ical r egion

Lost token, process crash 0 to n – 1 1 to ∞

Token ring

Process crash 2 ( n – 1 ) 2 ( n – 1 )

Distributed

Coordinator crash 2 3

Centralized Problems Delay before entry (in message times) Messages per entry/exit Algorithm

Out line

  • I nt ro

(done)

  • Clock Synchronizat ion

(done)

  • Global Time and St at e

(done)

  • Elect ion Algor it hms

(done)

  • Mut ual Exclusion

(done)

  • Dist ribut ed Transact ions

(next )

The Transact ion Model

  • Gives you mut ual exclusion plus…
  • Consider using PC (Quicken) t o:

– Wit hdr aw $a f r om account 1 – Depost $a t o account 2

  • I f int errupt bet ween 1) and 2), $a gone!
  • Mult iple it ems in single, at omic act ion

– I t all happens, or none – I f process backs out , as if never st art ed

Tr ansact ion Pr imit ives

  • Above may be syst em calls, libr ar ies or st at ement s

in a language (Sequent ial Quer y Language or SQL)

Write data to a file, a table, or otherwise WRITE Read data from a file, a table, or otherwise READ Kill the transaction and restore the old values ABORT_TRANSACTION Terminate the transaction and try to commit END_TRANSACTION Make the start of a transaction BEGIN_TRANSACTION Description Primitive

Example: Reserving Flight f rom Whit e Plains t o Nairobi

a) Tr ansact ion t o r eser ve t hr ee f light s commit s b) Tr ansact ion abor t s when t hir d f light is unavailable

  • The “all-or-not hing” is one pr oper t y. Ot her s:

BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full => ABORT_TRANSACTION

(b)

BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi; END_TRANSACTION (a)

slide-6
SLIDE 6

6

Transact ion Propert ies

1) At omic –

  • Ot hers don’t see int ermediat e result s, eit her

2) Consist ent

  • Syst em invariant s not violat ed
  • Ex: no money lost af t er operat ions)

3) I solat ed

  • Operat ions can happen in parallel but as if were

done serially 4) Durabilit y

  • Once commit s, move f orward
  • (Ch 7, won’t cover more)
  • ACI D

Classif icat ion of Transact ions

  • Flat Tr ansact ions

– Limit ed – Example: what if want t o keep f irst part of f light reservat ion? I f abort and t hen rest art , t hose might be gone. – Example: what if want t o move a Web page. All links point ing t o it would need t o be updat ed. I t could lock r esour ces f or a long t ime

  • Also Dist ribut ed and Nest ed Tr ansact ions

Dist ribut ed Transact ions

  • Nest ed t r ansact ion gives you a hier ar chy

– C an dist ribut e (example: WPJFK, JFK Nairobi ) – But may require mult iple dat abases

  • Dist r ibut ed t r ansact ion is “f lat ” but acr oss

dist r ibut ed dat a (example: J FK and Nair obi dbase)

Out line

  • I nt ro

(done)

  • Clock Synchronizat ion

(done)

  • Global Time and St at e

(done)

  • Elect ion Algor it hms

(done)

  • Mut ual Exclusion

(done)

  • Dist ribut ed Transact ions

– Overview (done) – I mplement at ion (next )

Privat e Workspace (1)

  • File syst em wit h t ransact ion across

mult iple f iles

– Nor mally, updat es seen + No way t o undo

  • Privat e Workspace Copy f iles
  • Only updat e Public Workspace once done
  • I f abort t ransact ion, remove privat e copy.
  • But copy can be expensive!

– How t o ix?

Pr ivat e Wor kspace (2)

a) Original f ile index (descript or) and disk blocks b) Copy descript or only. Copy blocks only when writ t en.

  • Modif ied block 0 and appended block 3

c) Replace original f ile (new blocks plus descript or) af t er commit

slide-7
SLIDE 7

7

Writ eahead Log

a) A t ransact ion b) – d) log bef ore each st at ement is execut ed

  • I f t ransact ion commit s, not hing t o do
  • I f t ransact ion is abort ed, use log t o rollback

Log [x = 0 / 1] [y = 0/2] [x = 1/4] (d) Log [x = 0 / 1] [y = 0/2] (c) Log [x = 0 / 1] (b) x = 0; y = 0; BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y; END_TRANSACTION; (a)

  • Don’t make copies. Instead, record action plus old and new

values.

Concurrency Cont rol (1)

  • General organizat ion of managers f or

handling t r ansact ions.

Concur r ency Cont r ol (2)

  • General organizat ion of managers f or

handling dist ribut ed t ransact ions.

Ser ializabilit y

a) – c) Thr ee t r ansact ions T1, T2, and T3. Answer could be 1, 2 or 3. All valid.

BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION (c) BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION (b) BEGIN_TRANSACTION x = 0; x = x + 1; END_TRANSACTION (a) Illegal x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3; Schedule 3 Legal x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3; Schedule 2 Legal x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3 Schedule 1

Allow parallel execution, but end result as if serial

  • I f in par allel, only some possible schedules
  • 2 is serialized
  • Concur r ency cont r oller needs t o manage

Two-Phase Locking

  • Acquire locks (ex: in previous example). P

erf orm updat e. Release.

  • Can lead t o deadlocks (use OS t echniques t o resolve)
  • Can prove: if used by all t ransact ions, t hen all schedules will

be serializable

Timest amp Ordering

  • Pessimist ic

– Every read and writ e get s a t imest amp (unique, using Lampor t ’s alg) – I f conf lict , abor t sub-oper at ion and r e

  • t r y
  • Opt imist ic

– Allow all oper at ions since conf lict r at e – At end, if conf lict , r oll-back