A Dist ribut ed Syst em 18: Dist ribut ed Syst ems Last Modif ied: - - PDF document

a dist ribut ed syst em 18 dist ribut ed syst ems
SMART_READER_LITE
LIVE PREVIEW

A Dist ribut ed Syst em 18: Dist ribut ed Syst ems Last Modif ied: - - PDF document

A Dist ribut ed Syst em 18: Dist ribut ed Syst ems Last Modif ied: 7/ 3/ 2004 1:49:01 PM -1 -2 Loosely Coupled Dist ribut ed Tight ly Coupled Dist ribut ed- Syst ems Syst ems Users are aware of mult iplicit y of Users not aware


slide-1
SLIDE 1

1

  • 1

18: Dist ribut ed Syst ems

Last Modif ied: 7/ 3/ 2004 1:49:01 PM

  • 2

A Dist ribut ed Syst em

  • 3

Loosely Coupled Dist ribut ed Syst ems

Users are aware of mult iplicit y of

  • machines. Access t o resources of various

machines is done explicit ly by:

Remot e logging int o t he appr opr iat e r emot e

machine.

Tr ansf er r ing dat a f r om r emot e machines t o

local machines, via t he File Transf er Prot ocol (FTP ) mechanism.

  • 4

Tight ly Coupled Dist ribut ed- Syst ems

Users not aware of mult iplicit y of

  • machines. Access t o remot e resources

similar t o access t o local resources

Examples

Dat a Migr at ion – t r ansf er dat a by t r ansf er r ing

ent ir e f ile, or t r ansf er r ing only t hose por t ions

  • f t he f ile necessary f or t he immediat e t ask.

Comput at ion Migr at ion – t r ansf er t he

comput at ion, r at her t han t he dat a, acr oss t he syst em.

  • 5

Dist ribut ed-Operat ing Syst ems (Cont .)

Pr ocess Migr at ion – execut e an ent ir e pr ocess,

  • r part s of it , at dif f erent sit es.
  • Load balancing – dist ribut e processes across net work

t o even t he workload.

  • Comput at ion speedup – subprocesses can run

concurrent ly on dif f erent sit es.

  • Hardware pref erence – process execut ion may require

specialized processor.

  • Sof t ware pref erence – required sof t ware may be

available at only a part icular sit e.

  • Dat a access – run process remot ely, rat her t han

t ransf er all dat a locally.

  • 6

Why Dist ribut ed Syst ems?

Communicat ion

Dealt wit h t his when we t alked about net wor ks

Resour ce shar ing Comput at ional speedup Reliabilit y

slide-2
SLIDE 2

2

  • 7

Resource Sharing

Dist ribut ed Syst ems of f er access t o

specialized resources of many syst ems

Example:

  • Some nodes may have special dat abases
  • Some nodes may have access t o special hardware

devices (e.g. t ape drives, print ers, et c.)

DS of f ers benef it s of locat ing processing

near dat a or sharing special devices

  • 8

OS Support f or resource sharing

Resource Management ?

Dist r ibut ed OS can manage diver se r esour ces

  • f nodes in syst em

Make r esour ces visible on all nodes

  • Like VM, can provide f unct ional illusion bur rarely hide

t he perf ormance cost

Scheduling?

Dist r ibut ed OS could schedule pr ocesses t o r un

near t he needed r esour ces

I f need t o access dat a in a lar ge dat abase may

be easier t o ship code t her e and r esult s back t han t o r equest dat a be shipped t o code

  • 9

Design I ssues

Transparency – t he dist r ibut ed syst em should

appear as a convent ional, cent r alized syst em t o t he user.

Fault tolerance – t he dist r ibut ed syst em should

cont inue t o f unct ion in t he f ace of f ailur e.

Scalability – as demands incr ease, t he syst em

should easily accept t he addit ion of new r esour ces t o accommodat e t he incr eased demand.

Clusters vs Client / Server

Clust ers: a collect ion of semi- aut onomous machines t hat

act s as a single syst em.

  • 10

Why Dist ribut ed Syst ems?

Resour ce shar ing Comput at ional speedup Reliabilit y

  • 11

Comput at ion Speedup

Some t asks t oo lar ge f or even t he f ast est single

comput er

Real t ime weat her/ climat e modeling, human genome

proj ect , f luid t urbulence modeling, ocean circulat ion modeling, et c.

ht t p:/ / www.nersc.gov/ research/ GC/ gcnersc.ht ml

What t o do?

Leave t he problem unsolved? Engineer a bigger/ f ast er comput er? Harness resources of many smaller (commodit y?)

machines in a dist ribut ed syst em?

  • 12

Breaking up t he problems

To harness comput at ional speedup must

f irst break up t he big problem int o many smaller problems

More art t han science?

Somet imes br eak up by f unct ion

  • P

ipeline?

  • Job queue?

Somet imes br eak up by dat a

  • Each node responsible f or port ion of dat a set ?
slide-3
SLIDE 3

3

  • 13

Decomposit ion Examples

Decrypt ing a message

Easily parallelizable , give each node a set of

keys t o t ry

J ob queue – when t r ied all your keys go back

f or more? Modeling ocean circulat ion

Give each node a por t ion of t he ocean t o model

(N squar e f t r egion?)

Model f lows wit hin r egion locally Communicat e wit h nodes managing neighbor ing

r egions t o model f lows int o ot her r egions

  • 14

Decomposit ion Examples (con’t)

Bar nes Hut – calculat ing ef f ect of

bodies in space on each ot her

Could divide space int o NxN regions? Some regions have many more bodies

I nst ead divide up so have r oughly

same number of bodies

Wit hin a r egion, bodies have lot s

  • f ef f ect on each ot her (close

t oget her )

Abst r act ot her r egions as a single

body t o minimize communicat ion

  • 15

Linear Speedup

Linear speedup is of t en t he goal.

Allocat e N nodes t o t he j ob goes N t imes as

f ast Once you’ve broken up t he problem int o N

pieces, can you expect it t o go N t imes as f ast ?

Are t he pieces equal? I s t her e a piece of t he wor k t hat cannot be

br oken up (inher ent ly sequent ial?)

Synchr onizat ion and communicat ion over head

bet ween pieces?

  • 16

Super -linear Speedup

Somet imes can act ually do bet t er t han linear

speedup!

Especially if divide up a big dat a set so t hat t he

piece needed at each node f it s int o main memor y

  • n t hat machine

Savings f r om avoiding disk I / O can out weigh t he

communicat ion/ synchr onizat ion cost s

When split up a pr oblem, t ension bet ween

duplicat ing pr ocessing at all nodes f or r eliabilit y and simplicit y and allowing nodes t o specialize

  • 17

OS Support f or P arallel J obs

P

rocess Management ?

OS could manage all pieces of a par allel j ob as

  • ne unit

Allow all pieces t o be cr eat ed, managed,

dest r oyed at a single command line

For k (pr ocess,machine)?

Scheduling?

Pr ogr ammer could specif y wher e pieces should

r un and or OS could decide

  • P

rocess Migrat ion? Load Balancing?

Tr y t o schedule piece t oget her so can

communicat e ef f ect ively

  • 18

OS Support f or P arallel J obs (con’t)

Gr oup Communicat ion?

OS could provide f acilit ies f or pieces of a single j ob t o

communicat e easily

Locat ion independent addressing? Shared memory? Dist ribut ed f ile syst em?

Synchr onizat ion?

Support f or mut ually exclusive access t o dat a across

mult iple machines

Can’t rely on HW at omic operat ions any more Deadlock management ? We’ll t alk about clock synchronizat ion and t wo

  • phase

commit lat er

slide-4
SLIDE 4

4

  • 19

Why Dist ribut ed Syst ems?

Resour ce shar ing Comput at ional speedup Reliabilit y

  • 20

Reliabilit y

Dist r ibut ed syst em of f er s pot ent ial f or incr eased

reliabilit y

I f one part of syst em f ails, rest could t ake over Redundancy, f ail- over

!BUT! Of t en r ealit y is t hat dist r ibut ed syst ems

  • f f er less r eliabilit y

“A dist ribut ed syst em is one in which some machine I ’ve

never heard of f ails and I can’t do work!”

Hard t o get rid of all hidden dependencies No clean f ailure model

  • Nodes don’t j ust f ail t hey can cont inue in a br oken st at e
  • Par t it ion net wor k = many many nodes f ail at once!

(Det er mine who you can st ill t alk t o; Ar e you cut of f or ar e t hey?)

  • Net work goes down and up and down again!
  • 21

Robust ness

Det ect and recover f rom sit e f ailure,

f unct ion t ransf er, reint egrat e f ailed sit e

Failur e det ect ion Reconf igur at ion

  • 22

Failure Det ect ion

Det ect ing har dwar e f ailur e is dif f icult . To det ect a link f ailur e, a handshaking pr ot ocol

can be used.

Assume Sit e A and Sit e B have est ablished a link.

At f ixed int er vals, each sit e will exchange an I - am-up message indicat ing t hat t hey ar e up and r unning.

I f Sit e A does not receive a message wit hin t he

f ixed int er val, it assumes eit her (a) t he ot her sit e is not up or (b) t he message was lost .

Sit e A can now send an Ar e-you-up? message t o

Sit e B.

I f Sit e A does not r eceive a r eply, it can r epeat

t he message or t r y an alt er nat e r out e t o Sit e B.

  • 23

Failure Det ect ion (cont )

I f Sit e A does not ult imat ely r eceive a r eply f r om

Sit e B, it concludes some t ype of f ailure has

  • ccurred.

Types of f ailures:

  • Sit e B is down
  • The direct link bet ween A and B is down
  • The alt er nat e link f r om A t o B is down
  • The message has been lost

However , Sit e A cannot det er mine exact ly why t he

f ailur e has occur r ed.

B may be assuming A is down at t he same t ime Can eit her assume it can make decisions alone?

  • 24

Reconf igurat ion

When Sit e A det er mines a f ailur e has occur r ed, it

must r econf igur e t he syst em:

  • 1. I f t he link f rom A t o B has f ailed, t his must be

br oadcast t o ever y sit e in t he syst em.

  • 2. I f a sit e has f ailed, ever y ot her sit e must also

be not if ied indicat ing t hat t he ser vices of f er ed by t he f ailed sit e ar e no longer available.

When t he link or t he sit e becomes available again,

t his inf or mat ion must again be br oadcast t o all

  • t her sit es.