 
              Objectives/Outline (selected topics from chapter 16 & 18) Objectives Outline Provide a high-level overview Introduction Distributed Systems � � of distributed systems Types of Distributed � Describe various methods for Operating Systems � achieving mutual exclusion in Event Ordering � a distributed system Mutual Exclusion � Presented By: Dr. El-Sayed M. El-Alfy Present schemes for handling � Deadlock Handling � deadlocks in a distributed Election Algorithms � system Present algorithms used in � case of coordinator failure Note: Most of the slides are compiled from the textbook and its complementary resources April 08 1 April 08 2 16.1 Introduction Introduction (cont.) Distributed system (DS) is a collection of loosely coupled processors � that do not share memory or clock (i.e. each processor has its own memory and clock); communicate through a network Processors are referred to as sites, nodes, computers, machines, hosts � Processors in DS are most likely heterogeneous (i.e. vary in size and � function) Reasons for distributed systems � Resource sharing 1. sharing and printing files at remote sites � processing information in a distributed database � using remote specialized hardware devices � Computation speedup – load sharing 2. Reliability – detect and recover from site failure, function transfer, 3. reintegrate failed site Communication – message passing 4. Require mechanisms for process synchronization & communication, � dealing with deadlocks, handling failures not encountered in a A general structure of a distributed system centralized system 3 4 April 08 April 08
16.2 Types of Distributed Operating Systems Types of Distributed Operating Systems (cont.) Network Operating Systems � � Process Migration Users are aware of multiplicity of machines. Access to resources of � � Load balancing – distribute processes across network to even various machines is done explicitly by: the workload � Remote logging into the appropriate remote machine (telnet, ssh) � Transferring data from remote machines to local machines, via FTP � Computation speedup – subprocesses can run concurrently on Distributed Operating Systems different sites � Users access remote resources in the same way they access local � Hardware preference – process execution may require � resources (seamless manner) specialized processor Data migration – transfer data by transferring entire file, or transferring � � Software preference – required software may be available at only those portions of the file necessary for the immediate task only a particular site Computation migration – transfer the computation, rather than the data, � � Data access – run process remotely, rather than transfer all across the system data locally Process migration – execute an entire process, or parts of it, at different � sites April 08 5 April 08 6 18.1 Event Ordering Relative Time for Three Concurrent Processes � Happened-before relation (denoted by → ) � Space-time diagram � If A and B are events in the same process, and A was executed before B , then A → B Processes (space) � If A is the event of sending a message by one process and B is the event of receiving that message by another process, then A → B � If A → B and B → C then A → C (transitive) � Irreflexive relation: since an event can not be Time happened-before itself � If two events are not related by → relation, they are concurrent � If A → B then A can affect B 7 8 April 08 April 08
18.2 Implementation of → Mutual Exclusion (ME) in DS Associate a timestamp with each system event � � Assumptions Require that for every pair of events A and B, if A → B, then the � � The system consists of n processes; each process P i resides at a timestamp of A is less than the timestamp of B different processor Within each process Pi a logical clock (LCi) is associated � � Each process has a critical section that requires mutual exclusion The logical clock can be implemented as a simple counter that is � � Requirement incremented between any two successive events executed within a process � If P i is executing in its critical section, then no other process P j is � Logical clock is monotonically increasing executing in its critical section A process advances its logical clock when it receives a message � � We present three algorithms to ensure the mutual whose timestamp is greater than the current value of its logical clock exclusion execution of processes in their critical sections If the timestamps of two events A and B are the same, then the � events are concurrent We may use the process identity numbers to break ties and to create a � total ordering April 08 9 April 08 10 ME: Centralized Approach ME: Fully Distributed Approach One of the processes in the system is chosen to coordinate the entry to � the critical section (CS) � When process P i wants to enter its CS, it generates a A process that wants to enter its CS sends a request message to the � new timestamp ( TS ), and sends the message request coordinator ( P i , TS ) to all other processes in the system The coordinator decides which process can enter the CS next, and its � � When process P j receives a request message, it may sends that process a reply message reply immediately or it may defer sending a reply back When the process receives a reply message from the coordinator, it � enters its CS � When process P i receives a reply message from all other After exiting its CS, the process sends a release message to the processes in the system, it can enter its CS � coordinator and proceeds with its execution � After exiting its CS, the process sends reply messages to No starvation if the coordinator scheduling policy is fair (e.g. FCFS) � all its deferred requests Requires three messages per CS entry: request, reply, and release � Upon failure of the coordinating process, a new process must be � elected as a coordinator, poll all processes to construct request queue 11 12 April 08 April 08
Fully Distributed Approach (Cont.) Fully Distributed Approach (Cont.) � The decision whether process P j replies immediately to � Desirable Behavior a request ( P i , TS ) message or defers its reply is based � Mutual exclusion is obtained on three factors: � Freedom from deadlock is ensured � If P j is in its CS, then it defers its reply to P i � Freedom from starvation is ensured, since entry to the CS is scheduled according to the timestamp ordering � If P j does not want to enter its CS, then it sends a reply immediately to P i � Which ensures that processes are served in a FCFS order � If P j wants to enter its CS but has not yet entered it, then it � The number of messages per CS entry is compares its own request timestamp with the timestamp TS 2 x ( n – 1) the minimum number of required messages per CS entry when � If its own request timestamp is greater than TS , then it sends a processes act independently and concurrently reply immediately to P i ( P i asked first) � Otherwise, the reply is deferred April 08 13 April 08 14 Fully Distributed Approach (Cont.) ME: Token-Passing Approach � Circulate a token among processes in system � Three Undesirable Consequences � Token is special type of message � The processes need to know the identity of all other processes in � Possession of token entitles holder to enter critical section the system, which makes the dynamic addition and removal of � Processes are logically organized in a ring structure processes more complex � If one of the processes fails, then the entire scheme collapses � Unidirectional ring guarantees freedom from starvation � This can be dealt with by continuously monitoring the state of all the � Number of messages per CS entry may vary processes in the system; if one process fails, all others are notified � Two types of failures � Processes that have not entered their CS must pause frequently to assure other processes that they intend to enter the CS � Lost token – election must be called � This protocol is therefore suited for small, stable sets of � Failed processes – new logical ring established cooperating processes 15 16 April 08 April 08
Recommend
More recommend