a qos driven resource allocation framework based on the
play

A QoS-driven Resource Allocation Framework based on the Risk - PowerPoint PPT Presentation

A QoS-driven Resource Allocation Framework based on the Risk Incursion Function and its Incorporation into a Middleware Structure & Mechanisms Supporting Distributed Fault Tolerant Real-time Computing Applications For presentation at the


  1. I ntroduction assigning fixed priority is a very primitive and crude way of expressing • the relative importance or urgency among different tasks or processes. - Fixed priority assignment introduces complexity for the distributed RT system design. The designer of distributed, RT systems should concentrate on high-level concepts such as computing objects, instead of considering details such as “process”, “thread”, “priority” or communication protocols. - Fixed priorities are the attributes that can be easily observed by the low-level node execution engine. - If there are timing requirements inherent in the target applications, it should be expressed in the simplest, easily analyzable form in the high-level system design. UCI DREAM Lab

  2. I ntroduction System output 1 A distributed System output 2 real-time … system System output N • Ultimately, execution resource requirements come from the needs of producing acceptable-quality outputs of application functions. • The most meaningful purpose of any resource allocation is meeting the application requirements with the best quality of execution results and with minimal use of execution resources. • An real-time computing system is required to take every service action accurately not only in “time dimension” but also in “logical dimension”. • System design engineers must understand not only the QoS requirements (i.e., output accuracy, fault tolerance), but also the impacts of QoS losses, i.e., inaccurate outputs on the overall application success. UCI DREAM Lab

  3. the Risk I ncursion Function (RI F) System Output 1 Actuator 1 A distributed, RIF 1 real-time system System Output 2 Actuator 2 RIF 2 Risks - Damaging impacts of QoS losses to the application mission RIF (a.k.a. Benefit Loss Function) := relation (Loss in timed value accuracy of each output action, Potential application damage) := relation (QoS loss, Risk) UCI DREAM Lab

  4. Risk I ncursion Potential Function (RI PF) System Output 1 Computing node 1 Actuator 1 RIF 1 Intermediate Intermediate Output 1 Output 2 RIPF 1 RIPF 2 System Output 2 Computing node 2 Actuator 2 RIF 2 System-level RIF and derived RIF (= RIPF) Derived RIF = RIPF (Risk Incursion Potential Function) = relation (Accuracy loss in intermediate output, Potential risk) UCI DREAM Lab

  5. Risk I ncursion Potential Function (RI PF) RIPF 12 RIPF 13 RIF 1 O 1 O 2 O 3 Actuator 1 RIPF 21 RIPF 11 O 5 RIPF 22 RIPF 23 RIF 2 O 4 O 6 Actuator 2 UCI DREAM Lab

  6. Risk I ncursion Potential Function (RI PF) RIPF 11 RIPF 12 RIF 1 O 11 O 12 O 13 Actuator 1 RIPF 2 RIPF 1 O 12 RIPF 22 RIPF 21 RIF 2 O 11 O 13 Actuator 2 OS & Support Middleware RI PF-based Resource Allocators UCI DREAM Lab

  7. The procedure of TMO-based application development System output 1 RIF 1 The whole application Application System output 2 started as (one TMO) RIF 2 … one TMO Then the System output 1 RIPF12 RIPF11 TMO is divided RIF 1 TMO1 TMO2 as multiple TMO, RIPF21 System output 2 At the same time, RIF 2 the RIPFs are RIPF31 RIPF32 System output 3 derived from TMO3 RIF 3 the RIFs … … Final, the RIPF121 RIPF11 SvM1 SpM1 SvM1 application is RIPF12 SpM1 described as RIF 1 RIPF111 SpM2 a TMO network (basic scheduling RIPF21 RIPF31 unit is SxM RIF 2 supported by RIPF32 … RIF 3 a thread) UCI DREAM Lab

  8. RI F (RI PF) examples Type I : Hard Deadline Type I I : Soft Deadline Risk Risk Serious level Serious level Earliest possible Output action time Earliest possible Output action time Deadline Deadline output time output time Concave function (I.e, ax + b or sqrt(x) + c) soft hard Type I I I : Soft deadline deadline deadline followed by a hard deadline Risk Serious level Convex function (Polynomial function, i.e., ax 3 + bx 2 + cx + d) Earliest possible Output action time output time UCI DREAM Lab

  9. Alien Case Study: RV’s CAMI N e d u t i t l a (Coordinated Anti- t p e c r e Missile Interceptor t n I Network) Theater Defense Target in Land e ( Command Post ) d u t i t l a t p e c r e t n I Defense Target in Sea : In safe area ( Command Ship ) UCI DREAM Lab

  10. Step 2 Step 3 SpM SpM • • • • • • SvM SvM • • • • • • Alien Alien SpM SpM SpM SpM FOT FOT SvM SvM SvM SvM RDQ IPDS RDQ IPDS SpM SpM SvM SvM UCI DREAM Lab

  11. CAMI N as a network of TMO’s Control Computer System Design for use in Land • Defense command-control system • 9 TMO’s; 2 TMO’s made fault-tolerant Alien FOT • Runs on LAN of 3+ PC’s • 25, 000 lines of C+ + code RDQ • Non-stop effective defense in the IPDS presence of - application software faults - processor faults - communication link (involves both software and hardware) faults FOT - interconnection network (involves both software and hardware) faults RDQ IPDS Control Computer System Real-Time Simulation Design for use in Sea UCI DREAM Lab

  12. Case Study: CAMI N Alien System Output 1: Alien.SysOut1: Theater Alien Theater System Output 1: Theater.SysOut1 Alien.SysOut1: Send reentry vehicle (missile) and NTFOs (non-threatening flying object) to the theater. Theater.SysOut1: Send information about the defense targets to the alien; Send current statuses of missiles and commercial airplanes leaving from the theater to the alien; UCI DREAM Lab

  13. Theater Alien.SysOut1 Alien Theater (TH) Theater.SysOut1 TH SysOut3 TH.SysOut2: Send radar spot check and CS scan check data to CP. SysOut2 TH TH.SysOut3: Send radar spot check and CP TH SysOut2 scan check data to CS. SysOut2 SysOut5 CP TH TH.SysOut4: Send the status of the CS SysOut1 interceptors and launchers to CP. SysOut4 SysOut1 TH.SysOut5: Send the status of the interceptors and launchers to CS. CP.SysOut1: Send intercept request to TH. CP Command Command CP.SysOut2: Send radar spot check SysOut3 plan to TH. Post Ship CP.SysOut3: Send data on status of (CP) (CS) suspicious items to CS. CS.SysOut1: Send intercept request to TH. CS.SysOut2: Send radar spot check plan to TH. UCI DREAM Lab

  14. CP.SysOut1 CP.SysOut2 RIF_CP1: RIF_CP2: if x ≤ D y = 0 if x ≤ D y = 0 x – D if D < x ≤ D + 50 or 400 if x > D 50 if x > D + 50 Risk Risk Earliest possible Earliest possible Output action time Output action time D output time output time D Deadline Deadline Risk CP.SysOut3 RIF_CP3: if x ≤ D1 y = 0 5(x – D1) if D1 < x ≤ D2 200 if x > D2 Earliest possible Output action time D1 D2 output time Soft Hard Deadline Deadline UCI DREAM Lab

  15. Constraints for the deadline of CP.SysOut1 60000 (0, 60000, 60000) 1. Spatial Constraint Time Interval 1 = (60000 – 2000) / Max. Speed of RV Time Interval 1 – Time Interval 2 (0, 60000, 2000) 2000 Distance 1 (11000, 20000, 0) Time Interval 2 = Distance 1 / Min. Speed of Launcher UCI DREAM Lab

  16. Constraints for the deadline of CP.SysOut1 2. Temporal Constraint t0: Radar detection data arrives t1: Interception plan is sent out t2: Hit or miss the target Hitting range t0: CP receives the radar data and starts Building the interception plan. t1: CP sends out the interception plan. The position Of the missile in t1 is extrapolated from the data of t0. While the t1 – t0 becomes bigger, the accuracy of the extrapolation becomes worse. t2: If the missile is in the hitting range of the interceptor, the interception is successful. The success rate depends on the accuracy of the extrapolation at t1. UCI DREAM Lab

  17. Command Post (CP) TH SysOut2 RDQ CP CP RIPF_RDQ RIPF_FOT1 CP RI F_CP3 CP RIPF_FOT3 RI F_CP2 RIPF_FOT2 FOT CS Theater CP RIPF_FOT4 TH SysOut4 IPDS RI F_CP1 CP RIPF_IPDS UCI DREAM Lab

  18. Command Post Max Max TH. Comm. Comm. RIPF_FOT1 RIF_CP1 SysOut2 RIPF_RDQ Delay Delay RIPF_IPDS RDQ FOT IPDS RIPF_FOT2 TH. RIF_CP2 RIF_CP3 SysOut4 • The derivation of RIPF from RIF is based on the worst case execution time (WCET) analysis and the importance of each task. CP.SysOut1: RIF • Let assume the maximum inter-TMO (intra-node) comm. delay y = 0 if x < = Deadline is 5ms, inter-node comm. delay is 10ms. Let also assume or RDQ, FOT and IPDS are running in the same node. 100 if x > Deadline • In this design example, suppose we conclude that the deadline of CP.SysOut1 should be 200ms, and CP.SysOut2 and CP.SysOut3 should be 100ms. After analyzing the WCETs of RDQ, FOT and Risk IPDS, we allocate this 200ms as follows: RDQ (25ms) FOT (50ms) IPDS (90ms) • Since RDQ and FOT are related to all of the three system outputs, while IPDS is related with only system output 1, we set the threshold of deadline violation as follows: Compl. time RDQ (80), FOT(80), IPDS (40) Deadline UCI DREAM Lab

  19. Command Post Max Max TH. Comm. Comm. RIPF_FOT1 RIF_CP1 SysOut2 RIPF_RDQ Delay Delay RIPF_IPDS RDQ FOT IPDS 5ms 10ms 5ms 5ms 10ms 5ms RIPF_FOT2 10ms TH. RIF_CP2 RIF_CP3 SysOut4 CP.RIPF_RDQ CP.RIPF_FOT1 CP.RIPF_RDQ y = 0 if x < = 25ms y = 0 if x < = 50ms y = 0 if x < = 90ms or or or 80 if x > 25ms 80 if x > 50ms 40 if x > 90ms Risk Risk Risk Compl. time Compl. time Compl. time Deadline Deadline Deadline UCI DREAM Lab

  20. TH.SysOut2 . Suppose max inter-SxM comm RDQ (through ODSS) delay is 1ms Example RIPF SvM1 After WECT analysis, we get: 1ms Dealines and risk for each SxM: Risk SpM1 RDQ.SvM1 5ms 40 RDQ.SpM1 19ms 40 FOT.SvM1 10ms 40 FOT.SpM1 39ms 40 FOT IPDS.SvM1 10ms 10 Deadline SvM1 IPDS.SvM2 15ms 10 IPDS.SpM1 79ms 20 1ms RIF_CP2 SpM1 RIF_CP3 I PDS TH.SysOut4 SvM1 SvM2 1ms SpM1 UCI RIF_CP1 DREAM Lab

  21. RI PF-driven CPU scheduling RIPF 1 Risk RIPF 2 RIPF 3 Current Time Execution Completion Time The optimal algorithm is NP-hard Theorem 1: The optimal (lowest-total-risk) scheduling algorithm based on the proposed RIPF set is NP-hard UCI DREAM Lab

  22. RI PF-driven CPU scheduling Resource allocation problem 1 RIPF 1 Risk RIPF 2 RIPF 3 Current Time Execution completion Time Theorem 1: Finding the optimal (lowest-total-risk) scheduling algorithm based on the proposed RIPF set is NP-hard Proof: 1. The inexact 0-1 knapsack problem is known to be NP-hard; ∑ = n Maximize v i i 1 n ∑ < subject to: R R i = i 1 Where there are n objects each with size R i and value v i , and R is the size of the knapsack. Both R i and V i are real number. 2. The above problem is equal to a special case of the problem 1, which is: F(x) = 0, if x < R i or V i , if x > R i 3. Therefore, the complexity of the problem 1 is NP-hard. UCI DREAM Lab

  23. RI PF-driven CPU scheduling The original problem (NP-hard) – Optimal solution based on the original RIPF set Approximation of the original problem (polynomial time, sub-optimal solution) Optimal solution Sub-optimal solution based the approximation of the based the original RIPF set original RIPF set Based on Based on the deadline only the risk only Alg. 5 Linear- RI PF Alg. 1 Alg. 2 Based on both LLF Shifted-RI PF Alg. 3 Alg. 4 RI PF RI PF/ Laxity UCI DREAM Lab

  24. RI PF-driven CPU scheduling Sub-optimal solution Alg. 1- LLF - Least laxity First Based the original RIPF Set O(nlgn) - Move all RIPF ‘s deadline to 0. Alg. 2 – - Compare the integration of the Shifted-RI PF Based on Based on RIPF within current timeslice, schedule O(nlgn) the deadline only the urgency only the highest one. If there are more than one highest, pick one randomly. Based on both Alg. 1 Alg. 2 LLF Shifted –RI PF Risk RIPF 1 Alg. 3 Alg. 4 RIPF 2 RI PF RI PF/ Laxity RIPF 3 The integration Compl. Time Of an RIPF within One timeslice UCI DREAM Lab

  25. RI PF-driven CPU scheduling Alg. 3 – RI PF - Run alg. 1 first, if zero risk arrangement O(nlgn) is found, use it and return; Otherwise Sub-optimal solution go to next step; Based the original RIPF Set - Calculate the integrations of RIPFs within the next N timeslice (vision window). Schedule the one with the highest value. Based on Based on - Run alg. 1 first, if zero risk arrangement the deadline only the urgency only Alg. 4 -RI PF/ Laxity is found, use it and return; Otherwise O(nlgn) go to next step; - Calculate the integrations of RIPFs within the next N timeslice (vision window), then divide it by Laxity. Based on both Schedule the one with the highest value. Alg. 1 Alg. 2 LLF Shifted –RI PF Vision Window RIPF 1 Risk Alg. 3 Alg. 4 RIPF 2 RI PF RI PF/ Laxity RIPF 3 Completion Time Current Time UCI DREAM Lab

  26. RI PF-driven CPU scheduling Alg. 4 Optimal solution Mathematical Approximation of the original Linear- RI PF Based the approximation of the RIPF with a function that: original RIPF Set - monotonically increasing (f’(x) > 0); - continuous. Risk Risk Approx. RIPF 1 Approx. RIPF 2 RIPF 1 RIPF 2 Compl. Time Compl. Time Risk RIPF 3 Approx. RIPF 3 Compl. Time UCI DREAM Lab

  27. RI PF-driven CPU scheduling Mathematical Approximation of the orignal RIPF Optimal solution with a function that: Based the approximation of the original RIPF Set - monotonically increasing (f’(x) > 0); - continuous. - Compare the current value of the RIPF, pick the highest one to schedule; Algorithm - If more than one RIPF’s have the highest value, compare the first derivative RIPF’, and pick the highest one. RIPF 1’ RIPF 1 Risk RIPF 2 RIPF 3 RIPF 2’ Execution Completion Time Current Time UCI DREAM Lab

  28. RI PF-driven CPU scheduling Mathematical Approximation of the original RIPF Optimal solution Based the approximation of the with a function that: original RIPF Set - monotonically increasing (f’(x) > 0); - continuous. Use linear approximation for the original RIPF’s Alg. 4 - Pick a set of equally-distanced dots from the RIPF functions Linear RI PF O(nlgn) - online - Find a linear function which go through dot0 (0,0) and the sum of the O(n 2 ) - offline distances from all the dots to this linear function are the minimum.   n ∑   − + − Subject to: y j = a x j and (y j - y i )/(x j -x i ) = -1/a 2 2 MIN ( x x ) ( y y )   i j i j   = i , j 1 (x i , y i ) Risk (x j , y j ) Y = aX Compl. Time dot 0 (0,0) Current Time UCI DREAM Lab

  29. The implementation of the RI PF-driven CPU scheduling • Since the derived RIPF set also incorporates deadline information for each SpM and SvM, the RIPF-driven resource schedulers can schedule various resources at least as efficiently as the deadline-driven resource schedulers do. • Algorithm 3 mentioned previously has been implemented and incorporated into the current version of TMOSM. The performance of the EDF and the RIPF schedulers have been compared using the CAMIN application. • Our analysis and experiments show that: – If the deadlines of all tasks can be met, the EDF and RIPF schedulers perform as efficiently; – In the case where not all deadlines can be met under EDF, RIPF scheduler can do a better job by considering the potential risk values together with the deadline information in the RIPFs, which means less important tasks are sacrificed first. UCI DREAM Lab

  30. A QoS-driven Resource Allocation Framework based on RI F TMO-based, distributed, … … Application real-time, fault-tolerant applications TMO Programming Language Approximation (TMOSL) Programming I nterface RI PF-driven Midterm Resource Allocation (Reconfiguration) QoS support PSTR PPTR FT Deadline … SNS support Handling SNS VMST (RIPF-based RI PF-driven Short-term CPU resource scheduler) Resource Allocation Distributed MMCT (RIPF-based VLIIT (RIPF-based Computing comm. resource scheduler) I/O resource scheduler) Support WTST Socket, COM CORBA, TTP Unintelligent maintenance of OS virtual machine (sub-millisec Windows 2000, NT, CE, or specialized RTOS level resource allocation) UCI DREAM Lab

  31. RI PF-driven Midterm Resource Allocation (Reconfiguration) - Two considerations about reconfiguration decision • Current maximum risk values returned by the RIPF-driven resource allocators • Current node work-load - Maximum risk value • If the maximum risk value returned by one RIPF-driven resource allocator is more than zero, it means that some QoS guarantees might not be met; e.g., if the maximum risk value returned by the CPU scheduler is more than zero, some deadlines might be violated; If it is from the communication bandwidth scheduler, some communication bandwidth requirements might not be able to satisfied. - Node work-load TMO work-load = ∑ (SpM-GCT/ SpM-Interval) + ∑ (SvM-GCT / SvM-MIR) GCT = Guaranteed Completion Time MIR = Maximum Invocation Rate Similarly, a node’s work-load: Node work-load = ∑ (TMO work-load) UCI DREAM Lab

  32. RI PF-driven Midterm Resource Allocation (Reconfiguration) - The reasons for system reconfiguration • Case 1: Node crash occurs • Case 2: TMO crash occurs • Case 3: In a certain computing node, if the number of times that the maximum risk value appears to be positive is bigger than a threshold with a certain period, The TNCM might consider move some tasks from this node to another node. - Case 1: Node crash occurs • TNCM examines the types of all TMOs hosted in the crashed node. The type of a TMO may be PSTR station, PPTR station, or Simplex. • Simplex TMOs should be moved immediately to other healthy computing node(s). Then the crashed node should be repaired and resurrected. All PSTR and PPTR TMOs hosted in this node may be restarted as the shadow station after the node is resurrected. If the resurrection fails, The PSTR and PPTR TMOs hosted in this node may be moved to other healthy node(s). • The order of moving TMO and the selecting of destination node(s) are based on the risk value incurred from the TMO movement. • The order of moving TMOs Examine the risk value incurred after the completion of the moving, based on the estimated moving time. The TMO with the highest risk value incursion may be moved first. • The selection of a destination node The maximum risk value of the node should be zero within a certain period; The node’s work-load should be lower than a threshold. UCI DREAM Lab

  33. RI PF-driven Midterm Resource Allocation (Reconfiguration) Case 1: Node crash occurs: TNCM Flowchart Node crash report from he SNS subsystem Resurrection succeeds Resurrection fails Identify all Simplex TMOs hosted in the crashed node Prepare to move all PSTR Restart all PSTR and PPTR and PPTR TMOs to TMOs as shadow station other healthy node(s) Determine the order of moving and the order of destination node list Determine the order of moving and the order of destination node list Move all Simplex TMOs to their destination node(s) Move all PSTR and PPTR Repair and resurrect TMOs to their the crashed node destination node(s) UCI DREAM Lab

  34. The Real-time Fault Tolerance Schemes I ncorporated into the RI F-based Resource Allocation Framework UCI DREAM Lab

  35. The Supervisor-based Network Surveillance (SNS) Scheme UCI DREAM Lab

  36. The SNS scheme The Supervisor-base Network Surveillance (SNS) Scheme • Network Surveillance (NS), which is basically a (partially or fully) decentralized mode of detecting faulty and repaired status of distributed computing components, is a major part of real-time fault-tolerant distributed computing. • There are only small number of NS schemes which yield to rigorous quantitative analyses fault coverage, and the SNS scheme is one of them. • The SNS scheme is semi-centralized real-time NS scheme effective in a variety of point-to-point networks and can also be adapted to broadcast networks. UCI DREAM Lab

  37. The SNS scheme – Fault sources Fault sources • Processor • incoming communication handling unit … • outgoing communication handling unit • point-to-point interconnection network Node … * Processor Internal Internal I-unit O-unit * * X X * … … UCI DREAM Lab

  38. The SNS scheme – Fault Frequencies Fault frequencies assumptions: (A1) The fault-source components in each node do not generate messages containing erroneous values or untimely messages. (A2) Each of the nodes performing store-and-forward functions (as well as the source node) transmits each stored message twice continuously. It is assumed that this makes the probability of transient faults in the components of the two neighbor nodes and transient faults in the link between the two neighbor nodes causing message losses to be negligible. (A3) It is assumed that no second permanent hardware fault occurs in the system until either the detection of the first permanent hardware fault F or a fast re-election of the supervisor (which involves one message multicast) is done. Also network partitioning doesn't occur during the lifetime of the application. (A4) The clocks in the nodes are kept synchronized sufficiently closely for practical purposes, i.e., for the given applications. GPS (global positioning system) based approaches and other cheaper high-precision approaches which have become available in recent years may be utilized. UCI DREAM Lab

  39. The SNS scheme architecture Basic duties of work nodes • Exchange heartbeat messages with its neighbors; • Monitor its neighbors’ health status; • Generate fault suspicion report if necessary. Worker Worker … … (Supervisor’s Supervisor (Supervisor’s Neighbor) Neighbor) Communication Network … Worker Worker UCI DREAM Lab

  40. The SNS scheme architecture Additional duties of the supervisor node • Determine other nodes’ health status based on the received suspicion reports; • After confirming a fault, inform all the related nodes. Worker Worker … … (Supervisor’s Supervisor (Supervisor’s Neighbor) Neighbor) Communication Network … Worker Worker UCI DREAM Lab

  41. The SNS I mplementation on the TMOSM From network To network MMCT Incoming message queue SNS message types: Outgoing message queue • Heart Beat Message HeartBeat signal Fault announcement • Fault Suspicion Message HeartBeat signal Fault suspicion report Fault announcement • Fault Announcement Message Fault suspicion report • Supervisor-Fault Suspicion Message Request queue • New Supervisor Announcement Message NST Note: • Message sending and receiving are done in MMCT; • Generation and analysis of messages are done in NST (Network Surveillance Thread), which is a special SpM. UCI DREAM Lab

  42. Algorithm used by Y Am I a LOCAL_SLAVE? Return the worker’s NST N N NumHBSignals received < Num of healthy neighbors ? Y N Mark host node as PI faulty. Shutdown the host node NumHBSignals received > 0 ? Inform the supervisor. If the host node is a LOCAL_MASTER, Y mark all of its LOCAL_SLAVE “faulty” and inform the supervisor Find the neighbor node Y from which HB signal is not received Y Try to use some info from the Is Y marked “possibly faulty” ? supervisor. (Might change Y’s status to permanently faulty. Inform N The supervisor. Consider all links attached to Y as unusable) Mark Y as “possibly faulty”. Inform the supervisor about Return the anomaly N HB signals received on all Find the link K over which HB Mark K as “faulty”. Inform attached healthy link? is not received the supervisor about the fault Y Return UCI DREAM Lab

  43. Algorithm used by the supervisor NST For each worker node Y For each link L Y Is there any “spontaneous N Is there any “fault report” Continue Continue fault report” for Y? for L? N Y N Mark Y as Is there any “fault suspicion” Mark L as “faulty”. “possibly faulty”. for Y? Multicast this msg Multicast this msg Y N N N Is the number of Is Y’s status Continue “faulty suspicion” > 1 “possibly faulty”? Y Y Mark Y as “faulty”. Multicast this msg. If this change makes node X has only one neighbor Z left, claim X is Z’s slave and multicast this msg Continue UCI DREAM Lab

  44. Algorithm used by the supervisor NST For each worker Link L N Is there any “Faulty report” Continue for L? Y Mark L as “faulty”. Multicast this msg N Continue UCI DREAM Lab

  45. Fault Detection time Bound Analysis Definition: 1) MIT: Maximum incoming message turnaround time of MMCT. i.e., Maximum amount of time that elapses from the arrival of a message in the input queue of MMCT in a node to the time at which MMCT completes the forwarding of the item to its destination thread. 2) MOT: Maximum outgoing message turnaround time of MMCT. i.e., Maximum amount of time that elapses from the time of arrival of an item at the input queue of MMCT to the time at which MMCT sends out the item. 3) MNT: Maximum NST turnaround time. i.e., Maximum amount of time that elapses from the time of arrival of an item at the input queue of NST to the time at which NST completes the processing of the item. … UCI DREAM Lab

  46. The SNS scheme - Fault detection time bound analysis Node X Node Y NST execution MD h i h i x,y,4 y,x,4 r i y,x,4 p r i x,y,4 MIT Round MIT i MNT MNT HeartBeat Msg h i+ 1 h i+ 1 x,y,4 y,x,4 r i+1 x,y,4 r i+1 x,y,4 Round i + 1 • All messages initiate in round i will be received in the the same round. • When NST starts to execute, all messages initiate the previous round have been delivered to its input queue. All of the messages in the input queue will be processed before the completion of the NST execution. UCI DREAM Lab

  47. The SNS scheme - Fault detection time bound analysis Node Y Node X Node Z Supervisor h i x,z,4 X h i r i Round x,y,1 x,z,4 p i r i p + e x,y,1 L PO_NEI h i+1 x,z,4 h i+1 Round x,y,4 r i+1 x,z,4 i + 1 MD r i+1 x,y,4 MIT + MNT NPT execution Heartbeat signal X PO fault Omitted heartbeat signal Fault suspicion report Fault announcement UCI DREAM Lab

  48. The SNS scheme - Fault detection time bound analysis Node Y Node X Node Z Supervisor h i x,z,4 X h i r i Round x,y,1 x,z,4 p i r i p + e x,y,1 L PO_NEI h i+1 x,z,4 h i+1 x,y,4 r i+1 Round x,z,4 i + 1 MD r i+1 x,y,4 MIT + MNT MOT Round L PO_SUP MD i + 2 MIT + MNT NPT execution Heartbeat signal X PO fault Omitted heartbeat signal Fault suspicion report Fault announcement UCI DREAM Lab

  49. The SNS scheme - Fault detection time bound analysis Node Y Node X Node Z Supervisor h i x,z,4 X h i r i Round x,y,1 x,z,4 p i r i p + e x,y,1 L PO_NEI h i+1 x,z,4 h i+1 Round x,y,4 r i+1 x,z,4 i + 1 MD r i+1 x,y,4 MIT + MNT MOT Round L PO_SUP MD i + 2 MIT + MNT MCAST Round L PO MD i + 3 MIT + MNT NPT execution heartbeat signal X PO fault Omitted heartbeat signal Fault suspicion report Fault announcement The detection procedure of a PO fault in a worker node – node X UCI DREAM Lab

  50. The SNS scheme - Fault detection time bound analysis Node Y Node X Node Z Supervisor h i x,z,1 h i h i y,x,4 z,x,1 r i y,x,4 Round X p i r i z,x,1 L PI_LOC p + e h i+1 h i+1 y,x,4 z,x,4 r i+1 y,x,4 Round i + 1 r i+1 z,x,4 MIT + MNT MOT Round L PI_SUP MD i + 2 MIT + MNT MCAST Round L PI MD i + 3 MIT + MNT NPT execution Heartbeat signal X PI fault Lost heartbeat signal Fault suspicion report Fault announcement The detection procedure of a PI fault in a worker node – node X UCI DREAM Lab

  51. The SNS scheme - Fault detection time bound analysis Node Y Node X Node Z Supervisor h i x,z,4 X h i r i Round x,y,1 x,z,4 p i p + e h i x,y,1 L PP_NEI h i+1 x,z,4 h i+1 Round x,y,4 r i+1 x,z,4 i + 1 MD r i+1 x,y,4 MIT + MNT MOT Round L PP_SUP MD i + 2 MIT + MNT MCAST Round L PP MD i + 3 MIT + MNT NPT execution Heartbeat signal X Permanent processor fault Omitted heartbeat signal Fault suspicion report Fault announcement The detection procedure of a permanent processor fault in a worker node – node X UCI DREAM Lab

  52. The SNS scheme - Fault detection time bound analysis Node Y Node X Node Z Supervisor h i x,y,3 h i X x,y,4 Round h i p x,z,1 i r i r i p - e x,y,4 x,z,1 L PLS h i+1 x,y,3 h i+1 Round x,y,4 i + 1 p h i+1 MD x,z,1 r i+1 x,y,4 r i+1 x,z,1 MIT + MNT MOT Round L PLS_SUP MD i + 2 MIT + MNT MCAST Round L PLS MD i + 3 MIT + MNT NPT execution Heartbeat signal X Permanent link fault Lost heartbeat signal Fault suspicion report Fault announcement The detection procedure of a permanent Link fault by the sender node – node X UCI DREAM Lab

  53. The SNS scheme - Fault detection time bound analysis Node Y Node X Node Z Supervisor h i x,y,3 h i X x,y,4 Round h i p x,z,1 i r i p - e x,y,4 h i x,z,1 L PLR h i+1 x,y,3 h i+1 Round x,y,4 i + 1 h i+1 MD x,z,1 r i+1 x,y,4 r i+1 x,z,1 MIT + MNT MOT Round L PLR_SUP MD i + 2 MIT + MNT MCAST Round L PLR MD i + 3 MIT + MNT NPT execution Heartbeat signal X Permanent link fault Lost heartbeat signal Fault suspicion report Fault announcement The detection procedure of a permanent Link fault by the receiver node – node Y UCI DREAM Lab

  54. The SNS scheme - Fault detection time bound analysis Supervisor neighbor Supervisor neighbor Node X Node z Supervisor Node Y h i s,x,4 X h i Round r i s,y,1 p s,x,4 i r i s,y,1 p + e L SPO_NEI h i+1 s,x,4 Round h i+1 s,y,4 r i+1 i + 1 r i+1 s,x,4 MD s,y,4 MIT + MNT MCAST’ Round MD i + 2 L SPO_ELE MIT + MNT MCAST Round L SPO MD i + 3 MIT + MNT NPT execution Heartbeat signal X PO fault Omitted heartbeat signal Fault suspicion report Fault announcement The detection procedure of a PO fault in the supervisor node UCI DREAM Lab

  55. The SNS scheme - Fault detection time bound analysis Supervisor neighbor Supervisor neighbor Node X Node z Supervisor Node Y h i x,s,4 h i r i y,s,1 x,s,4 X Round r i p y,s,1 i p + e L SPI_NEI h i+1 h i+1 x,s,4 y,s,4 r i+1 x,s,4 Round r i+1 i + 1 y,s,4 MIT + MNT MCAST’ Round MD i + 2 L SPI_ELE MIT + MNT MCAST Round L SPI MD i + 3 MIT + MNT NPT execution Heartbeat signal X PI fault Lost heartbeat signal Fault suspicion report Fault announcement The detection procedure of a PI fault in the supervisor node UCI DREAM Lab

  56. The SNS scheme - Fault detection time bound analysis Supervisor neighbor Supervisor neighbor Node X Node z Supervisor Node Y h i s,x,4 X h i Round h i s,y,1 p s,x,4 i r i s,y,1 p + e L SPP_NEI h i+1 s,x,4 Round h i+1 h i+1 s,y,4 i + 1 s,x,4 r i+1 MD s,y,4 MIT + MNT MCAST’ Round MD i + 2 L SPP_ELE MIT + MNT MCAST Round L SPP MD i + 3 MIT + MNT NPT execution Heartbeat signal X Permanent processor fault Omitted heartbeat signal Fault suspicion report Fault announcement The detection procedure of a permanent processor fault in the supervisor node UCI DREAM Lab

  57. Algorithm used by the supervisor NST Experimental data Message delay 1) 400 byte package 1. In isolated network: 189us; 2. In Internet environment: 192us; 2) 600 byte package 1. In isolated network: 212 us; 2. In Internet environment: 236us. Maximum MMCT turnaround time: 82us. Maximum NST turnaround time: 28us. Selecting NST execution period p = 12ms, both the fault detection and the new supervisor election take about 3.5 p , 42ms UCI DREAM Lab

  58. Algorithm used by the supervisor NST The main issues of adaptation are: 1) selecting appropriate neighboring scheme, and 2) establishing two independent communication paths between any two nodes in the system. node23 … node11 node12 node1N node2N node22 Local … Local node21 Broadcast Point-to-Point Domain Domain Multi-campus Net Local Point-to-Point … Domain node31 node32 node3N UCI DREAM Lab

  59. PSTR - The Primary Shadow TMO Replication Scheme UCI DREAM Lab

  60. The PSTR scheme • The PSTR scheme is a result of incorporating the primary-shadow active replication principle, into the TMO object structuring scheme. • A natural way to incorporate the active replication principle into the TMO structuring scheme is to replicate each TMO to form a pair of partner objects and host the partners in two different nodes. • The methods of the primary object along will produce all external outputs under normal circumstance. • Since each partner has the same external inputs and its own object data store (ODS), the methods of both objects perform the same execution and ODS updates. UCI DREAM Lab

  61. Node B Node A An SvM Execution ODS ODS in PSTR Primary SpM Section Shadow SpM Section Normal Case Primary SvM1 Shadow SvM1 Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request * Transaction 1 For each external Notify client begins … output, execute + + + request ID the actions Acceptance listed in this box * Transaction 1 Test begins … pass Commit Acceptance Test pass Update ODSS’s & Commit release locks, if any Notify AT success Receive AT Update ODSS’s & result release locks, if any : wait + : compute absolute deadline External output(s) Receive output * : may involve acquiring success notice ODSS locks Output success Note: Transaction 2 Transaction 2 External outputs are sent begins … begins … by MMCT, possibly through Report completion Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  62. Handling inputs to TMO replicas – Service request TMO1 TMO1 TMO3 primary shadow Service request: TMO1, SvM2 SRQ SRQ TMOSM SvMInfoList (Shadow) SvMInfoList (Primary) I n TMOSM Node1 TMOSM I n TMO1 SvM2 … TMO1 SvM2 … I n Node2 … … Node3 TMO4 SvM4 … TMO3 SvM1 … … … Service request: Service request: TMO1, SvM2, primary TMO1, SvM2, shadow UCI DREAM Lab

  63. Handling inputs to TMO replicas – Result return TMO3 TMO3 TMO1 primary shadow Service result return TMO3, SvM1 RRQ RRQ TMOSM SvMInfoList (Shadow) SvMInfoList (Primary) I n TMOSM Node1 TMOSM I n TMO2 SvM5 … TMO3 SvM1 … I n Node2 … … Node3 TMO3 SvM1 … TMO1 SvM1 … … … Service result return: Service result return: TMO3, SvM1, primary TMO3, SvM1, shadow UCI DREAM Lab

  64. Types of faults & their symptoms Hardware faults • – Symptoms 1.1 Node crash – Symptoms 1.2 Process/thread gets corrupted – no progress – Symptoms 1.3 Process/thread gets corrupted – progress but with contaminated state (Low probability) – Symptoms 2.1 Resource shortage -> Process/thread lockup/stall OS faults • – Symptoms 1.1 Node crash – Symptoms 1.2 Process/thread gets corrupted – no progress – Symptoms 1.3 Process/thread gets corrupted – progress but with contaminated state (Low probability) – Symptoms 2.1 Resource shortage -> Process/thread lockup/stall • Communication failures – Symptoms 3.1 Message loss – Symptoms 3.2 Duplicated messages Application design faults • – Symptoms 1.2 Process/thread gets corrupted – no progress – Symptoms 1.3 Process/thread gets corrupted – progress but with contaminated state (high probability) UCI DREAM Lab

  65. PSTR fault detection mechanism • Primary’s AT - logic test (Detection mechanism(DM) 1.1) • Primary’s AT – timeout (DM 1.2) • Primary’s sending of clientRequestID – timeout (DM 1.3) • Shadow’s wait for clientRequestID – timeout (DM 2.1) • Shadow’s AT - logic test (DM 2.2) • Shadow’s AT – timeout (DM 2.3) • Shadow’s wait for primary’s AT result – timeout (DM 2.4) • Shadow’s wait for primary’s notice of external output success – timeout (DM 2.5) • SNS’s node failure notice (DM 3.1) • Message-sequence check(Double transmission over redundant links are done) (DM 4.1) • Absence of ack. (DM 4.2) – Server’s ack of an SvM request (DM 4.2.1) – Server’s return of the expected result (DM 4.2.2) • Unacceptable request to kernel/middleware (DM 5.1) Note: 1. When a TMO changes its role between primary & shadow, it reports the change to TMOSM which in turn notifies the TNCM. The TNCM can detect primary-primary situations 2. Every external output should be done in an independent manner. UCI DREAM Lab

  66. PSTR fault detection mechanism • Primary’s AT - logic test (Detection mechanism(DM) 1.1) - Given by application programmers • Primary’s AT – timeout (DM 1.2) - Given by application programmers or by the tools • Primary’s sending of clientRequestID – timeout (DM 1.3) - Given by application programmers or by the tools • Shadow’s wait for clientRequestID – timeout (DM 2.1) - Given by application programmers or by the tools • Shadow’s AT - logic test (DM 2.2) - Given by application programmers • Shadow’s AT – timeout (DM 2.3) - Given by application programmers or by the tools • Shadow’s wait for primary’s AT result – timeout (DM 2.4) - Derived • Shadow’s wait for primary’s notice of external output success - Derived – timeout (DM 2.5) UCI DREAM Lab

  67. PSTR fault detection mechanism • SNS’s node failure notice (DM 3.1) • Message-sequence check(Double transmission over redundant links are done) (DM 4.1) - Provided by TMOSM • Absence of ack. (DM 4.2) – Server’s ack of an SvM request (DM 4.2.1) – Server’s return of the expected result (DM 4.2.2) • Unacceptable request to kernel/middleware (DM 5.1) UCI DREAM Lab

  68. Typical cases of fault detection under PSTR + SNS Faults in the primary Detection Primary Shadow SNS Messaging Kernel/ mechanisms Middleware DM DM DM DM DM DM DM DM DM DM DM Fault types DM5.1 1.1 1.2 1.3 2.1 2.2 2.3 2.4 2.5 3.1 4.1 4.2 Sym1.1 C1.1 C1.2 C1.3 C1.4 C1.5 Sym1.2 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.5 Hard ware Sym1.3 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.5 C1.9 Sym2.1 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.4 C1.5 Sym1.1 C1.1 C1.2 C1.3 C1.4 C1.5 Sym1.2 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.5 OS Sym1.3 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.5 C1.9 Sym2.1 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.4 C1.5 Comm Sym3.1 C1.1 C1.2 C1.3 C1.5 Sym3.2 C1.5 App Sym1.2 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.5 Sym1.3 C1.6 C1.7 C1.8 C1.1 C1.2 C1.3 C1.5 C1.9 UCI DREAM Lab

  69. Typical cases of fault detection under PSTR + SNS Faults in the shadow Detection Primary Shadow SNS Messaging Kernel/ mechanisms Middleware DM DM DM DM DM DM DM DM DM DM DM Fault types DM5.1 1.1 1.2 1.3 2.1 2.2 2.3 2.4 2.5 3.1 4.1 4.2 Sym1.1 C2.1 C2.4 Sym1.2 C2.2 C2.3 C2.4 Hard ware Sym1.3 C2.2 C2.3 C2.4 C2.5 Sym2.1 C2.2 C2.3 C2.1 C2.4 Sym1.1 C2.1 C2.4 Sym1.2 C2.2 C2.3 C2.4 OS Sym1.3 C2.2 C2.3 C2.4 C2.5 Sym2.1 C2.2 C2.3 C2.1 C2.4 Comm Sym3.1 C2.2 C2.3 C2.4 Sym3.2 C2.4 App Sym1.2 C2.2 C2.3 C2.4 Sym1.3 C2.2 C2.3 C2.4 C2.5 UCI DREAM Lab

  70. Node B Node A Case C1.1A ODS ODS Node crash in the Primary SpM Section Shadow SpM Section primary node during Primary SvM1 Shadow SvM1 SvM initiation Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request Fail to receive Fatal error client ID from primary + occurs Change to Primary. For each external Node crashes Inform the TNCM output, execute and other SxM’s the actions listed in this box Transaction 1 Note: begins … After this node crash, the TNCM in the master node detects it through Acceptance The SNS and starts to relocate all the Test TMO’s in this node to other health pass nodes. Those relocated TMO’s Will be started as shadow TMO’s Commit and they will collaborate with the Active primary TMO’s to catch : wait Notify AT success Up by receiving current status + : compute absolute deadline Data from the primary TMO’s. * : may involve acquiring ODSS locks External output(s) Note: Transaction 2 External outputs are sent begins … by MMCT, possibly through Report completion VLIIT. Initiation Initiation Primary’s client SRQ SRQ Condition Condition Request ID check check UCI DREAM Lab

  71. Node B Node A Case C1.1B ODS ODS Other failures in the Primary SpM Section Shadow SpM Section primary node during Primary SvM1 Shadow SvM1 SvM initiation Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request Fail to receive Transient failure client ID from primary + + occurs Change to Primary. For each external Fail to notify client Inform the TNCM output, execute request ID and other SxM’s the actions listed in this box * Transaction 1 Error detected begins … Inform the TNCM Acceptance and the shadow Test pass Inform other SxM’s in the Commit same TMO : wait Notify AT success Rollback & Recovery + : compute absolute deadline * : may involve acquiring Change mode to ODSS locks External output(s) Shadow Transaction 1 Note: Transaction 2 begins … External outputs are sent begins … by MMCT, possibly through Report completion Report completion VLIIT. Initiation Initiation Primary’s client SRQ SRQ Condition Condition Request ID check check UCI DREAM Lab

  72. Node B Node A Case C1.2A ODS ODS Node crash in the Primary SpM Section Shadow SpM Section primary node during Primary SvM1 Shadow SvM1 one transaction Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request * Transaction 1 Notify client begins … + + request ID Acceptance * Transaction 1 For each external Test begins … output, execute pass the actions Commit listed in this box Fatal error occurs Update ODSS’s & release locks, if any Node crashes Fail to receive AT result Note: After this node crash, the TNCM Change to Primary. in the master node detects it through Inform the TNCM : wait The SNS and starts to relocate all the and other SxM’s + : compute absolute deadline TMO’s in this node to other health * : may involve acquiring nodes. Those relocated TMO’s ODSS locks External output(s) Will be started as shadow TMO’s and they will collaborate with the Note: Active primary TMO’s to catch Transaction 2 External outputs are sent Up by receiving current status begins … by MMCT, possibly through Data from the primary TMO’s. Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  73. Node B Node A Case C1.2B ODS ODS AT Timeout Primary SpM Section Shadow SpM Section in the primary node Primary SvM1 Shadow SvM1 Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request * Transaction 1 Notify client begins … + + request ID For each external Acceptance output, execute * Transaction 1 Test the actions begins … pass listed in this box Commit AT timeout X Update ODSS’s & release locks, if any Inform the TNCM and the shadow Receive AT Inform other timeout msg SxM’s in the same TMO Change to Primary. Inform the TNCM : wait Rollback & Recovery and other SxM’s + : compute absolute deadline * : may involve acquiring Change mode to ODSS locks External output(s) Shadow Note: Transaction 2 Transaction 2 External outputs are sent begins … begins … by MMCT, possibly through Report completion Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  74. Node B Node A Case C1.3A ODS ODS Node crash in the Primary SpM Section Shadow SpM Section primary node during Primary SvM1 Shadow SvM1 one transaction Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request * Transaction 1 Notify client begins … + + request ID Acceptance * Transaction 1 For each external Test begins … output, execute pass the actions … Acceptance listed in this box Test pass Receive AT result notice Commit Fail to recv Notify AT success output suc Change to Primary. Fatal error Inform the TNCM : wait occurs and other SxM’s + : compute absolute deadline * : may involve acquiring Node crashes ODSS locks External output(s) Note: Transaction 2 External outputs are sent begins … by MMCT, possibly through Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  75. Node B Node A Case C1.4A ODS ODS Node crash in the Primary SpM Section Shadow SpM Section primary node during Primary SvM1 Shadow SvM1 SvM initiation Shadow Primary Save client request SvM2 Save client SvM2 - Detected by SNS Send ack. to the client request SNS fault report SNS report received. Fatal error No need to wait for primary + occurs Change to Primary. For each external Node crashes Inform the TNCM output, execute and other SxM’s the actions listed in this box Transaction 1 Note: begins … After this node crash, the TNCM in the master node detects it through Acceptance The SNS and starts to relocate all the Test TMO’s in this node to other health pass nodes. Those relocated TMO’s Will be started as shadow TMO’s Commit and they will collaborate with the Active primary TMO’s to catch : wait Notify AT success Up by receiving current status + : compute absolute deadline Data from the primary TMO’s. * : may involve acquiring ODSS locks External output(s) Note: Transaction 2 External outputs are sent begins … by MMCT, possibly through Report completion VLIIT. Initiation Initiation Primary’s client SRQ SRQ Condition Condition Request ID check check UCI DREAM Lab

  76. Node B Node A Case C1.4B ODS ODS Node crash in the Primary SpM Section Shadow SpM Section primary node during Primary SvM1 Shadow SvM1 one transaction Shadow Primary Save client request SvM2 Save client SvM2 - Detected by SNS Send ack. to the client request * Transaction 1 Notify client begins … + + request ID Acceptance * Transaction 1 For each external Test SNS fault report begins … output, execute pass the actions Commit listed in this box Fatal error occurs Update ODSS’s & release locks, if any Node crashes SNS report recved. No need to wait for AT Note: After this node crash, the TNCM Change to Primary. in the master node detects it through Inform the TNCM : wait The SNS and starts to relocate all the and other SxM’s + : compute absolute deadline TMO’s in this node to other health * : may involve acquiring nodes. Those relocated TMO’s ODSS locks External output(s) Will be started as shadow TMO’s and they will collaborate with the Note: Active primary TMO’s to catch Transaction 2 External outputs are sent Up by receiving current status begins … by MMCT, possibly through Data from the primary TMO’s. Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  77. Node B Node A Case C1.7 ODS ODS AT timeout Primary SpM Section Shadow SpM Section in the primary Primary SvM1 Shadow SvM1 Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request * Transaction 1 Notify client begins … + + + request ID Acceptance * Transaction 1 Test begins … pass For each external Commit Acceptance output, execute X Test - Timeout the actions listed in this box Update ODSS’s & Rollback & retry release locks, if any Notify AT success Receive AT Update ODSS’s & result release locks, if any : wait + : compute absolute deadline External output(s) Receive output * : may involve acquiring success notice ODSS locks Output success Note: Transaction 2 Transaction 2 External outputs are sent begins … begins … by MMCT, possibly through Report completion Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  78. Node B Node A Case 2.1 ODS ODS Node crash in the Primary SpM Section Shadow SpM Section shadow node during Primary SvM1 Shadow SvM1 one transaction Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request + * Transaction 1 Notify client begins … + request ID Fatal error * Transaction 1 occurs For each external begins … output, execute the actions Node crashes Acceptance listed in this box Test pass Note: Commit After this node crash, the TNCM in the master node detects it through Notify AT success The SNS and starts to relocate all the TMO’s in this node to other health nodes. Those relocated TMO’s Update ODSS’s & Will be started as shadow TMO’s release locks, if any : wait and they will collaborate with the + : compute absolute deadline External output(s) Active primary TMO’s to catch * : may involve acquiring Up by receiving current status ODSS locks Output success Data from the primary TMO’s. Note: Transaction 2 External outputs are sent begins … by MMCT, possibly through Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  79. Node B Node A Case 2.2 ODS ODS Temp failure Primary SpM Section Shadow SpM Section in the shadow node Primary SvM1 Shadow SvM1 Shadow Primary Save client request SvM2 Save client SvM2 Send ack. to the client request + * Transaction 1 Notify client begins … + request ID For each external output, execute AT fails * Transaction 1 the actions begins … listed in this box Inform the TNCM Acceptance Test Inform other pass SxM’s in the Commit same TMO Notify AT success Rollback & Recovery Update ODSS’s & release locks, if any : wait Resume as shadow + : compute absolute deadline External output(s) * : may involve acquiring ODSS locks Output success Note: Transaction 2 Transaction 2 External outputs are sent begins … begins … by MMCT, possibly through Report completion Report completion VLIIT. Primary’s client Initiation Initiation SRQ SRQ Request ID Condition Condition check check UCI DREAM Lab

  80. The PSTR scheme - Fault detection time bound analysis Shadow Primary (fault-free case) Client request message t 0 MIT MIT MMPT MMPT __ Pick up msg __ Pick up msg MOT ClientID msg MD MIT MMPT __ P exec Pick up __ AT msg P exec MOT Time COMPL AT result msg __ __ AT MOT __ t 1 Output success msg __ Pick up msg External output __ Pick up msg PSTR timing chart – normal case UCI DREAM Lab

  81. The PSTR scheme - Fault detection time bound analysis Shadow Shadow Primary (Primary clientID failure case) (fault-free case) Client request message t 0 MIT MIT MIT MMPT MMPT MMPT __ __ Pick up msg __ Pick up Pick up msg MOT msg ClientID msg ClientID msg MD DL clientID MIT __ Pick up MMPT Timeout msg __ P exec __ AT P exec P exec MOT Time COMPL AT result msg __ __ AT __ AT MOT __ t 1 __ Output success msg Pick up COMPL msg __ External output MOT __ t 2 __ Pick up msg External output PSTR timing chart – Primary clientID failure case UCI DREAM Lab

  82. The PSTR scheme - Fault detection time bound analysis Shadow Shadow Primary (Primary output failure case) (fault-free case) Client request message t 0 MIT MIT MIT MMPT MMPT MMPT __ __ Pick up msg __ Pick up Pick up msg msg MOT ClientID msg C l i e n t I D DL clientID m s g __ __ Pick up Pick up P exec msg msg P exec __ AT P exec MOT Time COMPL AT result msg AT result msg __ __ AT MD __ AT MOT MIT __ DL AT t 1 __ Output success msg Pick up MMPT msg __ External output COMPL __ __ MOT Pick up __ msg t 2 External output PSTR timing chart – primary AT failure case UCI DREAM Lab

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend