Computing Optimal Self- Repair Actions: Damage Minimization versus - PowerPoint PPT Presentation

University of Paderborn Software Engineering Group Prof. Dr. Wilhelm Schäfer Computing Optimal Self- Repair Actions: Damage Minimization versus Repair Time Matthias Tichy, Holger Giese, Daniela Schilling, Wladimir Pauls Daniela Schilling – May 2005

University of Paderborn Software Engineering Group Motivation Prof. Dr. Wilhelm Schäfer www.railcab.de Daniela Schilling - May 2005- 2

University of Paderborn Software Engineering Group Motivation Prof. Dr. Wilhelm Schäfer � Redundant implementations of important software components vot:Voter pc1:Position Calculation Taliesin Avalon cc:Convoy Uther Gareth pc3:Position pc2:Position Calculation Calculation mul:Multiplier Gorlois Arthur gps:GPS- Controller � Required: reconfiguration � Given: automatism to detect failed components � Self-Repair Actions: automatic calculation of redeployment for failed components Daniela Schilling - May 2005- 3

University of Paderborn Initial Deployment Software Engineering Group Prof. Dr. Wilhelm Schäfer pc1:Position Node1: pc1.mem=2.0Mb Calculation pc2:Position Node2: Calculation � Map deployment constraints given as extended UML Deployment Diagrams to inequalities over boolean and integer variables � Use constraint solver to calculate initial deployment WOSS/FSE 2004: Matthias Tichy, Daniela Schilling, Holger Giese: Design of Self-Managing Dependable Systems with UML and Fault Tolerance Patterns Daniela Schilling - May 2005- 4

University of Paderborn Online Redeployment Software Engineering Group Prof. Dr. Wilhelm Schäfer � Node crash failure ⇒ all components running on this node fail too � Compute Self-Repair Action � -> Find suitable nodes to redeploy failed components � How to find suitable nodes? � What to do if there is no suitable node? � Redeploy further (still running) components � Damage: negative effects of unavailable components � Costs damage � Components to Goal: minimize costs be migrated � Keep damage as low as possible � Reduce solving time Failed Costs components time calculate redeployment perform redeployment Daniela Schilling - May 2005- 5

University of Paderborn Online Redeployment Software Engineering Group Prof. Dr. Wilhelm Schäfer - 1.Solution - � Remove crashed nodes from constraint system � Solve complete constraint system again damage time Daniela Schilling - May 2005- 6

University of Paderborn Online Redeployment Software Engineering Group Prof. Dr. Wilhelm Schäfer - 2.Solution - � Remove crashed nodes from constraint system � Add objective function (minimize damage caused by migration of running componets) to the constraint system � Solve complete system again damage time Daniela Schilling - May 2005- 7

University of Paderborn Online Redeployment Software Engineering Group Prof. Dr. Wilhelm Schäfer - Our Approach - � Remove crashed nodes from constraint system � Add objective function (minimize damage) to the constraint system � Try to solve constraint systems for failed components only � Until a solution is found: extend set of components that have to be redeployed/migrated � Use Constraint solver � Heuristic approach Daniela Schilling - May 2005- 8

University of Paderborn Online Redeployment Software Engineering Group Prof. Dr. Wilhelm Schäfer - Our Approach - damage time Daniela Schilling - May 2005- 9

University of Paderborn Choosing Components for Software Engineering Group Prof. Dr. Wilhelm Schäfer Redeployment � Example: 3 redundant copies of important components � Algorithm: � Try to redeploy failed component � Until redeployment is possible: 1. Choose components which are no redundant copies of failed components 2. Choose components where only one of three redundant copies already failed 3. Choose arbitrary components Daniela Schilling - May 2005- 10

University of Paderborn Choosing Components for Software Engineering Group Prof. Dr. Wilhelm Schäfer Redeployment � Example: 3 redundant copies of important components � Algorithm: � Try to redeploy failed component � Until redeployment is possible: 1. Choose components which are no redundant copies of failed components 2. Choose components where only one of three redundant copies already failed 3. Choose arbitrary components Daniela Schilling - May 2005- 11

University of Paderborn Experiment Software Engineering Group Prof. Dr. Wilhelm Schäfer � Scenario: � 36 nodes with 114 links � 72 components with 99 connectors � 5 node-specific (CPU, OS, Memory, Utilization, HDD) and 2 link-specific (Bandwidth, Loss) deployment restrictions � set of deployment constraints on components and connectors � Experiment: � Randomly selected a node and let it fail Daniela Schilling - May 2005- 12

University of Paderborn Experimental Results Software Engineering Group Prof. Dr. Wilhelm Schäfer Test 1. Solution 2. Solution Our Algorithm Nr. Time (ms) Damage Time (ms) Damage Time (ms) Damage 1 13630 773 > 1h N/A 50 7 2 14890 97 56060 29 30 30 3 13790 4 14920 1 10 5 4 13660 34 16430 31 50 34 damage time Daniela Schilling - May 2005- 13

University of Paderborn Conclusion & Future Work Software Engineering Group Prof. Dr. Wilhelm Schäfer � Algorithm to calculate optimal self-repair actions � Deployment constraints solved by standard constraint solver � Experiment showed that algorithm is nearly optimal in damage minimization and time consumption � Not presented: pre-solving step � Communication and monitoring framework � Describe repair rules by graph transformation systems Daniela Schilling - May 2005- 14

University of Paderborn Software Engineering Group Prof. Dr. Wilhelm Schäfer Appendix Daniela Schilling - May 2005- 15

University of Paderborn Simple Software Engineering Group Prof. Dr. Wilhelm Schäfer Redeployment vot:Voter pc1:Position Calculation Taliesin Avalon cc:Convoy Uther Gareth pc3:Position pc2:Position Calculation Calculation mul:Multiplier Gorlois Arthur gps:GPS- Controller Daniela Schilling - May 2005- 16

University of Paderborn Software Engineering Group Example Prof. Dr. Wilhelm Schäfer vot:Voter pc1:Position Mem:0.5Mb Calculation Taliesin Avalon Mem=2Mb cc:Convoy Mem=1.5Mb Mem=2.5Mb Mem=0.7Mb pc2:Position pc1:Position Uther Gareth pc3:Position Calculation Calculation Calculation Mem=1Mb Mem=2Mb Mem=1.5Mb Mem=2Mb pc2:Position mul:Multiplier Gorlois Arthur Calculation Mem=0.25Mb Mem=2Mb Mem=1.5Mb Mem=1.5Mb gps:GPS- Controller Mem=0.5Mb Daniela Schilling - May 2005- 17

University of Paderborn Damage Calculation Software Engineering Group Prof. Dr. Wilhelm Schäfer n2 C2 n3 n5 n1 C1 C3 C5 damage=13 damage=13 n4 C4 damage: all=13 2of3=4 1of3=1 Daniela Schilling - May 2005- 18

University of Paderborn Submodel Expansion Software Engineering Group Prof. Dr. Wilhelm Schäfer Failed components Running components Initial situation a b c d e f g Submodel: Consider later: Consider: 1) a b c d e f g Submodel not solvable 2) a b c d e f g Redundant copies 3) a b c e f g d Not related e f g d a b c 4) Submodel not solvable Daniela Schilling - May 2005- 19

University of Paderborn Submodel Expansion(2) Software Engineering Group Prof. Dr. Wilhelm Schäfer Failed components Running components a b c e f g d 4) Submodel not solvable e a b c f g d 5) Redundant copies e d f g a b c 6) a b c e d f g 7) Submodel solvable Daniela Schilling - May 2005- 20

University of Paderborn Pre-Solving Software Engineering Group Prof. Dr. Wilhelm Schäfer Daniela Schilling - May 2005- 21

University of Paderborn Foundations (TMR) Software Engineering Group Prof. Dr. Wilhelm Schäfer � Use fault tolerance techniques to ensure dependability � Triple Modular Redundancy (TMR) :Component1 :Provider :Multiplier :Component2 :Voter :User :Component3 Daniela Schilling - May 2005- 22

University of Paderborn Foundations (TMR) Software Engineering Group Prof. Dr. Wilhelm Schäfer � Deployment constraints for TMR Avoid single-point- of-failure of voter / Node1: Node2: multiplier -> Deploy voter and user to same node (if the user fails, the :Provider :Multiplier :Voter :User failure of the voter is no problem) Avoid crash failures -> Deploy redundant :Component1 :Component2 :Component3 components to distinct nodes Heterogeneous Node3: Node4: Node5: hardware platform -> require different CPU { Node3.CPU � Node4.CPU � Node4.CPU � Node5.CPU � Node3.CPU � Node 5.CPU } Daniela Schilling - May 2005- 23

University of Paderborn Software Engineering Group Prof. Dr. Wilhelm Schäfer Questions? .de www. Daniela Schilling - May 2005- 24

University of Paderborn Online Redeployment Software Engineering Group Prof. Dr. Wilhelm Schäfer - Our Solution - � Compute Self-Repair Action � -> Find suitable nodes to redeploy failed components � How to find suitable nodes? � What to do if there is no suitable node? � 2) Redeploy further (still running) components � Goal: reduce costs � Redeployment should not decrease dependability (reduce damage) � Reduce solving time Daniela Schilling - May 2005- 25

Computing Optimal Self- Repair Actions: Damage Minimization versus - PowerPoint PPT Presentation

University of Paderborn Software Engineering Group Prof. Dr. Wilhelm Schfer Computing Optimal Self- Repair Actions: Damage Minimization versus Repair Time Matthias Tichy, Holger Giese, Daniela Schilling, Wladimir Pauls Daniela Schilling

Actions of Compact Quantum Groups V Free and homogeneous actions I Kenny De Commer (VUB,

Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov

The Role of Endovascular The Role of Endovascular Repair Repair Repair Repair John Rose John

New Approaches to New Approaches to New Approaches to Repair of Repair of Repair of Spinal

Tissue Repair Kristine Krafts, M.D. Tissue Repair Lecture Objectives Define tissue repair,

Laparoscopic Laparoscopic Ventral vs. Open Hernia Repair Ventral vs. Open Hernia Repair Hernia

Actions of Compact Quantum Groups III Reduced and universal actions Kenny De Commer (VUB,

Minimization Satoru Iwata (University of Tokyo) Submodular Function Minimization ( )

Individual Damage Assessment (IDA) Preliminary Damage Assessment (PDA) Overview IDA /

General Structure of a PW code Self-Consistent KS eqs. or Global Minimization approach

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Civil Actions Civil Actions Civil Actions Lesson No. 13 ENV H 471 Environmental Health

Graph-based, Self-Supervised Program Repair from Diagnostic Feedback ICML 2020 Michihiro

Wilton Fire Department 2020 Dry Hydrant Repair Project WFD Dry Hydrant Repair Project Overview:

Do Automated Program Repair Techniques Repair Hard and Important Bugs? Manish Motwani Sandhya

Aortic Arch repair Tim Chuter, MD Professor of Surgery In-Residence, UCSF UCSF UCSF Arch

Antidot Training AFS@Store AFS@Store Introduction 2 Antidot solution for E-Commerce 3 What

What You Always Wanted to Know about OCL and Never Dared to Ask Martin Gogolla Personally, i

Early Twentieth-Century Fiction e20fic14.blogs.rutgers.edu Prof. Andrew Goldstone

Symbolic Analysis of Networked Systems Klaus Wehrle t Joint work by the COMSYS team

Twentieth artistic language (techniques of Century representation) versus the image and its

Performance evaluation for various braking systems of street motorcycles Introduction This

typical writing automatic writing intentional ??? thought hand movement hand movement marks

II. JM Keynes and The General Theory (1936) 2. The impact of The General Theory on economic