1/20/2014 1
ECE 753: FAULT-TOLERANT COMPUTING
K l K S l j Kewal K.Saluja
Department of Electrical and Computer Engineering
Motivation and Introduction
Lecture Set 1
Overview
- Motivation
- About the Course and the Instructor
– Conduct, Outline, Coursepack
- Introduction
- Terminology and definitions
Sources Overview and Comments
ECE 753 Fault Tolerant Computing 2
– Sources, Overview and Comments – System defined
- Dependability/Security and their attributes
- Threat to dependability and modeling FEF chain
- Means to attain dependability
- Fundamental Principles
Motivation
- Informal Definition
- Key Attributes
- Who, What and Why Study
ECE 753 Fault Tolerant Computing 3
- Examples
Motivation
- What is Fault-Tolerance?
A “fault-tolerant system” is one that
ECE 753 Fault Tolerant Computing 4
continues to perform at desired level of service in spite of failures in some components that constitute the system.
Motivation (contd.)
- Key attributes
Fault - Error - Failure
ECE 753 Fault Tolerant Computing 5
Performance - Availability - Reliability
More recently concept of “survivability”
Inclusions of these constraints at design stage is likely to be more cost effective.
Motivation (contd.)
- Who is concerned about fault-tolerance?
– System Users – irrespective of the application but some are a lot more concerned than others
- Who is concerned at design stages?
Universities
ECE 753 Fault Tolerant Computing 6
– Universities
- R, d, and a (Research, development, applications)
– Industry
- r, D, and A (research, Development, Applications)
- Issues