Reliability Engineering - Discussions and Clarifications Reliability - - PowerPoint PPT Presentation

reliability engineering discussions and clarifications
SMART_READER_LITE
LIVE PREVIEW

Reliability Engineering - Discussions and Clarifications Reliability - - PowerPoint PPT Presentation

Mission Success Starts with Safety Reliability Engineering - Discussions and Clarifications Reliability Engineering VS. Probabilistic Risk Assessment (PRA) Reliability Prediction VS. Reliability Demonstration Design Reliability VS. Process


slide-1
SLIDE 1

Mission Success Starts with Safety

Reliability Engineering - Discussions and Clarifications

Reliability Engineering VS. Probabilistic Risk Assessment (PRA) Reliability Prediction VS. Reliability Demonstration Design Reliability VS. Process Reliability

Fayssal M. Safie, Ph. D.,

NASA R&M Tech Fellow Marshall Space Flight Center

SRE Meeting

March 11, 2014

  • F. Safie

1

slide-2
SLIDE 2

Agenda

  • Reliability Engineering Overview

– Reliability Engineering Definitions – The Reliability Engineering Case – The Relationship to Safety, Mission Success, and

Affordability

  • Discussions and Clarifications

– Reliability VS. Probabilistic Risk Assessment (PRA) – Reliability Prediction VS. Reliability Demonstration – Design Reliability VS. Process Reliability

  • Concluding Remarks
  • F. Safie

2

slide-3
SLIDE 3

Reliability Engineering Overview

  • F. Safie

3

slide-4
SLIDE 4

Reliability Engineering

  • Reliability Engineering as a Discipline:

– The application of engineering and scientific principles to the

design and processing of products, both hardware and software, for the purpose of meeting product reliability requirements or goals.

  • Reliability as a Figure of Merit is:

– The probability that an item will perform its intended function

for a specified mission profile.

  • Reliability is a very broad design-support discipline. It has

important interfaces with most engineering disciplines

  • Reliability analysis is critical for understanding component failure

mechanisms and identifying reliability critical design and process drivers.

  • F. Safie

4

slide-5
SLIDE 5

Reliability Requirements

Reliability Case

Reliability Program Management & Control

The Reliability Engineering Case

Process Reliability Reliability Testing Worst Case Analysis Human Reliability Analysis Root Cause Analysis Sneak Circuit Analysis Probabilistic Design Analysis Reliability Program Plan Contractors and Suppliers Monitoring Reliability Program Audits Reliability Progress Reports Failure Review Processes Process Characterization Critical Parameter Process Parameter Design Feedback Control Statistical Process Control Process Monitoring Design Reliability Drivers Stress Screening Reliability Requirements Analysis Reliability Requirements Allocation Reliability Prediction

FMEA/CIL

  • F. Safie

5

slide-6
SLIDE 6

Loss of Crew/Mission/Space System, Stand Down, Loss of Launch Opportunity, etc.

COST OF LOGISTICS SUPPORT & INFRASTRUCTURE

COST OF PREVENTIVE MAINTENANC E

COST OF CORRRECTIVE MAINTENANCE

COST OF LOSS AFFORDABILITY Corrective Maintenance Level of Repair

MAINTAINABILITY RELIABILITY SUPPORTABILITY Spares, Facilities, Maintenance Labor , materials , Maintenance Support , etc.

Redesigns

COST OF DEVELOPMENT TESTING, CERTIFICATION, AND SUSTAINING ENGINEERING

Failure Identification and Analysis Critical Items Identification Design Mitigation and Critical Process Control

Preventive Maintenance

Failures

The Relationship to Safety, Mission Success, and Affordability

  • F. Safie

6

slide-7
SLIDE 7

Reliability Discussions and Clarifications

  • F. Safie

7

slide-8
SLIDE 8

Probabilistic Risk Assessment (PRA)

  • Reliability: The probability that an item will perform its intended function for a specified mission

profile.

  • Risk: The chance of occurrence of an undesired event and the severity of the resulting consequences.
  • Probabilistic Risk assessment (PRA) is the systematic process of analyzing a system, a process, or an

activity to answer three basic questions:

– What can go wrong that would lead to loss or degraded performance (i.e., scenarios involving undesired

consequences of interest)?

– How likely is it (probabilities)? – What is the severity of the degradation (consequences)?

Scenario Likelihood (Probability) Consequence S1 S2 S3 . . . SN p1 p2 p3 . . . pN C1 C2 C3 . . . CN

R  RISK  { Si, Pi, Ci }

Risk assessment is the task

  • f generating the triplet set

8

slide-9
SLIDE 9

The PRA Process

Fault Tree (FT) System Modeling Event Tree (ET) Modeling

IE B C D E End State 1: OK 2: LOM 3: LOC 4: LOC 5: LOC 6: LOC A

Initiating Events Identification

Not A Link to another fault tree Basic Event Logic Gate End State: ES2 End State: LOC End State: LOM

Defining the PRA Study Scope and Objectives Mapping of ET-defined Scenarios to Causal Events  Internal initiating events  External initiating events  Hardware failure  Human error  Software error  Common cause failure  Environmental conditions  Other

  • ne or more
  • f these

elementary events One of these events

AND Event Sequence Diagram (Inductive Logic)

IE End State: OK End State: LOM End State: ES2 End State: LOC A B C D E 0.01 0.02 0.03 0.04 10 20 30 40 50 60 0.02 0.04 0.06 0.08 5 10 15 20 25 30 0.02 0.04 0.06 0.08 10 20 30 40 50

Probabilistic Treatment of Basic Events The uncertainty in occurrence frequency of an event is characterized by a probability distribution

Examples (from left to right): Probability that the hardware x fails when needed Probability that the crew fail to perform a task Probability that there would be a windy condition at the time of landing

Communicating & Documenting Risk Results and Insights to Decision-maker  Displaying the results in tabular and graphical forms  Ranking of risk scenarios  Ranking of individual events (e.g., hardware failure, human errors, etc.)  Insights into how various systems interact  Tabulation of all the assumptions  Identification of key parameters that greatly influence the results  Presenting results of sensitivity studies  Proposing candidate mitigation strategies Technical Review of Results and Interpretation Model Integration and Quantification of Risk Scenarios

Integration and quantification of logic structures (ETs and FTs) and propagation of epistemic uncertainties to obtain  minimal cutsets (risk scenarios in terms of basic events)  likelihood of risk scenarios  uncertainty in the likelihood estimates

0.01 0.02 0.03 0.04 0.05 20 40 60 80 100

End State: LOM End State: LOC

Domain Experts ensure that system failure logic is correctly captured in model and appropriate data is used in data analysis Model Logic and Data Analysis Review Fault Tree (FT) System Modeling Event Tree (ET) Modeling

IE B C D E End State 1: OK 2: LOM 3: LOC 4: LOC 5: LOC 6: LOC A

Event Tree (ET) Modeling

IE B C D E End State 1: OK 2: LOM 3: LOC 4: LOC 5: LOC 6: LOC A

Initiating Events Identification Initiating Events Identification

Not A Link to another fault tree Basic Event Logic Gate End State: ES2 End State: LOC End State: LOM

Defining the PRA Study Scope and Objectives Mapping of ET-defined Scenarios to Causal Events  Internal initiating events  External initiating events  Hardware failure  Human error  Software error  Common cause failure  Environmental conditions  Other

  • ne or more
  • f these

elementary events One of these events

AND Mapping of ET-defined Scenarios to Causal Events  Internal initiating events  External initiating events  Hardware failure  Human error  Software error  Common cause failure  Environmental conditions  Other

  • ne or more
  • f these

elementary events One of these events

AND Event Sequence Diagram (Inductive Logic)

IE End State: OK End State: LOM End State: ES2 End State: LOC A B C D E

Event Sequence Diagram (Inductive Logic)

IE End State: OK End State: LOM End State: ES2 End State: LOC A B C D E 0.01 0.02 0.03 0.04 10 20 30 40 50 60 0.02 0.04 0.06 0.08 5 10 15 20 25 30 0.02 0.04 0.06 0.08 10 20 30 40 50

Probabilistic Treatment of Basic Events The uncertainty in occurrence frequency of an event is characterized by a probability distribution

Examples (from left to right): Probability that the hardware x fails when needed Probability that the crew fail to perform a task Probability that there would be a windy condition at the time of landing 0.01 0.02 0.03 0.04 10 20 30 40 50 60 0.02 0.04 0.06 0.08 5 10 15 20 25 30 0.02 0.04 0.06 0.08 10 20 30 40 50

Probabilistic Treatment of Basic Events The uncertainty in occurrence frequency of an event is characterized by a probability distribution

Examples (from left to right): Probability that the hardware x fails when needed Probability that the crew fail to perform a task Probability that there would be a windy condition at the time of landing

Communicating & Documenting Risk Results and Insights to Decision-maker  Displaying the results in tabular and graphical forms  Ranking of risk scenarios  Ranking of individual events (e.g., hardware failure, human errors, etc.)  Insights into how various systems interact  Tabulation of all the assumptions  Identification of key parameters that greatly influence the results  Presenting results of sensitivity studies  Proposing candidate mitigation strategies Communicating & Documenting Risk Results and Insights to Decision-maker  Displaying the results in tabular and graphical forms  Ranking of risk scenarios  Ranking of individual events (e.g., hardware failure, human errors, etc.)  Insights into how various systems interact  Tabulation of all the assumptions  Identification of key parameters that greatly influence the results  Presenting results of sensitivity studies  Proposing candidate mitigation strategies Communicating & Documenting Risk Results and Insights to Decision-maker  Displaying the results in tabular and graphical forms  Ranking of risk scenarios  Ranking of individual events (e.g., hardware failure, human errors, etc.)  Insights into how various systems interact  Tabulation of all the assumptions  Identification of key parameters that greatly influence the results  Presenting results of sensitivity studies  Proposing candidate mitigation strategies Technical Review of Results and Interpretation Model Integration and Quantification of Risk Scenarios

Integration and quantification of logic structures (ETs and FTs) and propagation of epistemic uncertainties to obtain  minimal cutsets (risk scenarios in terms of basic events)  likelihood of risk scenarios  uncertainty in the likelihood estimates

0.01 0.02 0.03 0.04 0.05 20 40 60 80 100

End State: LOM End State: LOC

Model Integration and Quantification of Risk Scenarios

Integration and quantification of logic structures (ETs and FTs) and propagation of epistemic uncertainties to obtain  minimal cutsets (risk scenarios in terms of basic events)  likelihood of risk scenarios  uncertainty in the likelihood estimates

0.01 0.02 0.03 0.04 0.05 20 40 60 80 100

End State: LOM End State: LOC

Domain Experts ensure that system failure logic is correctly captured in model and appropriate data is used in data analysis Model Logic and Data Analysis Review Domain Experts ensure that system failure logic is correctly captured in model and appropriate data is used in data analysis Model Logic and Data Analysis Review

Ref., ESD 10011, Cross Program Probabilistic Risk Assessment Methodology

  • F. Safie

9

slide-10
SLIDE 10

The ET Foam Probabilistic Risk Assessment

TPS Void Distributions Process Control TPS Debris Generation (divot/no divot, size/shape, (mass), time and location of release, and pop-off velocity TPS Reliability TPS Transport Model (axial/lateral locations and velocities during ascent Orbiter Impact Algorithms (impact/no impact, location, time, mass, velocity and angle) Orbiter Damage Analysis (tile/RCC panel damage) Probability of Orbiter Damage Exceeding Damage Tolerance System Risk ET TPS Dissections (ET Project) TPS Geometry Properties, Boundary Conditions (ET Project) Debris Transport and CFD Calculations (SE&I) Orbiter Geometric Models (Orbiter Project) Orbiter Impact / Damage Tolerances (Orbiter Project) ET Dissection / Manufacturing Data Thermal-Vacuum and Flight Imagery Data Debris Transport Analysis Orbiter Post-Flight Data

Input Data Validation Data

  • F. Safie

10

slide-11
SLIDE 11

Reliability Demonstration

  • Reliability Demonstration is the process of quantitatively estimating the

reliability of a system using objective data at the level intended for demonstration.

  • statistical formulas are used to calculate the demonstrated reliability at

some confidence level.

  • Models and techniques used in reliability demonstration include

Binomial, Exponential, Weibull models, etc..

  • Due to high cost and schedule impact of reliability demonstration,

programs employed this method only to demonstrate a certain reliability comfort level. For example, a reliability goal of .99 at 95% confidence level is demonstrated by conducting 298 successful tests.

  • F. Safie

11

slide-12
SLIDE 12

Statistical Confidence

  • F. Safie

12

slide-13
SLIDE 13

Reliability Predictions

  • Reliability prediction is the process of quantitatively estimating the reliability of a

system using both objective and subjective data.

  • Reliability prediction is performed to the lowest level for which data is available.

The sub-level reliabilities are then combined to derive the system level prediction.

  • Reliability prediction techniques are dependent on the degree of the design

definition and the availability of historical data. Examples are:

– Similarity analysis techniques: Reliability of a new design is predicted using

reliability of similar parts; where failure rates are adjusted for the operating environment, geometry, material change, etc.

– Physics-based techniques: Reliability is predicted using probabilistic

engineering models expressed as loads and environment vs. capability

– Techniques that utilize generic failure rates such as MIL-HDBK 217, Reliability

Prediction of Electronic Equipment.

  • F. Safie

13

slide-14
SLIDE 14

Design VS. Process Reliability “Design it Right and Built it Right

Design Reliability Reliability Process Reliability

Processing

Models

Processing Process Characterization Process Control Manufacturing Models Materials Properties

Material Models Loads

Loads & Environments

Operating conditions

  • F. Safie

14

slide-15
SLIDE 15

Design Reliability

Failure Region

  • F. Safie

15

slide-16
SLIDE 16
  • Causes and Contributing Factors
  • The zinc chromate putty frequently failed and permitted the gas to erode the

primary O-rings.

  • The particular material used in the manufacture of the shuttle O-rings was the

wrong material to use at low temperatures.

  • Elastomers become brittle at low temperatures.

Design Reliability The Challenger Accident

16

slide-17
SLIDE 17

Process Reliability

  • F. Safie

17

slide-18
SLIDE 18

Assurance for Complex Electronics Process Reliability

The Columbia Shuttle Accident

  • The ET thermal protection system is a foam-type material applied to the

external tank to maintain cryogenic propellant quality, minimize ice and frost formation, and protect the structure from ascent, plume, and re-entry heating.

  • The TPS during re-entry is needed because after ET/Orbiter separation,

premature structural overheating due to loss of TPS could result in a premature ET breakup with debris landing outside the predicted footprint.

  • F. Safie

18

slide-19
SLIDE 19

Process Reliability Component Reliability System Risk

Process Uniformity, Capability, and Control Capability vs. Performance Failure Impact on System High Material Capability Higher Reliability Lower Risk and Higher Safety

The Quality, Reliability, and Risk Relationship

Process Reliability

  • F. Safie

19

slide-20
SLIDE 20

Concluding Remarks

  • Reliability engineering is a discipline while PRA is a process
  • Reliability deals with failure analysis focusing on understanding failure

mechanisms that could lead to loss of function ; while PRA deals with system risk focusing on understanding the system risk scenarios that could lead to loss of mission or loss of crew.

  • Reliability prediction, which is based on objective and subjective data, is

intended to help the design process by identifying component, subsystem, and system reliability drivers; while demonstrated reliability, which is based on objective data, is intended to demonstrate certain comfort reliability level in support of reliability prediction.

  • Physics based design reliability and process reliability, which are

performed on selected failure modes, are critical input to reliability prediction.

  • Both reliability prediction and reliability demonstration are critical data

source for PRA.

  • F. Safie

20