TDDD82 Secure Mobile Systems Lecture 5: Dependability Mikael - - PowerPoint PPT Presentation

tddd82 secure mobile systems lecture 5 dependability
SMART_READER_LITE
LIVE PREVIEW

TDDD82 Secure Mobile Systems Lecture 5: Dependability Mikael - - PowerPoint PPT Presentation

TDDD82 Secure Mobile Systems Lecture 5: Dependability Mikael Asplund Real-tjme Systems Laboratory Department of Computer and Informatjon Science Linkping University Based on slides by Simin Nadjm-Tehrani Dependability Property of a


slide-1
SLIDE 1

TDDD82 Secure Mobile Systems Lecture 5: Dependability

Mikael Asplund Real-tjme Systems Laboratory Department of Computer and Informatjon Science Linköping University

Based on slides by Simin Nadjm-Tehrani

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6

Dependability

Property of a computing system which allows

reliance to be justifiably placed on the service it delivers.

[Avizienis et al.] The ability to avoid service failures that are more frequent or more severe than is acceptable.

slide-7
SLIDE 7

Dependability taxonomy

slide-8
SLIDE 8

Fault-tolerant Distributed Systems

slide-9
SLIDE 9

Redundancy

  • Necessary for fault-tolerance!
  • Increase overall complexity
  • Statjc

– Error masking propertjes

  • Dynamic

– Error detectjng propertjes

slide-10
SLIDE 10

Dependability & Distributjon

  • Making systems fault-tolerant typically uses

redundancy

– Redundancy in space leads to distributjon – But distributed systems are not necessarily fault- tolerant!

slide-11
SLIDE 11

Replicatjon

  • Passive replicatjon

– Primary – backup – Cold/Warm/Hot

  • Actjve replicatjon

– Group membership

slide-12
SLIDE 12

Implementjng replica consistency

  • Message ordering

– Use the before relatjon (i.e., by using Lamport clocks)

  • Agreement

– For passive replicatjon

  • Controlled by the master
  • Stjll requires agreement of when the primary is down...

– Actjve replicatjon

  • Agreement for every operatjon
slide-13
SLIDE 13

Agreement is not just for replicatjon

slide-14
SLIDE 14

The consensus problem

  • Processes p1,…, pn take part in a decision

– Each pi proposes a value vi – All correct processes decide on a common value v that is equal to

  • ne of the proposed values
  • Desired propertjes

– Terminatjon: Every correct process eventually decides – Agreement: No two (correct) processes decide difgerently – Validity: If a process decides v then the value v was proposed by some process

slide-15
SLIDE 15

Fault model

Normality Tolerated faults Non-tolerated faults

slide-16
SLIDE 16

Recall from previous lecture

  • Node/Channel failures

Crash

Omission

Timing

Byzantine/arbitrary

  • System model

Synchronous

Asynchronous

slide-17
SLIDE 17

Basic impossibility result

[Fischer, Lynch and Paterson 1985]

  • There is no deterministjc algorithm solving the consensus

problem in an asynchronous distributed system with a single crash failure.

slide-18
SLIDE 18

Naïve approaches

  • Wait for all to agree

– Node crash

  • Wait for a majority to agree

– What about confmicts?

  • When to move on?
slide-19
SLIDE 19

Assume synchrony

  • If a node does not respond within tjme t, it

will not respond at tjme t+d

  • Partjal synchrony

– Bounds exist but are not known

  • Powerful abstractjon:

– Unreliable failure detectors

slide-20
SLIDE 20

For the project

  • Passive replicatjon
  • Need to think carefully about your fault

model!

slide-21
SLIDE 21

Hints for your dependability analysis

  • 1. Model the system

– What is the logical structure? – What are your assumptjons?

  • 2. Consider what types of faults that could occur (part of risk

analysis)

– Nodes (crash, byzantjne, ...) – Links

  • 3. If possible, measure parameters such the frequency of faults,

and tjme to recovery, combine with historical data

– htups://doi.org/10.1109/TDSC.2009.4 – htup://liu.diva-portal.org/smash/record.jsf?pid=diva2%3A1034002

  • 5. Derive esimates for dependability atuributes of your system
slide-22
SLIDE 22

For the future

  • Dependability is important!

– Take it seriously when building systems

  • Fault-tolerance is non-trivial
  • Create simple and easy-to-understand

systems (at least the cores)