- System Software Group
Software-based Fault Tolerance – Mission (Im)possible?
http://www4.cs.fau.de
Peter Ulbrich The 29th CREST Open Workshop on Software Redundancy November 18, 2013
Software-based Fault Tolerance Mission (Im)possible? Peter Ulbrich - - PowerPoint PPT Presentation
Software-based Fault Tolerance Mission (Im)possible? Peter Ulbrich The 29th CREST Open Workshop on Software Redundancy November 18, 2013 System Software Group
Software-based Fault Tolerance – Mission (Im)possible?
http://www4.cs.fau.de
Peter Ulbrich The 29th CREST Open Workshop on Software Redundancy November 18, 2013
Soft Errors – A Growing Problem
Peter Ulbrich – ulbrich@cs.fau.de 2
■
Induced by e.g., radiation, glitches, insufficient signal integrity
■
Affecting microcontroller logic
!
!
Soft Errors – A Growing Problem
Peter Ulbrich – ulbrich@cs.fau.de 2
■
Induced by e.g., radiation, glitches, insufficient signal integrity
■
Affecting microcontroller logic
!
!
Soft Errors – A Growing Problem
Peter Ulbrich – ulbrich@cs.fau.de 2
■
Induced by e.g., radiation, glitches, insufficient signal integrity
■
Affecting microcontroller logic
■ Future hardware designs: more performance
performance and parallelism parallelism!
→ On the price of being less and less r On the price of being less and less reliable eliable !
[3]
Soft Errors – A Growing Problem
Peter Ulbrich – ulbrich@cs.fau.de 2
■
Induced by e.g., radiation, glitches, insufficient signal integrity
■
Affecting microcontroller logic
■ Future hardware designs: more performance
performance and parallelism parallelism!
→ On the price of being less and less r On the price of being less and less reliable eliable !
Toyota Acceleration Case ■ Electronic throttle control system (2005 Camry)
“Toyota claimed the 2005 Camry's main CPU had error detecting and correcting RAM. It didn't.” 2
■ Unintended acceleration potentially involving 261 deaths1 ■ Experts identified soft errors as possible cause1
1 US News, Mar 17, 2010 2 Investigation Report, EDN Network, Oct 28, 2013
[3]
Software-Based Fault Tolerance
Peter Ulbrich – ulbrich@cs.fau.de 3
■ Software-based redundancy!
■
Triple Modular Redundancy riple Modular Redundancy (e.g., recommended by ISO 26262)
! Selective
Selective and adaptive adaptive
! Resour
Resource efficient ce efficient
!
Safety-Critical System!
Isola&on(domain(
( (
Sphere(of(redundancy((SOR)(
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface(
✗
Replica(1(
↯
Software-Based Fault Tolerance
Peter Ulbrich – ulbrich@cs.fau.de 3
■ Software-based redundancy!
■
Triple Modular Redundancy riple Modular Redundancy (e.g., recommended by ISO 26262)
! Selective
Selective and adaptive adaptive
! Resour
Resource efficient ce efficient
■ Single points of failur
Single points of failure!
■
Interface Interface and Majority V Majority Voter
■
Allowing for Silent Data Corruptions Silent Data Corruptions (SDC) (SDC) → Replication is impossible! Replication is impossible!
Safety-Critical System!
Isola&on(domain(
( (
Sphere(of(redundancy((SOR)(
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface( Majority( Voter( Interface(
↯ ↯
Threats to Applicability – Mission failed?
Peter Ulbrich – ulbrich@cs.fau.de 4
■ Triple modular redundancy reliability!
!
!
Threats to Applicability – Mission failed?
Peter Ulbrich – ulbrich@cs.fau.de 4
■ Triple modular redundancy reliability! ■ Voting on unreliable hardware?!
■
Very small residual err esidual error pr
■
Risk analysis inherently complex complex (no random error distribution! [4])
!
Threats to Applicability – Mission failed?
Peter Ulbrich – ulbrich@cs.fau.de 4
■ Triple modular redundancy reliability! ■ Voting on unreliable hardware?!
■
Very small residual err esidual error pr
■
Risk analysis inherently complex complex (no random error distribution! [4])
→ Dealbr Dealbreaker for softwar eaker for software-based TMR e-based TMR!
Research Aims
Peter Ulbrich – ulbrich@cs.fau.de 5
" " "
Safety-Critical System!
Isola&on(domain(
( (
Sphere(of(redundancy((SOR)(
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface( Majority( Voter( Interface(
Research Aims
Peter Ulbrich – ulbrich@cs.fau.de 5
# Eliminate single points of failure " "
Safety-Critical System!
Isola&on(domain(
( (
Sphere(of(redundancy((SOR)(
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface(
Research Aims
Peter Ulbrich – ulbrich@cs.fau.de 5
# Eliminate single points of failure # Constrain residual error probability "
Safety-Critical System!
Isola&on(domain(
( (
Sphere(of(redundancy((SOR)(
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface(
RV =1 RI =1
Research Aims
Peter Ulbrich – ulbrich@cs.fau.de 5
# Eliminate single points of failure # Constrain residual error probability # Dependability as a resource efficient option
Safety-Critical System!
Isola&on(domain(
( (
Sphere(of(redundancy((SOR)(
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface(
RV =1 RI =1
Agenda
■ Introduction! ■ The Combined Redundancy approach (CoRed )!
■ Holistic protection – eliminating single points of failure ■ Arithmetic coding ■ Dependable voting
■ Constraining residual error probability!
■ From coding theory to application – lessons learned ■ Finding appropriate parameters ■ Circumvent implementation pitfalls
■ Evaluation!
■ Use case ■ Experimental setup ■ Fault-injection results
■ Conclusion!
Peter Ulbrich – ulbrich@cs.fau.de 6
CoRed Overview – Holistic Protection Approach
■ The Combined Redundancy Approach (CoRed )! !
"
Peter Ulbrich – ulbrich@cs.fau.de 7
( (
Sphere(of(redundancy((SOR)( Isola&on(domain(
( (
TMR +
CoRed Overview – Holistic Protection Approach
■ The Combined Redundancy Approach (CoRed )!
Data-flow encoding Data-flow encoding
!
"
Peter Ulbrich – ulbrich@cs.fau.de 7
( (
Sphere(of(redundancy((SOR)( Isola&on(domain(
( (
TMR +
CoRed Overview – Holistic Protection Approach
■ The Combined Redundancy Approach (CoRed )!
Data-flow encoding Data-flow encoding Dependable voters Dependable voters
!
"
Peter Ulbrich – ulbrich@cs.fau.de 7
( (
Sphere(of(redundancy((SOR)( Isola&on(domain(
( (
TMR +
CoRed Overview – Holistic Protection Approach
■ The Combined Redundancy Approach (CoRed )!
Data-flow encoding Data-flow encoding Dependable voters Dependable voters
■ Holistic protection approach for control applications!
■ Input to output pr
Input to output protection
1 Reading inputs 2 Processing 3 Distributing outputs
Peter Ulbrich – ulbrich@cs.fau.de 7
( (
Sphere(of(redundancy((SOR)( Isola&on(domain(
( (
1 2 3
TMR +
Eliminating Input and Output Vulnerabilities
■ Arithmetic Codes ANBD Code !
■ Based on VCP [5] ■ Data integrity:
Key
■ Address integrity:
Per variable signature
■ Outdated data:
Timestamp
!
Peter Ulbrich – ulbrich@cs.fau.de 8
SOR( Encode( Encode( X (Value)( Y (Value)( Decode( Decode( X X’ (Encoded(Value)( Y’ (Encoded(Value)( Y
Eliminating Input and Output Vulnerabilities
■ Arithmetic Codes ANBD Code !
■ Based on VCP [5] ■ Data integrity:
Key
■ Address integrity:
Per variable signature
■ Outdated data:
Timestamp
■ Set of arithmetic operators
arithmetic operators (+, -, *, =, …)! ■ Checksum
Checksum vs. Arithmetic code Arithmetic code (AN code)
■ AN Code Encoded data operations
Encoded data operations
■ Enabler for dependable voter
Enabler for dependable voter
Peter Ulbrich – ulbrich@cs.fau.de 8
SOR( Encode( Encode( X (Value)( Y (Value)(
Decode( Z = X Y Z’
(
v' = A⋅v+ B+ D
CoRed Dependable Voter – Basics
■ CoRed Dependable V
Dependable Voter
■ Input
Input: variants ( X’, Y’, Z’ )
■ Output
Output: Equality set (E) and encoded winner (W)
■ No decoding necessary
No decoding necessary
■ Control-flow signatures!
■ Static signatur
Static signature (expected value): Compile-time " Used as return value E
■ Dynamic signatur
Dynamic signature (actual value): Runtime, computed from variants " Applied to winner W
■ Validation
alidation: Subsequent check (decode)
Peter Ulbrich – ulbrich@cs.fau.de 9
Encode( Encoded(Voter( Replica(2( Encode( Replica(1 Encode( Replica(3( X’ X Y Z Y’ Z’ {E, W} Check((Decode)( X’ Provider((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((Encoded(Voter((((((((((((((((((((((((((((((((Consumer( e.g.,(X’(is(the(winner(
Agenda
■ Introduction! ■ The Combined Redundancy approach (CoRed )!
■ Holistic protection – eliminating single points of failure ■ Arithmetic coding ■ Dependable voting
■ Constraining residual error probability!
■ From coding theory to application – lessons learned ■ Finding appropriate parameters ■ Circumvent implementation pitfalls
■ Evaluation!
■ Use case ■ Experimental setup ■ Fault-injection results
■ Conclusion!
Peter Ulbrich – ulbrich@cs.fau.de 10
From Coding Theory to Application
Peter Ulbrich – ulbrich@cs.fau.de 11
Safety-Critical System!
Isola&on(domain(
( (
Sphere(of(redundancy((SOR)(
Sensors( Actuators( Replica(2( Replica(3( Replica(1( CoRed( Voter( CoRed( Interface(
RV =1 RI =1?
?
Decoded_Static() { TAssert(_B > 0); assert(check()); return (vc-_B-D)/_A; }; 101010101001010 001010100001011 111010101011010 000010101001110 001011111001011
Arithmetic coding operations! Mathematics C / C++ Assembler
Know your compiler & architecture Think binary
■ Coding theory!
■ Data word + redundant information = code word ■ Fault detection distance between code wor
distance between code words ds
!
Peter Ulbrich – ulbrich@cs.fau.de 12
v' = A⋅v+ B+ D
■ Coding theory!
■ Data word + redundant information = code word ■ Fault detection distance between code wor
distance between code words ds
■ Residual error probability!
■ Chance for code-to-code word mutation ■ Fundamental property for fault tolerance mathematics
Peter Ulbrich – ulbrich@cs.fau.de 12
v' = A⋅v+ B+ D
psdc = valid code words possible code words ≈ 1 A
ppred ✓ 1 A ◆ 2 8 192 16 384 32 768 61 440 10−6 10−5 10−4 10−3 values of A (16-bit constant key) psdc (residual error probability)
Constraining residual error probability
■ Coding theory!
■ Data word + redundant information = code word ■ Fault detection distance between code wor
distance between code words ds
■ Residual error probability!
■ Chance for code-to-code word mutation ■ Fundamental property for fault tolerance mathematics
Peter Ulbrich – ulbrich@cs.fau.de 12
v' = A⋅v+ B+ D
psdc = valid code words possible code words ≈ 1 A
Choosing Keys and Signatures
Peter Ulbrich – ulbrich@cs.fau.de 13
■ Mathematics: prime numbers
prime numbers! ■ Intuitively plausible ■ Literature: little help to find suitable As
■ Practitioner’s approach: min. Hamming distance
■ Distance (d) between code words (# unequal bits)
■ d-1 bit err
error detection capabilities
■ Brute force!
■ 1.4
1.4×10 1014
14 experiments
experiments for all 16 bit As A = 58,368 dmin = 2 #errors detectable = 1 58,831 3 2 58,659 " " " "6 " " " " " "5
→ The bigger the better is misleading! The bigger the better is misleading! "!
1! 0! 1! 0! 1! 1! 0! 0!
pbrd (borderline bit errors) ppred ✓ 1 A ◆ 2 8 192 16 384 32 768 61 440 10−6 10−5 10−4 10−3 values of A (16-bit constant key) psdc (residual error probability)
Consistence with Coding Theory – Mission Failed?
■ Fault-simulation entir
entire fault-space e fault-space! ■ Each and every
Each and every A, v and fault pattern
■ 6.5
6.5×10 1016
16 experiments
experiments for 16 bit As and 1-8 bit soft errors
→ Excess of pr Excess of predicted r edicted residual err esidual error pr
"! → Violation of pr iolation of predicted fault-detection capabilities edicted fault-detection capabilities!
Peter Ulbrich – ulbrich@cs.fau.de 14
Think Binary
Peter Ulbrich – ulbrich@cs.fau.de 15
■ Binary representation of code words!
■ Coding theory is unaware of machine word sizes
→ Danger Dangerous over
■ Extended AN code (EAN) implementation → Compliance with coding theory! Compliance with coding theory!!
■ Improved code reliability (A = 251)!
■ Predicted
3 3×10 10-3
■ Common implementation [4]
≈ 1.3 1.3×10 10-2
■ EAN implementation
≈ 1.5 1.5×10 10-5
→ Impr Improvement by or
ders of magnitude!!
Know your Compiler and Architecture
Peter Ulbrich – ulbrich@cs.fau.de 16
■ On target fault-injection entir
entire fault space e fault space! ■ Each and every
Each and every register, flag, instruction and execution path
■ FAIL* fault injection framework [6] → Violation of pr iolation of predicted fault-detection capabilities edicted fault-detection capabilities!
■ Architecture specifics!
■ Absence of compound test-and-branch
test-and-branch (e.g., IA32 architecture)
■ Control-flow information is stor
information is stored in single bit ed in single bit → Redundancy is lost Redundancy is lost → Additional range checks Additional range checks
■ Undefined Execution Environment !
■ Zombie values
Zombie values leaking from caller to voter function
■ Compiler laziness
Compiler laziness leaves encoded values in registers → Isolation assumptions violated Isolation assumptions violated → Cleaning local storage r Cleaning local storage restor estores isolation es isolation
→ Tight feedback loop with fault-injection experiments Tight feedback loop with fault-injection experiments!
Agenda
■ Introduction! ■ The Combined Redundancy approach (CoRed )!
■ Holistic protection – eliminating single points of failure ■ Arithmetic coding ■ Dependable voting
■ Constraining residual error probability!
■ From coding theory to application – lessons learned ■ Finding appropriate parameters ■ Circumvent implementation pitfalls
■ Evaluation!
■ Use case ■ Experimental setup ■ Fault-injection results
■ Conclusion!
Peter Ulbrich – ulbrich@cs.fau.de 17
Evaluation – Experimental Setup
Peter Ulbrich – ulbrich@cs.fau.de 18
System Under Test
Replica 2 EAN Decode EAN Encode Replica 3 EAN Decode Replica 1 EAN Decode EAN Encode EAN Encode CoRed Encoded Tolerance Voter Sensor 1 Sensor 2 Sensor 3 EAN Encode EAN Encode EAN Encode Sensor System Network Interface EAN Decode CoRed Encoded (Exact) Voter Actuator Remote NodeFlightGControl(Applica&on(
Host Computer Hardware Debugger FAIL*( (Campaign(Manager( Fault(DB( Results(DB(
Outcome: 401,592 401,592 experiments Effective: 67,617 67,617 errors Categories: Fail Silent Fail Silent, , Masked Masked, , " Har Hardwar dware Detected e Detected, , EAN-Code EAN-Code, , Contr Control-Flow
Silent Data Corruption Silent Data Corruption
Evaluation – Experimental Results (1)
■ Redundant execution campaign (Interface) !
■ Total: ~45,000 Errors
Peter Ulbrich – ulbrich@cs.fau.de 19
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % Distribution of Effective Faults Mask HW EAN SDC Mask HW EAN SDC Mask HW EAN SDC Unprotected Plain TMR CoRed TMR
Replica(2( Replica(3( Replica(1( Interface(
Silent Data Corruptions Hardware Detected EAN-Code Detected Masked
Evaluation – Experimental Results (1)
■ Redundant execution campaign (Interface) !
■ Total: ~45,000 Errors Unpr
Unprotected
3,622 corruptions!
Peter Ulbrich – ulbrich@cs.fau.de 19
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % Distribution of Effective Faults Mask HW EAN SDC Mask HW EAN SDC Mask HW EAN SDC Unprotected Plain TMR CoRed TMR
Replica(2( Replica(3( Replica(1( Interface(
Silent Data Corruptions Hardware Detected EAN-Code Detected Masked
Evaluation – Experimental Results (1)
■ Redundant execution campaign (Interface) !
■ Total: ~45,000 Errors Unpr
Unprotected
3,622 corruptions!
■ TMR
TMR: Suffers from 71 corruptions 71 corruptions!
Peter Ulbrich – ulbrich@cs.fau.de 19
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % Distribution of Effective Faults Mask HW EAN SDC Mask HW EAN SDC Mask HW EAN SDC Unprotected Plain TMR CoRed TMR
Replica(2( Replica(3( Replica(1( Interface(
Silent Data Corruptions Hardware Detected EAN-Code Detected Masked
Evaluation – Experimental Results (1)
■ Redundant execution campaign (Interface) !
■ Total: ~45,000 Errors Unpr
Unprotected
3,622 corruptions!
■ TMR
TMR: Suffers from 71 corruptions 71 corruptions!
■ CoRed
CoRed: Remaining corruptions are covered 0 corruptions 0 corruptions
Peter Ulbrich – ulbrich@cs.fau.de 19
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % Distribution of Effective Faults Mask HW EAN SDC Mask HW EAN SDC Mask HW EAN SDC Unprotected Plain TMR CoRed TMR
Replica(2( Replica(3( Replica(1( Interface(
Silent Data Corruptions Hardware Detected EAN-Code Detected Masked
Evaluation – Experimental Results (2)
■ Voter campaign!
"
Peter Ulbrich – ulbrich@cs.fau.de 20
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % CFM HW EAN SDC Plain Voter CoRed Encoded Voter Mask CFM HW EAN SDC Mask
Replica(2( Replica(3( Replica(1( Voter(
Silent Data Corruptions Hardware Detected EAN-Code Detected Control-flow Monitoring Masked
Evaluation – Experimental Results (2)
■ Voter campaign!
■ Plain voter
Plain voter: Total ~11,000 2,465 masked 7,245 retry 1,223 corruptions 1,223 corruptions "
Peter Ulbrich – ulbrich@cs.fau.de 20
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % CFM HW EAN SDC Plain Voter CoRed Encoded Voter Mask CFM HW EAN SDC Mask
Replica(2( Replica(3( Replica(1( Voter(
Silent Data Corruptions Hardware Detected EAN-Code Detected Control-flow Monitoring Masked
Evaluation – Experimental Results (2)
■ Voter campaign!
■ Plain voter
Plain voter: Total ~11,000 2,465 masked 7,245 retry 1,223 corruptions 1,223 corruptions
■ CoRed Dependable V
CoRed Dependable Voter
Total ~26,000 1,228 masked 24,682 retry 0 corruptions 0 corruptions
Peter Ulbrich – ulbrich@cs.fau.de 20
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % CFM HW EAN SDC Plain Voter CoRed Encoded Voter Mask CFM HW EAN SDC Mask
Replica(2( Replica(3( Replica(1( Voter(
Silent Data Corruptions Hardware Detected EAN-Code Detected Control-flow Monitoring Masked
Evaluation – Experimental Results (2)
■ Voter campaign!
■ Plain voter
Plain voter: Total ~11,000 2,465 masked 7,245 retry 1,223 corruptions 1,223 corruptions
■ CoRed V
CoRed Voter
" Total ~26,000 1,228 masked 24,682 retry 0 corruptions 0 corruptions
Peter Ulbrich – ulbrich@cs.fau.de 21
Data Address
0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % CFM HW EAN SDC Plain Voter CoRed Encoded Voter Mask CFM HW EAN SDC Mask
Replica(2( Replica(3( Replica(1( Voter(
Silent Data Corruptions Hardware Detected EAN-Code Detected Control-flow Monitoring Masked
Evaluation – Overhead
■ Overhead Analysis!
■ I4Copter Flight-Control: 7.1% overhead "
(compared to plain TMR)
■ Selectivity!
■ I4Copter system CPU utilisation: 41% "
Full replication impossible, CPU: 120%
■ Mission-critical replication of flight control"
possible with CoRed, CPU: 60%
Conclusion
Eliminate single points of failure [1]!
! !
!
Peter Ulbrich – ulbrich@cs.fau.de 22
Safety-Critical System!
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface(
Conclusion
Eliminate single points of failure [1]!
■ TMR + Encoding: Combined Redundancy appr
Combined Redundancy approach
■ Key feature: CoRed Dependable V
CoRed Dependable Voter
! !
!
Peter Ulbrich – ulbrich@cs.fau.de 22
Safety-Critical System!
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface( Replica(2( Replica(3( Replica(1( CoRed( Voter( EAN( Coding(
Decode( Decode( Decode( Encode( Encode( Encode(
Conclusion
Eliminate single points of failure [1]!
■ TMR + Encoding: Combined Redundancy appr
Combined Redundancy approach
■ Key feature: CoRed Dependable V
CoRed Dependable Voter
Constrain residual error probability [2]!
■ Parameterisation guidelines: choosing the right A
choosing the right A
■ Binary aware implementation: complying with coding theory
complying with coding theory
■ Factor 1000 impr
Factor 1000 improvement
Dependability as a resource efficient option!
■ Only 7.1% overhead
7.1% overhead (flight control example)
!
Peter Ulbrich – ulbrich@cs.fau.de 22
Safety-Critical System!
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface( Replica(2( Replica(3( Replica(1( CoRed( Voter( EAN( Coding(
Decode( Decode( Decode( Encode( Encode( Encode(
Conclusion
Eliminate single points of failure [1]!
■ TMR + Encoding: Combined Redundancy appr
Combined Redundancy approach
■ Key feature: CoRed Dependable V
CoRed Dependable Voter
Constrain residual error probability [2]!
■ Parameterisation guidelines: choosing the right A
choosing the right A
■ Binary aware implementation: complying with coding theory
complying with coding theory
■ Factor 1000 impr
Factor 1000 improvement
Dependability as a resource efficient option!
■ Only 7.1% overhead
7.1% overhead (flight control example)
→ Bullet-pr Bullet-proof softwar
e-based fault tolerance is possible!
Peter Ulbrich – ulbrich@cs.fau.de 22
Safety-Critical System!
Sensors( Actuators( Replica(2( Replica(3( Replica(1( Majority( Voter( Interface( Replica(2( Replica(3( Replica(1( CoRed( Voter( EAN( Coding(
Decode( Decode( Decode( Encode( Encode( Encode(
(1) Ulbrich, Peter; Hoffmann, Martin; Kapitza, Rüdiger; Lohmann, Daniel; Schmid, Reiner; Schröder-Preikschat,
Wolfgang: “Eliminating Single Points of Failure in Software-Based Redundancy”, Proceedings of the 9th European Dependable Computing Conference (EDCC '12), 2012.
(2) Hoffmann, Martin; Ulbrich, Peter; Dietrich, Christian; Schirmeier, Horst; Lohmann, Daniel; Schröder-Preikschat,
Wolfgang: “A Practitioner's Guide to Software-based Soft-Error Mitigation Using AN-Codes“, Proceedings of the 15th IEEE International Symposium on High Assurance Systems Engineering (HASE '14), 2014.
http://www4.cs.fau.de/Research/CoRed!
References
(3)
P . Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi, “Modelling the effect of technology trends on the soft error rate of combinational logic,” in DSN ’02: Proceedings of the 2002 International Conference on Dependable Systems and Networks
(4)
Edmund B. Nightingale, John R Douceur, and Vince Orgovan, Cycles, Cells and Platters: An Empirical Analysis of Hardware Failures on a Million Consumer PCs, in Proceedings of EuroSys 2011
(5)
Forin, “Vital coded microprocessor principles and application for various transit systems”, 1989
(6)
Schirmeier, Horst ; Hoffmann, Martin ; Kapitza, Rüdiger ; Lohmann, Daniel ; Spinczyk, Olaf :" “FAIL: Towards a Versatile Fault-Injection Experiment Framework”, 25th International Conference on Architecture of Computing Systems, 2012
Peter Ulbrich – ulbrich@cs.fau.de 24