a new approach to system level single event survivability
play

A New Approach to System-Level Single Event Survivability Prediction - PowerPoint PPT Presentation

A New Approach to System-Level Single Event Survivability Prediction Melanie Berg 1 , Kenneth LaBel 2 , Michael Campola 2 , Michael Xapsos 2 Melanie.D.Berg@NASA.gov 1.AS&D in support of NASA/GSFC 2. NASA/GSFC P resented by Melanie Berg at


  1. A New Approach to System-Level Single Event Survivability Prediction Melanie Berg 1 , Kenneth LaBel 2 , Michael Campola 2 , Michael Xapsos 2 Melanie.D.Berg@NASA.gov 1.AS&D in support of NASA/GSFC 2. NASA/GSFC P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018 1

  2. Acronyms • Combinatorial logic (CL) • Probability of configuration upsets (P configuration ) • Commercial off the shelf (COTS) • Probability of Functional Logic upsets (P functionalLogic ) • Complementary metal-oxide semiconductor (CMOS) • Probability of single event functional interrupt (P SEFI ) • Device under test (DUT) • Probability of system failure (P system ) • Edge-triggered flip-flops (DFFs) • Processor (PC) • Electronic design automation (EDA) • Radiation Effects and Analysis Group (REAG) • Error rate ( λ ) • Reliability over time (R(t)) • Error rate per bit( λ bit ) • Reliability over fluence (R( Φ )) • Error rate per system( λ system ) • Single event effect (SEE) • Field programmable gate array (FPGA) • Single event functional interrupt (SEFI) • Global triple modular redundancy (GTMR) • Single event latch-up (SEL) • Hardware description language (HDL) • Single event transient (SET) • Input – output (I/O) • Single event upset (SEU) • Intellectual Property (IP) • Single event upset cross-section ( σ SEU ) • Linear energy transfer (LET) • System on a chip (SoC) • Mean fluence to failure (MFTF) • Windowed Shift Register (WSR) • Mean time to failure (MTTF) • Xilinx Virtex 5 field programmable gate array (V5) • Number of used bits (#Usedbits) • Xilinx Virtex 5 field programmable gate array • Operational frequency (fs) radiation hardened (V5QV) • Personal Computer (PC) 2 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  3. Problem Statement and Abstract • The process for application of single event upset (SEU) data used to characterize system performance in radiation environments needs improvement. • We are investigating the application of classical reliability performance metrics combined with standard SEU analysis data to improve system survivability prediction. This presentation is a simplified approach for SEU data extrapolation to complex systems. Future work will incorporate additional details. 3 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  4. Background (1) : FPGA SEU Susceptibility SEU Cross Section ( σ SEU ) σ SEU s ( per category) are calculated from SEU test and analysis. • σ SEU s are calculated per particle linear energy transfer (LET). • Most believe the dominant σ SEU s are per bit (configuration or flip- • flops (DFFs)). However, global routes are significant (more than DFFs). σ SEU s are measured σ SEU s are measured by bit! by bit??? Design σ SEU Configuration σ SEU SEFI σ SEU Functional logic σ SEU Sequential and Global Routes For a system, should σ SEU s be Combinatorial and Hidden logic (CL) in data Logic measured by bit???? path 4 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  5. Window Shift Register (WSR) Microsemi σ SEU s: Design and Stimulus Dependencies to SEUs Add combinatorial logic, Increase frequency may 7.00E-09 increase cross section. or may not change SEU data. WSR16 Checkerboard 6.00E-09 How and what you WSR8 Checkerboard test make a big WSR4 Checkerboard 5.00E-09 difference! WSR0 Checkerboard σ SEU (cm 2 /DFF) WSR16 All 1's 4.00E-09 WSR8 All 1's WSR4 All 1's 3.00E-09 WSR0 All 1's WSR16 All 0's 2.00E-09 WSR8 All 0's 1.00E-09 WSR4 All 0's WSR0 All 0's 0.00E+00 σ SEU = #errors/fluence 0 5 10 15 20 25 λ system = #errors/time LET MeV*cm 2 /mg LET: Linear energy transfer 5 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  6. Background (2) Conventional Conversion of SEU Cross-Sections To Error Rates for Complex Systems Next Step Bottom-Up approach (transistor level): • – Given σ SEU (per bit) use an error rate calculator (such as CRÈME96) to obtain an error rate per bit ( λ bit ). – Multiply λ bit by the number of used memory bits (# UsedBits ) in the target design to attain a system error rate ( λ system ). Configuration and DFFs. Top-Down approach (system level): • Given σ SEU (per system) use an error rate calculator (such as • CRÈME96) to obtain an error rate per bit ( λ system ). 6 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  7. Technical Problems with Current Methods of Error Rate Calculation For submission to CRÈME96, σ SEU • data (in Log-linear form) are fitted to a Weibull curve. 1.00E-01 – During the curve fitting process, a large amount of error can be 1.00E-02 σ SEU (cm 2 /design) introduced. 1.00E-03 – Consequently, it is possible for 1.00E-04 resultant error rates (for the same design) to vary by decades. 1.00E-05 Because of the error rate calculation • 1.00E-06 process, σ SEU data are blended together and it is nearly impossible 1.00E-07 to hone in on the problem spots. 1.00E-08 This can become important for 0.0 20.0 40.0 60.0 mitigation insertion. LET MeV*cm 2 /mg 7 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  8. Technical Problems with Bottom-Up Analysis Method Multiplying each bit within a design by λ bit • is not an efficient method of system error rate prediction. – Works well with memory structures… but…complex systems do not operate or respond like memories. – If an SEU affects a bit, and the bit is either inactive, disabled, or masked, a system malfunction might not occur. λ system < λ bit ×#UsedBits • Using the same multiplication factor across DFFs will produce extreme over-estimates. Let’s Not Reinvent The Wheel… A Proven Solution Can Be Found in Classical Reliability System-Level Analysis 8 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  9. Mapping Classical Reliability Models from The Time Domain To The Fluence Domain The exponential model that relates reliability to MTTF • assumes that during useful-lifetime: – Failures are independent. R(t)=e -t/MTTF or R(t)=e - λt – Error rate is constant. Weibull slope = 1… exponential. – MTTF = 1/ λ . Parallel between For a given LET (across fluence): • time and fluence. – SEUs are independent. – σ SEU is constant. σ SEU = #errors/fluence λ system = #errors/time – MFTF = 1/ σ SEU . Hence, mapping from the time domain to the fluence • domain (per LET) is straight forward: – t Φ – MTTF MFTF R( Φ )=e - Φ /MFTF R(t)=e -t/MTTF – λ σ SEU 9 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  10. Example of Proposed Methodology Application • Mission requirements: – Selection shall be made between a Xilinx V5QV (relatively expensive device) or a Xilinx V5 with embedded PowerPC (relatively cheap device). – FPGA operation shall have reliability of 3-nines (99.9%) within a 10 minute window at Geosynchronous Equatorial Orbit (GEO). • Proposed methodology: – Create a histogram of particle flux versus LET for a 10- minute window of time for your target environment. – Calculate MFTF per LET (obtain SEU data). – Graph R( Φ ) for a variety of LET values and their associated MFTFs. R( Φ )=e - Φ /MFTF – For selected ranges of LETs, use an upper bound of particle flux (number of particles/cm 2  10-minutes), to determine if the system will meet the mission’s reliability requirements. 10 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

  11. Environment Data: Flux versus LET Histogram for A 10-minute Window Geosynchronous Equatorial Orbit (GEO) 100-mils shielding Bins are selected based on σ SEU data 1.0E+03 Flux(particles/(cm 2 *10-minutes) points. 1.0E+02 We will analyze 1.0E+01 system reliability 1.0E+00 for each bin 1.0E-01 1.0E-02 1.0E-03 1.0E-04 1.0E-05 1.0E-06 1.0E-07 1.0E-08 1.0E-09 1.0E-10 0.7 0.1 0.1 to 1.8 1.8 3.6 20 40 >40 0 To 0.07 ฀ ฀ 0.07 To 0.1 3.6 To 20 ฀ ฀ 20 To 40 ฀ 40 and over 0.1 To 1.8 1.8 To 3.6 LET Bins (MeVcm 2 /mg) 11 P resented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6-7, 2018

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend