NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for - - PowerPoint PPT Presentation

nocalert an on line and real
SMART_READER_LITE
LIVE PREVIEW

NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for - - PowerPoint PPT Presentation

NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures Andreas Prodromou, Andreas Panteli, Chrysostomos Nicopoulos, and YiannakisSazeides University of Cyprus MICRO 2012 Presenters: Leul Belayneh, Shibo


slide-1
SLIDE 1

Andreas Prodromou, Andreas Panteli, Chrysostomos Nicopoulos, and YiannakisSazeides

University of Cyprus MICRO 2012

NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures

Presenters: Leul Belayneh, Shibo Chen

slide-2
SLIDE 2

Motivation

Moore’s Law Chip Multi-Processor

Network on Chips

Permanent Fault Transient Fault Intermittent Fault

Pictures from internet

slide-3
SLIDE 3

Previous Works

Fault Prevention: Bulletproof [HPCA 06’] Fault Recovery: Vicis [DAC 09’] Relinoc [DATE 11’] Fault Detection: ForEver [MICRO 11']

slide-4
SLIDE 4

Goal

A light-weight comprehensive on-line and real-time fault detection mechanism on NoC’s control logic.

slide-5
SLIDE 5

Background

Flit

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-6
SLIDE 6

Background

Network invariance instances

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-7
SLIDE 7

Methodology

  • Identifying invariances/forbidden behaviors in

the baseline router

  • Design individual checker for each invariance
  • Low power and area overhead design
slide-8
SLIDE 8

Identifying Invariances

32 Invariances Identified:

RC Unit: 3 Arbiter Modules: 10 Crossbar: 3 Buffer State: 12 Port-Level: 3 Network-Level: 1 No flit drop: 13 No new flit generated: 6 Bounded delivery: 13 No data corruption/ packet missing:16

slide-9
SLIDE 9

Checkers

INPUTS OUTPUTS

Checker result

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-10
SLIDE 10

Experimental Evaluation

Hardware Implementation

  • Baseline router + 32 invariance checkers are implemented on 65nm

TSMC Cycle-accurate simulator

  • Garnet NoC Simulator for 8X8 2D mesh network
  • Extended Garnet with checker modules.

Compared with the baseline- ForEVeR.

  • R. Parikh, V. Bertacco, "Formally enhanced runtime verification to ensure noc functional correctness," In proc. of the

International Symposium on Microarchitecture (MICRO), 2011.

slide-11
SLIDE 11

Fault Injection Model

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

205 fault locations per 5-port router 205 X 64 = 11,808 fault locations in 8X8 mesh

slide-12
SLIDE 12

Results

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-13
SLIDE 13

Results

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-14
SLIDE 14

Results

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-15
SLIDE 15

Results

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-16
SLIDE 16

Results

  • A. Prodomou, et.al, "NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip

Architectures," In proc. of the International Symposium on Microarchitecture (MICRO), 2012.

slide-17
SLIDE 17

Conclusion

NoCAlert has achieved:

0% false negatives Low detection latency- 100X over the baseline

  • 97% of true positive faults are captured at the instance of

injection Minimal power (0.7%) and area overhead (3%)

slide-18
SLIDE 18

Discussion Points

Is it worth having the 36% of false positives exhibited in the detection process? Delayed response Vs increased false positives Is it good to minimize the number of checkers to decrease area and power overhead? Do you think it is feasible on other NoC networks and router designs?