A Case for Clumsy Packet Processors Arindam Mallik and Gokhan Memik - PowerPoint PPT Presentation

A Case for Clumsy Packet Processors Arindam Mallik and Gokhan Memik Electrical and Computer Engineering Dept. Northwestern University

Overview � Faults � Correctness is overrated � What if the higher levels take care of it? � Processor can be even more aggressive/speculative � Application-specific correctness � Networking applications � How do we measure? � Tools for architects � Relation between overclocking and faults Treat correctness as an objective, not a requirement 12/15/2004 International Symposium on Microarchitecture - MICRO 37 2

Outline � Introduction � Application description and error metrics � Error models for overclocking a cache � Processor configuration � Measurement definitions � Simulations 12/15/2004 International Symposium on Microarchitecture - MICRO 37 3

Motivation � Performance, energy requirements � Reliability / Probabilistic Circuits � Circuit designers have to be conservative � Worst-case design 12/15/2004 International Symposium on Microarchitecture - MICRO 37 4

Introduction � Inherent possibility of fault occurrence � Adverse environmental conditions � Aggressive scaling of supply voltage � Smaller manufacturing technologies � Need for analysis � More Transistors � Higher fault probability � Effect on system integrity � Transient faults � Permanent faults 12/15/2004 International Symposium on Microarchitecture - MICRO 37 5

Application Errors � For desktop processor or server � Capture and eliminate all faults � Networking – Communication � A certain level of error is acceptable � Nevertheless � The integrity of the system behavior must be maintained � System impact � Excessive “resubmission” � Program output 12/15/2004 International Symposium on Microarchitecture - MICRO 37 6

Overview of Approach Application Error Overclocking vs. Metrics Fault Modeling Simulator Configuration - Performance Comparison Metric - Application Errors 12/15/2004 International Symposium on Microarchitecture - MICRO 37 7

Error Classification � Fault vs. Error � Effect or duration � Volatile Error � Occurs mostly while processing a packet � Effects unit data element � Error in a single packet � Non-volatile Error � Occurs in the static data structures � Effects seen in many elements � Error in routing table 12/15/2004 International Symposium on Microarchitecture - MICRO 37 8

Error Metrics for Applications � Categorization of NetBench Applications � Low or micro-level Routines related to lowest layers of network stack � � Routing-level Applications similar to traditional IP routing (Layer 3-4 of the � network stack) � Application-level Traditional as well as emerging applications � � Common property of all applications � Control level tasks � Data level tasks 12/15/2004 International Symposium on Microarchitecture - MICRO 37 9

Error Measurement Procedure � Mark data structures in NetBench apps � Important Data Structures Routing Table Entries, TTL Value, … � � Outputs of Key Function Units Checksum Value, NAT Address � � Perform simulation � Introduce hardware faults � Mark the change � Data values change � Application behavior changes � Define the application error rate 12/15/2004 International Symposium on Microarchitecture - MICRO 37 10

A Sample Application - Route � Route – one of the most common networking applications � Implements IPv4 routing � Receives each packet – table lookup – processes it to decide the next network hop � Error Keys � Routing Table Initialization (IMPORTANT !!) � Checksum value � TTL Value � Path traversed in Routing Table for each packet 12/15/2004 International Symposium on Microarchitecture - MICRO 37 11

Fault Models for Overclocking � Overclock a component � Increased performance � Reduced energy � Increase in fault probability � Goal � Find fault vs. overclocking aggressiveness � Particular circuit design � Parameters � Voltage swing, noise 12/15/2004 International Symposium on Microarchitecture - MICRO 37 12

Opportunity for overclocking Voltage Swing vs. Time � Voltage swing � Rapid increase at first � Slow increase later 12/15/2004 International Symposium on Microarchitecture - MICRO 37 13

Not so fast, my friend! � Noise (inductive and/or capacitive) � Signal deviation � Overclocking � Reduced immunity 12/15/2004 International Symposium on Microarchitecture - MICRO 37 14

Approach � Analyze each component separately � 6-transistor SRAM cell � Input, clock, feedback loop 12/15/2004 International Symposium on Microarchitecture - MICRO 37 15

Finding fault probability 0.05*2 2n V fs 0.89V fs 0.78V fs 0.67V fs 0.39V fs 0.50V fs 0.56V fs 0.61V fs Noise immunity curves Noise amplitude for switching comb. � Analyze the impact of noise on the feedback loop � Noise immunity curves � Different noise amplitude probabilities � Check all switching combinations − 28 . 8 A = r P ( A ) 28 . 8 * e r 12/15/2004 International Symposium on Microarchitecture - MICRO 37 16

Estimation model 3.00E-04 2.50E-04 2.00E-04 1.50E-04 1.00E-04 5.00E-05 0.00E+ 00 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 relative voltage swing ( Vrs ) Fault probability versus voltage swing � Fit distribution into immunity � Combine it with voltage swing vs. time 1.00E+00 1.00E-01 Data Formula 1.00E-02 1.00E-03 1.00E-04 1.00E-05 1.00E-06 1.00E-07 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 relative cycle time ( Cr ) Fault probability versus relative clock frequency 1 2 F r = − 2 = − 6 * C 7 7 6 P 2 . 59 * 10 * e 2 . 59 * 10 * e r E 12/15/2004 International Symposium on Microarchitecture - MICRO 37 17

Outline � Introduction � Application description and error metrics � Error models for overclocking a cache � Processor configuration � Measurement definitions � Simulations 12/15/2004 International Symposium on Microarchitecture - MICRO 37 18

Processor Configuration � Fault detection � No detection � Parity One-strike, two-strikes, three-strikes � � Overclocking � Static 75%, 50%, and 25% of the original � � Dynamic Processors adapts according to fault observed � Frequency is adjusted at the end of each epoch � 12/15/2004 International Symposium on Microarchitecture - MICRO 37 19

Measurement definitions � Comparison between ideal and erroneous execution � Traditional parameters – unfair competition � � Consider both performance and reliability � � Energy-Delay-Fallibility product � Energy k x delay m x fallibility n � Fallibility = unit error occurrence probability � Can adjust the importance of faults by changing n � In present work, k = 1; m = 2; n = 2 12/15/2004 International Symposium on Microarchitecture - MICRO 37 20

Simulations � SimpleScalar Simulator for StrongARM 110 � Roughly an execution core of a Network Processor � Separate 4 KB direct mapped L1 data and instruction caches � 128 KB 4-way set-associative unified L2 cache � Error Probability � At normal clock frequency Error probability = 2.59*10 -7 per bit � � Increased error probability at higher clock rate according to the fault model 12/15/2004 International Symposium on Microarchitecture - MICRO 37 21

Application Error Behavior 0.012 0.025 Initialization Error Initialization Error 0.01 0.02 Interface Value Interface Value Error Probability Error Probability Destn Add 0.008 Destn Add 0.015 Radix Tree Entry Radix Tree Entry 0.006 Translated IP Address Translated IP Address 0.01 Fatal Error 0.004 Fatal Error 0.005 0.002 0 0 100% 75% 50% 25% 100% 75% 50% 25% Relative Clock Cycle Relative Clock Cycle Data plane Control plane 0.05 Initialization Error 0.045 Interface Value 0.04 Error Probability 0.035 Destn Add 0.03 Radix Tree Entry 0.025 Translated IP Address 0.02 Fatal Error 0.015 0.01 0.005 0 100% 75% 50% 25% Relative Clock Cycle Error introduced in both control and data plane 12/15/2004 International Symposium on Microarchitecture - MICRO 37 22

Fatal Error Probability � Curse on the system � Destroys integrity – unacceptable � Increases with high clock frequency � Observed on system with no error detection 0.0012 100% 75% 0.001 50% 25% 0.0008 Probability 0.0006 0.0004 0.0002 0 route drr nat tl url md5 crc avrg Applications 12/15/2004 International Symposium on Microarchitecture - MICRO 37 23

Energy-Delay-Fallibility Values � High Energy-Delay-Fallibility � Higher fallibility rate � Increased execution cycle � Extra instructions due to errors � Erroneous load � cache miss 2 Energy-Delay^2-Fallibility^2 1.8 1 0.75 1.6 0.5 0.25 1.4 dynamic 1.2 1 0.8 0.6 no detection one-strike two strikes three strikes Recovery Scheme 12/15/2004 International Symposium on Microarchitecture - MICRO 37 24

Conclusions � Release correctness constraint � Application-Specific Processors � Utilizing released correctness � Application-Specific error metrics � Overclocking � Fault modeling for overclocking a data cache � Error weighting – metrics 12/15/2004 International Symposium on Microarchitecture - MICRO 37 25

A Case for Clumsy Packet Processors Arindam Mallik and Gokhan Memik - PowerPoint PPT Presentation

A Case for Clumsy Packet Processors Arindam Mallik and Gokhan Memik Electrical and Computer Engineering Dept. Northwestern University Overview Faults Correctness is overrated What if the higher levels take care of it? Processor

Worm Detection ICMP Packet Analysis Ankur Agiwal 1 2 Packet Content Matching Packet

Introduction to Packet Tracer What is Packet Tracer? Packet Tracer is a protocol simulator

Chapter 7 Packet-Switching Networks Routing in Packet Networks Shortest Path Routing Chapter 7

Packet Radio Lee Maddox, N4HOK What is Packet Radio? Packet radio is the connection of a computer

Lab 1: Packet Sniffing and Wireshark Fengwei Zhang SUSTech CS 315 Computer Security 1 Packet

I wonder if I could ask for your help? Im sometimes a little clumsy with my spelling. Could

Wicked Problems & Clumsy Solutions: The Role of Leadership Keith Grint What work problem is

Lecture 4: Session Tracking Cookies allow us to maintain state, but are somewhat clumsy to

A Case for Packet Sampling A Case for Packet Sampling Tanja Zseby, zseby@fokus.fhg.de Competence

How to (passively) measure? Packet Monitoring 1 What to expect? Overview / What is packet

Towards High- -performance performance Towards High Flow- -level Packet Processing level

Stateful Firewalls Hank and Foo Types of firewalls Packet filter (stateless) Proxy

Routing in packet-switching networks Circuit switching vs. Packet switching Most of WANs based on

Packet Radio Email via radio Greg Kruckewi,, ARRL Sacramento Valley Sec:on EC Paul Grose N6DRY

Lab 1: Packet Sniffing and Wireshark Fengwei Zhang Wayne State University Course: Cyber

Packet Information IMPORTANT LABEL ON RIGHT HAND SIDE OF PACKET Shoe, Gym 11111 (Student ID #)

CS137: Things weve seen Electronic Design Automation Add two N-bit numbers in O(log(N))

Memory Hierarchy Instructor: Jun Yang 1 11/19/2009 Motivation Processor-DRAM Memory Gap

Enabling Hardware Randomization Across the Cache Hierarchy in Linux-Class Processors Max

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

Basic cache memory Computer Architecture J. Daniel Garca Snchez (coordinator) David

PRAM ALGORITHMS 2 1 27 07 2015 RAM: A MODEL OF SERIAL COMPUTATION The Random Access

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture X:

UMBC A B M A L T F O U M B C I M Y O R T 1 (12/1/04) I E S R C E O V U

A Case for Clumsy Packet Processors Arindam Mallik and Gokhan Memik - PowerPoint PPT Presentation

A Case for Clumsy Packet Processors Arindam Mallik and Gokhan Memik Electrical and Computer Engineering Dept. Northwestern University Overview Faults Correctness is overrated What if the higher levels take care of it? Processor

Worm Detection ICMP Packet Analysis Ankur Agiwal 1 2 Packet Content Matching Packet

Introduction to Packet Tracer What is Packet Tracer? Packet Tracer is a protocol simulator

Chapter 7 Packet-Switching Networks Routing in Packet Networks Shortest Path Routing Chapter 7

Packet Radio Lee Maddox, N4HOK What is Packet Radio? Packet radio is the connection of a computer

Lab 1: Packet Sniffing and Wireshark Fengwei Zhang SUSTech CS 315 Computer Security 1 Packet

I wonder if I could ask for your help? Im sometimes a little clumsy with my spelling. Could

Wicked Problems &amp; Clumsy Solutions: The Role of Leadership Keith Grint What work problem is

Lecture 4: Session Tracking Cookies allow us to maintain state, but are somewhat clumsy to

A Case for Packet Sampling A Case for Packet Sampling Tanja Zseby, zseby@fokus.fhg.de Competence

How to (passively) measure? Packet Monitoring 1 What to expect? Overview / What is packet

Towards High- -performance performance Towards High Flow- -level Packet Processing level

Stateful Firewalls Hank and Foo Types of firewalls Packet filter (stateless) Proxy

Routing in packet-switching networks Circuit switching vs. Packet switching Most of WANs based on

Packet Radio Email via radio Greg Kruckewi,, ARRL Sacramento Valley Sec:on EC Paul Grose N6DRY

Lab 1: Packet Sniffing and Wireshark Fengwei Zhang Wayne State University Course: Cyber

Packet Information IMPORTANT LABEL ON RIGHT HAND SIDE OF PACKET Shoe, Gym 11111 (Student ID #)

CS137: Things weve seen Electronic Design Automation Add two N-bit numbers in O(log(N))

Memory Hierarchy Instructor: Jun Yang 1 11/19/2009 Motivation Processor-DRAM Memory Gap

Enabling Hardware Randomization Across the Cache Hierarchy in Linux-Class Processors Max

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

Basic cache memory Computer Architecture J. Daniel Garca Snchez (coordinator) David

PRAM ALGORITHMS 2 1 27 07 2015 RAM: A MODEL OF SERIAL COMPUTATION The Random Access

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture X:

UMBC A B M A L T F O U M B C I M Y O R T 1 (12/1/04) I E S R C E O V U

Wicked Problems & Clumsy Solutions: The Role of Leadership Keith Grint What work problem is