Razor and ReCycle A M E E N A K E L
Razor
Razor – Motivation  Power  Today’s designs are extremely power hungry  Power now a limiting factor  Performance cannot be sacrificed (overall) to save on power  Both situations must continue improving  Static Voltage Scaling  Not adaptive enough  Must be conservative estimates  Wastes power savings for little to no performance benefit  Make average silicon matter!
Razor – Approach  The Main Idea  Circuit delay is data dependent, so why should designers care about the conservative case?  Shooting for the “average” case – just like with ReCycle  Lower the supply voltage (to sub-critical voltages) to reduce power throughout the chip  What happens if execution encounters a “worst-case” path through a pipeline stage?  Wrong data can be latched and moved to the next stage  Razor hybrids a few previous designs to solve this.
Razor Design Goals  Razor hardware must not interfere with error-free operation of a pipeline  Nearly invisible to the common case  Razor hardware cannot fail; it must always be correct  Razor hardware must be minimal in both hardware size and power footprint
Razor – Approach  What’s unique to Razor?  Counterflow pipeline in the synchronous world  Handling metastability  Inducing error-prone state  What did it inherit?  Delayed latch idea (Triple Latch), but not implementation (Shadow Latch)  Error correction method (from DIVA)
Razor – Approach  Exploring the Shadow Latch  Method to detect and recover from errors in a minimal number of cycles  Shadow latch is delayed about 50% behind the main flip-flop’s clock in order to catch any timing errors  A comparator will quickly decide how accurate the data in the flip-flop is via an XOR gate.  Pipeline stages are designed so that in the absolute worst case, the shadow latch’s setup time is met.  An encountered error will invalidate any data coming out of the flip-flop for that cycle clk Cycle 1 Cycle 2 Cycle 3 Cycle 4 D1 Logic stage Logic stage Q1 clk Main L1 0 L2 flip-flop clk_delayed 1 Error_L D Instr 1 Instr 2 Shadow latch Razor FF Comparator Error Error clk_delayed Q Instr 1 Instr 2
Razor – Approach  Metastability – The state in which a signal is neither 0 nor 1. The state usually settles around V dd /2.  Shadow latch can never be metastable, based upon its timing constraints.  If flip-flop becomes metastable, the metastability detector can report on that fact (most of the time).  Small chance that Error can become metastable, which is claimed as inevitable. In this case, a panic signal is raised and the pipeline is flushed. !"#$% !"#$% !"# !"# & & ' ' !"#$% !"#$% !"# !"# .-/012/0%3"3/45,-/-!/*) .-/012/0%3"3/45,-/-!/*) !"#$" !"#$" ())*)$+ ())*)$+ !"#$% !"#$% !"#$,-"$% !"#$,-"$% ())*)$+ ())*)$+ !"#$,-" !"#$,-" 670,*85+0/!7 670,*85+0/!7 Figure 2. Reduced overhead Razor flip-flop and meta- stability detection circuits.
Razor Approach – Recovery  Clock Gating WB ST Stabilizer FF MEM IF ID EX Razor FF Razor FF Razor FF Razor FF (reg/mem) PC Error Error Error Error Recover Recover Recover Recover clk (a) Razor latch gets Correct value correct EX value provided to MEM Time (in cycles) IF ID EX MEM ST WB Instructions IF ID EX* MEM* MEM ST WB IF ID EX Stall MEM ST WB Stall IF ID EX MEM ST (b)
Razor Approach – Recovery  Clock Gating  Pipeline stalls on any Razor error  Forward progress is guaranteed, as the problematic input is always available at the previous stage’s Shadow Latch  Only a single cycle stall is required to recompute the next stage’s value, and the pipeline can continue.  Possible long cycle time  Cycle time must be long enough so that any stage in the pipeline can deliver a clock gating signal to the rest of the Flip Flops.
Razor Approach – Recovery  Counterflow ST WB MEM Stabilizer FF IF ID EX Razor FF Razor FF Razor FF Razor FF (reg/mem) (read only) PC Bubble Bubble Bubble Bubble Error Error Error Error Recover Recover Recover Recover Flush control FlushID FlushID FlushID FlushID (a) Razor detects fault, Pipeline flush forwards bubble toward WB, completes initiates flush toward IF Time (in cycles) ST IF ID EX MEM WB Instructions IF ID EX* Bubble MEM ST WB IF ID EX Flush EX Flush ID Flush IF IF ID IF ID IF (b)
Razor Approach – Recovery  Counterflow Pipelining  Uses an asynchronous-like design to propagate errors backwards  Now the error propagation is also pipelined, which translates to a minimal effect on the cycle time of each stage.  This translates into a tradeoff between resuming within one cycle versus a faster cycle time  Error signal travels through each pipelined register until reaching the PC, which then restarts execution.
Razor Approach – Dynamic Adjustments  Focus on a constant error rate (E ref )  Change voltages based upon this measurement  Pros  Real dynamic changes based on the runtime conditions  Cons  Voltage regulators are slow  Slow reaction causes overcompensation reset reset E diff = E ref - E sample E diff = E ref - E sample Voltage Voltage E sample E sample V dd V dd E diff E diff Voltage Voltage . . signals signals Pipeline Pipeline error error Σ Σ Control Control E ref E ref . . Regulator Regulator . . Function Function - - panic panic Figure 6. Supply Voltage Control System
Razor – Simulations/Data  Alpha-64 Simulation  Parameters:  In-order pipeline  8 KB I/D Caches  192/2408 flip-flops were augmented with a shadow latch.  Important results:  3.1% total power overhead for Razor parts  1% of total power for recovery overhead
Razor – Simulations/Data  FPGA Multiplier Simulation 100.0000000% 100.0000000% 35% energy savings with 1.3% error 35% energy savings with 1.3% error 35% energy savings with 1.3% error 10.0000000% 10.0000000% Error rate (log scale) Error rate (log scale) 1.0000000% 1.0000000% 30% energy saving 30% energy saving 30% energy saving 0.1000000% 0.1000000% 22% saving 22% saving 22% saving 0.0100000% 0.0100000% 0.0010000% 0.0010000% 0.0001000% 0.0001000% 0.0000100% 0.0000100% One error every ~20 seconds One error every ~20 seconds random random 0.0000010% 0.0000010% 0.0000001% 0.0000001% 0.0000000% 0.0000000% 1.78 1.78 1.74 1.74 1.70 1.70 1.66 1.66 1.62 1.62 1.58 1.58 1.54 1.54 1.50 1.50 1.46 1.46 1.42 1.42 1.38 1.38 1.34 1.34 1.30 1.30 1.26 1.26 1.22 1.22 1.18 1.18 1.14 1.14 Supply Voltage (V) Supply Voltage (V) Environmental-margin Environmental-margin Safety-margin Safety-margin Zero-margin Zero-margin @ 1.69 V @ 1.69 V @ 1.63 V @ 1.63 V @ 1.54 V @ 1.54 V Figure 9. Measured Error Rates for an 18x18-bit FPGA Multiplier Block at 90 MHz and 27 C.
Razor – Simulations/Data  Adder Simulation BZIP 1 .5 R e l E ne rgy  Fixed voltage sweep 1 .3 R e l P e rform ance Relative IPC and Energy 1 .1  Goal: 0 .9 0 .7  Reduce energy without 0 .5 sacrificing IPC 0 .3 1 % E rror R ate 0 .3 1 .8 1 .5 1.2 0 .9 0 .6 1.725 1 .65 1.575 1 .425 1 .35 1 .275 1 .125 1 .05 0 .975 0.825 0 .75 0.675 Voltage P ipe line P ipe line Throughput Throughput GCC E ne rgy E ne rgy 1 .5 IP C IP C R e l E ne rgy Tota l Adder E ne rgy, Tota l Adder E ne rgy, 1 .3 R e l P e rform ance Relative IPC and Energy E adder = E additions + E recovery E adder = E additions + E recovery 1 .1 0 .9 Optimal E adder Optimal E adder 0 .7 1 .6 2 % E rror R ate E nergy of Adde r E nergy of Adde r E ne rgy of E ne rgy of O pera tions, E additions O pera tions, E additions P ipe line P ipe line 0 .5 R e covery, R e covery, E nergy of Adde r E nergy of Adde r E recove ry E recove ry w/o R a zor S upport w/o R a zor S upport 0 .3 1.8 1 .725 1 .65 1 .575 1 .5 1 .425 1.35 1 .275 1 .2 1 .125 1 .05 0 .975 0.9 0 .825 0.75 0 .675 0 .6 D e cre asing S upply V oltage D e cre asing S upply V oltage Voltage Figure 11. The Qualitative Relationship Between Supply Voltage, Energy and Pipeline Throughput (for Figure 12. Relative Adder Energy and Pipeline a fixed frequency) . Throughput for Simulated Benchmarks.
Razor – Simulations/Data GCC  Dynamic Scaling 2 40.00% Voltage 1.8 Error Rate 35.00%  Target error rate was 1.5% 1.6 30.00% 1.4 Supply Voltage  Takes 5000 cycle chunk 25.00% Error Rate 1.2 1 20.00% samples 0.8 15.00% 0.6 10.00% 0.4  Uses those chunks to 5.00% 0.2 0 0.00% dynamically scale voltage Time G ap  Slow reaction times 2 3 0 . 0 0 % V o lta g e 2 7 . 0 0 % 1 . 8 E rro r R a te 2 4 . 0 0 % 1 . 6 2 1 . 0 0 % Supply Voltage Error Rate 1 8 . 0 0 % 1 . 4 1 5 . 0 0 % 1 . 2 1 2 . 0 0 % 9 .0 0 % 1 6 .0 0 % 0 . 8 3 .0 0 % 0 . 6 0 .0 0 % T im e Figure 13. Adder Error Rate and Voltage Controller Response.
Recommend
More recommend