statistical analysis and optimization of asynchronous
play

Statistical Analysis and Optimization of Asynchronous Digital - PowerPoint PPT Presentation

Statistical Analysis and Optimization of Asynchronous Digital Circuits Tsung-Te Liu and Jan M. Rabaey University of California, Berkeley 1 Outline Motivation Variability model of CMOS digital circuit Performance model for different


  1. Statistical Analysis and Optimization of Asynchronous Digital Circuits Tsung-Te Liu and Jan M. Rabaey University of California, Berkeley 1

  2. Outline • Motivation • Variability model of CMOS digital circuit • Performance model for different timing schemes • Performance comparison • Conclusion 2

  3. Variability Continues to Increase as Technology and Voltage Scales Down Delay spread due to process variations Device variability vs. Technology node -40% ~ +30% @1V Count Normalized Delay -80% ~ +110% @0.3V Count [Cao, ASU] Normalized Delay • Higher variability with finer design rules and larger wafers • Higher variability with lower supply voltages 3

  4. Circuit Performance Characteristics with Different Timing Schemes Self-timed circuit Probability Conventional Original synchronous circuit circuit A A: protocol circuit delay B B: 3 σ delay variation Computation Delay • Self-timed circuit is a variation-monitoring circuit by itself • Becomes advantageous when the variation is large (B>A) • Statistical analysis framework is necessary 4

  5. Statistical Analysis Framework Circuit Variability Model Performance Model • Supply voltage • Computation overhead • Logic depth • Communication overhead • Width and length • Delay and energy performance • Body bias Determine the optimal timing strategy in the presence of variability Energy Processors Communications Sensors 0 Delay 5

  6. Outline • Motivation • Variability model of CMOS digital circuit • Performance model for different timing schemes • Performance comparison • Conclusion 6

  7. Delay Model of CMOS Digital Circuit 2 10 1 Simulation data Model 0.5 Delay [FO4(@V DD =1V)] 2 ( ) V DD " V th 0 Error [%] I ! # & 1 + V DD " V th � 0.5 % ( E sat L $ ' 1 10 � 1 � 1.5 0.2 0.4 0.6 0.8 1 Supply Voltage [V] I ! exp V DD " V th # & 2 / 2 ) , # & % ( ln 1 + exp V DD " V th S $ ' 0 3 % ( + . $ 2 S ' * - 1 4 0 I ! 10 0.2 0.4 0.6 0.8 1 / 2 ) , # & 5 5 1 + ln 1 + exp V DD " V th Supply Voltage [V] 0 + . 3 % ( 4-stage FO4 INV chain E sat L 5 $ ' 5 * - 1 4 • One unified current model across different operating regions • Model error <2% from 0.3V to 1V 7

  8. Delay Variability Model ! V th V th ! K K 2 K = 2 " % S T d . " % V th = ! T d 2 ! ! V th 2 ! ! K S T d ! T d T d ( V th ) ( K ) $ ' ! T d T d S T d + S T d $ ' = $ ' $ ' µ T d µ µ # & # & Vth K Threshold voltage Geometry 25 20 Simulation data Simulation data Model (WID) Model (DTD) Model (Threshold voltage) Model (Threshold voltage) 20 15 Model (Geometry) Model (Geometry) 15 σ / μ [%] σ / μ [%] 10 10 5 5 0 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Supply Voltage [V] Supply Voltage [V] Die-to-die variation (DTD) Within die variation (WID) “Global variation” “Local mismatch” 8

  9. Delay Variability Model 30 4 Simulation data 2 Model (total) 25 Model (DTD) 0 Model (WID) Error [%] 20 � 2 � / µ [%] � 4 15 � 6 10 � 8 0.2 0.4 0.6 0.8 1 Supply Voltage [V] 5 2 2 ! $ ! $ ! T d , total ! T d , DTD + ! T d , WID 0 # & # & = 0.2 0.4 0.6 0.8 1 # & # & Supply Voltage [V] µ T d , total µ T d , DTD µ T d , WID " % " % • Model error <8% from 0.3V to 1V • Local mismatch dominates at low supply voltages 9

  10. Delay Variability Model with Different Logic Depths 30 15 Simulation data (n=4) n=4 n=8 Model (n=4) n=24 10 Simulation data (n=8) 25 Model (n=8) 5 Error [%] Simulation data (n=24) Model (n=24) 20 0 � / µ [%] � 5 15 � 10 0.2 0.4 0.6 0.8 1 Supply Voltage [V] 10 2 2 ! $ ! $ ! T d , total _ n ! T d , DTD _ 4 &' ! T d , WID _ 4 + 4 ! # $ # & # & = 5 # & # & µ T d , total _ n µ T d , DTD _ 4 " n % µ T d , WID _ 4 0.2 0.4 0.6 0.8 1 " % " % Supply Voltage [V] • Use 4-stage inverter chain model as baseline model • Model error <13% for n=8 and <15% for n=24 10

  11. Outline • Motivation • Variability model of CMOS digital circuit • Performance model for different timing schemes • Performance comparison • Conclusion 11

  12. Delay Overhead Evaluation Dual-rail timing Probability Synchronous timing Original circuit A A: protocol circuit delay B B: 3 σ delay variation Computation Delay • Assumption: Process variation follows Gaussian distribution • Dual-rail approach: have only protocol overhead but no delay overhead • Synchronous approach: have only delay overhead D sync = 3 ! logic , total For 99.7% yield: µ logic , total 12

  13. Bundled-Data Self-Timed Approach Main data path Main data path Replica delay line Probability ( 2 ) f logic ( t ) = N µ logic , ! logic Replica delay line Computation Delay ( ) # 1 Goal: P t logic ! t delay " line 2 ) f delay ! line = N ( µ delay ! line , ! delay ! line Assume main data path and replica delay line exhibit similar statistics: # & 2 2 D bundled ! data = D variation " 0.5 + 0.25 + For 99.7% yield: % ( % ( 2 D variation $ ' D variation = 3 ! logic , WID D bundled ! data = µ delay ! line ! µ logic where µ logic , WID µ logic 13

  14. Bundled-Data Delay Overhead & 2 # D variation , when D variation $ 0 ( D bundled ! data " . ' 2 ( D variation , when D variation $ % ) 600 O(n 2 ) 500 Delay Overhead [%] 400 • Delay overhead becomes much larger as process 300 variability increases! 200 O(n) 100 0 0 50 100 150 200 Process Variability [%] 14

  15. Performance Model under Variations Original delay and energy model Statistical delay and energy model T comp = T delay (1+P+D) T comp = T delay E dynamic = α C switch (1+P)V 2 E dynamic = α C switch V 2 E leakage =VI leakage (1+P)T delay (1+P+D) E leakage =VI leakage T delay E total = α C switch (1+P)V 2 E total = α C switch V 2 +VI leakage (1+P)T delay (1+P+D) +VI leakage T delay Timing scheme Synchronous Bundled-Data Dual-Rail Delay Overhead (D) D sync D bundled-data 0 Protocol Overhead (P) 0 P bundled-data P dual-rail • Evaluate computation delay and energy under variations • Overhead changes with supply voltage and logic depth 15

  16. Outline • Motivation • Variability model of CMOS digital circuit • Performance model for different timing schemes • Performance comparison • Conclusion 16

  17. Delay Overhead Comparison 120 70 Synchronous Timing Synchronous Timing Bundled � Data Self � Timing Bundled � Data Self � Timing 60 100 Delay Overhead [%] Delay Overhead [%] 50 80 40 60 30 40 20 20 10 0 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Supply Voltage [V] Supply Voltage [V] 4-stage FO4 INV chain 24-stage FO4 INV chain • Global variation affects only synchronous approach • Local mismatch dominates at low supply voltages • Local mismatch has less impact on longer critical path 17

  18. Speed Performance Comparison 1.3 1 Dual � Rail Self � Timing Dual � Rail Self � Timing Bundled � Data Self � Timing Bundled � Data Self � Timing 0.95 1.2 0.9 Normalized Delay Normalized Delay 1.1 0.85 0.8 1 0.75 0.9 0.7 0.8 0.65 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Supply Voltage [V] Supply Voltage [V] 4-stage FO4 INV chain 24-stage FO4 INV chain • Assumption: P bundled-data = 1T FO4 ; P dual-rail = 2T FO4 • Synchronous scheme is better for small critical path at high supply voltages • Dual-rail scheme is better for large critical path at low supply voltages 18

  19. Energy Performance Comparison Energy � Delay Plot 60 70 Synchronous Timing ( � = 0.1) Synchronous Timing ( � = 0.01) Dual � Rail Self � Timing ( � = 0.1) Dual � Rail Self � Timing ( � = 0.01) 50 60 Bundled � Data Self � timing ( � = 0.1) Bundled � Data Self � timing ( � = 0.01) 40 Energy [fJ] Energy [fJ] 50 30 40 20 30 10 0 20 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Supply [V] Supply [V] 24-stage FO4 INV chain • Synchronous scheme is better for high activity at high supply voltages • Dual-rail scheme is better for low activity at low supply voltages • Leakage dominates for low activity at low supply voltages 19

  20. Conclusion • A statistical analysis framework is proposed to evaluate performance of CMOS digital circuit in the presence of process variations. • Designer can efficiently determine the optimal timing strategy, pipeline depth and supply voltage based on the proposed variability and statistical performance models. • Asynchronous design exhibits better energy and delay characteristics for circuits with low activity and larger critical path delay under process variations 20

  21. Acknowledgement • Berkeley Wireless Research Center • NSF Infrastructure Grant • STMicroelectronics • Multiscale System Center Thank you! 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend