DSENT – A Tool Connecting Emerging Photonics with Electronics for Opto- Electronic Networks-on-Chip Modeling
Chen Sun, Chia-Hsin Owen Chen, George Kurian, Lan Wei, Jason Miller, Anant Agarwal, Li-Shiuan Peh, Vladimir Stojanovic
5/19/2012 1
Photonics with Electronics for Opto- Electronic Networks-on-Chip - - PowerPoint PPT Presentation
DSENT A Tool Connecting Emerging Photonics with Electronics for Opto- Electronic Networks-on-Chip Modeling Chen Sun , Chia-Hsin Owen Chen, George Kurian, Lan Wei, Jason Miller, Anant Agarwal, Li-Shiuan Peh, Vladimir Stojanovic 5/19/2012 1
Chen Sun, Chia-Hsin Owen Chen, George Kurian, Lan Wei, Jason Miller, Anant Agarwal, Li-Shiuan Peh, Vladimir Stojanovic
5/19/2012 1
5/19/2012 2
Photonics to DRAM [Beamer ‘10, Udipi ‘11] Photonics on-chip [Vantrease ’08, Kurian ‘10]
5/19/2012 3
Photonics to DRAM [Beamer ‘10, Udipi ‘11]
Photonics on-chip [Vantrease ’08, Kurian ‘10]
5/19/2012 4
Photonics to DRAM [Beamer ‘10, Udipi ‘11]
Photonics on-chip [Vantrease ’08, Kurian ‘10]
5/19/2012 5
Photonics to DRAM [Beamer ‘10, Udipi ‘11]
Photonics on-chip [Vantrease ’08, Kurian ‘10]
5/19/2012 6
5/19/2012 7
directing data
– Digital logic – Consumes power
5/19/2012 8
– Wire capacitance switching – Repeaters
5/19/2012 9
– Receivers, Modulators – Laser – Ring thermal tuning – Serialize/Deserialize
5/19/2012 10
– Receivers, Modulators – Laser – Ring thermal tuning – Serialize/Deserialize
5/19/2012 11
All these costs need to be visible to the network architect!
5/19/2012 12
5/19/2012 13
5/19/2012 14
5/19/2012 15
[Joshi, NOCS 2009] [Pan, HPCA 2010]
Nothing currently models the interface between electronics and photonics
5/19/2012 16
[Joshi, NOCS 2009] [Pan, HPCA 2010]
5/19/2012 17
5/19/2012 18
5/19/2012 19
5/19/2012 20
5/19/2012 21
5/19/2012 22
Scaling factors no longer valid for advanced processes
5/19/2012 23
Scaling factors no longer valid for advanced processes Very difficult to add technology or extend existing models
5/19/2012 24
Incomplete architectural models and timing for the router Scaling factors no longer valid for advanced processes Very difficult to add technology or extend existing models
5/19/2012 25
Incomplete architectural models and timing for the router Scaling factors no longer valid for advanced processes Very difficult to add technology or extend existing models All links are optimized for min-delay
5/19/2012 26
Incomplete architectural models and timing for the router Scaling factors no longer valid for advanced processes Very low accuracies for modern technologies
Very difficult to add technology or extend existing models All links are optimized for min-delay
5/19/2012 27
Incomplete architectural models and timing for the router Scaling factors no longer valid for advanced processes Very low accuracies for modern technologies
Very difficult to add technology or extend existing models A 10-year-old model that worked well, but insufficient now All links are optimized for min-delay
5/19/2012 28
5/19/2012 29
5/19/2012 30
5/19/2012 31
5/19/2012 32
5/19/2012 33
5/19/2012 34
– Takes network parameters, queries, technology, give back area, power
5/19/2012 35
Technology File Network Parameter File
– Takes network parameters, queries, technology, give back area, power
5/19/2012 36
Technology File Network Parameter File Run DSENT Results
traces
5/19/2012 37
5/19/2012 38
5/19/2012 39
ASIC-like modeling flow, generates primitives/standard cells
DSENT
User-Defined Models Support Models Tools
Arbiter Router Decoder Buffers Technology Characterization Area Mesh Network Electrical Clos Repeated Link Optical Link Photonic Clos Crossbar Multiplexer Delay Technology Parameters Model Parameters Standard Cells Timing Optimization Expected Transitions Optical Link Components Optical Link Optimization Non-Data- Dependent Power Data-Dependent Energy Nin Nout fclock ... Process VDD Wmin T ...
User Inputs DSENT Outputs
5/19/2012 40
Keep relevant tech parameters, simplify technology entry ASIC-like modeling flow, generates primitives/standard cells
DSENT
User-Defined Models Support Models Tools
Arbiter Router Decoder Buffers Technology Characterization Area Mesh Network Electrical Clos Repeated Link Optical Link Photonic Clos Crossbar Multiplexer Delay Technology Parameters Model Parameters Standard Cells Timing Optimization Expected Transitions Optical Link Components Optical Link Optimization Non-Data- Dependent Power Data-Dependent Energy Nin Nout fclock ... Process VDD Wmin T ...
User Inputs DSENT Outputs
5/19/2012 41
Keep relevant tech parameters, simplify technology entry ASIC-like modeling flow, generates primitives/standard cells
DSENT
User-Defined Models Support Models Tools
Arbiter Router Decoder Buffers Technology Characterization Area Mesh Network Electrical Clos Repeated Link Optical Link Photonic Clos Crossbar Multiplexer Delay Technology Parameters Model Parameters Standard Cells Timing Optimization Expected Transitions Optical Link Components Optical Link Optimization Non-Data- Dependent Power Data-Dependent Energy Nin Nout fclock ... Process VDD Wmin T ...
Delay model, timing-constrained cell sizing, electrical links
User Inputs DSENT Outputs
5/19/2012 42
Keep relevant tech parameters, simplify technology entry ASIC-like modeling flow, generates primitives/standard cells Delay model, timing-constrained cell sizing, electrical links Able to model more generic digital, beyond just routers
DSENT
User-Defined Models Support Models Tools
Arbiter Router Decoder Buffers Technology Characterization Area Mesh Network Electrical Clos Repeated Link Optical Link Photonic Clos Crossbar Multiplexer Delay Technology Parameters Model Parameters Standard Cells Timing Optimization Expected Transitions Optical Link Components Optical Link Optimization Non-Data- Dependent Power Data-Dependent Energy Nin Nout fclock ... Process VDD Wmin T ...
User Inputs DSENT Outputs
5/19/2012 43
Delay model, timing-constrained cell sizing, electrical links ASIC-like flow, standard cell based Keep relevant tech parameters, simplify technology entry Able to model more generic digital, beyond just routers Methodology targeted for 45 nm and below
5/19/2012 44
Delay model, timing-constrained cell sizing, electrical links ASIC-like flow, standard cell based Keep relevant tech parameters, simplify technology entry Able to model more generic digital, beyond just routers Power/Area estimates accurate to ~20% of SPICE simulation Methodology targeted for 45 nm and below
Model Reference Point DSENT Router (6x6) Buffer (mW) SPICE – 6.93 7.55 (+9%) Xbar (mW) SPICE – 2.14 2.06 (+4%) Control (mW) SPICE – 0.75 0.83 (+11%) Clock (mW) SPICE – 0.74 0.63 (-15%) Total (mW) SPICE – 10.6 11.2 (+6%) Area (mm2) Encounter – 0.070 0.062 (-11%)
5/19/2012 45
5/19/2012 46
– High data-rate – Higher modulation depth (extinction ratio) – Lower insertion loss
5/19/2012 47
expensive with:
– High data-rate
– High data-rate – Lower modulation depth – Higher bit error rate requirement
5/19/2012 48
– Higher receiver sensitivity requirement – Higher channel losses, e.g. higher modulator insertion loss
5/19/2012 49
temperature, active tuning is needed
5/19/2012 50
temperature, active tuning is needed
– [Georgas CICC 2011, Nitta HPCA 2011]
5/19/2012 51
temperature, active tuning is needed
– [Georgas CICC 2011, Nitta HPCA 2011]
DSENT models schemes for tuning, impact of process sigmas
5/19/2012 52
temperature, active tuning is needed
– [Georgas CICC 2011, Nitta HPCA 2011]
Serializer/Deserializers are taken care of by electrical framework DSENT models schemes for tuning, impact of process sigmas
5/19/2012 53
– Pclos, EClos normalized to same throughput/latency
routers, k, n, r = 16, 16, 16
Compare at
5/19/2012 54
[Joshi, NOCS 2009]
Data-Dependent Non-Data-Dependent Router data-path/control Leakage Electrical links Un-gated clocks Gated clocks Laser Receiver/Modulator Thermal tuning, ring heating SerDes
5/19/2012 55
Data-Dependent energy dominant Non-data-dependent energy dominant
5/19/2012 56
Data-Dependent energy dominant Non-data-dependent energy dominant
5/19/2012 57
crossover points
Data-Dependent energy dominant Non-data-dependent energy dominant
5/19/2012 58
Max Throughput Low Throughput
Energy Breakdown at Max Network Throughput (33 Tb/s)
Electrical 45nm Photonic 11nm Photonic 45nm Electrical 11nm
5/19/2012 59
Energy Breakdown at Max Network Throughput (33 Tb/s)
Electrical 45nm Photonic 11nm Photonic 45nm Electrical 11nm
Energy Breakdown at Low Network Throughput (4.5 Tb/s)
Electrical 45nm Photonic 45nm Photonic 11nm Electrical 11nm
5/19/2012 60
Energy Breakdown at Max Network Throughput (33 Tb/s)
Electrical 45nm Photonic 11nm Photonic 45nm Electrical 11nm
Energy Breakdown at Low Network Throughput (4.5 Tb/s)
Electrical 45nm Photonic 45nm Photonic 11nm Electrical 11nm
Significant non-data- dependent laser, tuning
5/19/2012 61
Energy Breakdown at Low Network Throughput (4.5 Tb/s)
Electrical 45nm Photonic 45nm Photonic 11nm Electrical 11nm
5/19/2012 62
Energy Breakdown at Low Network Throughput (4.5 Tb/s)
Electrical 45nm Photonic 45nm Photonic 11nm Electrical 11nm
5/19/2012 63
5/19/2012 64
5/19/2012 65
5/19/2012 66
Very costly above 1.0 dB/cm
5/19/2012 67
Very costly above 1.0 dB/cm Some gains going below 1.0 dB/cm, still can’t win at lower utilizations
5/19/2012 68
Very costly above 1.0 dB/cm Some gains going below 1.0 dB/cm, still can’t win at lower utilizations
5/19/2012 69
5/19/2012 70
5/19/2012 71
– Modulator is DD, laser is NDD
5/19/2012 72
– Modulator is DD, laser is NDD
5/19/2012 73
5/19/2012 74
– Generalized methodology for digital components – Moves beyond fixed number evaluations for photonics – Includes power/area models for several networks
5/19/2012 75
– Generalized methodology for digital components – Moves beyond fixed number evaluations for photonics – Includes power/area models for several networks
tradeoffs for an example photonic clos network
– Utilization-dependent energy plots – Data-dependent and non-data-dependent power – Investigate network sensitivity to optical parameters
5/19/2012 76
– Generalized methodology for digital components – Moves beyond fixed number evaluations for photonics – Includes power/area models for several networks
tradeoffs for an example photonic clos network
– Utilization-dependent energy plots – Data-dependent and non-data-dependent power – Investigate network sensitivity to optical parameters
– Ease user model specification to aid microarchitecture studies – Automatically form estimates for local interconnect
5/19/2012 77
(we will make it downloadable following the conference)
5/19/2012 78
– Integrated Photonics teams at MIT and University of Colorado, Boulder for models – Prof. Dmitri Antoniadas’s group for their sub-45nm transistor models
– DARPA, NSF , FCRP IFC, SMART LEES, Trusted Foundry, Intel, APIC, MIT CICS, NSERC
5/19/2012 79
5/19/2012 80
5/19/2012 81
– Uses old Cacti decoder sizing
– Though clock power of those is added
– Optimal delay H-tree
5/19/2012 82
DSENT
User-Defined Models Support Models Tools
Arbiter Router Decoder Buffers Technology Characterization Area Mesh Network Electrical Clos Repeated Link Optical Link Photonic Clos Crossbar Multiplexer Delay Technology Parameters Model Parameters Standard Cells Timing Optimization Expected Transitions Optical Link Components Optical Link Optimization Non-Data- Dependent Power Data-Dependent Energy Nin Nout fclock ... Process VDD Wmin T ...
5/19/2012 83
5/19/2012 84
External Laser Source Chip
Sender A λ1 λ2 λ1 λ2 Sender B Receiver A Receiver B Ring Modulator with λ1 resonance Ring Modulator with λ2 resonance Single Mode Fiber Coupler Ring Filter with λ1 resonance Ring Filter with λ2 resonance On-chip Waveguide Modulator Driver Receiver Circuit Photodetector λ1 + λ2
5/19/2012 85
...
Delay Delay Delay Delay
...
Delay
...
A-Y
...
A-Y B-Y B-Y A-Y Ron-INV Ron-NAND2 Ron-NAND2 Cin-INV Cin-NAND2 Cin-NAND2
INV NAND2 NAND2
Equivalent Circuit Equivalent Circuit Equivalent Circuit
X Z Z X
Timing Optimization Iteration 1 50 Big Cap 10 25 20 10 200 50 Timing not met! Size up! 1 1 1 35 Timing Optimization Iteration 3 50 Big Cap 10 50 30 20 40 50 Timing not met! Size up! 1 6 55 1 1 Timing Optimization Iteration 4 50 Big Cap 20 35 30 20 40 50 3 6 3 45 1 Timing met! Timing Optimization Iteration 2 50 Big Cap 10 50 45 10 60 50 Size up! 1 6 1 60 1 Timing not met! Timing not met! 3
5/19/2012 86
NAND2_X1 Standard Cell Equivalent Circuit A B Y INV_X1 NAND2_X1 Net: B P00 = 0.00 P01 = 0.50 P10 = 0.50 P11 = 0.00 Net: A P00 = 0.30 P01 = 0.20 P10 = 0.20 P11 = 0.30 INV_X1 Standard Cell Net: Y P00 = 0.00 P01 = 0.25 P10 = 0.25 P11 = 0.50 Net: M P00 = 0.30 P01 = 0.20 P10 = 0.20 P11 = 0.30 Leakage Input Gate Cap A Output Drain Cap Calculate Output Transition Leakage Equivalent Circuit Leak(A=0, B=0) Leak(A=0, B=1) Leak(A=1, B=0) Leak(A=1, B=1) Input Gate Cap A Input Gate Cap B Output Drain Cap Calculate Output Transition Leak(A=0) Leak(A=1)
5/19/2012 87
roughly even with electronics
Electrical 45nm Photonic 45nm Photonic 11nm Electrical 11nm
5/19/2012 88
5/19/2012 89
5/19/2012 90
5/19/2012 91
Model Reference Point DSENT Orion2.0 + Orion2.0 Mod* Ring Modulator Driver (fJ/b) 50 (11 Gb/s) 60.87 N/A N/A Receiver (fJ/b) 52 (3.5 Gb/s 45nm) 43.02 N/A N/A Router (6x6) Buffer (mW) SPICE – 6.93 7.55 34.4 3.57 Xbar (mW) SPICE – 2.14 2.06 14.5 1.26 Control (mW) SPICE – 0.75 0.83 1.39 0.31 Clock (mW) SPICE – 0.74 0.63 28.8 0.36 Total (mW) SPICE – 10.6 11.2 91.3 5.56 Area (mm2) Encounter – 0.070 0.062 0.129 0.067
+ Default Orion 2.0 technology parameters for 45nm
*Correctly specified 45nm tech params Router (6x6)
5/19/2012 92
Technology Value Supply Voltage 1.0 V Gate Capacitance / width 1.0 fF/um Effective on current / width 650 uA/um Off-current / width 100 nA/um DIBL 150 mV/V Sub-threshold Swing 100 mV/dec Photodetector Responsivity 1.0 mA/mW … … Primitive Cells NAND2 INVERTER BUFFER … Receiver Modulator …
technology parameters
primitives for modeling
through ITRS projections and other roadmaps
5/19/2012 93
terms other models and primitives
Example Models Mesh Network Clos Network Routers Optical links (SWSR, SWMR) Serializer/Deserializer …
5/19/2012 94
implementation, design can be optimized and evaluated
[Georgas, CICC 2011]
5/19/2012 95
5/19/2012 96
[S. Li, ICCAD 2011]