Comparison of Channel Protocols for Fast, Low Energy Communication over Transmission Lines
Shomit Das*, Kenneth S. Stevens The University of Utah
*now with AMD Research
Comparison of Channel Protocols for Fast, Low Energy Communication - - PowerPoint PPT Presentation
Comparison of Channel Protocols for Fast, Low Energy Communication over Transmission Lines Shomit Das*, Kenneth S. Stevens The University of Utah *now with AMD Research Exascale Challenges Cost of data movement relative to cost of a flop*
Shomit Das*, Kenneth S. Stevens The University of Utah
*now with AMD Research
Cost of data movement relative to cost of a flop* Data movement energy component** *
** G. Kestor et.al., Quantifying the energy cost of data movement in scientific applications, IISWC 2013
*Shekhar Borkar, Exascale Computing- Fact or Fiction? IPDPS 2013
Exascale System Architecture Examples (proposed) AMD NVidia
On-chip Transmission Lines Repeated RC wire SerDes based TL interconnect Transmission Lines require thick top level metals They require carefully designed signal and return paths Signal integrity depends on interconnect aspect ratio among many other factors Bandwidth per unit area suffers as a result Analog signaling techniques such as differential signaling, current mode signaling are applied Higher frequencies can be used SerDes means more timing and energy considerations
Transmission Line Interconnect Design Environment *H.G. Rhew et.al., A 22Gb/s, 10 mm on-chip serial link over lossy Transmission Line, ESSCIRC 2012
7mm TL
Dual Rail Bundled Data 4-phase Bundled Data 2-phase Source Asynchronous Signaling Clocked latched Clocked flopped Source Synchronous
Source Asynchronous Signaling (uncoupling req and ack)
Cycle Time expressions
Latency expressions
Energy per transaction expressions
5 10 15 20 25 30 35 40 45 50 DualRail BD4 BD2 SAS Clock_l Clock_f SrcSync
Cycle Time Comparison
10 20 30 40 50 60 70 80 90 100 DualRail BD4 BD2 SAS Clock_l Clock_f SrcSync
RC vs TL
RC TL
1 2 3 4 5 6 7 8 9 10 DualRail BD4 BD2 SAS Clock_l Clock_f SrcSync
Latency Comparison
5 10 15 20 25 DualRail BD4 BD2 SAS Clock_l Clock_f SrcSync
RC vs TL
RC TL
20 40 60 80 100 120 140 160 180 200 DualRail BD4 BD2 SAS Clock_l Clock_f SrcSync
Energy Comparison
100 200 300 400 500 600 700 800 DualRail BD4 BD2 SAS Clock_l Clock_f SrcSync
RC vs TL
RC TL
20 40 60 80 100 120 DualRail BD4 BD2 SAS Clock_l Clock_f SrcSync
Percentage difference in Cycle Time
TL RC
characteristics
discontinuity-free requirement of TLs
energy overhead of clock distribution
Transmission Lines
metrics
throughput and wire latency Effect of link length (7mm vs 3mm)