hermes a an asynchronous noc router with distributed
play

Hermes-A: An Asynchronous NoC Router with Distributed Routing - PowerPoint PPT Presentation

Hermes-A: An Asynchronous NoC Router with Distributed Routing Julian Pontes Matheus Moreira Fernando Moraes Ney Calazans 1 Outline Introduction Related Work Architecture Input Port Path Calculation Output


  1. Hermes-A: An Asynchronous NoC Router with Distributed Routing Julian Pontes Matheus Moreira Fernando Moraes Ney Calazans 1

  2. Outline • Introduction • Related Work • Architecture – Input Port • Path Calculation – Output Port • Output Control • Results • Future Work • Conclusions 2

  3. Introduction A B S • Dual Rail Encoding 0 0 SF • Four Phase Protocol 0 1 ST • DIMS Logic 1 0 ST 1 1 SF CD CD CD CD R R R R R R e Logic e e e e e Logic g g g g g g 3

  4. Introduction • Asynchronous Circuits – Less Simultaneous Switching ☺ • Less EMI • Less IR Drop ( Slight PowerPlan ) • Less Peak Power ( No Decap Cells ) • Less Crosstalk Problems in Data Links ??? (DI Codes - Four Phase) ( Partial Shielding in data links ) – Average Case Delay ☺ – Reduce Dynamic Power (5 times – 65nm comparison) ☺ 4

  5. Introduction • Asynchronous Circuits – Area and Leakage Overhead (~5-3 times more – 65nm) � – Lack of CAD Tools and Standards � • Synthesis Tools – Traditional Tools (~45 Thousand loop breakers in a 3x3 NoC) – Asynchronous Synthesis Tools (Balsa, Teak) » Lack of traditional optimizations (Pin Swapping, Reordering, Retime, …) • STA – Liberty File Support (is_async_reg) – New Set of Constraints (Cycle Time Definition) 5

  6. Introduction • Networks on Chip – Offer large communication parallelism – Can provide alternate paths • Asynchronous Network on Chip – Enable the Design of Complex GALS Systems on Chip 6

  7. Objetive • Design an asynchronous router architecture capable to support the design of GALS Systems – High Throughput – Low Power – Permit the implementation of fine grain control power » MVS » Power Shut-Off 7

  8. Related Work Characteristics Topology Routing / Flow Network Asynchronous Links and Implementati � Control Interface Style encoding on NoC As. QNoC 2D Mesh Source / wormhole N.A 4-phase 10-bit flits 180 nm, (Irreg/Reg) / credit-based with bundled-data 200Mflits/s, preemption ASIC 8VCs RasP Framework Source / bit serial Ad hoc QDI Point-to- 180nm, / point-to- point 700Mb/s point pipelined (Irreg/Reg) serial links ASPIN 2D Mesh Distributed XY / A2S, S2A Bundled-data/ Dual-rail, 4- 90nm, (Reg) wormhole / EOP FIFOs QDI ph., 34-bit 714Mflits/s flits ANoC 2D Mesh Source / Adaptive - QDI One of Four 130nm/ 2VCs 5Gb/s (router) Hermes-A 2D Mesh Distributed XY / Dual-Rail QDI Dual-Rail 180nm, wormhole / BOP- SCAFFI 727Mbits/s, EOP (454Mflits/s 3.6Gb/s per router) ASIC 8

  9. Related Work Characteristics Topology Routing / Flow Network Asynchronous Links and Implementati � Control Interface Style encoding on NoC As. QNoC 2D Mesh Source / wormhole N.A 4-phase 10-bit flits 180 nm, (Irreg/Reg) / credit-based with bundled-data 200Mflits/s, preemption ASIC 8VCs RasP Framework Source / bit serial Ad hoc QDI Point-to- 180nm, / point-to- point 700Mb/s point pipelined (Irreg/Reg) serial links ASPIN 2D Mesh Distributed XY / A2S, S2A Bundled-data/ Dual-rail, 4- 90nm, (Reg) wormhole / EOP FIFOs QDI ph., 34-bit 714Mflits/s flits ANoC 2D Mesh Source / Adaptive - QDI One of Four 130nm/ 2VCs 5Gb/s (router) Hermes-A 2D Mesh Distributed XY / Dual-Rail QDI Dual-Rail 180nm, wormhole / BOP- SCAFFI 727Mbits/s, EOP (454Mflits/s 3.6Gb/s per router) ASIC 9

  10. Related Work Characteristics Topology Routing / Flow Network Asynchronous Links and Implementati � Control Interface Style encoding on NoC As. QNoC 2D Mesh Source / wormhole N.A 4-phase 10-bit flits 180 nm, (Irreg/Reg) / credit-based with bundled-data 200Mflits/s, preemption ASIC 8VCs RasP Framework Source / bit serial Ad hoc QDI Point-to- 180nm, / point-to- point 700Mb/s point pipelined (Irreg/Reg) serial links ASPIN 2D Mesh Distributed XY / A2S, S2A Bundled-data/ Dual-rail, 4- 90nm, (Reg) wormhole / EOP FIFOs QDI ph., 34-bit 714Mflits/s flits ANoC 2D Mesh Source / Adaptive - QDI One of Four 130nm/ 2VCs 5Gb/s (router) Hermes-A 2D Mesh Distributed XY / Dual-Rail QDI Dual-Rail 180nm, wormhole / BOP- SCAFFI 727Mbits/s, EOP (454Mflits/s 3.6Gb/s per router) ASIC 10

  11. Related Work Characteristics Topology Routing / Flow Network Asynchronous Links and Implementati � Control Interface Style encoding on NoC As. QNoC 2D Mesh Source / wormhole N.A 4-phase 10-bit flits 180 nm, (Irreg/Reg) / credit-based with bundled-data 200Mflits/s, preemption ASIC 8VCs RasP Framework Source / bit serial Ad hoc QDI Point-to- 180nm, / point-to- point 700Mb/s point pipelined (Irreg/Reg) serial links ASPIN 2D Mesh Distributed XY / A2S, S2A Bundled-data/ Dual-rail, 4- 90nm, (Reg) wormhole / EOP FIFOs QDI ph., 34-bit 714Mflits/s flits ANoC 2D Mesh Source / Adaptive - QDI One of Four 130nm/ 2VCs 5Gb/s (router) Hermes-A 2D Mesh Distributed XY / Dual-Rail QDI Dual-Rail 180nm, wormhole / BOP- SCAFFI 727Mbits/s, EOP Clock (454Mflits/s Stretching 3.6Gb/s per router) ASIC 11

  12. Related Work Characteristics Topology Routing / Flow Network Asynchronous Links and Implementati � Control Interface Style encoding on NoC As. QNoC 2D Mesh Source / wormhole N.A 4-phase 10-bit flits 180 nm, (Irreg/Reg) / credit-based with bundled-data 200Mflits/s, preemption ASIC 8VCs RasP Framework Source / bit serial Ad hoc QDI Point-to- 180nm, / point-to- point 700Mb/s point pipelined (Irreg/Reg) serial links ASPIN 2D Mesh Distributed XY / A2S, S2A Bundled-data/ Dual-rail, 4- 90nm, (Reg) wormhole / EOP FIFOs QDI ph., 34-bit 714Mflits/s flits ANoC 2D Mesh Source / Adaptive - QDI One of Four 130nm/ 2VCs 5Gb/s (router) Hermes-A 2D Mesh Distributed XY / Dual-Rail QDI Dual-Rail 180nm, wormhole / BOP- SCAFFI 727Mbits/s, EOP Clock (454Mflits/s Stretching 3.6Gb/s per router) ASIC 12

  13. Related Work Characteristics Topology Routing / Flow Network Asynchronous Links and Implementati � Control Interface Style encoding on NoC As. QNoC 2D Mesh Source / wormhole N.A 4-phase 10-bit flits 180 nm, (Irreg/Reg) / credit-based with bundled-data 200Mflits/s, preemption ASIC 8VCs RasP Framework Source / bit serial Ad hoc QDI Point-to- 180nm, / point-to- point 700Mb/s point pipelined (Irreg/Reg) serial links ASPIN 2D Mesh Distributed XY / A2S, S2A Bundled-data/ Dual-rail, 4- 90nm, (Reg) wormhole / EOP FIFOs QDI ph., 34-bit 714Mflits/s flits ANoC 2D Mesh Source / Adaptive - QDI One of Four 130nm/ 2VCs 5Gb/s (router) Hermes-A 2D Mesh Distributed XY / Dual-Rail QDI Dual-Rail 180nm, wormhole / BOP- SCAFFI 727Mbits/s, EOP Clock (454Mflits/s Stretching 3.6Gb/s per router) ASIC 13

  14. Related Work Characteristics Topology Routing / Flow Network Asynchronous Links and Implementati � Control Interface Style encoding on NoC As. QNoC 2D Mesh Source / wormhole N.A 4-phase 10-bit flits 180 nm, (Irreg/Reg) / credit-based with bundled-data 200Mflits/s, preemption ASIC 8VCs RasP Framework Source / bit serial Ad hoc QDI Point-to- 180nm, / point-to- point 700Mb/s point pipelined (Irreg/Reg) serial links ASPIN 2D Mesh Distributed XY / A2S, S2A Bundled-data/ Dual-rail, 4- 90nm, (Reg) wormhole / EOP FIFOs QDI ph., 34-bit 714Mflits/s flits ANoC 2D Mesh Source / Adaptive - QDI One of Four 130nm/ 2VCs 5Gb/s (router) Hermes-A 2D Mesh Distributed XY / Dual-Rail QDI Dual-Rail 180nm, wormhole / BOP- SCAFFI 727Mbits/s, EOP Clock (454Mflits/s Stretching 3.6Gb/s per router) ASIC 14

  15. Router Architecture • Distributed Routing • Independent Ports • Dual Rail Encoding • Weak Conditioned Half Buffer • DIMS Logic 15

  16. Input Port • Packet – First Flit contains the address – BOP and EOP delimiters – Three main paths • First Flit (1), Last Flit (3) and other Flits (2) 16

  17. Input Port 10 17

  18. Path Calculation • All logic employs Delay Insensitive Minterm Synthesis • First Flit contains the XY address 18

  19. Input Port 14 19

  20. Input Port 10 4 20

  21. Input Port 10 4 21

  22. Input Port 14 22

  23. Input Port 10 4 23

  24. Input Port 10 4 24

  25. Input Port 4 10 25

  26. Input Port 14 26

  27. Input Port 10 4 27

  28. Input Port 4 K 28

  29. Input Port 14 29

  30. Input Port K 30

  31. Input Port S-Control 31

  32. S-Element - Enclosure • Starts with a handshake at the input port • Perform two handshakes – First to send the last flit – Second to close the communication section at the output port (EOP = BOP = 1) • Speed Independent Design – Circuit generated with Petrify 32

  33. S-Control S-Control INPUT LAST FLIT Input Ack S-Control – Output Port A LAST FLIT Ack A S-Control – Output Port B BOP = EOP =1 AckB 33

  34. Output Port • Arbitration • Kill Section 34

  35. Output Port 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend