the silago method next generation vlsi architectures and
play

The SiLago Method: Next Generation VLSI Architectures and Design - PowerPoint PPT Presentation

The SiLago Method: Next Generation VLSI Architectures and Design Automation Ahmed Hemani KTH Dept of Electronics and Embedded Systems, School of ICT, KTH . Acknowledgement: Nasim Farahini, Muhammad Asad, Li Shuo, Hassan Sohofi, Muhammad Ali


  1. The SiLago Method: Next Generation VLSI Architectures and Design Automation Ahmed Hemani KTH – Dept of Electronics and Embedded Systems, School of ICT, KTH . Acknowledgement: Nasim Farahini, Muhammad Asad, Li Shuo, Hassan Sohofi, Muhammad Ali Shami, Adeel Tajammul, Omer Malik, Anders Lansnser, Christer Svensson 1

  2. The Core Ideas behind the SiLago Method The SiLago Method Engineering Cost 45 MUSDs 1. Higher abstraction of << 45 MUSDs Physical Design Platform < 5 MUSDs 2. A structured grid based physical design scheme Manufacturing Cost 5 MUSDs

  3. The Large Engineering and Manufacturing Cost Software Centric Accelerator rich Platform Based Design Loss in Silicon and Computational Efficiencies Blocks Application Categories Stifles Innovation 3

  4. Generality vs. Customisation Custom solutions are Generality comes at a huge cost of Accelerator Rich Hardware Centric Silicon, Computational and Orders of magnitude Software Centric Custom Design Platform Based SOC Design Engineering efficiencies more efficient The SiLago Method 4

  5. Energy Breakdown in a GPP Arithmetic 6% Clock & Control 24% Instruction Supply 42% Data Supply 28 % William J. Dally, James Balfour, David Black ‐ Shaffer, James Chen, R. Curtis Harting, Vishal Parikh, Jongsoo Park, and David Sheffield, Stanford University, ”Efficient Embedded Computing”. IEEE Computer July 2008 5

  6. The Impact of Customization GFlops /w 10 2 Matrix Matrix Multiplication FFT 2048 10 1 0 10 FPGA CPU GPU SiLago ASIC GTX255 Core i7 LX760 6

  7. A Brief History VLSI Design Automation To Explain Why the Path of Customization has been abandoned 7

  8. The Mead Conway Era The Design Space System System level Synthesis Manual : Stick Diagram, Mead Conway, Silicon Compiler Application ‐ level Application Synthesis Abstraction Level High ‐ level Algoritims Synthesis RTL / RTL/Logic  ‐ architecture Synthesis Physical Gates Synthesis Physical # of Solutions increases exponentially with abstraction gap

  9. The Mead Conway Era Survived As long as the complexity was of the order of O(10K gates) 9

  10. The Standard Cell Era The Design Space System System level Synthesis Manual Application ‐ level Application Synthesis Abstraction Level High ‐ level Algoritims Synthesis Automated RTL / RTL/Logic  ‐ architecture Synthesis Physical Standard ‐ Cell Synthesis time One Standard Physical Cells # of Solutions increases exponentially with abstraction gap

  11. What Standard Cells Did Abstraction Boolean level abstraction Hides circuit and physical design details Enabled logic synthesis Physical Design Discipline Standard pitch and Row based layout Enabled physical design automation Improves efficiency of 1. Synthesis from RTL to GDSII 2. Verification at RTL 3. System Design

  12. Standard Cells as building blocks are not scalable for 10 ‐ 100 million gate designs Standard Cell ~10 ‐ 100 K gates ~10 ‐ 100 Million gates

  13. An Analogy

  14. So what happens when you try to build skyscapers with bricks 14

  15. Commercial HLS achieves local optimisation Global Area, Energy and latency constraints are specified for the application Commercial HLS Global constraints are manually partitioned to local partitions Local Optimisation  min (L); The synthesis tool does the local optimisation L is the # of algorithms in the application Application Algorithms ↓ EQ CR ADC FDEC DEC RRC SLICER 2 Filter Comp Clock EQ Carrier Adaptation Adaptation Adaptation System Control 15

  16. Commercial HLS: No synthesis of inter ‐ algorithm interfaces in an Application Commercial HLS The user has to manually refine the interface between the synthesized algorithms This manual refinement induces a functional verification step because the correct by construction contract assured by machine translation is now violated ↓ EQ CR ADC FDEC DEC RRC SLICER 2 Filter Comp Clock EQ Carrier Adaptation Adaptation Adaptation System Control 16

  17. The 45 MUSD State of the Art SOC Design Flow System: Multiple applications Functional Verification System Architecting 1. HW/SW Partitioning Software Design 2. Interface Design 3. Memory & Interconnect Constraints Verification: Hierarchy Timing/Energy/Power/Area 4. I/O Design Architecture Definition in terms of pre ‐ designed IPs Stitch Architecture: Buy and Assemble Logic: Algorithm + RTL + Boolean Automatic: High ‐ level, RTL / Logic & Physical Synthesis Chip 17

  18. Solution : The SiLago Method SiLago = Silicon Large Grain Object Inspired by Lego 18

  19. We shifted to pre ‐ fabricated wall segments

  20. The First Proposition – Raise Abstraction to  Arch level 4 ‐ 5 orders larger than Sandard Cell SiLago Blocks Are NOT IPs – Soft or Hard SiLago Block Standard Cell (Register Files, DPUs, Switch boxes, Processors, SRAM banks etc.) Characterised boolean operations Characterised Micro ‐ architectural operations

  21. Solutions to VLSI Design Complexity: 1. Abstraction 2. Physical Design Discipline / Regularity The VLSI community has largely forgotten the second component 21

  22. London Manhattan

  23. A grid based structured layout scheme SiLago Fabric based SOC Traditional SOC Flash CTRL 1 DRAM Streaming Storage Protocol Processing CTRL 2 3 4 Ethernet PLL/CGU 9 PMC Data Storag e 6 5 Outer Inner Mode System Ctrl Modem Flexilators Outer m Inner 8 Mode Modem m Progra m 7 Storag e Physical Design Regularity is the sword that can slay the demons of VLSI Design Complexity

  24. The SiLago Method Ahmed Hemani, Nasim Farahini, Syed M.A.H. Jafri, Hassan Sohofi, Shuo Li and Kolin Paul, ”The SiLago Solution: Architecture and Design Methods for a Heterogeneous Dark Silicon Aware Coarse Grain Reconfigurable Fabric”, Chapter 3 in the book “The Dark Side of the Silicon” Springer, DOI 10.1007/978-3-319-31596-6 24

  25. The SiLago Concepts Flash CTRL DRAM CTRL Protocol Processing Streaming Storage Ethernet A Virtual GRID SiLago Blocks REGIONS PLL/CGU PMC All SiLago design objects are alligned with grid lines A grid is divided into regions Each region is occupied by SiLago blocks that are region specific And occupy multiples of contiguous grid cells Each region is specialized in a type of functionality Data These SiLago blocks occupy one or more contiguous grid cells Storage Grid has not pre ‐ determined size, it is as big as the Some regions are infrastructural while others are functional synthesis tool decides or the designer decides SiLago blocks are hardened and characterized with post Regions are separated by corridors to accomodate NOCs to layout data System Ctrl Inner Outer connect the regions Flexilators Modem Modem SiLago blocks absorb, global nets including power grids, clock Each region has its own internal interconnect scheme. grid and connect to the neighbouring SiLago blocks by Program abutment Storage 25

  26. The SiLago Concepts Flash CTRL Streaming Storage DRAM CTRL Protocol Processing Ethernet This is a SiLago Design Instance PLL/CGU PMC NOCs It is automatically generated by the Data SiLago Syntheses tool chain Storage Number, size and position of regions NOCs Inner Outer vary from one instance to another Modem Modem Flexilators System Ctrl Inner Outer Modem Modem Program Storage 26

  27. SiLago Interconnects are also hardened The SiLago interconnects are not just logical interconnect, i.e., soft. They are physical and electrical objects in a templatized or parametric manner 27

  28. SiLago fabrics are composed by abutment 1. SiLago blocks absorbs a) Clock Tree & Power Ring b) Absorbs regional and global interconnect c) Pins on the periphery at right positions 2. Fabric Composition by abutment Block 1 Block 2 28

  29. SiLago Platform Cost Metrics are Space Invariant 1. 16 global wires in each cell varies by 1. The SiLago physical design discipline about 70% from cell to cell ensures that all wires are of exact 2. This variation is a proof that even if it is same length hierarchical design, the cost metrics would vary Power Ring SiLago Block SiLago Block Power Ring Power Stripes Power Stripes 29

  30. Clocking & STA • Clock – Three levels of clocking: local, regional and global – Local Each SiLago block is hardened to be timing clean and synthesized with a certain margin for skew and latency The Local Clock is synthesized using standard EDA flow – Regional Each Region is a synchronous region and the regional clock is manually synthesized to have sufficient buffers to maintain good edge and the delays balanced to keep the skew and latency within the margins of the local clock – Global Regions communicate with each other on latency insensitive basis using a previously developed GRLS scheme. For more details see http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6507330&tag=1 • STA – Static Timing Analysis – ILMs are created for each SiLago blocks – Once the regional clocks are synthesized and inserted back into the data base, a hierarchcial STA script is run to ensure that the entire design is timing clean. 30

  31. Characterization 1. Each SiLago block is hardened 2. Sufficiently exhaustive simulation is performed for molecules of SiLago blocks at gate level with post layout data back annotated – The SiLago blocks cannot be too large and complex – The same pipeline cannot be used for multiple operations 3. Concurrent operations within and neighbouring SiLago blocks weakly couple and we model this coupling 4. The NOCs are parameterically hardened 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend