physical aware system level design for tiled hierarchical
play

Physical Aware System Level Design for Tiled Hierarchical Chip - PowerPoint PPT Presentation

Physical Aware System Level Design for Tiled Hierarchical Chip Multiprocessors Jordi Cortadella, Javier de San Pedro, Nikita Nikitin and Jordi Petit Universitat Politcnica de Catalunya (Barcelona) Project funded by Intel Corp. Designing a


  1. Physical ‐ Aware System ‐ Level Design for Tiled Hierarchical Chip Multiprocessors Jordi Cortadella, Javier de San Pedro, Nikita Nikitin and Jordi Petit Universitat Politècnica de Catalunya (Barcelona) Project funded by Intel Corp.

  2. Designing a Chip Multiprocessor DSP Graphics Off ‐ Chip CMP Memory Data Mining ⁞ Bioinformatics • How many cores? • How much L2/L3 on-chip cache? • Interconnect: mesh/ring/bus? • How many memory controllers? ISPD 2013 Tiled CMPs 2

  3. What is architectural exploration? MC R ‐ 6x4 mesh, 24 clusters C2 C2 C2 ‐ total 144 cores NI L2 L2 L2 ‐ 6 cores/cluster Bus ‐ 1 C1, 128K L1, 256K L2 MC MC L2 L2 L2 ‐ 5 C2, 64K L1, 96K L2 L3 C2 C2 ‐ 146 Mb total shared L3 C1 Throughput = 85.71 IPC MC MC ‐ 5x5 mesh, 25 clusters L2 C1 C2 ‐ total 100 cores ‐ 4 cores/cluster L2 L2 C1 Bus ‐ 3 C1, 128K L1, 1M L2 MC MC ‐ 1 C2, 128K L1, 1M L2 L2 C1 L3 ‐ 130 Mb total shared L3 R NI Throughput = 103.26 IPC MC ISPD 2013 Tiled CMPs 3

  4. Library of models CMP configuration Cache models Parameter Value Cache Area Latency (mm 2 ) Size (cycles) 350 mm 2 Chip area Mesh dimensions 2x2 to 16x16 64Kb 0.063 2 Mem. Cntrl. latency 200 cycles 128Kb 0.125 3 Interconnect Bus, uni ‐ / bi ‐ ring 256Kb 0.25 4 Link width 256 – 1024 bits Workload MPI 0.5 … … … Workload MLP 1.25 8Mb 8.0 9 Core library 0.8 2.75 C3 (OoO) Miss rates for 2.5 Miss Ratio 0.6 SPEC CPU2006 2.25 C2 (OoO) IPC 0.4 2 0.2 1.75 C1 (IO) 1.5 0 0.75 1 1.25 1.5 1.75 2 2.25 0 1 2 3 4 Area (mm 2 ) Cache Size (Mb) ISPD 2013 Tiled CMPs 4

  5. Physical planning for tiled CMPs N W E S ISPD 2013 Tiled CMPs 5

  6. Outline • Architectural exploration – The cost of exploration – Exploring with metaheuristics – Analytical models • Physical planning for tiled CMPs • Current work: regular floorplanning ISPD 2013 Tiled CMPs 6

  7. Exploration engines 1E+13 Design space: 10 9 configurations 300 centuries 1E+12 1E+11 Exploration runtime (sec) 300 years 1E+10 1E+09 1E+08 100 days 1E+07 1E+06 1E+05 10000 1000 100 seconds 100 10 1 Simulation Simulation Analytical Analytical (full system) (probabilistic) (exhaustive) (metaheuristic) ISPD 2013 Tiled CMPs 7

  8. Scalable exploration Architectural configurations Analytical Modeling Promising configurations Simulation ISPD 2013 Tiled CMPs 8

  9. Automated exploration Physical info Cores Caches Interconnects Models Architectural configuration (performance/power) Number of cores Cores Cluster size On ‐ chip caches Exploration L2/L3 size Off ‐ chip memories tool Intra ‐ cluster interconnect Interconnect fabrics Inter ‐ cluster interconnect Cache protocol Memory controllers Workloads Constraints Area Throughput Power ISPD 2013 Tiled CMPs 9

  10. Exploration engine: metaheuristics • Explore huge design spaces efficiently • Our proposal: – Simulated Annealing (Kirkpatrick et al., 1983) – Extremal Optimization (Boettcher et al., 1999) Models Partial generation Analytical Exploration tool of configurations modeling Constraints search direction Best Simulation configuration ISPD 2013 Tiled CMPs 10

  11. Generation of configurations • Generate neighbors by applying transformations – Increase/Decrease • mesh dimensions • core count per cluster • L1, L2 size – Change interconnect type (bus/uni ‐ ring/bi ‐ ring) – Complex updates (increase mesh/decrease core count) • Example: Increase_X(mesh 4x4) => mesh 5x4 Increase_X ISPD 2013 Tiled CMPs 11

  12. Analytical performance model for CMPs Nonlinear analytical models Memory subsystem Core Throughput model: λ i L i Traffic model: Core Latency model: Queueing model: … Umit Ogras et al. IEEE TCAD, Dec 2010 Characteristic of Core Characteristic of the IC the cores/workload 50 L, average latency (cycles) λ (L) L( λ ) 40 30 λ L Throughput 20 10 0 0 0.05 0.1 0.15 0.2   L     L     ••• Hop ‐ count λ , average traffic rate (flits/cycle) latency ISPD 2013 Tiled CMPs 12

  13. Analytical model vs. simulation 80 Modeling Simulation 70 Analytical modeling 60 Throughput (IPC) 50 Simulation 40 30 20 10 0 1 55 109 163 217 271 325 379 433 487 541 595 649 703 757 811 865 919 973 1027 1081 1135 1189 1243 1297 1351 1405 1459 1513 1567 1621 1675 1729 1783 1837 1891 1945 1999 2053 2107 Configurations sorted in descending order of throughput ISPD 2013 Tiled CMPs 13

  14. Case Study: Power ‐ performance exploration Power ‐ performance trade ‐ off (Search space: 1.5 · 10 9 configurations) 130 6x5, Bi ‐ Ring, 4C 2 5x4, Bi ‐ Ring, 6C 2 120 5x3, Bi ‐ Ring, 8C 2 Throughput (IPC) 4x2, Bi ‐ Ring, 15C 2 110 4x3, Bi ‐ Ring, 10C 2 3x2, Bi ‐ Ring, 20C 2 100 6x5, Bus, 4C 2 7x4, Bus, 3C 2 +1C 3 90 7x4, Bus, 4C 2 80 6x5, Bus, 2C 1 +2C 2 6x4, Bus, 3C 1 +2C 2 70 120 140 160 180 200 220 240 Power (W) ISPD 2013 Tiled CMPs 14

  15. Outline • Architectural exploration • Physical planning for tiled CMPs – Impact of physical planning – Floorplanning – Wire planning • Current work: regular floorplanning ISPD 2013 Tiled CMPs 15

  16. Physical planning NSWE C C R L2 L2 r r r r r r L2 L2 C C L3 ISPD 2013 Tiled CMPs 16

  17. The impact of physical planning NSWE C C R L2 L2 r r r r r r L2 L2 C C L3 ISPD 2013 Tiled CMPs 17

  18. Physical planning for tiles N W E S ISPD 2013 Tiled CMPs 18

  19. Link width: how many wires? Router Cntrl Addr Cache line Cntrl Addr Cache line 64 512 64 512 > 1K wires  100  m Router ISPD 2013 Tiled CMPs 19

  20. 3D Wire Planning m6 m5 m4 m3 m2 Core Router Memory m1 FEOL In systems where memory bandwidth is the bottleneck, the physical resources providing the bandwidth are critical ISPD 2013 Tiled CMPs 20

  21. Exploration without physical planning Architectural exploration Models Generation of Analytical configurations modeling Constraints search direction Validation Best Simulation configuration ISPD 2013 Tiled CMPs 21

  22. Exploration with physical planning Architectural exploration Models Generation of Analytical Phys. Info configurations modeling Constraints search direction Validation Physical planning Wire Floor ‐ Best Simulation configuration Planning planning ISPD 2013 Tiled CMPs 22

  23. Physical planning C C L2 L2 Local IC Analytical L3 R Modeling Physical Planning Simulation Estimations: • Area • Wirelength L3 • Routability ISPD 2013 Tiled CMPs 23

  24. Physical planning technology • Floorplanning – Slicing structures & Simulated Annealing – Lightweight 3D maze router – Constraints: • Adjacency (Core  L2) • Balanced links (rings) • Wire planning – SAT ‐ based 3D global routing – Boolean constraints ISPD 2013 Tiled CMPs 24

  25. Slicing structures V 3 1 H H 5 V 4 1 2 3 2 4 5 H 4 1 V V 3 1 2 5 H 2 5 4 3 D.F. Wong and C.L. Liu, “ A New Algorithm for Floorplan Design ” DAC, 1986, pages 101-107. ISPD 2013 Tiled CMPs 25

  26. Bounding curves Memory L. Stockmeyer, 1983, Optimal Orientation of Cells in Slicing Floorplan Designs ISPD 2013 Tiled CMPs 26

  27. Wire planner • SAT ‐ based approach for gridded routing • Grid unit: link width (  500 ‐ 1000 wires) • Support for floating terminals • Customizable for any type of Boolean ‐ encoded constraints (symmetry, 1D/2D routing, …) Top view Cross-section view ISPD 2013 Tiled CMPs 27

  28. Wire planner W E Router W E • Concurrent routing: all nets simultaneously • Using Euler’s theory to find legal routes • SAT: a route is always found if it exists • ILP ‐ based route optimization ISPD 2013 Tiled CMPs 28

  29. Design space ISPD 2013 Tiled CMPs 29

  30. Design space Wire length [10 6 μ m] ISPD 2013 Tiled CMPs 30

  31. Filtering floorplans Area [mm 2 ] ISPD 2013 Tiled CMPs 31

  32. Filtering floorplans Area [mm 2 ] ISPD 2013 Tiled CMPs 32

  33. After physical planning ISPD 2013 Tiled CMPs 33

  34. After physical planning ISPD 2013 Tiled CMPs 34

  35. After physical planning ISPD 2013 Tiled CMPs 35

  36. Outline • Architectural exploration • Physical planning for tiled CMPs • Current work: regular floorplanning – Memory floorplanning – Regularity extraction ISPD 2013 Tiled CMPs 36

  37. Min ‐ area floorplan NSWE C C R L2 C L2 L2 L2 L3 C r r r r r r r r r r r r L2 L2 L2 C C R L2 L3 C C ISPD 2013 Tiled CMPs 37

  38. Integrated memory floorplanner 1Mb 1Mb 1Mb 256 256 512Kb 512Kb 512 512 Kb Kb Kb Kb R 512Kb L-shape T-shape ISPD 2013 Tiled CMPs 38

  39. Regular floorplan NSWE C C R C C L3 L2 L2 r r L2 L2 r r r L2 L2 r r r r r r r C C L2 L2 R L3 C C ISPD 2013 Tiled CMPs 39

  40. Regular floorplan L2 C C C L2 L3 L3 C r r L2 L2 r r r r r L2 L2 r r r L2 r r C C R L2 C C R Regularity: • Smaller design effort • Efficient timing closure • Choppability ISPD 2013 Tiled CMPs 40

  41. Regular floorplan L2 C C L2 L3 L3 C r L2 r r r r r L2 r r L2 r r C C R L2 C R Regularity: Exploration: • Smaller design effort • Graph based knowledge discovery • Efficient timing closure • Hierarchical slicing structures • Choppability • Simulated Annealing ISPD 2013 Tiled CMPs 41

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend