large scale circuit placement
play

Large-Scale Circuit Placement: The Gap and Promise Jason Cong - PowerPoint PPT Presentation

Large-Scale Circuit Placement: The Gap and Promise Jason Cong Computer Science Department University of California, Los Angeles cong@cs.ucla.edu Contributors: Chin-Chih Chang, Kenton Sze, Tim Kong, Michail Romesis, Joe Shinnerl, Min Xie, Xin


  1. Large-Scale Circuit Placement: The Gap and Promise Jason Cong Computer Science Department University of California, Los Angeles cong@cs.ucla.edu Contributors: Chin-Chih Chang, Kenton Sze, Tim Kong, Michail Romesis, Joe Shinnerl, Min Xie, Xin Yuan

  2. Outline � Optimality and scalability study of placement problem -- gap analysis � Research on multilevel large-scale placement problem � Our research plan 2

  3. Optimality and Scalability Study--- Motivation � Lack of significant progress in wirelength reduction � Rate of reduction is about 5-10% every 2-3 years � Latest developments in placement differ mainly in runtime � Where do we stand � How much room for further improvement? � Will existing placement engines scale well to 10+M gate designs? � Need to quantify the optimality and scalability of state-of-the-art placement engines 3

  4. Optimality and Scalability Study--- Motivation (II) � Most work compare only with existing heuristics � Use real design based benchmarks � ISPD98 [C. Alpert 1998] � Use synthetic benchmarks � circ and gen [M. D. Hutton et al, 1998] � gnl [D. Stroobandt et al, 2000] � Little understanding about the gap from the optimal 4

  5. Optimality and Scalability Study--- Related Work Quantified Suboptimality of VLSI � Layout Heuristics [L. Hagen et al, 1995] Construct scaled instance with � known upperbound ? x Over 10% area suboptimality in � TimberWolf Notable wirelength � suboptimality in GORDIAN-L x x x But test cases are small, the � x x x largest netlist is less than 40K x x x 5

  6. Our Contribution: Placement Example Construction with Known Optimal Wirelength � Optimality and Scalability Study of Existing Placement Algorithms [C. Chang et al, 2003] Construct instances with ? � known optimal using the characteristic of the original problem Studied the optimality and � scalability of existing algorithms on constructed instances 6

  7. Construction of Placement Examples with Known Optimal Wirelength (PEKO Examples) � Input � Desired number of placeable modules t � Net Distribution Vector (NDV) D = ( d 2 , d 3 , … d p ), d k is the # of k -pin nets in the circuit t and D are extracted from a real circuit � Output � Cell library L � Netlist N with known optimal wirelength � Constraint � N has D as its NDV 7

  8. Our Algorithm for Constructing PEKO Examples � All the modules are of equal size, and there is no space between rows and adjacent modules � For 2-pin nets , connect any two adjacent modules � For each n -pin net , connect the n modules in a rectangular region close to a square, i.e., the length of each side is close to sqrt( n ) � The wirelength is of each n -pin net is given by   + −     n n / n 2         8

  9. Illustration: PEKO Example Construction Input : t = 64, D = {d 2 =34,d 3 =20,d 4 =7,d 5 =4,d 6 =2, d 7 =1} #2-pin nets = 34, WL = 34 #3-pin nets = 20, WL = 40 #4-pin nets = 7, WL= 14 #5-pin nets = 4, WL = 12 #6-pin nets = 2, WL = 6 #7-pin nets = 1, WL = 4 Total WL = 110 • Method first conceived by K. Boese (1995), but not implemented 9

  10. White Space Insertion � Need for white space � mimic real designs � Ease for legalization Option 1: expanding one dimension Option 2: removing some of the of the chip 10 nets

  11. Four New Suites of Placement Examples with Known Optimal Wirelength � Module number t and NDV extracted from ISPD98 [C. Alpert, 1998] � Two suites without pads (suite1 and suite2) � suite2 is derived by scaling t and NDV by a factor of 10 � Two suites with pads (suite3 and suite4) � suite4 is derived by scaling t and NDV by a factor of 10 � 15% white space by expanding on dimension of the chip URL: http://ballade.cs.ucla.edu/~pubbench/peko.htm 11

  12. PEKO Characteristics PEKO Suite1 ( 12.5k – 210k ) PEKO Suite2 ( 125k – 2.1M ) ckt #cell #net #row Optimal WL ckt #cell #net #row Optimal WL Peko01 12506 13865 113 8.14E+05 Peko01x10 125060 138650 335 8.14E+06 Peko02 19342 19325 140 1.26E+06 Peko02x10 193420 193250 441 1.26E+07 Peko03 22853 27118 152 1.50E+06 Peko03x10 228530 271180 479 1.50E+07 Peko04 27220 31683 166 1.75E+06 Peko04x10 272200 316830 523 1.75E+07 Peko05 28146 27777 169 1.91E+06 Peko05x10 281460 277770 532 1.91E+07 Peko06 32332 34660 181 2.06E+06 Peko06x10 323320 346600 570 2.06E+07 Peko07 45639 47830 215 2.88E+06 Peko07x10 456390 478300 677 2.88E+07 Peko08 51023 50227 227 3.14E+06 Peko08x10 510230 502270 715 3.14E+07 Peko09 53110 60617 231 3.64E+06 Peko09x10 531100 606170 730 3.64E+07 Peko10 68685 74452 263 4.73E+06 Peko10x10 686850 744520 830 4.73E+07 Peko11 70152 81048 266 4.71E+06 Peko11x10 701520 810480 839 4.71E+07 Peko12 70439 76603 266 5.00E+06 Peko12x10 704390 766030 840 5.00E+07 Peko13 83709 99176 290 5.87E+06 Peko13x10 837090 991760 916 5.87E+07 Peko14 147088 152255 385 9.01E+06 Peko14x10 1470880 1522550 1214 9.01E+07 Peko15 161187 186225 402 1.15E+07 Peko15x10 1611870 1862250 1271 1.15E+08 Peko16 182980 189544 429 1.25E+07 Peko16x10 1829800 1895440 1354 1.25E+08 Peko17 184752 188838 431 1.34E+07 Peko17x10 1847520 1888380 1360 1.34E+08 Peko18 210341 201648 460 1.32E+07 Peko18x10 2103410 2016480 1451 1.32E+08 12

  13. Tested four State-of-the-Art Placers � Capo [A. E. Caldwell et al, 2000] � based on multilevel partitioner � aims to enhance the routability � Dragon [M. Wang et al, 2000] � uses hMetis for initial partition � SA with bin-based swapping � mPL [T. Chan et al, 2000] � nonlinear programming on the coarsest level � Goto based relaxation � QPlace [Cadence Inc.] � quadratic programming � component of Silicon Ensemble 13

  14. Experiment with State-of-the-Art Placers Using PEKO Suite1 2.80 45000 2.60 40000 Multiple of Optim al 2.40 35000 30000 2.20 runtime(s) 25000 2.00 1.80 20000 15000 1.60 1.40 10000 1.20 5000 1.00 0 0 50000 100000 150000 200000 250000 0 50000 100000 150000 200000 250000 #cells #cells Dragon v.2.20 capo v.8.0 mPL v.1.2 qplace v.5.1.55 Dragon v.2.20 capo v.8.0 mPL v.1.2 qplace v.5.1.55 � Existing algorithms are 66-153% away from the optimal on PEKO � On examples with pads � mPL and QPlace show improvement of 12% and 10% respectively � Dragon and Capo do not benefit much from the additional information � There is significant room for improvement in placement algorithms! 14

  15. Experiment with State-of-the-Art Placers Using PEKO Suite1 & Suite2 60000 2.80 2.60 50000 2.40 Multiple of Optimal 2.20 40000 2.00 runtime(s) 30000 1.80 1.60 20000 1.40 1.20 10000 1.00 0 10000 100000 1000000 10000000 10000 100000 1000000 10000000 #cells #cells Dragon v.2.0 capo v.8.0 mPL v.1.2 qplace v.5.1.55 Dragon v.2.20 capo v.8.0 mPL v.1.2 qplace v.5.11.55 � Capo, QPlace and mPL scales well in runtime � Average solution quality of each tool shows deterioration by an additional 4% to 25% when the problem size increases by a factor of 10 � QoR of the existing placement algorithms can be 80% - 180% away from the optimal for large designs 15

  16. 16

  17. Limitation of PEKO Examples � Optimal solution includes local nets only � Unlikely for real designs � Measure wirelength only � Timing and routability are important objectives for placement algorithms as well 17

  18. Impact of Global Connections in Real Examples WL of WL contribution � Produced by Dragon circuit height width longest net of longest 10% ibm01 8158 4530 7148 51% on ISPD98 ibm02 8158 6430 14224 46% ibm03 8158 6740 10624 58% � The wirelength ibm04 8158 9140 15171 53% contribution from ibm05 8158 11055 19064 47% ibm06 8158 8715 13966 61% global connections ibm07 8158 14605 14051 51% ibm08 8158 15895 16142 60% can be significant! ibm09 8158 16395 13780 55% ibm10 8158 27890 30755 53% � Need to consider the ibm11 16350 10925 19234 59% ibm12 16350 15545 26748 52% impact of global ibm13 16350 12230 19539 59% ibm14 16350 25475 26370 61% connections ibm15 16350 23785 27284 63% ibm16 16350 34015 42860 59% ibm17 16283 38895 45686 56% 18 ibm18 16350 37065 52846 64%

  19. Placement Examples with Known Upperbounds (PEKU) � Extend PEKO by introducing non-local nets to mimic global connections � All the modules are of equal size, and there is no space between rows and adjacent modules � For nets of degree i i , a subset of them are generated by randomly connecting i i modules, the rest are generated optimally as in PEKO 19

  20. Placement Examples with Known Upperbounds (PEKU) Input : t = 64, D = {d 2 =34,d 3 =20,d 4 =7,d 5 =4,d 6 =2, d 7 =1} α =0.2 Generate 28 2-pin optimally Generate 6 2-pin randomly Generate 16 3-pin optimally Generate 4 3-pin randomly Generate 6 4-pin optimally Generate 1 4-pin randomly Generate 4 5-pin optimally Generate 2 6-pin optimally Generate 1 7-pin optimally Total WL = 160 20

  21. Placement Examples with Global Connections only (G-PEKU) Input : t = 64 � Each net connects either a row or column � Obvious upper bound � Sum the length of each row and column � Similar to datapath examples 21

  22. PEKU Suite � Module numbers and NDV s extracted from ISPD98 � Remove connections with pads � Vary α from 0 to 10% � 15% white space by expanding one dimension of the chip 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend