high performance graph convolutional networks with
play

High Performance Graph Convolutional Networks with Applications in - PowerPoint PPT Presentation

High Performance Graph Convolutional Networks with Applications in Testability Analysis Yuzhe Ma 1 , Haoxing Ren 2 , Brucek Khailany 2 , Harbinder Sikka 2 , Lijuan Luo 2 , Karthikeyan Natarajan 2 , Bei Yu 1 1 The Chinese University of Hong Kong 2


  1. High Performance Graph Convolutional Networks with Applications in Testability Analysis Yuzhe Ma 1 , Haoxing Ren 2 , Brucek Khailany 2 , Harbinder Sikka 2 , Lijuan Luo 2 , Karthikeyan Natarajan 2 , Bei Yu 1 1 The Chinese University of Hong Kong 2 NVIDIA 1 / 16

  2. Learning for EDA ◮ Verification [Yang et.al ◮ Mask optimization [Yang et.al DAC’2018] TCAD’2018] Litho- HS Generator Simulator … Non-HS 2 / 16

  3. Learning for EDA ◮ Verification [Yang et.al ◮ Mask optimization [Yang et.al DAC’2018] TCAD’2018] Litho- HS Generator Simulator … Non-HS More Considerations ◮ Existing attempts still rely on regular format of data, like images; ◮ Netlists and layouts are naturally represented as graphs; ◮ Few DL solutions for graph-based problems in EDA. 2 / 16

  4. Test Points Insertion ◮ Fig. (a): Original circuit with bad testability. Module 1 is unobservable. Module 2 is uncontrollable; ◮ Fig. (b): Insert test points to the circuit; ◮ (CP1, CP2) = (0, 1) → line I = 0; (CP1, CP2) = (1, 1) → line I = 1; ◮ CP2 = 0 → normal operation mode. OP Module 1 0 Module 1 I Module 2 1 Module 2 CP1 CP2 (a) (b) 3 / 16

  5. Problem Overview Problem Given a netlist, identify where to insert test points, such that: - Maximize fault coverage; - Minimize the number of test points and test patterns. * (Focus on observation points insertion in this work.) 4 / 16

  6. Problem Overview Problem Given a netlist, identify where to insert test points, such that: - Maximize fault coverage; - Minimize the number of test points and test patterns. * (Focus on observation points insertion in this work.) ◮ It is a binary classification problem from the perspective of DL model; ◮ A classifier can be trained from the historical data. ◮ Need to handle graph-structured data. ◮ Strong scalability is required for realistic designs. 4 / 16

  7. Node Classification ◮ Represent a netlist as a directed graph. Each node represents a gate. ◮ Initial node attributes: SCOAP values [Goldstein et. al, DAC’1980]. ◮ Graph convolutional networks: compute node embeddings first, then perform classification. Layer 1 Layer 2 FC Layers Prediction 1 1 0 1 0 0 5 / 16

  8. Node Classification Node embedding : two-step operation: ◮ Neighborhood feature aggregation: weighted sum of the neighborhood features. g ( v ) = e ( v ) e ( u ) e ( u ) � � d − 1 + w pr × d − 1 + w su × d d − 1 u ∈ PR ( v ) u ∈ SU ( v ) ◮ Projection: a non-linear transformation to higher dimension. e d = σ ( g d · W d ) Classification : A series of fully-connected layers. 6 / 16

  9. Imbalance Issue ◮ High imbalance ratio: much more negative nodes than positive nodes in a design; ◮ Poor performance: bias towards majority class; Solution: multi-stage classification. ◮ Impose a large weight on positive points. ◮ Only filter out negative points with high confidence in each stage. + + + - - - Positive point Negative point Decision boundary Stage-1 Stage-2 Stage-3 7 / 16

  10. Efficient Inference ◮ Neighborhood overlap leads to duplicated computation → poor scalability. ◮ Transform weighted summation to matrix multiplication. ◮ Potential issue: adjacency matrix is too large. ◮ Fact: adjacency matrix is highly sparse! It can be stored using compressed format. 1 2 3 4 5 6 e ( 1 )   1 1 w 1 w 1 w 1 0 0 d − 1   e ( 2 )   6 3 2 w 2 1 0 0 w 1 0  d − 1      e ( 3 )   5 3 w 2 0 1 0 0 w 2   1   d − 1 G d = A · E d − 1 =   ×   4  e ( 4 )  2 4 w 2 0 0 1 0 0       d − 1     5 0 w 2 0 0 1 0 e ( 5 )       d − 1   6 0 0 w 1 0 0 1 e ( 6 ) d − 1 8 / 16

  11. Efficient Training ◮ Adjacency matrix cannot be split as conventional way. ◮ A variant of conventional data-parallel scheme. - Each GPU process one graph instead of one "chunk"; - Gather all to calculate the gradient. Training data: GPU1 GPU2 Gradient Output Evaluate Output 9 / 16

  12. Test Point Insertion Flow ◮ Not every difficult-to-observe node has the same impact for improving the observability; ◮ Select the observation point locations with largest impact to minimize the total count. ◮ Impact: The positive prediction reduction in a local neighborhood after inserting an observation point. ◮ E.g., the impact of node a in the figure is 4. Predicted-0 Predicted-1 a a OP Fan-in cone (c) (d) 10 / 16

  13. Test Point Insertion Flow ◮ Iterative prediction and OPs insertion. ◮ Once an OP is inserted, the netlist would be modified and node attributes would be re-calculated. ◮ Sparse representation enables incremental update on adjacency matrix. ◮ Exit condition: no positive predictions left. Y Netlist Prediction Satisfied? END Trained GCN N Model Impact Evaluation OP Insertion 11 / 16

  14. Benchmarks ◮ Industrial designs under 12nm technology node. ◮ Each graph contains > 1 M nodes and > 2 M edges. Design #Nodes #Edges #POS #NEG B1 1384264 2102622 8894 1375370 B2 1456453 2182639 9755 1446698 B3 1416382 2137364 9043 1407338 B4 1397586 2124516 8978 1388608 12 / 16

  15. Classification Results Comparison ◮ Baselines: classical learning models with feature engineering in industry; ◮ GCN outperforms other classical learning algorithms. LR SVM RF MLP GCN 1 0 . 9 Accuracy 0 . 8 0 . 7 B1 B2 B3 B4 Average 13 / 16

  16. Multi-stage GCN Results ◮ Scalability: 10 3 × speedup on inference ◮ Single-stage GCN vs. Multi-stage GCN ; time for a design with > 1 million cells. GCN-S GCN-M Recursion Ours 10 4 0 . 6 Inference time (s) F1-Score 10 2 0 . 4 10 0 0 . 2 10 − 2 0 B1 B2 B3 B4 10 3 10 4 10 5 10 6 Benchmark Number of nodes 14 / 16

  17. Testability Results Comparison ◮ Without loss on fault coverage, 11% reduction on test points inserted and 6% reduction on test pattern count are achieved. Industrial Tool GCN-Flow Design #OPs #PAs Coverage #OPs #PAs Coverage B1 6063 1991 99.31% 5801 1687 99.31% B2 6513 2009 99.39% 5736 2215 99.38% B3 6063 2026 99.29% 4585 1845 99.29% B4 6063 2083 99.30% 5896 1854 99.31% Average 6176 2027 99.32% 5505 1900 99.32% Ratio 1.00 1.00 1.00 0.89 0.94 1.00 15 / 16

  18. Thank You 16 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend