Graph Sparsification Approaches to Scalable Integrated Circuit - PowerPoint PPT Presentation

Design Automation Group Graph Sparsification Approaches to Scalable Integrated Circuit Modeling and Simulations Zhuo Feng Acknowledgements: My PhD students Xueqian Zhao (MTU) and Lengfei Han (MTU) ICSICT, Oct, 2014 1

Scalable SPICE-Accurate IC Simulations  Motivation – Integrated circuit (IC) system that involves billions of transistors and interconnect components needs to be accurately modeled and analyzed  Challenges in large-scale SPICE-accurate IC simulations – Computational cost grows rapidly with traditional direct solution methods – Iterative solution methods need to be robust and efficient for general tasks Analog Circuit Blocks V out I out M p V in I f V G Cur. VR VR Amp. C out C f R f1 I C + VR VR - R f2 V ref Error Amp Original Circuit with Digital Circuit Blocks Analog and Digital Blocks Power Delivery Network (PDN) w/ Embedded Voltage Regulators (VRs) 2

Background of SPICE Simulation Algorithms  Problem formulation – Nonlinear differential equations – f(.) and q(.) denote the static and dynamic nonlinearities, respectively d = + + = ( ) ( ( )) ( ( )) ( ) 0 F x f x t q x t u t dt  Standard SPICE simulators rely on Newton-Raphson (NR) method – Step1: Linearize the nonlinear devices (transistors, diodes, etc) δ δ f q = = k k ( ) , ( ) G x C x δ δ k k x x x x – Step 2: Update the solution through NR iteration Jacobian of F(x) 3

Prior Works  Direct and iterative solvers have been used in SPICE simulations – Direct solver: LU decomposition (KLU [1]) – Expensive for large-scale post-layout IC problems due to the exponentially increased memory and runtime cost – Krylov-subspace iterative methods: GMRES [2] – Pros: black box solver, good memory efficiency, high parallelism – Cons: problem dependent convergence properties, worse runtime – ILU and domain-decomposition based preconditioners, etc  Our contribution: a circuit-oriented preconditioning approach – Novel circuit-oriented preconditioners (compared to matrix-oriented ones ) – Rigorous mathematic foundation: graph sparsification research [3-4] – Consistent performance when solving transistor-level nonlinear circuits References: [1] T. Davis, et al . Algorithm 907: KLU, a direct sparse solver for circuit simulation problems. ACM Trans. Math. Softw., 2010. [2] Y. Saad, et al. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 1986. [3] D. A. Spielman, et al. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. ACM STOC, 2004. [4] M. Bern, et al . Support-graph preconditioners. SIAM J. Matrix Anal. Appl., 2006. 4

Graph Sparsification Techniques  Graph sparsification basics – Find a subgraph P approximating the original graph G in some measure (pairwise distance, cut values, graph Laplacian, etc) – Maintain the same set of vertices such that P can be used as a proxy for G in numerical computations w/o introducing much error – A good graph sparsifier should keep very few edges to limit the computation and storage cost G P Figure source: L. Koutis, G. L. Miller and R. Peng. A fast solver for a class of linear systems. Commun. ACM, 2012 5

Support-Graph Preconditioner  Support-graph preconditioner (SGP) – Example: find a spanning tree from the original graph – Compute matrix factors w/o introducing any fill-ins for the spanning tree P 2 4 G 2 4 1 2 3   d 2 0 1 0 0 0 0 0   ' 2 0 0 0 0 0 0 0 d 1 2 3 1   1   2 d 4 0 3 0 0 0 0 2 ' 4 0 0 0 0 0 0   d   2 2 1 3 8     0 4 0 0 8 0 0 0 d 1 3 8 0 4 d ' 0 0 8 0 0 0 3   3   1 0 0 6 0 4 0 0  d  0 0 0 ' 6 0 4 0 0 6 5  d  6 5 4 4 4 5 6     4 5 6 0 3 0 6 d 5 0 1 0 0 0 0 6 ' 5 0 0 0 d  5    5  0 0 8 0 5 0 0 3    d 0 0 8 0 5 d ' 0 0 0 4 1 3 6 6 4 1 3     0 0 0 4 0 0 9 0 0 0 0 4 0 0 d ' 9 0 d     7 7 9 4 9 4     0 0 0 0 0 0 9 ' 4 0 0 0 0 1 0 9 d 4 d 9 7 8 7 8 9 8 8      0 0 0 0 0 0 0 4 '   0 0 0 0 0 3 0 4  d d 9 9  The condition number of P -1 G can be greatly reduced 1 st 2 nd 3 rd 4 th 5 th 6 th Matrix cond G 26.170 23.182 17.572 11.514 9.373 6.673 135.948 P 25.239 23.540 17.579 10.909 9.865 6.822 16.752 P -1 G 1.431 1.204 1.062 1.000 1.000 1.000 17.442 6

Support-Circuit Preconditioner  A naïve support-circuit preconditioner (SCP) – Sparsifies the linear networks of the original circuit network – Takes advantage of existing sparse matrix techniques (Cholesky, LU, etc) – Nearly-linear complexity for analyzing nanoscale (parasitics-dominant) ICs – E.g. clock networks, power delivery networks, etc. Support Graph of the Original Network VR VR VR VR VR VR VR VR Digital Circuit Blocks Support-Circuit Preconditioner 7

Support-Circuit Preconditioner (Cont.)  General-purpose support-circuit preconditioner (GPSCP) – Extracts sparsified network from the linearized circuit of the original circuit – Leverages existing sparse matrix solution techniques – Nearly-linear complexity for analyzing more general nonlinear circuit systems C g R g g d gd d g C 2 g 2 d 3 gd g 2 3 R g 3 g g V g V g C C R m gs C ds ds m gs ds gs 1 s ds s s g g g g R R 1 g 5 5 1 4 5 4 Nonlinear Circuit Linearized Circuit Support Circuit 8

Support-Circuit Preconditioner Extraction (1)  Directed weighted graph corresponding to a linearized circuit – Can be obtained around an solution point during NR iterations – Will be sparsified through graph decomposition and sparsification C R g g d g gd 2 d 2 3 R g 3 g g V R C C ds m gs 1 gs s ds 1 s g g R R g 5 4 5 1 4 Linearized Circuit Nonlinear Circuit C C C g g g gd gd gd 2 2 2 d d d h h h g g g g g g C C C 3 g V ds 3 ds 3 ds g g g 2 m gs ds ds ds h h h 4 3 C C gs gs g g 4 s 4 s s h h g g g g g g 1 1 1 5 5 5 Undirected Weighted Graph Support Graph Directed Weighted Graph 9

Support-Circuit Preconditioner Extraction (2)  Support-circuit preconditioner extraction – Combine support graph and other components (e.g. controlling sources) – Factor the Jacobian matrix of the support circuit to create the preconditioner C g gd C g g d g gd d 2 2 3 h g C g V g C m gs ds ds ds g g 3 ds h s C g g 1 g gd 5 d 2 s 6 5 g h g g 1 g 5 7 3 C Support Graph g ds g V ds m gs h 5 s g g g m 5 1 Spt-CKT Spt-CKT V Support Circuit Controlling Sources General-Purpose Support Circuit 10

Quality Quantification of Support Graph Preconditioners  Convergence of support-graph preconditioners – The convergence relies on the condition number of matrix pencil ( G , P ) λ ( , ) G P = max ( , ) k G P λ ( , ) G P min – The support of pencil ( G , P ) is defined as: σ = τ ∈ℜ τ − ≥ ∈ℜ T n ( , ) min{ | ( ) 0, all } G P x P G x x τ – Eigenvalues of pencil ( G , P ) are bounded by τ – A smaller means faster convergence  Spanning-tree support graph as a preconditioner τ – May require many iterations to converge if (mismatch) is too large τ – can be estimated by comparing Joule heating of two resistive networks T Power dissipated by G : x Gx T x Px Power dissipated by P : 11

Ultra-Sparsifier Support Graph (1)  Ultra-sparsifier (non-tree) support graphs – Ultra-sparsifier contains at most n-1+k edges (spanning tree + extra edges) Spanning tree Ultra-sparsifier Extra edges Edges of spanning tree graph – It is k -ultra-sparse that -approximates the original graph with high probability [1] – Adding extra edges to the spanning tree can better approximate the original graph (e.g. eigenvalues, power dissipations) [1] D. A. Spielman and S. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proc. ACM STOC, 2004. 12

Ultra-Sparsifier Support Graph (2)  Sparsity control of an ultra-sparsifier support graph – Provides tradeoffs between the quality and efficiency of preconditioners – Weighted degree of a vertex v in a graph A is defined: ( ) vol v = ( ) wd v max ( , ) w u v ∈ ( ) u neighbor v vol( v ): total weight incident to node v w( u , v ): the weight of the edge connecting nodes v and u – Example: for a 2D-mesh grid, 1 ≤ wd(v) ≤ 4 – If wd(v) ->1: one dominant edge – If wd(v) ->4 : four evenly critical edges 13

Ultra-Sparsifier Support Graph (3)  Iterative ultra-sparsifier support graph construction – Define θ as the matching factor threshold (0 < θ < 1) of node weighted degree • Compute weighted degree wd of each node in the original graph A Spanning tree Step 1 Ultra-sparsifier • Compute the support graph A’ with weighed degree wd’ Step 2 • Recover edges to A’ until wd’/wd > θ for each node in the support graph A’ Step 3 wd’/wd < θ wd’/wd > θ • Return the final ultra-sparsifier support graph A’ for support-circuit preconditioning Step 4 Extra edges 14

Graph Sparsification Approaches to Scalable Integrated Circuit - PowerPoint PPT Presentation

Design Automation Group Graph Sparsification Approaches to Scalable Integrated Circuit Modeling and Simulations Zhuo Feng Acknowledgements: My PhD students Xueqian Zhao (MTU) and Lengfei Han (MTU) ICSICT, Oct, 2014 1 Scalable SPICE-Accurate

Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov 2016 1 / 10 Todays Biz 1.

Active Regression via Linear-Sample Sparsification Xue Chen Eric Price UT Austin Xue Chen, Eric

Vertex Sparsification and Oblivious Reductions Ankur Moitra, MIT September 14, 2010 Ankur Moitra

Improved Dynamic Graph Learning through Fault-Tolerant Sparsification Chun Jiang Zhu , Sabine

Graph Sparsifiers Smaller graph that (approximately) preserves the values of some set of

Quantum Speedup for Graph Sparsification, Cut Approximation and Laplacian Solving Simon Apers 1

Random Projections, Graph Sparsification, and Differential Privacy Jalaj Upadhyay Center for

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Deterministic MST Sparsification in the Congested Clique Janne H. Korhonen University of

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Graph Partitioning for Scalable Distributed Graph Computations Aydn Bulu Kamesh

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

Bayesian Sparsification of Deep Complex-valued networks Ivan Nazarov, Evgeny Burnaev ADASE

Dual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning Zijie Yan,

Zonoids and sparsification of quantum measurements Guillaume AUBRUN (joint with C ecilia

A Bucket Graph Based Labelling Algorithm for the Resource Constrained Shortest Path Problem with

A Content-Centric Network for Autonomous Driving Swarun Kumar Lixin Shi, Stephanie Gil, Nabeel

Optimizing Indirect Memory References with milk Vladimir Kiriansky, Yunming Zhang, Saman

Matthias Grossglauser, EPFL CTW 2013 1 4417749 care packages 2006-03 03-02 09:19:32

USING DEEP LEARNING GT 8803 // FALL 2018 // JACOB LOGAS L E C T U R E # 0 9 : DATA VO C A L I

Status tus Update te Alexey Lyashen henko Incom om Inc. Incom Inc. LAPPD, CPAD19, December 8

JORDAN BISERKOV ClojuTRE Helsinki, Finland September 14 th 2018 Jordan Biserkov Programming

DS504/CS586: Big Data Analytics Data acquisition and measurement Prof. Yanhua Li Time: 6:00pm