 
              Reconstruction of full rank algebraic branching programs Vineet Nair Joint work with: Neeraj Kayal, Chandan Saha, Sebastien Tavenas 1
Arithmetic circuits 2
Reconstruction problem ➢ f( X )  Q[ X ] is an m-variate degree d polynomial computable by a size s circuit in circuit class C. ➢ Input: α ϵ F m f( α ) Blackbox access 3
Reconstruction problem ➢ Input: α ϵ F m f( α ) ➢ Output: A small arithmetic circuit computing f. ➢ The algorithm should run in time poly(m,s,d, b ) where ( b is the bit length of the coefficients of f). 4
Polynomial identity testing (PIT):  Input: α ϵ F m f( α ) Is f( X ) = 0 ?  Randomized algorithm for PIT follows easily from Schwartz-Zippel lemma  Unlike PIT no efficient randomized algorithm is known for reconstruction. 5
Previous works  Over finite fields [Shp07],[KS09] gave quasi-poly time deterministic reconstruction algorithm for depth three circuits with constant number of product gates. 6
Previous works  Over characteristic zero fields [Sinha16] gave a poly time randomized algorithm for depth three circuits with two product gates.  [GKL12] gave poly time randomized algorithm for multilinear depth four circuits with two top-level product gates. 7
Previous works  [SV09], [MV16] gave deterministic poly time reconstruction for read-once formulas  [KS03], [FS13] gave deterministic quasi-poly time reconstruction for ROABPs, set-multilinear ABPs and non-commutative ABPs 8
Average-case reconstruction  Progress in reconstruction is slow.  Can we do reconstruction for most circuits in a circuit class C ? C Efficiently reconstructed 9
Average-case reconstruction  Problem definition: The input f is an m variate degree d polynomial picked according to a distribution D on circuit class C  Output an efficient reconstruction algorithm for f.  [GKL11], [GKQ13] gave randomized poly time algorithm for average-case reconstruction of multilinear formulas and formulas. 10
Algebraic branching programs (ABP)  Definition: Consider the product of d matrices as X 1 • X 2 • … • X d , where X 1 is a row vector of length w, X d is a column vector of length w and X 2 , … , X d-1 are w x w matrices.  Each entry of X i , i  [d] is an affine form in X variables. | X | = m, example a 0 + a 1 x 1 + … + a m .  Polynomial computed by the ABP is the entry in the 1x1 matrix computed as above. Length and width of the ABP is w and d respectively. 11
Distribution on ABPs  Random ABP: Fix w,d and m. Pick the constants of the linear forms independently and uniformly at random from a large set S ⊆ Q.  Average-case reconstruction: Design a reconstruction algorithm for random(m,w,d,S) ABP. 12
Average-case reconstruction for ABPs ➢ Input: Blackbox access to f( X ) computable by random(m,w,d,S) ABP. α ϵ F n f( α ) ➢ Output: A small ABP computing f with high probability. ➢ The algorithm should run in time poly( m,w,d,ρ ) - ( ρ bit length of an element in S). 13
Pseudo-random family  A distribution D on m variate degree d polynomial family with seed length s=(md) O(1) generates a pseudo-random family if  Every algorithm that distinguishes a polynomial coming from D and uniformly random m-variate polynomial with a non-negligible bias runs in time exponential in s. 14
Candidate family  [Aar08] conjectures the family Det n (A X ) where every entry of A ϵ F t x m is chosen uniformly at random from a finite field and m << t=n 2 is pseudo-random  Example x 1 +x 2 6x 1 +x 2 x 1 +3x 2 5x 1 +4x 2 8x 1 +x 2 10x 1 +x 2 8x 1 +3x 2 3x 1 +2x 2 m = 2, n = 4 8x 1 +2x 2 5x 1 +4x 2 7x 1 +9x 2 11x 1 +x 2 4x 1 +3x 2 9x 1 +3x 2 5x 1 +6x 2 9x 1 +7x 2 15
Iterated matrix multiplication  Definition: Consider the product of d matrices as X 1 • X 2 • … • X d , where X 1 is a row vector of length w, X d is a column vector of length w and X 2 , … , X d-1 are w x w matrices.  Each entry of X i , i  [d] is a distinct variable. The variables are disjoint across matrices.  IMM w,d is the entry in 1x1 matrix computed as above. 16
Consequence  Det n and IMM w,d are affine projections of each other [Mahajan, Vinay 97].  Hence, it makes sense to ask whether IMM w,d (A X ) where A ϵ F t x m is chosen uniformly at random from a finite S ⊆ Q and m << t = w 2 (d-2) + 2w is pseudorandom. 17
Our Contribution 18
Main result  19
Remarks  Does not resolve Aaronson’s conjecture For IMM w,d the conjecture holds • when m << w 2 d Our result holds when m  w 2 d •  Our result works even if the matrices are not of uniform width. 20
Full rank ABPs  If m  w 2 d then the affine forms in the ABP are Q-linearly independent with high probability.  Full rank ABPs: the set of linear forms in X 1 , X 2 , …, X d are Q-linearly independent.  Example: x 1 + x 2 x 2 + x 3 x 3 + x 4 x 4 + x 5 x 5 + x 6 x 6 + x 7 x 13 + x 14 x 7 + x 8 x 8 + x 9 x 9 + x 10 x 14 + x 15 x 10 + x 11 x 11 + x 12 x 12 + x 13 x 15 + x 16 21
Full rank ABPs  If m  w 2 d then the affine forms in the ABP are Q-linearly independent with high probability.  Full rank ABPs: the set of linear forms in X 1 , X 2 , …, X d are Q-linearly independent.  Main result: We design an efficient randomized algorithm to reconstruct full rank ABPs. 22
Equivalent polynomials  An n-variate polynomial f is equivalent to an n- variate polynomial g if there exists an invertible A ϵ F n x n such that f( X ) = g(A X )  Equivalence test: f( X ) g( X ) Is there an invertible A in F nxn such that f( X ) = g(A X ) 23
Equivalent polynomials  Equivalence test: IMM( X ) f( X ) Is there an invertible A in F nxn such that f( X ) = IMM(A X ) Remark: Computing a full rank ABP for f is the same as designing an efficient randomized equivalence test for IMM 24
Group of symmetries of IMM  Group of symmetries: For an n variate polynomial g( X ) it is the set of all invertible A  F nxn such that g(A X ) = g( X ).  Characterization by symmetries: g( X ) is characterized by its group of symmetries then  The group of symmetries of f( X ) and g( X ) are equal if and only if f( X ) is a constant multiple of g( X )  Main theorem 2: IMM w,d is characterized by its group of symmetries. 25
Proof Ideas 26
Template of the reconstruction algorithm Assume the input polynomial f is computable by a full rank ABP Compute a full rank ABP 1. Find the layer spaces 2. Glue them together Do a polynomial identity test to check if the polynomial computed by the ABP is f Output `f is not yes Output the full rank ABP no computable by a full computing f rank ABP’ 27
Pre-processing  Let an m variate polynomial f be computed by a width w and length d full rank ABP.  The number of edges is n = w 2 (d-2) +2w m  n  Two steps of pre-processing: • Variable reduction: At the end of this step we get an n variate f computable by a full rank ABP • Translation equivalence test: The entries in the matrices of the full rank ABP computing f are linear forms (constant term is 0). 2
Multiple full rank ABPs for f  Suppose f is computable by a full rank ABP X 1 • X 2 • … • X d  Then this full rank ABP for f is not unique  The following transformations still compute f  Transposition  Left-right multiplication  Corner translations 29
Transposition  Recall X 1 and X d are row and column vectors  Since the eventual product is a 1x1 matrix the transpose of the product still computes f  Hence f is also computed by T X d • T X 2 • … • T X 1 30
Left-right multiplication  Let A be a w x w invertible matrix with entries from Q  Replace X 2 with X ’ 2 = X 2 • A and X ’ 3 = A -1 • X 3  f is computed by the product X 1 • X ’ 2 • X ’ 3 • … • X d 31
Corner translations  Let B be an anti-symmetric w x w matrix, then X 1 • B • T X 1 = 0 x 1 + x 2 x 1 + x 2 x 2 + x 3 x 3 + x 4 0 4 5 = 0 x 2 + x 3 -4 0 8 x 3 + x 4 -5 -8 0 32
Corner translations  Let B 1 , B 2 , … , B w be anti-symmetric w x w matrices.  Let Y be the matrix such that the i-th column of Y is B i • T X 1 i-th column of matrix Y B i • T X 1 33
Corner translations  Replace X 2 with X ’ 2 = X 2 + Y  Observe that X 1 • X’ 2 = X 1 • X 2 as X 1 Y = 0 w x w  f is computed by the product X 1 • X ’ 2 • X 3 • … • X d = X 1 • (X 2 + Y) • X 3 • … • X d  Similarly we can define corner translations for X d-1 34
Uniqueness of the layer spaces  Suppose f is computable by a full rank ABP X 1 • X 2 • … • X d  Let X i denote the Q-linear space spanned by the linear forms in X i  X 1,2 and X d-1,d denote the the Q-linear space spanned by the linear forms in X 1 ,X 2 and X d-1, X d respectively 35
Uniqueness of the layer spaces  If X ’ 1 • X ’ 2 • … • X ’ d computes f then  either  X ’ i = X i for i ϵ [d]\{2,d-1}  X ’ 1,2 = X 1,2 and X’ d-1,d = X d-1,d  or  X ’ i = X d-i for i ϵ [d]\{2,d-1}  X ’ 1,2 = X d-1,d and X’ d-1,d = X 1,2 36
Uniqueness of the layer spaces X d-2 X 1 X 3 X 4 X d • • • X d-1,d X 1,2 37
Uniqueness of the layer spaces X 3 X d X d-2 X d-3 • • • X 1 X d-1,d X 1,2 38
Recommend
More recommend