Reconstruction of full rank algebraic branching programs Vineet - - PowerPoint PPT Presentation

reconstruction of full rank algebraic branching programs
SMART_READER_LITE
LIVE PREVIEW

Reconstruction of full rank algebraic branching programs Vineet - - PowerPoint PPT Presentation

Reconstruction of full rank algebraic branching programs Vineet Nair Joint work with: Neeraj Kayal, Chandan Saha, Sebastien Tavenas 1 Arithmetic circuits 2 Reconstruction problem f( X ) Q[ X ] is an m-variate degree d polynomial


slide-1
SLIDE 1

Reconstruction of full rank algebraic branching programs

Vineet Nair

Joint work with: Neeraj Kayal, Chandan Saha, Sebastien Tavenas

1

slide-2
SLIDE 2

Arithmetic circuits

2

slide-3
SLIDE 3

Reconstruction problem

3

➢ f(X)  Q[X] is an m-variate degree d polynomial computable by a size s circuit in circuit class C. ➢ Input: α ϵ Fm f(α)

Blackbox access

slide-4
SLIDE 4

Reconstruction problem

4

➢ Input: ➢ Output: A small arithmetic circuit computing f. ➢ The algorithm should run in time poly(m,s,d,b) where (b is the bit length of the coefficients of f). α ϵ Fm f(α)

slide-5
SLIDE 5

Polynomial identity testing (PIT):

5

 Input:  Randomized algorithm for PIT follows easily

from Schwartz-Zippel lemma

 Unlike PIT no efficient randomized algorithm is

known for reconstruction.

α ϵ Fm f(α) Is f(X) = 0 ?

slide-6
SLIDE 6

Previous works

 Over finite fields [Shp07],[KS09] gave quasi-poly

time deterministic reconstruction algorithm for depth three circuits with constant number of product gates.

6

slide-7
SLIDE 7

Previous works

 Over characteristic zero fields [Sinha16] gave a

poly time randomized algorithm for depth three circuits with two product gates.

 [GKL12] gave poly time randomized algorithm for

multilinear depth four circuits with two top-level product gates.

7

slide-8
SLIDE 8

Previous works

 [SV09], [MV16] gave deterministic poly time

reconstruction for read-once formulas

 [KS03], [FS13] gave deterministic quasi-poly time

reconstruction for ROABPs, set-multilinear ABPs and non-commutative ABPs

8

slide-9
SLIDE 9

Average-case reconstruction

 Progress in reconstruction is slow.  Can we do reconstruction for most circuits in a

circuit class C ?

9

C

Efficiently reconstructed

slide-10
SLIDE 10

Average-case reconstruction

 Problem definition: The input f is an m variate

degree d polynomial picked according to a distribution D on circuit class C

 Output an efficient reconstruction algorithm for f.  [GKL11], [GKQ13] gave randomized poly time

algorithm for average-case reconstruction of multilinear formulas and formulas.

10

slide-11
SLIDE 11

Algebraic branching programs (ABP)

 Definition: Consider the product of d matrices as

X1• X2 • … • Xd , where X1 is a row vector of length w, Xd is a column vector of length w and X2, … , Xd-1 are wxw matrices.

 Each entry of Xi, i  [d] is an affine form in X

  • variables. |X| = m, example a0 + a1x1+ … + am.

 Polynomial computed by the ABP is the entry in

the 1x1 matrix computed as above. Length and width of the ABP is w and d respectively.

11

slide-12
SLIDE 12

Distribution on ABPs

 Random ABP: Fix w,d and m. Pick the constants

  • f the linear forms independently and uniformly

at random from a large set S ⊆ Q.

 Average-case reconstruction: Design a

reconstruction algorithm for random(m,w,d,S) ABP.

12

slide-13
SLIDE 13

Average-case reconstruction for ABPs

13

➢ Input: Blackbox access to f(X) computable by random(m,w,d,S) ABP. ➢ Output: A small ABP computing f with high probability. ➢ The algorithm should run in time poly(m,w,d,ρ) - (ρ bit length of an element in S). α ϵ Fn f(α)

slide-14
SLIDE 14

Pseudo-random family

 A distribution D on m variate degree d

polynomial family with seed length s=(md)O(1) generates a pseudo-random family if

 Every algorithm that distinguishes a polynomial

coming from D and uniformly random m-variate polynomial with a non-negligible bias runs in time exponential in s.

14

slide-15
SLIDE 15

Candidate family

 [Aar08] conjectures the family Detn(AX) where

every entry of A ϵ Ft x m is chosen uniformly at random from a finite field and m << t=n2 is pseudo-random

 Example

15

x1+x2 6x1+x2 x1+3x2 5x1+4x2 8x1+x2 10x1+x2 8x1+3x2 3x1+2x2 8x1+2x2 5x1+4x2 7x1+9x2 11x1+x2 4x1+3x2 9x1+3x2 5x1+6x2 9x1+7x2

m = 2, n = 4

slide-16
SLIDE 16

Iterated matrix multiplication

 Definition: Consider the product of d matrices as

X1• X2 • … • Xd , where X1 is a row vector of length

w, Xd is a column vector of length w and X2, … , Xd-1 are wxw matrices.

 Each entry of Xi, i  [d] is a distinct variable. The

variables are disjoint across matrices.

 IMMw,d is the entry in 1x1 matrix computed as

above.

16

slide-17
SLIDE 17

Consequence

 Detn and IMMw,d are affine projections of each

  • ther [Mahajan, Vinay 97].

 Hence, it makes sense to ask whether

IMMw,d(AX) where A ϵ Ft x m is chosen uniformly at random from a finite S ⊆ Q and m << t = w2(d-2) + 2w is pseudorandom.

17

slide-18
SLIDE 18

18

Our Contribution

slide-19
SLIDE 19

Main result

19

slide-20
SLIDE 20

Remarks

 Does not resolve Aaronson’s conjecture  Our result works even if the matrices are not of

uniform width.

20

  • For IMMw,d the conjecture holds

when m << w2d

  • Our result holds when m  w2d
slide-21
SLIDE 21

Full rank ABPs

 If m  w2d then the affine forms in the ABP are

Q-linearly independent with high probability.

 Full rank ABPs: the set of linear forms in X1, X2,

…, Xd are Q-linearly independent.

 Example:

21

x4+ x5 x5+ x6 x6+ x7 x7+ x8 x8+ x9 x9+ x10 x10+ x11 x11+ x12 x12+ x13 x1+ x2 x2+ x3 x3+ x4 x13+ x14 x14+ x15 x15+ x16

slide-22
SLIDE 22

Full rank ABPs

 If m  w2d then the affine forms in the ABP are

Q-linearly independent with high probability.

 Full rank ABPs: the set of linear forms in X1, X2,

…, Xd are Q-linearly independent.

 Main result: We design an efficient randomized

algorithm to reconstruct full rank ABPs.

22

slide-23
SLIDE 23

Equivalent polynomials

 An n-variate polynomial f is equivalent to an n-

variate polynomial g if there exists an invertible A ϵ Fnxn such that f(X) = g(AX)

 Equivalence test:

23

Is there an invertible A in Fnxn such that f(X) = g(AX)

g(X) f(X)

slide-24
SLIDE 24

Equivalent polynomials

 Equivalence test:

24

Is there an invertible A in Fnxn such that f(X) = IMM(AX)

IMM(X) f(X)

Remark: Computing a full rank ABP for f is the same as designing an efficient randomized equivalence test for IMM

slide-25
SLIDE 25

Group of symmetries of IMM

 Group of symmetries: For an n variate

polynomial g(X) it is the set of all invertible A  Fnxn such that g(AX) = g(X).

 Characterization by symmetries: g(X) is

characterized by its group of symmetries then

 The group of symmetries of f(X) and g(X) are equal if

and only if f(X) is a constant multiple of g(X)

 Main theorem 2: IMMw,d is characterized by its

group of symmetries.

25

slide-26
SLIDE 26

26

Proof Ideas

slide-27
SLIDE 27

Template of the reconstruction algorithm

27

Assume the input polynomial f is computable by a full rank ABP Compute a full rank ABP

  • 1. Find the layer spaces
  • 2. Glue them together

Do a polynomial identity test to check if the polynomial computed by the ABP is f Output the full rank ABP computing f Output `f is not computable by a full rank ABP’ yes no

slide-28
SLIDE 28

Pre-processing

 Let an m variate polynomial f be computed by a

width w and length d full rank ABP.

 The number of edges is n = w2(d-2) +2w

 Two steps of pre-processing:

2

  • Variable reduction: At the end of this step we get an n

variate f computable by a full rank ABP

  • Translation equivalence test: The entries in the

matrices of the full rank ABP computing f are linear forms (constant term is 0).

m  n

slide-29
SLIDE 29

Multiple full rank ABPs for f

 Suppose f is computable by a full rank ABP

X1• X2 • … • Xd

 Then this full rank ABP for f is not unique  The following transformations still compute f

 Transposition  Left-right multiplication  Corner translations

29

slide-30
SLIDE 30

Transposition

 Recall X1 and Xd are row and column vectors  Since the eventual product is a 1x1 matrix the

transpose of the product still computes f

 Hence f is also computed by

TXd• TX2 • … • TX1

30

slide-31
SLIDE 31

Left-right multiplication

 Let A be a wxw invertible matrix with entries

from Q

 Replace X2 with X’2 = X2• A and X’3 = A-1 • X3  f is computed by the product

X1• X’2 • X’3 • … • Xd

31

slide-32
SLIDE 32

Corner translations

 Let B be an anti-symmetric wxw matrix, then

X1 • B • TX1 = 0

32

4 5

  • 4

8

  • 5
  • 8

x1+ x2 x2+ x3 x3+ x4 x1+ x2 x2+ x3 x3+ x4

= 0

slide-33
SLIDE 33

Corner translations

 Let B1, B2, … , Bw be anti-symmetric wxw

matrices.

 Let Y be the matrix such that the i-th column of

Y is Bi • TX1

33

Bi • TX1 i-th column of matrix Y

slide-34
SLIDE 34

Corner translations

 Replace X2 with X’2 = X2 + Y  Observe that X1 • X’2 = X1 • X2 as X1Y = 0wxw  f is computed by the product

X1• X’2 • X3 • … • Xd = X1• (X2 + Y) • X3 • … • Xd

 Similarly we can define corner translations for

Xd-1

34

slide-35
SLIDE 35

Uniqueness of the layer spaces

 Suppose f is computable by a full rank ABP

X1• X2 • … • Xd

 Let Xi denote the Q-linear space spanned by the

linear forms in Xi

 X1,2 and Xd-1,d denote the the Q-linear space

spanned by the linear forms in X1,X2 and Xd-1, Xd respectively

35

slide-36
SLIDE 36

Uniqueness of the layer spaces

 If X’1• X’2 • … • X’d computes f then  either

 X’i = Xi for i ϵ [d]\{2,d-1}  X’1,2 = X1,2 and X’d-1,d = Xd-1,d

 or

 X’i = Xd-i for i ϵ [d]\{2,d-1}  X’1,2 = Xd-1,dand X’d-1,d = X1,2

36

slide-37
SLIDE 37

Uniqueness of the layer spaces

37

  • X1

X3 X4 Xd-2 Xd X1,2 Xd-1,d

slide-38
SLIDE 38

Uniqueness of the layer spaces

38

  • Xd

Xd-2 X3 X1 Xd-1,d Xd-3 X1,2

slide-39
SLIDE 39

Group of symmetries of IMM

 The set of all invertible A ϵ Fnxn such that

IMMw,d(AX) = IMMw,d.

 We show that the group of symmetries are

generated by the following subgroups

 T denotes the group corresponding to transpositions  M denotes the group corresponding to left-right

multilpications

 C denotes the group corresponding to corner

translations

39

slide-40
SLIDE 40

Group of symmetries of IMM

 Main Theorem:

GIMM = C ⋊ H , where H = M ⋊ T

 C is a normal subgroup in GIMM and M is a

normal subgroup in H

 We also show that IMM is characterized by its

group of symmetries

 That is any polynomial with the same symmetry

group is a constant multiple of IMM

40

slide-41
SLIDE 41

Computing the full rank ABP: Step 1

 Computing the layer spaces:

 Study the Lie algebra of the group of symmetries of

IMMw,d

 [Kay12] Lie algebra of the group of symmetries of

f is the set of matrices A = (ai,j) i,j  [n]

41

 We just use the vector space property of the

algebra

slide-42
SLIDE 42

Invariant spaces

 Invariant space: Let M: Qn  Qn be a linear

  • perator. U⊆ Qn is an invariant space if M(U) ⊆

U

 The definition can be extended to a set of linear

  • perators {M1, M2, … , Mn}

 The layer spaces of an f computed by a full rank

ABP are intimately connected to the invariant spaces of Lie algebra of f

42

slide-43
SLIDE 43

Computing the layer spaces

43

Compute a basis of the Lie algebra of f Compute the irreducible invariant spaces of the Lie algebra of f Compute the layer spaces from the irreducible invariant spaces Easy: Involves solving a set of linear dependencies Since f and IMM are equivalent their Lie algebras are conjugates of each other We show that the layer spaces are in fact the irreducible invariant spaces in some sense

slide-44
SLIDE 44

Computing the full rank ABP: Step 2

 Ordering the layer spaces: We use evaluation

dimension to order the layer spaces.

 Definition:

 Evaluation Dimension for a polynomial H(X) is defined

with respect to a set of variables S ⊆ X

 EvaldimS[H(X) ] is equal to

dim (span{H(X) | xj =ɑj for xj∊S,where ɑj∊F})

44

slide-45
SLIDE 45

Ordering the layer spaces

 We make the variables in distinct layers are

disjoint by mapping the basis vectors of the layer spaces to distinct variables.

 Then we find the ordering inductively.

45

slide-46
SLIDE 46

Ordering the layer spaces

46

  •  Base Case
  • Evaluation dimension = w

Evaluation dimension = w2

slide-47
SLIDE 47

Ordering the layer spaces

47

 Inductive Step

  • Evaluation dimension = w

Evaluation dimension = w2

slide-48
SLIDE 48

48

Thank You