[PPT] - Analytic algorithms for the moment polytope Cole Franks Rutgers PowerPoint Presentation

SLIDE 1

Analytic algorithms for the moment polytope

Cole Franks

Rutgers University

SLIDE 2

Based on joint work with

Peter B¨ urgisser Ankit Garg Rafael Oliveira Michael Walter Avi Wigderson

Mainly from “Towards a theory of non-commutative optimization: geodesic 1st and 2nd order methods for moment maps and polytopes” FOCS 2019

1

SLIDE 3

Outline

1. Moment polytopes by example
2. Algorithms for the general problem

2

SLIDE 4

Moment polytopes

SLIDE 5

Motivating question

Horn’s problem: Are λ1, λ2, λ3 ∈ Rn the spectra of three n × n matrices H1, H2, H3 such that H1 + H2 = H3? If so, can one find the matrices efficiently?

3

SLIDE 6

Motivating question

Horn’s problem: Are λ1, λ2, λ3 ∈ Rn the spectra of three n × n matrices H1, H2, H3 such that H1 + H2 = H3? If so, can one find the matrices efficiently?

3

SLIDE 7

Horn set

Let V = P(Mat(n)2), define µ : V → Herm(n)3 by µ : [A1, A2] → (A1A†

1,

A2A†

2,

A†

1A1 + A† 2A2)

A12 + A22 . Note eigs(AA†) = eigs(A†A), so eigs(A1A†

1),

eigs(A2A†

2),

eigs(A†

1A1 + A† 2A2)

is a “yes” instance to Horn’s problem (in fact, all such instances take this form).

4

SLIDE 8

Horn set

Let V = P(Mat(n)2), define µ : V → Herm(n)3 by µ : [A1, A2] → (A1A†

1,

A2A†

2,

A†

1A1 + A† 2A2)

A12 + A22 . Note eigs(AA†) = eigs(A†A), so eigs(A1A†

1),

eigs(A2A†

2),

eigs(A†

1A1 + A† 2A2)

is a “yes” instance to Horn’s problem (in fact, all such instances take this form).

4

SLIDE 9

Moment polytopes

G = GL(n)
π : G → Cm a representation of G where U(n) acts unitarily
V ⊂ P(Cm) a projective variety fixed by G,

Moment map is the map µ : V → n × n Hermitians =: Herm(n) given by µ : v → ∇H∈Herm(n) log eH · v iµ is a moment map for U(n) in the physical sense! In particular: Theorem (Kirwan) Image of V Herm(n) Rn

µ take eigs.

is a convex polytope in Rn known as moment polytope, denoted ∆(V)

5

SLIDE 10

Moment polytopes

G = GL(n)
π : G → Cm a representation of G where U(n) acts unitarily
V ⊂ P(Cm) a projective variety fixed by G,

Moment map is the map µ : V → n × n Hermitians =: Herm(n) given by µ : v → ∇H∈Herm(n) log eH · v iµ is a moment map for U(n) in the physical sense! In particular: Theorem (Kirwan) Image of V Herm(n) Rn

µ take eigs.

is a convex polytope in Rn known as moment polytope, denoted ∆(V)

5

SLIDE 11

Moment polytopes

G = GL(n)
π : G → Cm a representation of G where U(n) acts unitarily
V ⊂ P(Cm) a projective variety fixed by G,

Moment map is the map µ : V → n × n Hermitians =: Herm(n) given by µ : v → ∇H∈Herm(n) log eH · v iµ is a moment map for U(n) in the physical sense! In particular: Theorem (Kirwan) Image of V Herm(n) Rn

µ take eigs.

is a convex polytope in Rn known as moment polytope, denoted ∆(V)

5

SLIDE 12

Moment polytopes

G = GL(n)
π : G → Cm a representation of G where U(n) acts unitarily
V ⊂ P(Cm) a projective variety fixed by G,

Moment map is the map µ : V → n × n Hermitians =: Herm(n) given by µ : v → ∇H∈Herm(n) log eH · v iµ is a moment map for U(n) in the physical sense! In particular: Theorem (Kirwan) Image of V Herm(n) Rn

µ take eigs.

is a convex polytope in Rn known as moment polytope, denoted ∆(V)

5

SLIDE 13

Horn polytope

V = P(Mat(n)2)
G = GL(n)3
π given by

(g1, g2, g3) · (A1, A2) = (g1A1g†

3, g2A2g† 3).

µ : V → Herm(n)3 given by

µ : [A1, A2] → (A1A†

1,

A2A†

2,

A†

1A1 + A† 2A2)

A12 + A22 . Thus, image of V Herm(n)3 (Rn)3

µ take eigs.

is the* solution set of the Horn problem!

6

SLIDE 14

Horn polytope

V = P(Mat(n)2)
G = GL(n)3
π given by

(g1, g2, g3) · (A1, A2) = (g1A1g†

3, g2A2g† 3).

µ : V → Herm(n)3 given by

µ : [A1, A2] → (A1A†

1,

A2A†

2,

A†

1A1 + A† 2A2)

A12 + A22 . Thus, image of V Herm(n)3 (Rn)3

µ take eigs.

is the* solution set of the Horn problem!

6

SLIDE 15

Horn polytope

V = P(Mat(n)2)
G = GL(n)3
π given by

(g1, g2, g3) · (A1, A2) = (g1A1g†

3, g2A2g† 3).

µ : V → Herm(n)3 given by

µ : [A1, A2] → (A1A†

1,

A2A†

2,

A†

1A1 + A† 2A2)

A12 + A22 . Thus, image of V Herm(n)3 (Rn)3

µ take eigs.

is the* solution set of the Horn problem!

6

SLIDE 16

Horn polytope

V = P(Mat(n)2)
G = GL(n)3
π given by

(g1, g2, g3) · (A1, A2) = (g1A1g†

3, g2A2g† 3).

µ : V → Herm(n)3 given by

µ : [A1, A2] → (A1A†

1,

A2A†

2,

A†

1A1 + A† 2A2)

A12 + A22 . Thus, image of V Herm(n)3 (Rn)3

µ take eigs.

is the* solution set of the Horn problem!

6

SLIDE 17

Horn polytope

V = P(Mat(n)2)
G = GL(n)3
π given by

(g1, g2, g3) · (A1, A2) = (g1A1g†

3, g2A2g† 3).

µ : V → Herm(n)3 given by

µ : [A1, A2] → (A1A†

1,

A2A†

2,

A†

1A1 + A† 2A2)

A12 + A22 . Thus, image of V Herm(n)3 (Rn)3

µ take eigs.

is the* solution set of the Horn problem!

6

SLIDE 18

Link to algebra

Why are moment polytopes interesting? Encode asymptotic representation theory of coordinate ring of V! Theorem (Mumford, Ness ’84, Brion ’87) Let VG,λ denote irrep of G of type λ. Then

k

1 k {λ : VG,λ ⊂ C[V]k} = ∆(V) ∩ Qn! Additional math (Schur-Weyl duality, Saturation [KT00]) = ⇒ Horn polytope ∩ (Zn)3 = {(λ1, λ2, λ3) : VGL(n),λ3 ∈ VGL(n),λ1⊗VGL(n),λ2}

7

SLIDE 19

Link to algebra

Why are moment polytopes interesting? Encode asymptotic representation theory of coordinate ring of V! Theorem (Mumford, Ness ’84, Brion ’87) Let VG,λ denote irrep of G of type λ. Then

k

1 k {λ : VG,λ ⊂ C[V]k} = ∆(V) ∩ Qn! Additional math (Schur-Weyl duality, Saturation [KT00]) = ⇒ Horn polytope ∩ (Zn)3 = {(λ1, λ2, λ3) : VGL(n),λ3 ∈ VGL(n),λ1⊗VGL(n),λ2}

7

SLIDE 20

Link to algebra

Why are moment polytopes interesting? Encode asymptotic representation theory of coordinate ring of V! Theorem (Mumford, Ness ’84, Brion ’87) Let VG,λ denote irrep of G of type λ. Then

k

1 k {λ : VG,λ ⊂ C[V]k} = ∆(V) ∩ Qn! Additional math (Schur-Weyl duality, Saturation [KT00]) = ⇒ Horn polytope ∩ (Zn)3 = {(λ1, λ2, λ3) : VGL(n),λ3 ∈ VGL(n),λ1⊗VGL(n),λ2}

7

SLIDE 21

Algorithmic tasks

Input (V, π, λ)

Projective variety V as arithmetic circuit parametrizing it
Representation π as its list of irreducible subrepresentations as

elements of Zn

Target λ ∈ Qn
1. membership: determine whether λ in ∆(V).
2. ε-search: given λ ∈ Rn, either find an element v ∈ λ such that
µ(v) − diag(λ) < ε, OR
correctly declare λ ∈ ∆(V).

i.e. find an approximate preimage under µ! 1/exp(poly)-search suffices for membership!

8

SLIDE 22

Algorithmic tasks

Input (V, π, λ)

Projective variety V as arithmetic circuit parametrizing it
Representation π as its list of irreducible subrepresentations as

elements of Zn

Target λ ∈ Qn
1. membership: determine whether λ in ∆(V).
2. ε-search: given λ ∈ Rn, either find an element v ∈ λ such that
µ(v) − diag(λ) < ε, OR
correctly declare λ ∈ ∆(V).

i.e. find an approximate preimage under µ! 1/exp(poly)-search suffices for membership!

8

SLIDE 23

Algorithmic tasks

Input (V, π, λ)

Projective variety V as arithmetic circuit parametrizing it
Representation π as its list of irreducible subrepresentations as

elements of Zn

Target λ ∈ Qn
1. membership: determine whether λ in ∆(V).
2. ε-search: given λ ∈ Rn, either find an element v ∈ λ such that
µ(v) − diag(λ) < ε, OR
correctly declare λ ∈ ∆(V).

i.e. find an approximate preimage under µ! 1/exp(poly)-search suffices for membership!

8

SLIDE 24

Algorithmic tasks

Input (V, π, λ)

Projective variety V as arithmetic circuit parametrizing it
Representation π as its list of irreducible subrepresentations as

elements of Zn

Target λ ∈ Qn
1. membership: determine whether λ in ∆(V).
2. ε-search: given λ ∈ Rn, either find an element v ∈ λ such that
µ(v) − diag(λ) < ε, OR
correctly declare λ ∈ ∆(V).

i.e. find an approximate preimage under µ! 1/exp(poly)-search suffices for membership!

8

SLIDE 25

Algorithmic tasks

Input (V, π, λ)

Projective variety V as arithmetic circuit parametrizing it
Representation π as its list of irreducible subrepresentations as

elements of Zn

Target λ ∈ Qn
1. membership: determine whether λ in ∆(V).
2. ε-search: given λ ∈ Rn, either find an element v ∈ λ such that
µ(v) − diag(λ) < ε, OR
correctly declare λ ∈ ∆(V).

i.e. find an approximate preimage under µ! 1/exp(poly)-search suffices for membership!

8

SLIDE 26

Algorithm for ε-search for Horn polytope (F18) Input: (λ1, λ2, λ3) ∈ (Rn)3 and ε > 0.

1. Choose A1, A2 at random. Define

µ1 = A1A†

1,

µ2 = A2A†

2,

µ3 = A†

1A1 + A† 2A2.

Want µi = diag(λi)

2. while µ3 − diag(λ3) > ε, do:
a. Choose B upper triangular such that B†µ3B = diag(λ3),

Set Ai ← AiB .

b. For i ∈ 1, 2, choose Bi upper triangular s.t. B†

i µiBi = diag(λi),

Set Ai ← B†

i Ai.

3. output A†

1A1, A† 2A2. 9

SLIDE 27

Algorithm for ε-search for Horn polytope (F18) Input: (λ1, λ2, λ3) ∈ (Rn)3 and ε > 0.

1. Choose A1, A2 at random. Define

µ1 = A1A†

1,

µ2 = A2A†

2,

µ3 = A†

1A1 + A† 2A2.

Want µi = diag(λi)

2. while µ3 − diag(λ3) > ε, do:
a. Choose B upper triangular such that B†µ3B = diag(λ3),

Set Ai ← AiB .

b. For i ∈ 1, 2, choose Bi upper triangular s.t. B†

i µiBi = diag(λi),

Set Ai ← B†

i Ai.

3. output A†

1A1, A† 2A2. 9

SLIDE 28

Algorithm for ε-search for Horn polytope (F18) Input: (λ1, λ2, λ3) ∈ (Rn)3 and ε > 0.

1. Choose A1, A2 at random. Define

µ1 = A1A†

1,

µ2 = A2A†

2,

µ3 = A†

1A1 + A† 2A2.

Want µi = diag(λi)

2. while µ3 − diag(λ3) > ε, do:
a. Choose B upper triangular such that B†µ3B = diag(λ3),

Set Ai ← AiB .

b. For i ∈ 1, 2, choose Bi upper triangular s.t. B†

i µiBi = diag(λi),

Set Ai ← B†

i Ai.

3. output A†

1A1, A† 2A2. 9

SLIDE 29

Algorithm for ε-search for Horn polytope (F18) Input: (λ1, λ2, λ3) ∈ (Rn)3 and ε > 0.

1. Choose A1, A2 at random. Define

µ1 = A1A†

1,

µ2 = A2A†

2,

µ3 = A†

1A1 + A† 2A2.

Want µi = diag(λi)

2. while µ3 − diag(λ3) > ε, do:
a. Choose B upper triangular such that B†µ3B = diag(λ3),

Set Ai ← AiB .

b. For i ∈ 1, 2, choose Bi upper triangular s.t. B†

i µiBi = diag(λi),

Set Ai ← B†

i Ai.

3. output A†

1A1, A† 2A2. 9

SLIDE 30

Algorithm for ε-search for Horn polytope (F18) Input: (λ1, λ2, λ3) ∈ (Rn)3 and ε > 0.

1. Choose A1, A2 at random. Define

µ1 = A1A†

1,

µ2 = A2A†

2,

µ3 = A†

1A1 + A† 2A2.

Want µi = diag(λi)

2. while µ3 − diag(λ3) > ε, do:
a. Choose B upper triangular such that B†µ3B = diag(λ3),

Set Ai ← AiB .

b. For i ∈ 1, 2, choose Bi upper triangular s.t. B†

i µiBi = diag(λi),

Set Ai ← B†

i Ai.

3. output A†

1A1, A† 2A2. 9

SLIDE 31

Complexity of moment polytope membership?

The case λ = 0 is the null-cone problem from Ankit’s talk!

1. Is membership in P?
For tori (G = Cn

×) Folklore,[SV17]

For Horn polytope, by saturation conjecture[MNS12]
2. Is it in RP?
We think so in general, but no proof yet!
3. Is it in NP or coNP?
In NP ∩ coNP for V = P(Cm) [BCMW17]
Not known in general!

10

SLIDE 32

Complexity of moment polytope membership?

The case λ = 0 is the null-cone problem from Ankit’s talk!

1. Is membership in P?
For tori (G = Cn

×) Folklore,[SV17]

For Horn polytope, by saturation conjecture[MNS12]
2. Is it in RP?
We think so in general, but no proof yet!
3. Is it in NP or coNP?
In NP ∩ coNP for V = P(Cm) [BCMW17]
Not known in general!

10

SLIDE 33

Complexity of moment polytope membership?

The case λ = 0 is the null-cone problem from Ankit’s talk!

1. Is membership in P?
For tori (G = Cn

×) Folklore,[SV17]

For Horn polytope, by saturation conjecture[MNS12]
2. Is it in RP?
We think so in general, but no proof yet!
3. Is it in NP or coNP?
In NP ∩ coNP for V = P(Cm) [BCMW17]
Not known in general!

10

SLIDE 34

General algorithms

SLIDE 35

Convert ε-search to an optimization problem

For b ∈ B := upper triangular matrices, define capλ(v) := inf

b∈B

b · v

i |bii|λi .

Kempf-Ness Theorem λ ∈ ∆(V) ⇐ ⇒ capλ(v) > 0 for generic v ∈ V ε-search reduces to finding algorithm for the following:

Given b with µ(b · v) − diag(λ) > ε,
Output b′ with

b′ · v

i |b′

ii|λi < (1 − δ) b · v

i |bii|λi .

11

SLIDE 36

Convert ε-search to an optimization problem

For b ∈ B := upper triangular matrices, define capλ(v) := inf

b∈B

b · v

i |bii|λi .

Kempf-Ness Theorem λ ∈ ∆(V) ⇐ ⇒ capλ(v) > 0 for generic v ∈ V ε-search reduces to finding algorithm for the following:

Given b with µ(b · v) − diag(λ) > ε,
Output b′ with

b′ · v

i |b′

ii|λi < (1 − δ) b · v

i |bii|λi .

11

SLIDE 37

Convert ε-search to an optimization problem

For b ∈ B := upper triangular matrices, define capλ(v) := inf

b∈B

b · v

i |bii|λi .

Kempf-Ness Theorem λ ∈ ∆(V) ⇐ ⇒ capλ(v) > 0 for generic v ∈ V ε-search reduces to finding algorithm for the following:

Given b with µ(b · v) − diag(λ) > ε,
Output b′ with

b′ · v

i |b′

ii|λi < (1 − δ) b · v

i |bii|λi .

11

SLIDE 38

Convert ε-search to an optimization problem

For b ∈ B := upper triangular matrices, define capλ(v) := inf

b∈B

b · v

i |bii|λi .

Kempf-Ness Theorem λ ∈ ∆(V) ⇐ ⇒ capλ(v) > 0 for generic v ∈ V ε-search reduces to finding algorithm for the following:

Given b with µ(b · v) − diag(λ) > ε,
Output b′ with

b′ · v

i |b′

ii|λi < (1 − δ) b · v

i |bii|λi .

11

SLIDE 39

Optimization algorithms

Alternating minimization: poly(1/ε) time [BFGOWW18]

Tensor products of easy reps e.g. Horn, k-tensors

log capλ(v) can be cast as a geodesically convex program! Domain is positive-semidefinite matrices; geodesics through P take the form √ PeHt√ P Geodesic gradient descent: poly(1/ε) time [BFGOWW19]

Any representation, e.g. V = k Cn, SymkCn, arbitrary quivers

Geodesic trust-regions: poly(log(1/ε), log κ) time [BFGOWW19]

κ is smallest condition-number of an ε-optimizer for capλ(v)
polynomial for some interesting cases, e.g. arbitrary quivers with

λ = 0

12

SLIDE 40

Optimization algorithms

Alternating minimization: poly(1/ε) time [BFGOWW18]

Tensor products of easy reps e.g. Horn, k-tensors

log capλ(v) can be cast as a geodesically convex program! Domain is positive-semidefinite matrices; geodesics through P take the form √ PeHt√ P Geodesic gradient descent: poly(1/ε) time [BFGOWW19]

Any representation, e.g. V = k Cn, SymkCn, arbitrary quivers

Geodesic trust-regions: poly(log(1/ε), log κ) time [BFGOWW19]

κ is smallest condition-number of an ε-optimizer for capλ(v)
polynomial for some interesting cases, e.g. arbitrary quivers with

λ = 0

12

SLIDE 41

Optimization algorithms

Alternating minimization: poly(1/ε) time [BFGOWW18]

Tensor products of easy reps e.g. Horn, k-tensors

log capλ(v) can be cast as a geodesically convex program! Domain is positive-semidefinite matrices; geodesics through P take the form √ PeHt√ P Geodesic gradient descent: poly(1/ε) time [BFGOWW19]

Any representation, e.g. V = k Cn, SymkCn, arbitrary quivers

Geodesic trust-regions: poly(log(1/ε), log κ) time [BFGOWW19]

κ is smallest condition-number of an ε-optimizer for capλ(v)
polynomial for some interesting cases, e.g. arbitrary quivers with

λ = 0

12

SLIDE 42

Optimization algorithms

Alternating minimization: poly(1/ε) time [BFGOWW18]

Tensor products of easy reps e.g. Horn, k-tensors

log capλ(v) can be cast as a geodesically convex program! Domain is positive-semidefinite matrices; geodesics through P take the form √ PeHt√ P Geodesic gradient descent: poly(1/ε) time [BFGOWW19]

Any representation, e.g. V = k Cn, SymkCn, arbitrary quivers

Geodesic trust-regions: poly(log(1/ε), log κ) time [BFGOWW19]

κ is smallest condition-number of an ε-optimizer for capλ(v)
polynomial for some interesting cases, e.g. arbitrary quivers with

λ = 0

12

SLIDE 43

Open problems

1. Is moment polytope membership in NP ∩ coNP, or even RP or P?
2. Membership is in P for Horn’s problem. But how about

exp(− poly)-search?

3. If (A1, A2) a random pair of matrices, does capλ(A1, A2) have an

ε-minimizer with condition number at most exp(poly(log(1/ε), λ))?

13

SLIDE 44

Open problems

1. Is moment polytope membership in NP ∩ coNP, or even RP or P?
2. Membership is in P for Horn’s problem. But how about

exp(− poly)-search?

3. If (A1, A2) a random pair of matrices, does capλ(A1, A2) have an

ε-minimizer with condition number at most exp(poly(log(1/ε), λ))?

13

SLIDE 45