A Friendly Smoothed Analysis of the Simplex Method Daniel Dadush - - PowerPoint PPT Presentation

a friendly smoothed analysis of the simplex method
SMART_READER_LITE
LIVE PREVIEW

A Friendly Smoothed Analysis of the Simplex Method Daniel Dadush - - PowerPoint PPT Presentation

A Friendly Smoothed Analysis of the Simplex Method Daniel Dadush (CWI) Sophie Huiberts (CWI) Aussois, January 2019 Linear Programming (LP) and the Simplex Method maximize c T x subject to Ax b d variables n constraints Simplex


slide-1
SLIDE 1

A Friendly Smoothed Analysis of the Simplex Method

Daniel Dadush (CWI) Sophie Huiberts (CWI) Aussois, January 2019

slide-2
SLIDE 2

Linear Programming (LP) and the Simplex Method

maximize cTx subject to Ax ≤ b

◮ d variables ◮ n constraints

slide-3
SLIDE 3

Simplex method: A short history

1947 2004 Dantzig: simplex method.

slide-4
SLIDE 4

Simplex method: A short history

1947 2004 Dantzig: simplex method. Hirsch: maximal distance n − d? 1957

slide-5
SLIDE 5

Simplex method: A short history

1947 2004 Dantzig: simplex method. Hirsch: maximal distance n − d? 1957 Klee, Minty: exponential worst case instance. 1970

slide-6
SLIDE 6

Simplex method: A short history

1947 2004 Dantzig: simplex method. Hirsch: maximal distance n − d? 1957 Klee, Minty: exponential worst case instance. 1970 Borgwardt: polynomial average case complexity. 1977

slide-7
SLIDE 7

Simplex method: A short history

1947 2004 Dantzig: simplex method. Hirsch: maximal distance n − d? 1957 Klee, Minty: exponential worst case instance. 1970 Borgwardt: polynomial average case complexity. 1977 Kalai, Kleitman: paths of length nlog d+2 exist. 1992 Kalai: sub-exponential pivot rule.

slide-8
SLIDE 8

Simplex method: A short history

1947 2004 Dantzig: simplex method. Hirsch: maximal distance n − d? 1957 Klee, Minty: exponential worst case instance. 1970 Borgwardt: polynomial average case complexity. 1977 Kalai, Kleitman: paths of length nlog d+2 exist. 1992 Kalai: sub-exponential pivot rule. Spielman-Teng: polynomial smoothed complexity.

slide-9
SLIDE 9

Average-case analysis

maximize cTx subject to Ax ≤ b x ≥ 0

◮ b = 1, Rows of A sampled from a rotationally symmetric distribution

(RSD). O(n1/dd3).

[Borgwardt ’77,’82,’87,’99]

slide-10
SLIDE 10

Average-case analysis

maximize cTx subject to Ax ≤ b x ≥ 0

◮ b = 1, Rows of A sampled from a rotationally symmetric distribution

(RSD). O(n1/dd3).

[Borgwardt ’77,’82,’87,’99]

◮ Rows of A,b,c sampled independently from an RSD.

[Smale ’83, Megiddo ’86]

slide-11
SLIDE 11

Average-case analysis

maximize cTx subject to Ax ≤ b x ≥ 0

◮ b = 1, Rows of A sampled from a rotationally symmetric distribution

(RSD). O(n1/dd3).

[Borgwardt ’77,’82,’87,’99]

◮ Rows of A,b,c sampled independently from an RSD.

[Smale ’83, Megiddo ’86]

◮ Fixed data. Flip signs of constraints at random. O(min{n2, d2}).

[Adler ’83, Haimovich ’83 Adler, Megiddo ‘85, Todd ‘86, Adler,Karp,Shamir ‘87]

slide-12
SLIDE 12

Random is Not Typical

slide-13
SLIDE 13

Random is Not Typical

Smoothed Complexity (Spielman, Teng ’ 01)

  • Worst case, σ = 0
  • Smoothed analysis, σ variable
slide-14
SLIDE 14

Defining polynomial smoothed complexity

◮ c ∈ Rd, ¯

A ∈ Rn×d, ¯ b ∈ Rn. Rows of ( ¯ A, ¯ b) norm at most 1.

slide-15
SLIDE 15

Defining polynomial smoothed complexity

◮ c ∈ Rd, ¯

A ∈ Rn×d, ¯ b ∈ Rn. Rows of ( ¯ A, ¯ b) norm at most 1.

◮ ˆ

A, ˆ b: entries iid N(0, σ2).

◮ A = ¯

A + ˆ A, b = ¯ b + ˆ b.

slide-16
SLIDE 16

Defining polynomial smoothed complexity

◮ c ∈ Rd, ¯

A ∈ Rn×d, ¯ b ∈ Rn. Rows of ( ¯ A, ¯ b) norm at most 1.

◮ ˆ

A, ˆ b: entries iid N(0, σ2).

◮ A = ¯

A + ˆ A, b = ¯ b + ˆ b.

◮ Smoothed Linear Program:

maximize cTx subject to Ax ≤ b.

slide-17
SLIDE 17

Defining polynomial smoothed complexity

◮ c ∈ Rd, ¯

A ∈ Rn×d, ¯ b ∈ Rn. Rows of ( ¯ A, ¯ b) norm at most 1.

◮ ˆ

A, ˆ b: entries iid N(0, σ2).

◮ A = ¯

A + ˆ A, b = ¯ b + ˆ b.

◮ Smoothed Linear Program:

maximize cTx subject to Ax ≤ b. Polynomial smoothed complexity: expected poly(n, d, σ−1) pivots.

slide-18
SLIDE 18

Results: smoothed complexity bounds

◮ d variables. ◮ n constraints. ◮ N(0, σ2) Gaussian noise.

Works Expected Number of Pivots Spielman, Teng ’04

  • O(n86d55σ−30 + n86d70)

Vershynin ’09 O(d3 ln3 nσ−4 + d9 ln7 n) Dadush, H. ’18 O(d2√ ln nσ−2 + d3 ln3/2 n)

slide-19
SLIDE 19

9 out of 10 Theoreticians recommend: Shadow Vertex rule

  • 1. Start at vertex x optimizing an objective c′ ∈ Rd.
  • 2. cλ := λc + (1 − λ)c′.
  • 3. Increase λ from 0 to 1, tracking optimal vertex for cλ.

Gass, Saaty ’55: shadow vertex rule.

slide-20
SLIDE 20

9 out of 10 Theoreticians recommend: Shadow Vertex rule

  • 1. Start at vertex x optimizing an objective c′ ∈ Rd.
  • 2. cλ := λc + (1 − λ)c′.
  • 3. Increase λ from 0 to 1, tracking optimal vertex for cλ.

Gass, Saaty ’55: shadow vertex rule.

slide-21
SLIDE 21

9 out of 10 Theoreticians recommend: Shadow Vertex rule

  • 1. Start at vertex x optimizing an objective c′ ∈ Rd.
  • 2. cλ := λc + (1 − λ)c′.
  • 3. Increase λ from 0 to 1, tracking optimal vertex for cλ.

Gass, Saaty ’55: shadow vertex rule.

slide-22
SLIDE 22

9 out of 10 Theoreticians recommend: Shadow Vertex rule

  • 1. Start at vertex x optimizing an objective c′ ∈ Rd.
  • 2. cλ := λc + (1 − λ)c′.
  • 3. Increase λ from 0 to 1, tracking optimal vertex for cλ.

Gass, Saaty ’55: shadow vertex rule.

slide-23
SLIDE 23

9 out of 10 Theoreticians recommend: Shadow Vertex rule

  • 1. Start at vertex x optimizing an objective c′ ∈ Rd.
  • 2. cλ := λc + (1 − λ)c′.
  • 3. Increase λ from 0 to 1, tracking optimal vertex for cλ.

Gass, Saaty ’55: shadow vertex rule.

slide-24
SLIDE 24

9 out of 10 Theoreticians recommend: Shadow Vertex rule

  • 1. Start at vertex x optimizing an objective c′ ∈ Rd.
  • 2. cλ := λc + (1 − λ)c′.
  • 3. Increase λ from 0 to 1, tracking optimal vertex for cλ.

Gass, Saaty ’55: shadow vertex rule.

slide-25
SLIDE 25

9 out of 10 Theoreticians recommend: Shadow Vertex rule

  • 1. Start at vertex x optimizing an objective c′ ∈ Rd.
  • 2. cλ := λc + (1 − λ)c′.
  • 3. Increase λ from 0 to 1, tracking optimal vertex for cλ.

Gass, Saaty ’55: shadow vertex rule.

Why work with shadow vertex rule? Can locally determine if a vertex is on the path. Borgwardt ’77

slide-26
SLIDE 26

Fundamental estimate: number of shadow edges

◮ P = {x : Ax ≤ 1}. ◮ A smoothed. ◮ RHS 1 fixed. ◮ W fixed 2D plane.

Shadow bound := Expected # vertices in projection of P onto W .

slide-27
SLIDE 27

Results: shadow size

◮ Shadow of P = {x : Ax ≤ 1} on fixed 2D plane W . ◮ d variables. ◮ n constraints. ◮ σ standard deviation.

Works Expected Number of Vertices Spielman, Teng ’04 O(d3nσ−6 + d6n ln3 n) Deshpande, Spielman ’05 O(dn2 ln n σ−2 + d2n2 ln2 n) Vershynin ’09 O(d3σ−4 + d5 ln2 n) Dadush, H. ’18 O(d2√ ln n σ−2 + d2.5 ln3/2 n (1 + σ−1)) Borgwardt ’87 Ω(d3/2√ ln n) (E[A] = 0)

slide-28
SLIDE 28

Polyhedral duality

P :={x : aT

i x ≤ 1 ∀i ≤ n}.

Q : = ConvexHull(a1, . . . , an) number of vertices in πW (P) ≤ number of edges in Q ∩ W .

slide-29
SLIDE 29

Counting polyhedron edges: comparing lengths

#edges ≤ perimeter minimum edge length

slide-30
SLIDE 30

Counting polyhedron edges: comparing lengths

#edges ≤ perimeter minimum edge length

slide-31
SLIDE 31

Counting polyhedron edges: comparing lengths

#edges ≤ perimeter minimum edge length

slide-32
SLIDE 32

Counting polyhedron edges: comparing lengths

E[#edges] ≤ E[perimeter] minimum E[ edge length ]

slide-33
SLIDE 33

Counting shadow edges: proof of lemma

Let E(B) be the event that conv(B) ∩ W forms an edge of Q ∩ W . E[perimeter(Q ∩ W )] =

  • B⊂{a1,...,an}

|B|=d

E[length(conv(B) ∩ W ) | E(B)] Pr[E(B)]

slide-34
SLIDE 34

Counting shadow edges: proof of lemma

Let E(B) be the event that conv(B) ∩ W forms an edge of Q ∩ W . E[perimeter(Q ∩ W )] =

  • B⊂{a1,...,an}

|B|=d

E[length(conv(B) ∩ W ) | E(B)] Pr[E(B)] ≥ min

|B|=d E[length(conv(B) ∩ W ) | E(B)]

  • |B′|=d

Pr[E(B′)]

slide-35
SLIDE 35

Counting shadow edges: proof of lemma

Let E(B) be the event that conv(B) ∩ W forms an edge of Q ∩ W . E[perimeter(Q ∩ W )] =

  • B⊂{a1,...,an}

|B|=d

E[length(conv(B) ∩ W ) | E(B)] Pr[E(B)] ≥ min

|B|=d E[length(conv(B) ∩ W ) | E(B)]

  • |B′|=d

Pr[E(B′)] = min

|B|=d E[length(conv(B) ∩ W ) | E(B)]E[#edges]

slide-36
SLIDE 36

Counting shadow edges: proof of lemma

Let E(B) be the event that conv(B) ∩ W forms an edge of Q ∩ W . E[perimeter(Q ∩ W )] =

  • B⊂{a1,...,an}

|B|=d

E[length(conv(B) ∩ W ) | E(B)] Pr[E(B)] ≥ min

|B|=d E[length(conv(B) ∩ W ) | E(B)]

  • |B′|=d

Pr[E(B′)] = min

|B|=d E[length(conv(B) ∩ W ) | E(B)]E[#edges]

So E[#edges] ≤

E[perimeter(Q∩W )] min|B|=d E[length(conv(B)∩W )|E(B)].

slide-37
SLIDE 37

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ]

slide-38
SLIDE 38

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ]

slide-39
SLIDE 39

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] E[perimeter(Q ∩ W )] ≤ 2πE[ max

x∈Q∩W x]

≤ 2πE[max πW (ai)] ≤ O(1 + σ √ ln n)

slide-40
SLIDE 40

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ]

slide-41
SLIDE 41

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-42
SLIDE 42

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-43
SLIDE 43

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-44
SLIDE 44

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-45
SLIDE 45

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-46
SLIDE 46

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-47
SLIDE 47

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-48
SLIDE 48

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-49
SLIDE 49

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1

slide-50
SLIDE 50

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1 E[height of simplex along l] ≥ Ω(τ/ √ d) Parameter Description Gaussian τ ai restricted to line has variance at least τ 2 σ

slide-51
SLIDE 51

High-level ideas

E[#edges(Q ∩ W )] ≤ E[perimeter(Q ∩ W )] minimum E[ edge length ] l W a3 a2 a1 E[height of simplex along l] ≥ Ω(τ/ √ d) E

  • intersection length

height of simplex along l

  • ≥ Ω
  • 1

dL(1 + R)

  • Parameter Description

Gaussian τ ai restricted to line has variance at least τ 2 σ R Pr[maxi≤n ai ≥ 1 + R] ≤ n−d O(σ √ d ln n) L L-log-Lipschitz prob. density within radius R O(σ−1√ d ln n)

slide-52
SLIDE 52

Parametrized Shadow Bound

Parameter Description Gaussian τ ai restricted to line has variance at least τ 2 σ R Pr[maxi≤n ai ≥ 1 + R] ≤ n−d O(σ √ d ln n) L L-log-Lipschitz prob. density within radius R O(σ−1√ d ln n) r Expected maximum perturbation size along W O(σ √ ln n)

Theorem (Dadush, H. ’18)

E[# edges(conv(a1 . . . , an) ∩ W )] ≤ O(d1.5 L τ (1 + r)(1 + R)).

slide-53
SLIDE 53

Issues in applying shadow bound

Shadow of {x : Ax ≤ 1} onto fixed 2D plane W has an expected O(d2√ ln n σ−2 + d2.5 ln3/2 n (1 + σ−1)) number of edges. Issues:

◮ What if RHS = 1? ◮ Noise on RHS. ◮ How to use this bound algorithmically?

slide-54
SLIDE 54

Open Problems

  • 1. Improve upper/lower bounds on expected shadow size.
  • 2. Sparse noise? Bounded noise?
  • 3. Analyze sequences of related LPs.