Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) - - PowerPoint PPT Presentation

discrepancy and sdps
SMART_READER_LITE
LIVE PREVIEW

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) - - PowerPoint PPT Presentation

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) Outline Discrepancy Theory What is it Basic Results (non-constructive) SDP connection Algorithms for discrepancy New methods in discrepancy (upper/lower bounds)


slide-1
SLIDE 1

Discrepancy and SDPs

Nikhil Bansal (TU Eindhoven, Netherlands )

slide-2
SLIDE 2

Outline

Discrepancy Theory

  • What is it
  • Basic Results (non-constructive)

SDP connection

  • Algorithms for discrepancy
  • New methods in discrepancy (upper/lower bounds)
  • Approximation
slide-3
SLIDE 3

Discrepancy: Example

Input: n points placed arbitrarily in a grid. Color them red/blue such that each axis-parallel rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Random has about O(n1/2 log1/2 n) Can achieve O(log2.5 n) Why do we care?

slide-4
SLIDE 4

Combinatorial Discrepancy

Universe: U= [1,…,n] Subsets: S1,S2,…,Sm Find χ: [n] ! {-1,+1} to Minimize |χ(S)|1 = maxS | ∑i 2 S χ(i) |

S1 S2 S3 S4

If A is a 𝑛 × n matrix. Disc(A) = min

𝑦∈ −1,1 𝑜 𝐵𝐵 ∞

slide-5
SLIDE 5

Applications

CS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, … Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …

slide-6
SLIDE 6

Hereditary Discrepancy

Discrepancy a useful measure of complexity of a set system Hereditary discrepancy: herdisc (U,S) = max𝑉′⊆𝑉 disc (U’, S|U’) Robust version of discrepancy (How to certify herdisc < D? In NP?)

A1 A2 …

1 2 … n

A’1 A’2 …

1’ 2’ … n’

But not so robust 𝑇𝑗 = 𝐵𝑗 ∪ 𝐵’𝑗 Discrepancy = 0

slide-7
SLIDE 7

Some Applications

slide-8
SLIDE 8

Rounding

Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and 𝐵 ∈ 𝑆𝑜, can round x to 𝐵 ∈ 𝑎𝑜 s.t. 𝐵𝐵 – 𝐵𝐵 ∞ < Herdisc 𝐵 Proof: Round the bits of x one by one. 𝐵1: blah .0101101 𝐵2: blah .1101010 … 𝐵𝑜: blah .0111101 Error = herdisc(A) (

1 2𝑙 + 1 2𝑙−1 + … + 1 2 )

𝐵

  • Ax=b

A (-1) (+1) Key Point: Low discrepancy coloring guides our updates! x

slide-9
SLIDE 9

Rounding

LSV’86 result guarantees existence of good rounding. How to find it efficiently? Thm [B’10]. Can round efficiently, so that Error ≤ 𝑃 log 𝑛 log 𝑜 Herdisc 𝐵 Use SDPs, basic method

slide-10
SLIDE 10

Refinements

Spencer’85: Any 0-1 matrix (n x n ) has disc ≤ 6 𝑜 Non-constructive Entropy method (very powerful technique) B.’10: Algorithmic O( 𝑜) (SDP + Entropy method) Lovett-Meka’12: (much simpler) Better variant of “entropy method” Extends iterated rounding.

Bin-packing: Rothvoss’13: Alg ≤ LP + O(log OPT log log OPT) Karmarkar-Karp’82: Alg ≤ LP + O(log2 OPT)

slide-11
SLIDE 11

Dynamic Data Structures

N weighted points in a 2-d region. Weights updated over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Goal: Preprocess (in a data structure) 1) Low query time 2) Low update time (upon weight change)

slide-12
SLIDE 12

Example

Line: Interval queries

Trivial: Query Time= O(n) Update Time = 1 Query time= 1 Update time= O(𝑜2) (Table of entries W[a,b] ) Query = O(log n) Update = O(log n)

Recursively for 2-d. 𝑃 log2 𝑜 , log2 𝑜

slide-13
SLIDE 13

What about other queries?

Circles arbitrary rectangles aligned triangle Turns out 𝑢𝑟𝑢𝑣 ≥

𝑜1/2 log2 𝑜

Reason: Set system S formed by query sets & points has large discrepancy (about 𝑜1/4 ) Larsen-Green’11: 𝑢𝑟𝑢𝑣 ≥

𝑒𝑗𝑒𝑒 𝑇 2 log2 𝑜

slide-14
SLIDE 14

Lower Bounds

Various methods: Spectral, Fourier analytic, … Determinant lower bound: detlb(A) ≤ herdisc (A) [Lovasz et al. 86] herdisc(A) ≤ polylog(n,m) detlb(A) [Matousek’11] (SDP duality) Polylog approximation for herdisc(A) [Nikolov, Talwar, Zhang’13]

slide-15
SLIDE 15

SDP Connection

slide-16
SLIDE 16

Vector Discrepancy

Exact: Min t −𝑢 ≤ ∑ 𝑏𝑗𝑗𝐵𝑗

𝑗

≤ 𝑢 for all rows i 𝐵𝑗 ∈ −1,1 for each j SDP: vecdisc(A) min t ∑ 𝑏𝑗𝑗𝑤𝑗

𝑗 2 ≤ 𝑢 for all rows i

𝑤𝑗 2 = 1 for each j

slide-17
SLIDE 17

Is vecdisc a good relaxation?

Not directly. vecdisc(A) = 0 even if disc(A) very large [Charikar, Newman, Nikolov’11] NP-Hard: disc(A) = 0 or disc(A) very large ? Let hervecdisc(A) = max

𝑇

vecdisc( 𝐵|𝑇 ) Thm [B’10]: disc(A) = 𝑃 log 𝑛 log 𝑜 hervecdisc(A) Pf: Algorithm

slide-18
SLIDE 18

Algorithm (at high level)

Cube: {-1,+1}n

Analysis: Few steps to reach a vertex (walk has high variance) Disc(Si) does a random walk (with low variance)

start finish

Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated, e.g. δt

1 + δt 2 ¼ 0

Each dimension: An Element Each vertex: A Coloring

slide-19
SLIDE 19

An SDP

Hereditary disc. λ ) the following SDP is feasible

SDP: Low discrepancy

|∑ 𝑏𝑗𝑗 vj |2 · λ2 for each row i. |vj|2 = 1 for each element j. Perhaps 𝑤𝑗 can guide us how to update color of element j ? Trouble: 𝑤𝑗 is a vector. Need a real number. Project on random vector vector g ( ηi = g ¢

vi)

Seems promising: ∑ 𝑏𝑗𝑗 𝜃𝑗

𝑗

= 𝑕 ⋅ ∑ 𝑏𝑗𝑗 𝑤𝑗

𝑗

Obtain vj 2 Rn

slide-20
SLIDE 20

Properties of Rounding

Lemma: If g 2 Rn is a random Gaussian, for any v 2 Rn, g ¢ v is distributed as N(0, |v|2) 1. Each ηj » N(0,1) 2. For each row i, ∑j aij ηj = g ¢ (∑j aij vj) » N(0, · λ2) (std deviation · λ)

SDP: |vj|2 = 1 |∑j aij vj|2 · λ2

η’s will guide our updates to x.

slide-21
SLIDE 21

Algorithm Overview

Construct coloring iteratively. Initially: Start with coloring x0 = (0,0,0, …,0) at t = 0. At Time t: Update coloring as xt = xt-1 + γ (ηt

1,…,ηt n)

(γ tiny: say 1/n) x(i) xt(j) = γ (η1

j + η2 j + … + ηt j)

Color of element j: Does random walk

  • ver time with step size ¼ γ Ν(0,1)

Fixed if reaches -1 or +1. time

  • 1

+1 Disc(row i): ∑ aij

𝑗

xt(j) does a random walk w/ step γ N(0,· λ2)

slide-22
SLIDE 22

Analysis

At time T = O(1/γ2) 1: With prob. ½, an element reaches -1 or +1. 2: Each row has discrepancy in expectation. At time T = O((log n)/𝛿2)

  • 1. Most likely all elements fixed
  • 2. Expected discrepancy for a row = 𝜇 log 𝑜

(By Chernoff, all have discrepancy O(𝜇 log 𝑜 log 𝑛 )

slide-23
SLIDE 23

New Entropy Method

slide-24
SLIDE 24

Entropy method

Very powerful method to prove discrepancy upper bounds [Beck, Spencer 80’s]: Given an m x n matrix A, there is a partial coloring satisfying 𝑏𝑗𝐵 ≤ 𝜇𝑗 𝑏𝑗 2 provided ∑ 𝑕(𝜇𝑗)

𝑗

𝑜 5

E.g. can ask for partial coloring with 0 discrepancy on n/log n rows, and reasonable amount on others.

𝑕 𝜇𝑗 ≈ ln

1 𝜇𝑗 if 𝜇𝑗 < 1

≈ 𝑓−𝜇𝑗

2 if 𝜇𝑗≥ 1

slide-25
SLIDE 25

Lovett Meka Algorithm

Do a sticky random walk. If some row 𝐵𝑗 gets tight (disc(𝐵𝑗) = 𝜇𝑗 𝑏𝑗 2) Move in space 𝐵𝑗 x = 0 Make progress as long as dimension = Ω 𝑜 ∑𝑗 exp −𝜇𝑗

2 ≤ 𝑜 2 (better than entropy method)

Guarantees a partial coloring, even if n/4 𝜇𝑗’s are 0.

start

slide-26
SLIDE 26

Comparison with iterated rounding

Fact: If n variables, but ≤ n/2 constraints Ax = b (rest: 0 ≤ x ≤ 1) There exists a basic feasible solution with > n/2 variables 0-1. Iterated rounding: LP with m constraints, drop except n/2. (no control on dropped constrains (𝑏𝑗 x ≤ 𝑐𝑗), error up to |𝑏𝑗|1) Lovett-Meka Lemma: Can find solution with ≥ n/2 integral variables with error ≤ 𝜇𝑗 𝑏𝑗 2).

E.g. Can set n/10 constraints to have 0 error, and controlled bounds on others

slide-27
SLIDE 27

Lower Bounds

slide-28
SLIDE 28

Discrepancy

If disc(A) > D 𝐵𝐵 ∞ ≥ 𝐸 for all 𝐵 ∈ −1,1 𝑜 (say A: n x n matrix for convenience) If 𝜏𝑛𝑗𝑜(𝐵) ≥ 𝐸 𝐵𝐵 2 ≥ 𝜏𝑛𝑗𝑜 𝐵 𝐵 2 Could be very weak bound. Can consider 𝜏𝑛𝑗𝑜(PA) P:diag, tr(P) = n

slide-29
SLIDE 29

Determinant Lower Bound

Thm (Lovasz Spencer Vesztergombi’86): herdisc 𝐵 ≥ detlb(A) detlb(A): max

𝑙

max

𝑙×𝑙 submatrix 𝐶 of 𝐵 det 𝐶 1/𝑙

(simple geometric argument)

Conjecture (LSV’86): Herdisc ≤ O(1) detlb Remark: For TU Matrices, Herdisc(A) =1, detlb = 1 (every submatrix has det -1,0 or +1)

slide-30
SLIDE 30

Hoffman’s example

Hoffman: Detlb(A) ≤ 2 herdisc 𝐵 ≥

log 𝑜 log log 𝑜

Palvolgyi’11: Ω log 𝑜 gap T: k-ary tree of depth k. ≈ 𝑙𝑙nodes.

S: All edges out of a node. S’: All leaf to root paths. Both S and S’ are TU.

Claim: 𝐸𝑓𝑢𝐸𝑐 𝑇 ∪ 𝑇𝑇 ≤ 2 (Expand determinant) Herdisc(𝑇 ∪ 𝑇𝑇) = k

...

slide-31
SLIDE 31

Matousek’11: herdisc(A) ≤O(log n log 𝑛) detlb. Idea: SDP Duality -> Dual Witness for large herdisc(A). Dual Witness -> Submatrix with large determinant. Other implications: herdisc 𝐵1 ∪ ⋯ ∪ 𝐵𝑢 ≤ 𝑃 log 𝑜 log 𝑛 𝑢 max

𝑗 . herdisc 𝐵𝑗

slide-32
SLIDE 32

Matousek’s result

Thm: Herdisc(A) = O(log n log 𝑛) detlb(A) Pf: Recall, Disc 𝐵 = 𝑃 log 𝑛 log 𝑜 Hervecdisc(A) Some S, s.t. Vecdisc 𝐵|𝑇 = Herdisc(A)/ 𝑃 log 𝑛 log 𝑜 Will show: vecdisc 𝐵|𝑇 ≤ 𝑃 log n detlb 𝐵|𝑇 Let us use A for 𝐵|𝑇

slide-33
SLIDE 33

SDP Duality

If vecdisc 𝐵 ≥ 𝐸, there exist weights 𝑥1, … , 𝑥𝑛 ≥ 0 with ∑ 𝑥𝑗

𝑗

≤ 1 And 𝑨1, … , 𝑨𝑜 ≥ 0 such that ∑ 𝑨

𝑗 𝑗

≥ 𝐸2 ∑ 𝑥𝑗

𝑗

∑ 𝑏𝑗𝑗𝐵𝑗

𝑗 2 ≥ ∑ 𝑨 𝑗𝐵𝑗 2 𝑗

for all 𝐵 ∈ 𝑆𝑜. Why is this a witness? 𝑥𝑗 𝑨

𝑗

slide-34
SLIDE 34

Proof Sketch

As ∑ 𝑨

𝑗 𝑗

≥ 𝐸2 There exists a subset of variables U such that 𝑨

𝑗 ∈

𝐸2 4 𝑉 log 𝑜 , 𝐸2 2 𝑉 log 𝑜 ∑ 𝑥𝑗

𝑗

∑ 𝑏𝑗𝑗𝐵𝑗

𝑗∈𝑉 2 ≥ 𝐸2 4 𝑉 log 𝑜 ∑

𝐵𝑗

2 𝑗∈𝑉

for all 𝐵 ∈ 𝑆𝑜. 𝑋1/2 𝐵|𝑉 has large 𝜏min Use Cauchy-Binet to show that A has large sub-determinant U

slide-35
SLIDE 35

Concluding Remarks

SDP view has been extremely useful in discrepancy Various open problems in discrepancy

Beck Fiala Conjecture Discrepancy of points and rectangles Constructive version of Banasczcyk’s theorem? …

Tightness of detlb bound ( log vs log3/2 ) O(1) approximation for herdisc

slide-36
SLIDE 36

Thanks!