Discrepancy and SDPs
Nikhil Bansal (TU Eindhoven, Netherlands )
Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) - - PowerPoint PPT Presentation
Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) Outline Discrepancy Theory What is it Basic Results (non-constructive) SDP connection Algorithms for discrepancy New methods in discrepancy (upper/lower bounds)
Nikhil Bansal (TU Eindhoven, Netherlands )
Discrepancy Theory
SDP connection
Input: n points placed arbitrarily in a grid. Color them red/blue such that each axis-parallel rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Random has about O(n1/2 log1/2 n) Can achieve O(log2.5 n) Why do we care?
Universe: U= [1,…,n] Subsets: S1,S2,…,Sm Find χ: [n] ! {-1,+1} to Minimize |χ(S)|1 = maxS | ∑i 2 S χ(i) |
S1 S2 S3 S4
If A is a 𝑛 × n matrix. Disc(A) = min
𝑦∈ −1,1 𝑜 𝐵𝐵 ∞
CS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, … Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …
Discrepancy a useful measure of complexity of a set system Hereditary discrepancy: herdisc (U,S) = max𝑉′⊆𝑉 disc (U’, S|U’) Robust version of discrepancy (How to certify herdisc < D? In NP?)
A1 A2 …
1 2 … n
A’1 A’2 …
1’ 2’ … n’
But not so robust 𝑇𝑗 = 𝐵𝑗 ∪ 𝐵’𝑗 Discrepancy = 0
Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and 𝐵 ∈ 𝑆𝑜, can round x to 𝐵 ∈ 𝑎𝑜 s.t. 𝐵𝐵 – 𝐵𝐵 ∞ < Herdisc 𝐵 Proof: Round the bits of x one by one. 𝐵1: blah .0101101 𝐵2: blah .1101010 … 𝐵𝑜: blah .0111101 Error = herdisc(A) (
1 2𝑙 + 1 2𝑙−1 + … + 1 2 )
𝐵
A (-1) (+1) Key Point: Low discrepancy coloring guides our updates! x
LSV’86 result guarantees existence of good rounding. How to find it efficiently? Thm [B’10]. Can round efficiently, so that Error ≤ 𝑃 log 𝑛 log 𝑜 Herdisc 𝐵 Use SDPs, basic method
Spencer’85: Any 0-1 matrix (n x n ) has disc ≤ 6 𝑜 Non-constructive Entropy method (very powerful technique) B.’10: Algorithmic O( 𝑜) (SDP + Entropy method) Lovett-Meka’12: (much simpler) Better variant of “entropy method” Extends iterated rounding.
Bin-packing: Rothvoss’13: Alg ≤ LP + O(log OPT log log OPT) Karmarkar-Karp’82: Alg ≤ LP + O(log2 OPT)
N weighted points in a 2-d region. Weights updated over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Goal: Preprocess (in a data structure) 1) Low query time 2) Low update time (upon weight change)
Line: Interval queries
Trivial: Query Time= O(n) Update Time = 1 Query time= 1 Update time= O(𝑜2) (Table of entries W[a,b] ) Query = O(log n) Update = O(log n)
Recursively for 2-d. 𝑃 log2 𝑜 , log2 𝑜
Circles arbitrary rectangles aligned triangle Turns out 𝑢𝑟𝑢𝑣 ≥
𝑜1/2 log2 𝑜
Reason: Set system S formed by query sets & points has large discrepancy (about 𝑜1/4 ) Larsen-Green’11: 𝑢𝑟𝑢𝑣 ≥
𝑒𝑗𝑒𝑒 𝑇 2 log2 𝑜
Various methods: Spectral, Fourier analytic, … Determinant lower bound: detlb(A) ≤ herdisc (A) [Lovasz et al. 86] herdisc(A) ≤ polylog(n,m) detlb(A) [Matousek’11] (SDP duality) Polylog approximation for herdisc(A) [Nikolov, Talwar, Zhang’13]
Exact: Min t −𝑢 ≤ ∑ 𝑏𝑗𝑗𝐵𝑗
𝑗
≤ 𝑢 for all rows i 𝐵𝑗 ∈ −1,1 for each j SDP: vecdisc(A) min t ∑ 𝑏𝑗𝑗𝑤𝑗
𝑗 2 ≤ 𝑢 for all rows i
𝑤𝑗 2 = 1 for each j
Not directly. vecdisc(A) = 0 even if disc(A) very large [Charikar, Newman, Nikolov’11] NP-Hard: disc(A) = 0 or disc(A) very large ? Let hervecdisc(A) = max
𝑇
vecdisc( 𝐵|𝑇 ) Thm [B’10]: disc(A) = 𝑃 log 𝑛 log 𝑜 hervecdisc(A) Pf: Algorithm
Cube: {-1,+1}n
Analysis: Few steps to reach a vertex (walk has high variance) Disc(Si) does a random walk (with low variance)
start finish
Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated, e.g. δt
1 + δt 2 ¼ 0
Each dimension: An Element Each vertex: A Coloring
Hereditary disc. λ ) the following SDP is feasible
SDP: Low discrepancy
|∑ 𝑏𝑗𝑗 vj |2 · λ2 for each row i. |vj|2 = 1 for each element j. Perhaps 𝑤𝑗 can guide us how to update color of element j ? Trouble: 𝑤𝑗 is a vector. Need a real number. Project on random vector vector g ( ηi = g ¢
vi)
Seems promising: ∑ 𝑏𝑗𝑗 𝜃𝑗
𝑗
= ⋅ ∑ 𝑏𝑗𝑗 𝑤𝑗
𝑗
Obtain vj 2 Rn
Lemma: If g 2 Rn is a random Gaussian, for any v 2 Rn, g ¢ v is distributed as N(0, |v|2) 1. Each ηj » N(0,1) 2. For each row i, ∑j aij ηj = g ¢ (∑j aij vj) » N(0, · λ2) (std deviation · λ)
SDP: |vj|2 = 1 |∑j aij vj|2 · λ2
η’s will guide our updates to x.
Construct coloring iteratively. Initially: Start with coloring x0 = (0,0,0, …,0) at t = 0. At Time t: Update coloring as xt = xt-1 + γ (ηt
1,…,ηt n)
(γ tiny: say 1/n) x(i) xt(j) = γ (η1
j + η2 j + … + ηt j)
Color of element j: Does random walk
Fixed if reaches -1 or +1. time
+1 Disc(row i): ∑ aij
𝑗
xt(j) does a random walk w/ step γ N(0,· λ2)
At time T = O(1/γ2) 1: With prob. ½, an element reaches -1 or +1. 2: Each row has discrepancy in expectation. At time T = O((log n)/𝛿2)
(By Chernoff, all have discrepancy O(𝜇 log 𝑜 log 𝑛 )
Very powerful method to prove discrepancy upper bounds [Beck, Spencer 80’s]: Given an m x n matrix A, there is a partial coloring satisfying 𝑏𝑗𝐵 ≤ 𝜇𝑗 𝑏𝑗 2 provided ∑ (𝜇𝑗)
𝑗
≤
𝑜 5
E.g. can ask for partial coloring with 0 discrepancy on n/log n rows, and reasonable amount on others.
𝜇𝑗 ≈ ln
1 𝜇𝑗 if 𝜇𝑗 < 1
≈ 𝑓−𝜇𝑗
2 if 𝜇𝑗≥ 1
Do a sticky random walk. If some row 𝐵𝑗 gets tight (disc(𝐵𝑗) = 𝜇𝑗 𝑏𝑗 2) Move in space 𝐵𝑗 x = 0 Make progress as long as dimension = Ω 𝑜 ∑𝑗 exp −𝜇𝑗
2 ≤ 𝑜 2 (better than entropy method)
Guarantees a partial coloring, even if n/4 𝜇𝑗’s are 0.
start
Fact: If n variables, but ≤ n/2 constraints Ax = b (rest: 0 ≤ x ≤ 1) There exists a basic feasible solution with > n/2 variables 0-1. Iterated rounding: LP with m constraints, drop except n/2. (no control on dropped constrains (𝑏𝑗 x ≤ 𝑐𝑗), error up to |𝑏𝑗|1) Lovett-Meka Lemma: Can find solution with ≥ n/2 integral variables with error ≤ 𝜇𝑗 𝑏𝑗 2).
E.g. Can set n/10 constraints to have 0 error, and controlled bounds on others
If disc(A) > D 𝐵𝐵 ∞ ≥ 𝐸 for all 𝐵 ∈ −1,1 𝑜 (say A: n x n matrix for convenience) If 𝜏𝑛𝑗𝑜(𝐵) ≥ 𝐸 𝐵𝐵 2 ≥ 𝜏𝑛𝑗𝑜 𝐵 𝐵 2 Could be very weak bound. Can consider 𝜏𝑛𝑗𝑜(PA) P:diag, tr(P) = n
Thm (Lovasz Spencer Vesztergombi’86): herdisc 𝐵 ≥ detlb(A) detlb(A): max
𝑙
max
𝑙×𝑙 submatrix 𝐶 of 𝐵 det 𝐶 1/𝑙
(simple geometric argument)
Conjecture (LSV’86): Herdisc ≤ O(1) detlb Remark: For TU Matrices, Herdisc(A) =1, detlb = 1 (every submatrix has det -1,0 or +1)
Hoffman: Detlb(A) ≤ 2 herdisc 𝐵 ≥
log 𝑜 log log 𝑜
Palvolgyi’11: Ω log 𝑜 gap T: k-ary tree of depth k. ≈ 𝑙𝑙nodes.
S: All edges out of a node. S’: All leaf to root paths. Both S and S’ are TU.
Claim: 𝐸𝑓𝑢𝐸𝑐 𝑇 ∪ 𝑇𝑇 ≤ 2 (Expand determinant) Herdisc(𝑇 ∪ 𝑇𝑇) = k
Matousek’11: herdisc(A) ≤O(log n log 𝑛) detlb. Idea: SDP Duality -> Dual Witness for large herdisc(A). Dual Witness -> Submatrix with large determinant. Other implications: herdisc 𝐵1 ∪ ⋯ ∪ 𝐵𝑢 ≤ 𝑃 log 𝑜 log 𝑛 𝑢 max
𝑗 . herdisc 𝐵𝑗
Thm: Herdisc(A) = O(log n log 𝑛) detlb(A) Pf: Recall, Disc 𝐵 = 𝑃 log 𝑛 log 𝑜 Hervecdisc(A) Some S, s.t. Vecdisc 𝐵|𝑇 = Herdisc(A)/ 𝑃 log 𝑛 log 𝑜 Will show: vecdisc 𝐵|𝑇 ≤ 𝑃 log n detlb 𝐵|𝑇 Let us use A for 𝐵|𝑇
If vecdisc 𝐵 ≥ 𝐸, there exist weights 𝑥1, … , 𝑥𝑛 ≥ 0 with ∑ 𝑥𝑗
𝑗
≤ 1 And 𝑨1, … , 𝑨𝑜 ≥ 0 such that ∑ 𝑨
𝑗 𝑗
≥ 𝐸2 ∑ 𝑥𝑗
𝑗
∑ 𝑏𝑗𝑗𝐵𝑗
𝑗 2 ≥ ∑ 𝑨 𝑗𝐵𝑗 2 𝑗
for all 𝐵 ∈ 𝑆𝑜. Why is this a witness? 𝑥𝑗 𝑨
𝑗
As ∑ 𝑨
𝑗 𝑗
≥ 𝐸2 There exists a subset of variables U such that 𝑨
𝑗 ∈
𝐸2 4 𝑉 log 𝑜 , 𝐸2 2 𝑉 log 𝑜 ∑ 𝑥𝑗
𝑗
∑ 𝑏𝑗𝑗𝐵𝑗
𝑗∈𝑉 2 ≥ 𝐸2 4 𝑉 log 𝑜 ∑
𝐵𝑗
2 𝑗∈𝑉
for all 𝐵 ∈ 𝑆𝑜. 𝑋1/2 𝐵|𝑉 has large 𝜏min Use Cauchy-Binet to show that A has large sub-determinant U
SDP view has been extremely useful in discrepancy Various open problems in discrepancy
Beck Fiala Conjecture Discrepancy of points and rectangles Constructive version of Banasczcyk’s theorem? …
Tightness of detlb bound ( log vs log3/2 ) O(1) approximation for herdisc