discrepancy and sdps
play

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) - PowerPoint PPT Presentation

Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands ) Outline Discrepancy Theory What is it Basic Results (non-constructive) SDP connection Algorithms for discrepancy New methods in discrepancy (upper/lower bounds)


  1. Discrepancy and SDPs Nikhil Bansal (TU Eindhoven, Netherlands )

  2. Outline Discrepancy Theory • What is it • Basic Results (non-constructive) SDP connection • Algorithms for discrepancy • New methods in discrepancy (upper/lower bounds) • Approximation

  3. Discrepancy: Example Input: n points placed arbitrarily in a grid. Color them red/blue such that each axis-parallel rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Random has about O(n 1/2 log 1/2 n) Can achieve O(log 2.5 n) Why do we care?

  4. Combinatorial Discrepancy Universe: U= [1,…,n] S 3 Subsets: S 1 ,S 2 ,…,S m S 4 S 1 Find χ : [n] ! {-1,+1} to S 2 Minimize | χ (S)| 1 = max S | ∑ i 2 S χ (i) | If A is a 𝑛 × n matrix. 𝑦∈ −1 , 1 𝑜 𝐵𝐵 ∞ min Disc(A) =

  5. Applications CS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, … Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …

  6. Hereditary Discrepancy Discrepancy a useful measure of complexity of a set system 1 2 … n 1’ 2’ … n’ 𝑇 𝑗 = 𝐵 𝑗 ∪ 𝐵 ’ 𝑗 A 1 A’ 1 But not so robust A 2 A’ 2 … … Discrepancy = 0 Hereditary discrepancy: herdisc (U,S) = max 𝑉 ′ ⊆𝑉 disc (U’, S |U’ ) Robust version of discrepancy (How to certify herdisc < D? In NP?)

  7. Some Applications

  8. Rounding Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and 𝐵 ∈ 𝑆 𝑜 , can round x to 𝐵 � ∈ 𝑎 𝑜 s.t. 𝐵𝐵 – 𝐵𝐵 � ∞ < Herdisc 𝐵 𝐵 � x Ax=b Proof: Round the bits of x one by one. 𝐵 1 : blah .0101101 (-1) 𝐵 2 : blah .1101010 Key Point: Low discrepancy A … coloring guides our updates! 𝐵 𝑜 : blah .0111101 (+1) 1 1 1 2 𝑙 + 2 𝑙−1 + … + 2 ) Error = herdisc(A) (

  9. Rounding LSV’86 result guarantees existence of good rounding. How to find it efficiently? Thm [B’10]. Can round efficiently, so that Error ≤ 𝑃 log 𝑛 log 𝑜 Herdisc 𝐵 Use SDPs, basic method

  10. Refinements Spencer’85: Any 0-1 matrix (n x n ) has disc ≤ 6 𝑜 Non-constructive Entropy method (very powerful technique) B.’10: Algorithmic O( 𝑜 ) (SDP + Entropy method) Lovett-Meka’12: (much simpler) Better variant of “entropy method” Extends iterated rounding. Bin-packing: Rothvoss’13: Alg ≤ LP + O(log OPT log log OPT) Karmarkar-Karp’82: Alg ≤ LP + O( log 2 OPT )

  11. Dynamic Data Structures N weighted points in a 2-d region. Weights updated over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Goal: Preprocess (in a data structure) 1) Low query time 2) Low update time (upon weight change)

  12. Example Line: Interval queries Trivial: Query Time= O(n) Update Time = 1 Query time= 1 Update time= O( 𝑜 2 ) (Table of entries W[a,b] ) Query = O(log n) Update = O(log n) Recursively for 2-d. 𝑃 log 2 𝑜 , log 2 𝑜

  13. What about other queries? Circles arbitrary rectangles aligned triangle 𝑜 1 / 2 Turns out 𝑢 𝑟 𝑢 𝑣 ≥ log 2 𝑜 Reason: Set system S formed by query sets & points has large discrepancy (about 𝑜 1 / 4 ) 𝑒𝑗𝑒𝑒 𝑇 2 Larsen-Green’11 : 𝑢 𝑟 𝑢 𝑣 ≥ log 2 𝑜

  14. Lower Bounds Various methods: Spectral, Fourier analytic, … Determinant lower bound: detlb(A) ≤ herdisc (A) [Lovasz et al. 86] herdisc(A) ≤ polylog(n,m) detlb(A) [Matousek’11] (SDP duality) Polylog approximation for herdisc(A) [Nikolov, Talwar, Zhang’13]

  15. SDP Connection

  16. Vector Discrepancy Exact: Min t −𝑢 ≤ ∑ 𝑏 𝑗𝑗 𝐵 𝑗 ≤ 𝑢 for all rows i 𝑗 𝐵 𝑗 ∈ − 1,1 for each j SDP: vecdisc(A) min t ∑ 𝑏 𝑗𝑗 𝑤 𝑗 2 ≤ 𝑢 for all rows i 𝑗 𝑤 𝑗 2 = 1 for each j

  17. Is vecdisc a good relaxation? Not directly. vecdisc(A) = 0 even if disc(A) very large [Charikar, Newman, Nikolov’11] NP-Hard: disc(A) = 0 or disc(A) very large ? Let hervecdisc(A) = max vecdisc( 𝐵 | 𝑇 ) 𝑇 Thm [B’10]: disc(A) = 𝑃 log 𝑛 log 𝑜 hervecdisc(A) Pf: Algorithm

  18. Algorithm (at high level) start Each dimension: An Element Cube: {-1,+1} n Each vertex: A Coloring finish Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated , e.g. δ t 1 + δ t 2 ¼ 0 Analysis: Few steps to reach a vertex (walk has high variance) Disc( S i ) does a random walk (with low variance)

  19. An SDP Hereditary disc. λ ) the following SDP is feasible SDP: Low discrepancy | ∑ 𝑏 𝑗𝑗 v j | 2 · λ 2 for each row i. |v j | 2 = 1 for each element j. Obtain v j 2 R n Perhaps 𝑤 𝑗 can guide us how to update color of element j ? Trouble: 𝑤 𝑗 is a vector. Need a real number. Project on random vector vector g ( η i = g ¢ v i ) Seems promising: ∑ 𝑏 𝑗𝑗 𝜃 𝑗 = 𝑕 ⋅ ∑ 𝑏 𝑗𝑗 𝑤 𝑗 𝑗 𝑗

  20. Properties of Rounding Lemma: If g 2 R n is a random Gaussian, for any v 2 R n , v is distributed as N(0, |v| 2 ) g ¢ SDP: Each η j » N(0, 1 ) |v j | 2 = 1 1. 2. For each row i, | ∑ j a ij v j | 2 · λ 2 ∑ j a ij η j = g ¢ ( ∑ j a ij v j ) » N(0, · λ 2 ) (std deviation · λ ) η ’s will guide our updates to x.

  21. Algorithm Overview Construct coloring iteratively. Initially: Start with coloring x 0 = (0,0,0, …,0) at t = 0. At Time t: Update coloring as x t = x t-1 + γ ( η t 1 ,…, η t n ) ( γ tiny: say 1/n) x t (j) = γ ( η 1 j + η 2 j + … + η t +1 j ) Color of element j: Does random walk over time with step size ¼ γ Ν(0,1) time x(i) -1 Fixed if reaches -1 or +1. Disc(row i): ∑ a ij x t (j) does a random walk w/ step γ N(0,· λ 2 ) 𝑗

  22. Analysis At time T = O(1/ γ 2 ) 1: With prob. ½, an element reaches -1 or +1. 2: Each row has discrepancy in expectation. At time T = O((log n)/ 𝛿 2 ) 1. Most likely all elements fixed 2. Expected discrepancy for a row = 𝜇 log 𝑜 (By Chernoff, all have discrepancy O( 𝜇 log 𝑜 log 𝑛 )

  23. New Entropy Method

  24. Entropy method Very powerful method to prove discrepancy upper bounds [Beck, Spencer 80’s]: Given an m x n matrix A, there is a partial coloring satisfying 𝑏 𝑗 𝐵 ≤ 𝜇 𝑗 𝑏 𝑗 2 1 𝑜 𝑕 𝜇 𝑗 ≈ ln 𝜇 𝑗 if 𝜇 𝑗 < 1 provided ∑ 𝑕 ( 𝜇 𝑗 ) ≤ 5 𝑗 2 if 𝜇 𝑗 ≥ 1 ≈ 𝑓 −𝜇 𝑗 E.g. can ask for partial coloring with 0 discrepancy on n/log n rows, and reasonable amount on others.

  25. Lovett Meka Algorithm Do a sticky random walk. If some row 𝐵 𝑗 gets tight (disc( 𝐵 𝑗 ) = 𝜇 𝑗 𝑏 𝑗 2 ) start Move in space 𝐵 𝑗 x = 0 Make progress as long as dimension = Ω 𝑜 2 ≤ 𝑜 ∑ 𝑗 exp −𝜇 𝑗 2 (better than entropy method) Guarantees a partial coloring, even if n/4 𝜇 𝑗 ’s are 0.

  26. Comparison with iterated rounding Fact: If n variables, but ≤ n/2 constraints Ax = b (rest: 0 ≤ x ≤ 1) There exists a basic feasible solution with > n/2 variables 0-1. Iterated rounding: LP with m constraints, drop except n/2. (no control on dropped constrains ( 𝑏 𝑗 x ≤ 𝑐 𝑗 ), error up to | 𝑏 𝑗 | 1 ) Lovett-Meka Lemma: Can find solution with ≥ n/2 integral variables with error ≤ 𝜇 𝑗 𝑏 𝑗 2 ). E.g. Can set n/10 constraints to have 0 error, and controlled bounds on others

  27. Lower Bounds

  28. Discrepancy If disc(A) > D 𝐵𝐵 ∞ ≥ 𝐸 for all 𝐵 ∈ − 1,1 𝑜 (say A: n x n matrix for convenience) If 𝜏 𝑛𝑗𝑜 ( 𝐵 ) ≥ 𝐸 𝐵𝐵 2 ≥ 𝜏 𝑛𝑗𝑜 𝐵 𝐵 2 Could be very weak bound. Can consider 𝜏 𝑛𝑗𝑜 (PA) P:diag, tr(P) = n

  29. Determinant Lower Bound Thm (Lovasz Spencer Vesztergombi’86 ): herdisc 𝐵 ≥ detlb (A) 𝑙 × 𝑙 submatrix 𝐶 of 𝐵 det 𝐶 1 / 𝑙 detlb(A): max 𝑙 max (simple geometric argument) Conjecture (LSV’86): Herdisc ≤ O(1) detlb Remark: For TU Matrices, Herdisc(A) =1, detlb = 1 (every submatrix has det -1,0 or +1)

  30. Hoffman’s example log 𝑜 Hoffman: Detlb(A) ≤ 2 herdisc 𝐵 ≥ log log 𝑜 Palvolgyi’11: Ω log 𝑜 gap T: k-ary tree of depth k. ≈ 𝑙 𝑙 nodes. ... S: All edges out of a node. S’: All leaf to root paths. Both S and S’ are TU. Claim: 𝐸𝑓𝑢𝐸𝑐 𝑇 ∪ 𝑇𝑇 ≤ 2 ( Expand determinant) Herdisc( 𝑇 ∪ 𝑇𝑇 ) = k

  31. Matousek’11: herdisc(A) ≤ O(log n log 𝑛 ) detlb. Idea: SDP Duality -> Dual Witness for large herdisc(A). Dual Witness -> Submatrix with large determinant. Other implications: herdisc 𝐵 1 ∪ ⋯ ∪ 𝐵 𝑢 ≤ 𝑃 log 𝑜 log 𝑛 𝑢 max 𝑗 . herdisc 𝐵 𝑗

  32. Matousek’s result Thm: Herdisc(A) = O(log n log 𝑛 ) detlb(A) Pf: Recall, Disc 𝐵 = 𝑃 log 𝑛 log 𝑜 Hervecdisc(A) Some S, s.t. Vecdisc 𝐵 | 𝑇 = Herdisc(A)/ 𝑃 log 𝑛 log 𝑜 Will show: vecdisc 𝐵 | 𝑇 ≤ 𝑃 log n detlb 𝐵 | 𝑇 Let us use A for 𝐵 | 𝑇

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend