discrepancy and optimization
play

Discrepancy and Optimization Nikhil Bansal IPCO Summer School - PowerPoint PPT Presentation

Discrepancy and Optimization Nikhil Bansal IPCO Summer School (lecture 2) www.win.tue.nl/~nikhil/ipco-slides.pdf (notes coming) Discrepancy Universe: U= [1,,n] S 3 Subsets: S 1 ,S 2 ,,S m Color elements red/blue so each S 4 S 1 set is


  1. Discrepancy and Optimization Nikhil Bansal IPCO Summer School (lecture 2) www.win.tue.nl/~nikhil/ipco-slides.pdf (notes coming)

  2. Discrepancy Universe: U= [1,…,n] S 3 Subsets: S 1 ,S 2 ,…,S m Color elements red/blue so each S 4 S 1 set is colored as evenly as possible. S 2 Given  : [n] ! { -1,+1} Disc ( 𝜓 ) = max S |  i 2 S  (i)| = max S |  𝑇 | Disc (set system) = min 𝜓 max S |  𝑇 |

  3. Matrix Notation Rows: sets Columns: elements Given any matrix A, find coloring 𝑦 ∈ −1,1 𝑜 , to minimize 𝐵𝑦 ∞

  4. Applications CS: Computational Geometry, Approximation, Complexity, Differential Privacy, Pseudo-Randomness, … Math: Combinatorics, Optimization, Finance, Dynamical Systems, Number Theory, Ramsey Theory, Algebra, Measure Theory, …

  5. Hereditary Discrepancy Discrepancy a useful measure of complexity of a set system 1 2 … n 1’ 2’ … n’ A ’ 1 A 1 𝑇 𝑗 = 𝐵 𝑗 ∪ 𝐵’ 𝑗 But not so robust A ’ 2 A 2 … … Discrepancy = 0 Hereditary discrepancy: disc (U’, S |U’ ) herdisc (U,S) = max 𝑉 ′ ⊆𝑉 Robust version of discrepancy (99% of problems: bounding disc = bounding herdisc)

  6. Rounding Lovasz-Spencer- Vesztermgombi’86: Given any matrix A, and 𝑦 ∈ 𝑆 𝑜 , can round x to 𝑦 ∈ 𝑎 𝑜 s.t. 𝐵𝑦 – 𝐵 𝑦 ∞ < Herdisc 𝐵 𝑦 x Ax=b Intuition: Discrepancy is like rounding ½ integral solution to 0 or 1. Can do dependent (correlated) rounding based on A. For approximation algorithms: need algorithms for discrepancy Bin packing: OPT + 𝑃 (log OPT) [Rothvoss’13] Herdisc(A) = 1 iff A is TU matrix.

  7. Rounding Lovasz-Spencer- Vesztermgombi’86: Given any matrix A, and 𝑦 ∈ 𝑎 𝑜 s.t. 𝐵𝑦 – 𝐵 𝑦 ∈ 𝑆 𝑜 , can round x to 𝑦 ∞ < Herdisc 𝐵 𝑦 x Ax=b Proof: Round the bits of x one by one. 𝑦 1 : blah .0101101 (-1) 𝑦 2 : blah .1101010 Key Point: Low discrepancy … A coloring guides our updates! 𝑦 𝑜 : blah .0111101 (+1) 1 1 1 Error = herdisc(A) ( 2 𝑙 + 2 𝑙−1 + … + 2 )

  8. Rounding Only shows existence of good rounding How to actually find it? Thm [B’10]: Error = 𝑃 log 𝑛 log 𝑜 herdisc(A)

  9. Ordering with small prefix sums Vectors 𝑤 1 , … , 𝑤 𝑜 ∈ 𝑆 𝑒 𝑤 ∞ ≤ 1 𝑗 𝑤 𝑗 = 0 Find a permutation 𝜌 such that each prefix sum has small norm i.e. 𝑁𝑏𝑦 𝑙 𝑤 𝜌 1 + … + 𝑤 𝜌 𝑙 ∞ is minimized d=1 numbers in [-1,1] e.g. 0.7 -0.2 - 0.9 0.8, 0.7 … What would a random ordering give? 0.7 , 0.8 , - 0.8 , … can we get 𝑃(1) d=2 -0.4 0.6 0.5 (Posed by Reimann, solved by Steinitz in 1913, called Steinitz problem)

  10. Steinitz Problem Given 𝑤 1 , … , 𝑤 𝑜 ∈ 𝑆 𝑒 with 𝑗 𝑤 𝑗 = 𝟏 Find permutation to minimize norm of prefix sums 𝑛 𝜌 = max 𝑤 𝜌 1 + … + 𝑤 𝜌 𝑙 𝑙 Discrepancy of prefix sums: Given ordering find signs to minimize norm of signed prefix sums 𝜌 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑤 5 𝑤 6 𝑤 7 𝑤 8 𝑤 1 𝑤 3 𝑤 4 𝑤 8 𝑤 7 𝑤 6 𝑤 5 𝑤 2 + - + + - - - + 𝑛 𝜌 + 𝑔 𝑒 𝑛 𝜌 2

  11. Sparsification Original motivation: Numerical Integration/ Sampling How well can you approximate a region by discrete points ? Discrepancy: Max over rectangles R |(# points in R) – (Area of R)| Use this to sparsify Quasi- Monte Carlo integration: Huge area (finance, …) 1 𝑒𝑗𝑡𝑑 Error MC ≈ QMC ≈ 𝑜 𝑜

  12. Tusnady’s problem Input: n points placed arbitrarily in a grid. Sets = axis-parallel rectangles Discrepancy: max over rect. R ( |# red in R - # blue in R| ) Random gives about O(n 1/2 log 1/2 n) Very long line of work O(lo g 4 n) [Beck 80’s] ... O(log 2.5 n) [Matousek’99] O( log 2 n ) [B., Garg’16] O(log 1.5 n) [Nikolov’17]

  13. Questions around Discrepancy bounds Combinatorial: Show good coloring exists Algorithmic: Find coloring in poly time Lower bounds on discrepancy Approximating discrepancy

  14. Combinatorial (3 generations) 0) Linear Algebra (Iterated Rounding) [Steinitz, Beck- Fiala, Barany, …] 1) Partial Coloring Method: Beck/Spencer early 80’s: Probabilistic Method + Pigeonhole Gluskin’87: Convex Geometric Approach Very versatile (black-box) Loss adds over O(log n) iterations 2) Banaszczyk’98: Based on a deep convex geometric result Produces full coloring directly (also black-box)

  15. Brief History (combinatorial) Method Tusnady Steinitz Beck-Fiala (rectangles) (prefix sums) (low deg. system) log 4 𝑜 Linear Algebra d k log 2.5 𝑜 d 1/2 log n k 1/2 log n Partial Coloring [Matousek’99] log 1.5 𝑜 (d log n) 1/2 (k log n) 1/2 Banaszczyk [Nikolov’17] [Banaszczyk’12] [Banaszczyk’98] d 1/2 k 1/2 Lower bound log 𝑜

  16. Brief History (algorithmic) Partial Coloring now constructive Bansal’10 : SDP + Random walk Lovett Meka’12: Random walk + linear algebra Rothvoss’14: Sample and Project (geometric) Many others by now [Harvey, Schwartz, Singh], [Eldan, Singh] Method Tusnady Steinitz Beck-Fiala (rectangles) (prefix sums) (low deg. system) log 4 𝑜 Linear Algebra d k log 2.5 𝑜 d 1/2 log n k 1/2 log n Partial Coloring [Matousek’99] log 1.5 𝑜 (d log n) 1/2 (k log n) 1/2 Banaszczyk [Banaszczyk’12] [Banaszczyk’98] [Nikolov’17] d 1/2 t 1/2 Lower bound log 𝑜

  17. Algorithmic aspects (2) Beck-Fiala (B.-Dadush- Garg’16) (tailor made algorithm) General Banaszczyk (B.-Dadush-Garg- Lovett’18) Method Tusnady Steinitz Beck-Fiala (rectangles) (prefix sums) (low deg. system) log 4 𝑜 Linear Algebra d K log 2.5 𝑜 d 1/2 log n k 1/2 log n Partial Coloring [Matousek’99] log 1.5 𝑜 log 2 𝑜 (d log n) 1/2 [BDGL] (k log n) 1/2 [BDG’16] Banaszczyk [Banaszczyk’12] [Banaszczyk’98] [Nikolov’17] [BDG16] d 1/2 k 1/2 Lower bound log 𝑜

  18. Linear Algebraic approach Start with x(0) = (0,…,0) coloring. Update at each step t If a variable reaches -1 or 1, fixed forever. x x(t) = x(t-1) + y(t) Update y(t) obtained by solving By(t) = 0 B cleverly chosen. −1,1 𝑜 cube Beck-Fiala: B = rows with size > k (on floating variables) Row has 0 discrepancy as long as it is big. (no control once it becomes of size <= k).

  19. Partial Coloring

  20. Spencer’s problem Spencer Setting: Discrepancy of any set system on 1 2 … n n elements and m sets? 𝑇 1 𝑇 2 … [ Spencer’85 ]: (independently by Gluskin’87 ) 𝑇 𝑛 For m = n discrepancy · 6n 1/2 Tight: Cannot beat 0.5 n 1/2 (Hadamard Matrix). Random coloring gives O n log n 1/2 Proof: For set S, Pr [disc(S) ≈ 𝑑|𝑇| 1/2 ] ≈ exp −𝑑 2 Set c = O log n 1/2 and apply union bound. Tight. Random gives Ω n log n 1/2 with very high prob.

  21. Beating random coloring [Beck, Spencer 80’s]: Given an m x n matrix A, there is a partial coloring satisfying 𝑏 𝑗 𝑦 ≤ 𝜇 𝑗 𝑏 𝑗 2 1 𝑜 𝑕 𝜇 𝑗 ≈ ln if 𝜇 𝑗 < 1 provided 𝑗 𝑕(𝜇 𝑗 ) ≤ 𝜇 𝑗 5 2 ≈ 𝑓 −𝜇 𝑗 if 𝜇 𝑗 ≥ 1 2 < 1 Union bound: 𝑗 𝑓 −𝜇 𝑗 n/5 vs 1 very powerful Can demand discrepancy 0 for ≈ Ω 𝑜 rows. (while still having control on other rows). Combines strengths of probability + linear algebra

  22. Spencer’s O(n 1/2 ) result Partial Coloring suffices: For any set system with m sets, there exists a coloring on ¸ n/2 elements with discrepancy Δ = O(n 1/2 log 1/2 (2m/n)) [For m=n, disc = O(n 1/2 )] Algorithm for total coloring: Repeatedly apply partial coloring lemma Total discrepancy O( n 1/2 log 1/2 2 ) [Phase 1] + O( (n/2) 1/2 log 1/2 4 ) [Phase 2] + O((n/4) 1/2 log 1/2 8 ) [Phase 3] + … = O(n 1/2 )

  23. Beck Fiala Thm: Partial coloring 𝑃 𝑙 1/2 , so Full coloring 𝑃 𝑙 1/2 log 𝑜 Total number of 1’s in matrix ≤ 𝑜𝑙 Why can we set Δ = 𝑙 1/2 ? 1 𝑕 𝜇 𝑗 ≈ ln if 𝜇 𝑗 < 1 /2 𝑜 Δ 𝑗 𝑕(𝜇 𝑗 ) ≤ 𝜇 𝑗 = 𝜇 𝑗 5 |𝑇 𝑗 | 2 ≈ 𝑓 −𝜇 𝑗 if 𝜇 𝑗 ≥ 1 /2 n sets of size k n g(1) ≈ 𝑜 𝑜 1 𝑢 𝑕 ≈ (𝑜/𝑢) log 𝑢 n/t sets of size tk 1 𝑢 2 tn sets of size k/t 𝑢𝑜 𝑕 𝑢 1/2 ≈ 𝑢𝑜 𝑓 −𝑢

  24. Proving Partial Coloring Lemma

  25. A geometric view Spencer’85: Any 0 -1 matrix (n x n ) has disc ≤ 6 𝑜 Gluskin’87: Convex geometric approach 𝑏 𝑗 𝑦 ≥ −𝑢 𝑏 𝑗 𝑦 ≤ 𝑢 Consider polytope P(t) = −𝑢 𝟐 ≤ 𝐵𝑦 ≤ 𝑢 𝟐 P(t) contains a point in −1,1 𝑜 for t = 6 𝑜 Gluskin’87: If K symmetric, convex with large (Gaussian) volume (> 2 −𝑜/100 ) then K contains a point with many coordinates {-1,+1} d-dim Gaussian Measure: 𝛿 𝑒 𝑦 = exp − 𝑦 2 /2 (2𝜌) −𝑒/2 𝛿 𝑒 𝐿 : Pr (𝑧 1 , … , 𝑧 𝑛 ) ∈ 𝐿 each 𝑧 𝑗 iid N(0,1) K −1,1 𝑜 cube What is the Gaussian volume of −1,1 𝑜 cube

  26. A geometric view Gluskin’87: If K symmetric, convex with large (Gaussian) volume (> 2 −𝑜/100 ) then K contains a point with many coordinates {-1,+1} K Proof: Look at K+x for all 𝑦 ∈ −1,1 𝑜 Total volume of shifts = 2 Ω 𝑜 𝛿 𝑜 𝐿 + 𝑦 ≥ 𝛿 𝑜 𝐿 exp − 𝑦 2 /2 Some point 𝑨 lies in 2 Ω 𝑜 copies 𝑨 = 𝑙 + 𝑦 and 𝑨 = 𝑙’ + 𝑦’ where 𝑦, 𝑦’ have large hamming distance Gives (𝑦 − 𝑦 ′ )/2 = (𝑙 − 𝑙′)/2 ∈ 𝐿.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend