beating simplex for fractional packing and covering
play

Beating Simplex for fractional packing and covering linear programs - PowerPoint PPT Presentation

Beating Simplex for fractional packing and covering linear programs Christos Koufogiannakis and Neal E. Young University of California, Riverside March 9, 2009 G&Ks sublinear-time algorithm for zero-sum games Theorem (Grigoriadis and


  1. Beating Simplex for fractional packing and covering linear programs Christos Koufogiannakis and Neal E. Young University of California, Riverside March 9, 2009

  2. G&K’s sublinear-time algorithm for zero-sum games Theorem (Grigoriadis and Khachiyan, 1995) Given a two-player zero-sum m × n matrix game A with payoffs in [ − 1 , 1] , near-optimal mixed strategies can be computed in time O (( m + n ) log( mn ) /ε 2 ) . Each strategy gives expected payoff within additive ε of optimal. Matrix has size m × n , so for fixed ε this is sublinear time. The algorithm can be viewed as fictitious play, where each player plays randomly from a distribution. The distribution gives more weight to pure strategies that are good responses to opponent’s historical average play. Takes O (log( mn ) /ε 2 ) rounds, each round takes O ( m + n ) time.

  3. G&K’s sublinear-time algorithm for zero-sum games Theorem (Grigoriadis and Khachiyan, 1995) Given a two-player zero-sum m × n matrix game A with payoffs in [ − 1 , 1] , near-optimal mixed strategies can be computed in time O (( m + n ) log( mn ) /ε 2 ) . Each strategy gives expected payoff within additive ε of optimal. Matrix has size m × n , so for fixed ε this is sublinear time. The algorithm can be viewed as fictitious play, where each player plays randomly from a distribution. The distribution gives more weight to pure strategies that are good responses to opponent’s historical average play. Takes O (log( mn ) /ε 2 ) rounds, each round takes O ( m + n ) time.

  4. How do LP algorithms do in practice? Simplex, interior-point methods, ellipsoid method optimistic estimate of Simplex run time (# basic operations): (# pivots) × (time per pivot) ≈ 5 min( m , n ) × mn m rows, n columns Empirically, ratio (observed time / this estimate) is in [0.3,20]: y = actual time / estimated time 100 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 10 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 1 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 0.1 1 10 100 1000 x = estimated time for simplex

  5. How do LP algorithms do in practice? Simplex, interior-point methods, ellipsoid method optimistic estimate of Simplex run time (# basic operations): (# pivots) × (time per pivot) ≈ 5 min( m , n ) × mn m rows, n columns in terms of number of non-zeroes, N : ( m + n ≤ N ≤ m n ) ◮ if constraint matrix is dense: time Θ( N 1 . 5 ) ◮ if constraint matrix is sparse: time Θ( N 3 ) This is optimistic — can be slower if numerical issues arise. Time to find, say, . 95 -approximate solution is comparable. Time for interior-point seems similar (within constant factors).

  6. We will extend G&K to LPs with non-negative coefficients: packing: maximize c · x such that A x ≤ b ; x ≥ 0 covering: minimize b · y such that A T y ≥ c ; y ≥ 0 ... solutions with relative error ε (harder to compute): ◮ a feasible x with cost ≥ (1 − ε ) OPT , ◮ a feasible y with cost ≤ (1 + ε ) OPT , or ◮ a primal-dual pair ( x , y ) with c · x ≥ b · y / (1 + ε ).

  7. But... isn’t LP equivalent to solving a zero-sum game? canonical packing LP equivalent game maximize | x | 1 minimize λ Ax ≤ 1 Az ≤ λ ⇐ ⇒ x ≥ 0 z ≥ 0 | z | 1 = 1 solution z ∗ = x ∗ / | x ∗ | solution x ∗ (can be large) ⇐ ⇒ λ ∗ = 1 / | x ∗ | relative error ε ⇐ ⇒ additive error ε/ | x ∗ | ◮ Straight G&K algorithm (given A ij ∈ [0 , 1]) requires time | x ∗ | 2 ( m + n ) log( m + n ) /ε 2 to achieve relative error ε .

  8. Run time it will take us to get relative error ε Worst-case time: n = rows, m = columns, N = non-zeros n + m ≤ N ≤ nm O ( N + ( n + m ) log( nm ) /ε 2 ) ◮ This is O ( N ) (linear) for fixed ε and slightly dense matrices. ◮ Really? In practice 1 /ε 2 is a “constant” that matters... ... for ε ≈ 1% down to 0 . 1%, “constant” 1 /ε 2 is 10 4 to 10 6 .

  9. Run time it will take us to get relative error ε Worst-case time: n = rows, m = columns, N = non-zeros n + m ≤ N ≤ nm O ( N + ( n + m ) log( nm ) /ε 2 ) Empirically: about 40 N + 12( n + m ) log( nm ) /ε 2 basic ops Empirically, ratio of (observed time / this estimate) is in [1,2]: y = actual time / estimated time 2.2 ♦ ♦ ♦ 2 ♦ 1.8 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 1.6 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 1.4 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 1.2 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 1 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 0.8 1 10 100 1000 x = estimated time

  10. Estimated speedup versus Simplex ( n × n matrix) ε 2 n 2 estimated speedup ≈ est. Simplex run time ≈ est. algorithm run time 12 ln n Empirically, ratio (observed speedup/this estimate) is in [0.4,10]: actual speedup y = estimated speedup 100 ♦ 10 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 1 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 0.1 1 10 100 1000 x = estimated alg time Slower than Simplex for small n , faster than Simplex for large n . “Hours instead of days, days instead of years.”

  11. Estimated speedup versus Simplex ( n × n matrix) ε 2 n 2 estimated speedup ≈ est. Simplex run time ≈ est. algorithm run time 12 ln n ◮ Slower than Simplex for small n , faster for large n . ◮ Break even at about 900 rows and columns (for ε = 1%). ◮ For larger problems, speedup grows proportionally to n 2 / ln n . “Hours instead of days, days instead of years.” (with ε = 1% and 1GHz CPU)

  12. Next (sketch of algorithm): ◮ canonical forms for packing and covering ◮ some smooth penalty functions ◮ simple gradient-based basic packing and covering algorithms ◮ coupling two algorithms (Grigoriadis & Khachiyan) ◮ non-uniform increments (Garg & Konemann) ◮ combining coupling and non-uniform increments (new) ◮ a random-sampling trick (new) — won’t present today

  13. packing and covering, canonical form | x | 1 | y | 1 maximize x = OPT = minimize y j y . max i A i x min j A T A (1 + ε ) -approximate primal-dual pair : x ≥ 0, y ≥ 0 with | x | 1 | y | 1 ≥ (1 − O ( ε )) × j y . max i A i x min j A T A – constraint matrix (rows i = 1 .. m , columns j = 1 .. n ) | x | – size (1-norm), � j x j A i x – left-hand side of i th packing constraint A T j y – left-hand side of j th covering constraint

  14. smooth estimates of max and min i e z i . Define smax( z 1 , z 2 , . . . , z m ) = ln � 1. smax approximates max within an additive ln m : | smax( z 1 , z 2 , . . . , z m ) − max z i | ≤ ln m . i 2. smax is (1 + ε )-smooth within an ε -neighborhood: If each d i ≤ ε , then smax( z + d ) ≤ smax( z ) + (1 + ε ) d · ∇ smax( z ) analogous estimate of min: i e − z i smin( z 1 , z 2 , . . . , z n ) = − ln � . . . ≥ min j z j − ln n

  15. Packing algorithm, assuming each A ij ∈ [0 , 1] 1. x ← 0 2. while max i A i x ≤ ln( m ) /ε do: 3. Let vector p = ∇ smax( Ax ). 4. Choose j minimizing A T j p . (=derivative of smax Ax w.r.t. x j ) 5. Increase x j by ε . 6. return x (appropriately scaled). Theorem (e.g. GK,PST,Y,GK,...(??), 1990’s) Alg. returns (1 + O ( ε )) -approximate packing solution. Proof. In each iteration, since A ij ∈ [0 , 1], each A i x increases by ≤ ε . Using smoothness of smax, show invariant smax Ax ≤ ln m + (1 + O ( ε )) | x | OPT ...

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend