regression via iteratively reweighted least squares alina
play

Regression via Iteratively Reweighted Least Squares Alina Ene, - PowerPoint PPT Presentation

Improved Convergence for and 1 Regression via Iteratively Reweighted Least Squares Alina Ene, Adrian Vladu IRLS Method Basic primitive: min r i x i 2 Ax = b IRLS Method Basic primitive: min r i x i 2 Ax = b solution


  1. Improved Convergence for ℓ ∞ and ℓ 1 Regression via Iteratively Reweighted Least Squares Alina Ene, Adrian Vladu

  2. IRLS Method Basic primitive: min ∑r i x i 2 Ax = b

  3. IRLS Method Basic primitive: min ∑r i x i 2 Ax = b solution given by one linear system solve x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r)

  4. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one linear system solve * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  5. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  6. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  7. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  8. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  9. IRLS Method “Hard” problem: Basic primitive: ** min ∑r i x i 2 min |x| p Ax = b Ax = b solution given by one equivalent to linear linear system solve programming * x = R -1 A T (A T R -1 A) -1 Ab * R = diag(r) ** p = {1, ∞ }

  10. Benchmark: Optimization on Graphs t min |x| ∞ s Ax = b

  11. Benchmark: Optimization on Graphs minimize congestion of flow x t min |x| ∞ s Ax = b

  12. Benchmark: Optimization on Graphs minimize congestion of flow x t min |x| ∞ s Ax = b boundary condition: x routes demand from s to t

  13. Benchmark: Optimization on Graphs minimize congestion of .5 .5 flow x .5 t 0 min |x| ∞ s Ax = b .5 .5 .5 boundary condition: x routes demand Maximum flow from s to t

  14. Benchmark: Optimization on Graphs +1 -1 min |x| 1 +1 Ax = b -1

  15. Benchmark: Optimization on Graphs minimize +1 cost of flow x -1 min |x| 1 +1 Ax = b -1

  16. Benchmark: Optimization on Graphs minimize +1 cost of flow x -1 min |x| 1 +1 Ax = b -1 boundary condition: x routes demand from +1 to -1

  17. Benchmark: Optimization on Graphs minimize +1 cost of 1 1 flow x 0 -1 0 min |x| 1 +1 Ax = b 0 1 1 -1 boundary condition: x routes demand Minimum cost flow from +1 to -1

  18. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Ax = b Ax = b max flow min cost flow

  19. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow

  20. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow First order methods (gradient descent) ➜ running time strongly depends on matrix structure ➜ in general, takes time at least Ω(m 1.5 /poly(ε)) Second order methods (Newton method, IRLS) ➜ interior point method: Õ(m 1/2 ) linear system solves ➜ can be made Õ(n 1/2 ) with a lot of work [LS ’ 14] “Hybrid” method ➜ [CKMST, STOC ’ 11] Õ(m 1/3 /ε 11/3 ) linear system solves ➜ ~30 pages of description and proofs for complicated method

  21. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow First order methods (gradient descent) ➜ running time strongly depends on matrix structure ➜ in general, takes time at least Ω(m 1.5 /poly(ε)) Second order methods (Newton method, IRLS) ➜ interior point method: Õ(m 1/2 ) linear system solves ➜ can be made Õ(n 1/2 ) with a lot of work [Lee-Sidford ’ 14] “Hybrid” method ➜ [CKMST, STOC ’ 11] Õ(m 1/3 /ε 11/3 ) linear system solves ➜ ~30 pages of description and proofs for complicated method

  22. Benchmark: Optimization on Graphs min |x| 1 min |x| ∞ Q: Are these problems really that hard? Ax = b Ax = b max flow min cost flow First order methods (gradient descent) ➜ running time strongly depends on matrix structure ➜ in general, takes time at least Ω(m 1.5 /poly(ε)) Second order methods (Newton method, IRLS) ➜ interior point method: Õ(m 1/2 ) linear system solves ➜ can be made Õ(n 1/2 ) with a lot of work [Lee-Sidford ’ 14] “Hybrid” method ➜ [Christiano-Kelner-Madry-Spielman-Teng ’ 11] Õ(m 1/3 /ε 11/3 ) linear system solves ➜ ~30 pages of description and proofs for complicated method

  23. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations

  24. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations * no matter what the structure of the underlying matrix is

  25. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ ≤ OPT Ax = b t s

  26. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ Guess ≤ OPT Ax = b OPT value (.5) t s

  27. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ Guess ≤ OPT Ax = b OPT value (.5) t s

  28. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 Ax = b 1 OPT value (.5) t r = 1 s Initialize 1 1 1

  29. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .4 Ax = b 1 OPT value (.5) .4 t .6 .2 r = 1 s Initialize .6 .4 1 .4 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem

  30. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .4 Ax = b 1.44 OPT value (.5) .4 t .6 .2 r = 1 s Initialize .6 .4 .4 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  31. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .4 Ax = b 1.44 OPT value (.5) .4 t .6 .2 r = 1 s Initialize .6 .4 .4 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  32. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 Ax = b 1.44 OPT value (.5) t r = 1 s Initialize 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  33. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .44 Ax = b 1.44 OPT value (.5) .44 t .55 .11 r = 1 s Initialize .55 .44 .44 1.44 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  34. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .44 Ax = b 1.75 OPT value (.5) .44 t .55 .11 r = 1 s Initialize .55 .44 .44 1.75 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  35. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 Ax = b 1.75 OPT value (.5) t r = 1 s Initialize 1.75 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  36. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations min |x| ∞ Guess ≤ OPT Ax = b OPT value (.5) t r = 1 s Initialize min ∑ r i x i 2 Solve least Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  37. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 2 Ax = b OPT value (.5) t r = 1 s Initialize 2 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  38. This work Natural IRLS method runs in Õ(m 1/3 /ε 2/3 +1/ε 2 ) iterations 1 min |x| ∞ Guess ≤ OPT 1 .5 2 Ax = b OPT value .5 (.5) t .5 0 r = 1 s Initialize .5 .5 2 .5 min ∑ r i x i 2 1 Solve least 1 Ax = b squares problem r i ← r i * Update r max{(x i /OPT) 2 , 1}

  39. Nonstandard Optimization Primitive ➜ Objective function is max r≥0 min Ax=b ∑r i x i 2 /∑r i Similar analysis to packing/covering LP [Young ’ 01] ℓ 1 version is a type of “slime mold dynamics” [Straszak- Vishnoi ’ 16, ‘17]

  40. Nonstandard Optimization Primitive ➜ Objective function is max r≥0 min Ax=b ∑r i x i 2 /∑r i ➜ Similar analysis to packing/covering LP [Young ’ 01] ℓ 1 version is a type of “slime mold dynamics” [Straszak- Vishnoi ’ 16, ‘17]

  41. Nonstandard Optimization Primitive ➜ Objective function is max r≥0 min Ax=b ∑r i x i 2 /∑r i ➜ Similar analysis to packing/covering LP [Young ’ 01] ➜ ℓ 1 version is a type of “slime mold dynamics” [Straszak- Vishnoi ’ 16, ‘17]

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend