minimax rates for memory constrained sparse linear
play

Minimax Rates for Memory-Constrained Sparse Linear Regression Jacob - PowerPoint PPT Presentation

Minimax Rates for Memory-Constrained Sparse Linear Regression Jacob Steinhardt John Duchi Stanford University { jsteinha,jduchi } @stanford.edu July 6, 2015 J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6,


  1. Minimax Rates for Memory-Constrained Sparse Linear Regression Jacob Steinhardt John Duchi Stanford University { jsteinha,jduchi } @stanford.edu July 6, 2015 J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 1 / 11

  2. Resource-Constrained Learning How do we solve statistical problems with limited resources? J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 2 / 11

  3. Resource-Constrained Learning How do we solve statistical problems with limited resources? computation (Natarajan, 1995; Berthet & Rigollet, 2013; Zhang et al., 2014; Foster et al., 2015) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 2 / 11

  4. Resource-Constrained Learning How do we solve statistical problems with limited resources? computation (Natarajan, 1995; Berthet & Rigollet, 2013; Zhang et al., 2014; Foster et al., 2015) privacy (Kasiviswanathan et al., 2011; Duchi et al., 2013) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 2 / 11

  5. Resource-Constrained Learning How do we solve statistical problems with limited resources? computation (Natarajan, 1995; Berthet & Rigollet, 2013; Zhang et al., 2014; Foster et al., 2015) privacy (Kasiviswanathan et al., 2011; Duchi et al., 2013) communication / memory (Zhang et al., 2013; Shamir, 2014; Garg et al., 2014; Braverman et al., 2015) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 2 / 11

  6. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  7. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  8. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  9. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ X ( 1 ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  10. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ X ( 1 ) Y ( 1 ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  11. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ X ( 1 ) Y ( 1 ) Z ( 1 ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  12. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ X ( 1 ) X ( 2 ) Y ( 1 ) Z ( 1 ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  13. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ X ( 1 ) X ( 2 ) Y ( 1 ) Y ( 2 ) Z ( 1 ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  14. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ X ( 1 ) X ( 2 ) Y ( 1 ) Y ( 2 ) b Z ( 1 ) Z ( 2 ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  15. Setting Sparse linear regression in R d : Y ( i ) = � w ∗ , X ( i ) � + ε ( i ) � w ∗ � 0 = k , k ≪ d Memory constraint: ( X ( i ) , Y ( i ) ) observed as read-only stream Only keep b bits of state Z ( i ) between successive observations W ∗ X ( 1 ) X ( 2 ) X ( 3 ) Y ( 1 ) Y ( 2 ) Y ( 3 ) b b ... Z ( 1 ) Z ( 2 ) Z ( 3 ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 3 / 11

  16. Motivating Question If we have enough memory to represent the answer, can we also efficiently learn the answer? J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 4 / 11

  17. Problem Statement How much data n is needed to obtain estimator ˆ w with w − w ∗ � 2 E [ � ˆ 2 ] ≤ ε ? J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 5 / 11

  18. Problem Statement How much data n is needed to obtain estimator ˆ w with w − w ∗ � 2 E [ � ˆ 2 ] ≤ ε ? Classical case (no memory constraint): Theorem (Wainwright, 2009) k ε log ( d ) � n � k ε log ( d ) J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 5 / 11

  19. Problem Statement How much data n is needed to obtain estimator ˆ w with w − w ∗ � 2 E [ � ˆ 2 ] ≤ ε ? Classical case (no memory constraint): Theorem (Wainwright, 2009) k ε log ( d ) � n � k ε log ( d ) Achievable with ˜ O ( d ) memory (Agarwal et al., 2012; S., Wager, & Liang, 2015). J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 5 / 11

  20. Problem Statement How much data n is needed to obtain estimator ˆ w with w − w ∗ � 2 E [ � ˆ 2 ] ≤ ε ? Classical case (no memory constraint): Theorem (Wainwright, 2009) k ε log ( d ) � n � k ε log ( d ) With memory constraints b : Theorem (S. & Duchi, 2015) k d b � n � k d ε ε 2 b J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 5 / 11

  21. Problem Statement How much data n is needed to obtain estimator ˆ w with w − w ∗ � 2 E [ � ˆ 2 ] ≤ ε ? Classical case (no memory constraint): Theorem (Wainwright, 2009) k ε log ( d ) � n � k ε log ( d ) With memory constraints b : Theorem (S. & Duchi, 2015) k d b � n � k d ε ε 2 b Exponential increase if b ≪ d ! J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 5 / 11

  22. Problem Statement How much data n is needed to obtain estimator ˆ w with w − w ∗ � 2 E [ � ˆ 2 ] ≤ ε ? Classical case (no memory constraint): Theorem (Wainwright, 2009) k ε log ( d ) � n � k ε log ( d ) With memory constraints b : Theorem (S. & Duchi, 2015) k d b � n � k d ε ε 2 b [Note: up to log factors; assumes k log ( d ) ≪ b ≤ d ] J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 5 / 11

  23. Proof Overview Lower bound: J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 6 / 11

  24. Proof Overview Lower bound: information-theoretic J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 6 / 11

  25. Proof Overview Lower bound: information-theoretic strong data-processing inequality d W ∗ X , Y Z 1 J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 6 / 11

  26. Proof Overview Lower bound: information-theoretic strong data-processing inequality ✁ db W ∗ X , Y Z ✁ 1 b d J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 6 / 11

  27. Proof Overview Lower bound: information-theoretic strong data-processing inequality ✁ db W ∗ X , Y Z ✁ 1 b d main challenge: dependence between X , Y J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 6 / 11

  28. Proof Overview Lower bound: information-theoretic strong data-processing inequality ✁ db W ∗ X , Y Z ✁ 1 b d main challenge: dependence between X , Y Upper bound: J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 6 / 11

  29. Proof Overview Lower bound: information-theoretic strong data-processing inequality ✁ db W ∗ X , Y Z ✁ 1 b d main challenge: dependence between X , Y Upper bound: count-min sketch + ℓ 1 -regularized dual averaging J. Steinhardt & J. Duchi (Stanford) Memory-Constrained Sparse Regression July 6, 2015 6 / 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend