elementary estimators for high dimensional linear
play

Elementary Estimators for High-Dimensional Linear Regression Eunho - PDF document

Elementary Estimators for High-Dimensional Linear Regression Eunho Yang EUNHO @ CS . UTEXAS . EDU Department of Computer Science, The University of Texas, Austin, TX 78712, USA Aur elie C. Lozano ACLOZANO @ US . IBM . COM IBM T.J. Watson


  1. Elementary Estimators for High-Dimensional Linear Regression Eunho Yang EUNHO @ CS . UTEXAS . EDU Department of Computer Science, The University of Texas, Austin, TX 78712, USA Aur´ elie C. Lozano ACLOZANO @ US . IBM . COM IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA Pradeep Ravikumar PRADEEPR @ CS . UTEXAS . EDU Department of Computer Science, The University of Texas, Austin, TX 78712, USA Abstract dimensional regression parameter are assumed to be non- zero, group-sparse constraints, and low-rank structure with We consider the problem of structurally con- matrix-structured parameters, among others. strained high-dimensional linear regression. This has attracted considerable attention over the last The development of consistent estimators for such struc- decade, with state of the art statistical estimators turally constrained high-dimensional linear regression has based on solving regularized convex programs. attracted considerable recent attention. A key class of esti- While these typically non-smooth convex pro- mators are based on regularized maximum likelihood es- grams can be solved by the state of the art op- timators; in the case of linear regression with Gaussian timization methods in polynomial time, scaling noise, these take the form of regularized least squares esti- them to very large-scale problems is an ongoing mators. For the case of sparsity, a popular instance is con- and rich area of research. In this paper, we at- strained basis pursuit or LASSO (Tibshirani, 1996), which tempt to address this scaling issue at the source, solves an ℓ 1 regularized (or equivalently ℓ 1 -constrained) by asking whether one can build simpler possibly least squares problem, and has been shown to have strong closed-form estimators, that yet come with statis- statistical guarantees, including prediction error consis- tical guarantees that are nonetheless comparable tency (van de Geer & Buhlmann, 2009), consistency of the to regularized likelihood estimators. We answer parameter estimates in ℓ 2 or some other norm (van de Geer this question in the affirmative, with variants & Buhlmann, 2009; Meinshausen & Yu, 2009; Candes & of the classical ridge and OLS (ordinary least Tao, 2006), as well as variable selection consistency (Mein- squares estimators) for linear regression. We an- shausen & B¨ uhlmann, 2006; Wainwright, 2009; Zhao & alyze our estimators in the high-dimensional set- Yu, 2006). For the case of group-sparse structured linear ting, and moreover provide empirical corrobora- regression, ℓ 1 /ℓ q regularized least squares (with q ≥ 2 ) tion of its performance on simulated as well as has been proposed (Tropp et al., 2006; Zhao et al., 2009; real world microarray data. Yuan & Lin, 2006; Jacob et al., 2009), and shown to have strong statistical guarantees, including convergence rates in ℓ 2 -norm (Lounici et al., 2009; Baraniuk et al., 2008)) 1. Introduction as well as model selection consistency (Obozinski et al., 2008; Negahban & Wainwright, 2009). For the matrix- We consider the problem of high-dimensional linear regres- structured least squares problem, nuclear norm regularized sion, where the number of variables p could potentially be estimators have been studied for instance in (Recht et al., even larger than the number of observations n . Under such 2010; Bach, 2008). For other structurally constrained least high-dimensional regimes, it is now well understood that squares problems, see (Huang et al., 2011; Bach et al., consistent estimation is typically not possible unless one 2012; Negahban et al., 2012) and references therein. All imposes low-dimensional structural constraints upon the of these estimators solve convex programs, though with regression parameter vector. Popular structural constraints non-smooth components due to the respective regulariza- include that of sparsity, where very few entries of the high- tion functions. The state of the art optimization methods for Proceedings of the 31 st International Conference on Machine solving these programs are iterative, and can approach the Learning , Beijing, China, 2014. JMLR: W&CP volume 32. Copy- optimal solution within any finite accuracy with computa- right 2014 by the author(s). tional complexity that scales polynomially with the number

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend