Sparse Nonparametric Density Estimation in High Dimensions Using the - PowerPoint PPT Presentation

Outline Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu 1 , 2 John Lafferty 2 , 3 Larry Wasserman 1 , 2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University July 1st, 2006 Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Outline Motivation Research background Rodeo is a general strategy for nonparametric inference. It has been successfully applied to solve sparse nonparametric regression problems in high dimensions by Lafferty & Wasserman, 2005. Our goal Trying to adapt the rodeo framework to nonparametric density estimation problems . So that we have a unified framework for both density estimation and regression problems which is computationally efficient and theoretically soundable Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Outline Outline 1 Background Nonparametric density estimation in high dimensions Sparsity assumptions for density estimation 2 Methodology and Algorithms The main idea The local rodeo algorithm for the kernel density estimator 3 Asymptotic Properties The asymptotic running time and minimax risk 4 Extension and Variations The global density rodeo and the reverse density rodeo Using other distributions as irrelevant dimensions 5 Experimental Results Empirical results on both synthetic and real-world datasets Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results Problem statement Problem To estimate the joint density of a continuous d -dimensional random vector X = ( X 1 , X 2 , ..., X d ) ∼ F , d ≫ 3 where F is the unknown distribution with density function f ( x ). This problem is essentially hard, since the high dimensionality causes both computational and theoretical problems. Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results Previous work From a frequentist perspective Kernel density estimation and the local likelihood method Projection pursuit method Log-spline models and the penalized likelihood method From a Bayesian perspective Mixture of normals with Dirichlet processes as prior Difficulties of current approaches Some methods only work well for low-dimensional problems Some heuristics lack the theoretical guarantees More importantly, they suffer from the curse of dimensionality Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results The curse of dimensionality Characterizing the curse In a Sobolev space of order k , minimax theory shows that the best convergence rate for the mean squared error is � n − 2 k/ (2 k + d ) � R opt = O which is practically slow when the dimension d is large. Combating the curse by some sparsity assumptions If the high-dimensional data has a low dimensional structure or a sparsity condition, we expect that some methods could be developed to combat the curse of dimensionality. This motivates the development of the rodeo framework Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results Rodeo for nonparametric regression (I) Rodeo ( r egularization o f d erivative e xpectation o perator) is a general strategy for nonparametric inference. Which has been used for nonparametric regression For a regression problem Y i = m ( X i ) + ǫ i , i = 1 , . . . , n where X i = ( X i 1 , ..., X id ) ∈ R d is a d -dimensional covariate. If m is in a d -dimensional Sobolev space of order 2, the best convergence rate for the risk is R ∗ = O � n − 4 / (4+ d ) � Which shows the curse of dimensionality in a regression setting. Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results Rodeo for nonparametric regression (II) Assume the true function only depends on r covariates ( r ≪ d ) m ( x ) = m ( x 1 , ..., x r ) for any ǫ > 0, the rodeo can simultaneously perform bandwidth selection and (implicitly) variable selection to achieve a better minimax convergence rate of � n − 4 / (4+ r )+ ǫ � R rodeo = O as if the r relevant variables were explicitly isolated in advance. Rodeo beats the curse of dimensionality in this sense. We expect to apply the same idea to solve density estimation problems. Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results Sparse density estimation For many applications, the true density function can be characterized by some low dimensional structure Sparsity assumption for density estimation problems Assume h jj ( x ) is the second partial derivative of h on the j -th varaible, there exists some r ≪ d , such that f ( x ) ∝ g ( x 1 , ..., x r ) h ( x ) where h jj ( x ) = 0 for j = 1 , ..., d. Where x R = { x 1 , ..., x r } are the relevant dimensions. This condition imposes that h ( · ) belongs to a family of very smooth functions (e.g. the uniform distribution). h ( · ) can be generalized to be any parametric distribution! Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results Generalized sparse density estimation We can generalize h ( · ) to other distributions (e.g. Gaussian). General sparsity assumption for density estimation problems Assume h ( · ) is any distribution (e.g. Gaussian) that we are not interested in f ( x ) ∝ g ( x 1 , ..., x r ) h ( x ) where r ≪ d. Thus, the density function f ( · ) can be factored into two parts: the relevant components g ( · ) and the irrelevant components h ( · ). Where x R = { x 1 , ..., x r } are the relevant dimensions. Under this framework, we can hope to achieve a better minimax rate � n − 4 / (4+ r ) � R ∗ rodeo = O Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms Nonparametric density estimation in high dimensions Asymptotic Properties Sparsity assumptions and the rodeo framework Extension and Variations Experimental Results Related work Recent work that addressed this problem Minimum volume set (Scott& Nowak, JMLR06) Nongaussian component analysis; (Blanchard et al. JMLR06) Log-ANOVA model; (Lin & Joen, Statistical Sinica 2006) Advantages of our approach: Rodeo can utilize well-established nonparametric estimators A unified framework for different kinds of problems easy to implement and is amenable to theoretical analysis Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms The main idea Asymptotic Properties The local rodeo algorithm for the kernel density estimato Extension and Variations Experimental Results Density rodeo: the main idea The key intuition: if a dimension is irrelevant, then changing the smoothing parameter of that dimension should only result in a small change in the whole estimator Basically, Rodeo is just a regularization strategy Use a kernel density estimator start with large bandwidths Calculate the gradient of the estimator w.r.t. the bandwidth Sequentially decrease the bandwidths in a greedy way, and try to freeze this decay process by some thresholding strategy to achieve a sparse solution Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Background Methodology and Algorithms The main idea Asymptotic Properties The local rodeo algorithm for the kernel density estimato Extension and Variations Experimental Results Density rodeo: the main idea Assuming a fixed point x and let ˆ f H ( x ) denote an estimator of f ( x ) based on smoothing parameter matrix H = diag( h 1 , ..., h d ) Let M ( h ) = E ( ˆ f h ( x )) denote the mean of ˆ f h ( x ), therefore, f ( x ) = M (0) = E ( ˆ f 0 ( x )). Assuming P = { h ( t ) : 0 ≤ t ≤ 1 } is a smooth path through the set of smoothing parameters with h (0) = 0 and h (1) = 1, then � 1 dM ( h ( s )) f ( x ) = M (1) − ( M (1) − M (0)) = M (1) − ds ds 0 � 1 � D ( s ) , ˙ M (1) − h ( s ) � ds = 0 � T � ∂h 1 , ..., ∂M ∂M where D ( h ) = ∇ M ( h ) = is the gradient of M ( h ) and ∂h d h ( s ) = dh ( s ) ˙ is the derivative of h ( s ) along the path. ds Liu, Lafferty, Wasserman Sparse Nonparametric Density Estimation

Sparse Nonparametric Density Estimation in High Dimensions Using the - PowerPoint PPT Presentation

Outline Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu 1 , 2 John Lafferty 2 , 3 Larry Wasserman 1 , 2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon

Nonparametric density estimation Christopher F Baum EC 823: Applied Econometrics Boston College,

Nonparametric density estimation Christopher F Baum ECON 8823: Applied Econometrics Boston

Outline Density Estimation 1 Nonparametric Methods Bins Kernel Estimators k-Nearest Neighbor

Nonparametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Density Estimation

Nonparametric Density Estimation October 1, 2018 Introduction If we cant fit a

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

Polyethylene Monomer: Ethylene High Density Polyethylene (HDPE) Low Density Polyethylene

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Sparse Kernel Density Estimation Technique Based on Zero-Norm Constraint Xia Hong 1 , Sheng Chen 2

Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

Nonparametric spectral-based estimation of latent structures Stphane Bonhomme (Chicago), Koen

Density Ratio Estimation Density Ratio Estimation in Machine Learning in Machine Learning

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

INDUCTION OF a few sentiments PREFECTS 20 0 4 w e r e expressed SCHOOL CAPTAINS by Sila

19 March 2018 Welcome Steering Committee Update Reminder to Vote! TMF RM Community

Rapid Response 1 5/25/2016 By the end of our time with you today We hope youll:

SEPECC Meeting 9:10 OCDEL & ELRC Updates 9:30 Funding Update Tuesday, October 13, 2020 9:40

Nonparametric Sparsity John Lafferty Larry Wasserman Computer Science Dept.

Generalized Significance in Scale Space: The GS3 Package Daniel V. Samarov Statistical

Safe R oute s to Sc hool Tuesday, September 1, 2020 12:151:15PM House ke e ping 1.

MARCH MEETING | THURSDAY, MARCH 28, 2019 WESTERN DAKOTA TECHNICAL INSTITUTE | RAPID CITY, SD

Sparse Nonparametric Density Estimation in High Dimensions Using the - PowerPoint PPT Presentation

Outline Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu 1 , 2 John Lafferty 2 , 3 Larry Wasserman 1 , 2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon

Nonparametric density estimation Christopher F Baum EC 823: Applied Econometrics Boston College,

Nonparametric density estimation Christopher F Baum ECON 8823: Applied Econometrics Boston

Outline Density Estimation 1 Nonparametric Methods Bins Kernel Estimators k-Nearest Neighbor

Nonparametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Density Estimation

Nonparametric Density Estimation October 1, 2018 Introduction If we cant fit a

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

Polyethylene Monomer: Ethylene High Density Polyethylene (HDPE) Low Density Polyethylene

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Sparse Kernel Density Estimation Technique Based on Zero-Norm Constraint Xia Hong 1 , Sheng Chen 2

Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

Nonparametric spectral-based estimation of latent structures Stphane Bonhomme (Chicago), Koen

Density Ratio Estimation Density Ratio Estimation in Machine Learning in Machine Learning

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

INDUCTION OF a few sentiments PREFECTS 20 0 4 w e r e expressed SCHOOL CAPTAINS by Sila

19 March 2018 Welcome Steering Committee Update Reminder to Vote! TMF RM Community

Rapid Response 1 5/25/2016 By the end of our time with you today We hope youll:

SEPECC Meeting 9:10 OCDEL &amp; ELRC Updates 9:30 Funding Update Tuesday, October 13, 2020 9:40

Nonparametric Sparsity John Lafferty Larry Wasserman Computer Science Dept.

Generalized Significance in Scale Space: The GS3 Package Daniel V. Samarov Statistical

Safe R oute s to Sc hool Tuesday, September 1, 2020 12:151:15PM House ke e ping 1.

MARCH MEETING | THURSDAY, MARCH 28, 2019 WESTERN DAKOTA TECHNICAL INSTITUTE | RAPID CITY, SD

SEPECC Meeting 9:10 OCDEL & ELRC Updates 9:30 Funding Update Tuesday, October 13, 2020 9:40