Model selection for fast density estimation orfi 1 L aszl o (Laci) - PowerPoint PPT Presentation

Model selection for fast density estimation orfi 1 L´ aszl´ o (Laci) Gy¨ 1 Department of Computer Science and Information Theory Budapest University of Technology and Economics Budapest, Hungary July 17, 2008 e-mail: gyorfi@szit.bme.hu www.szit.bme.hu/˜ gyorfi Gy¨ orfi Model selection for fast density estimation

Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n Gy¨ orfi Model selection for fast density estimation

Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n distributed according to unknown probability measure µ with density f Gy¨ orfi Model selection for fast density estimation

Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n distributed according to unknown probability measure µ with density f The L 1 norm � � f − g � := R d | f ( x ) − g ( x ) | dx Gy¨ orfi Model selection for fast density estimation

Density estimation R d -valued i.i.d. random vectors X 1 , . . . , X n distributed according to unknown probability measure µ with density f The L 1 norm � � � � � � � � f − g � := R d | f ( x ) − g ( x ) | dx = 2 sup f ( x ) dx − g ( x ) dx � � � � A A A Gy¨ orfi Model selection for fast density estimation

Kernel density estimate For a kernel function K and bandwidth h > 0, let f n be the kernel density estimate with sample size n : n � x − X i � 1 � f n ( x ) = K . nh d h i =1 Gy¨ orfi Model selection for fast density estimation

Density-free consistency If n →∞ h n = 0 lim and n →∞ nh d lim n = ∞ Gy¨ orfi Model selection for fast density estimation

Density-free consistency If n →∞ h n = 0 lim and n →∞ nh d lim n = ∞ then, for any density f , n →∞ E � f − f n � = 0 lim and n →∞ � f − f n � = 0 a.s. lim Gy¨ orfi Model selection for fast density estimation

Rate of convergence If the density f has a compact support and is twice differentiable, then c 1 + c 2 h 2 E ( � f n − f � ) ≤ n . � nh d n Gy¨ orfi Model selection for fast density estimation

Rate of convergence If the density f has a compact support and is twice differentiable, then c 1 + c 2 h 2 E ( � f n − f � ) ≤ n . � nh d n If h n = cn − 1 / ( d +4) then E ( � f n − f � ) ≤ Cn − 2 / ( d +4) . Gy¨ orfi Model selection for fast density estimation

Rate of convergence If the density f has a compact support and is twice differentiable, then c 1 + c 2 h 2 E ( � f n − f � ) ≤ n . � nh d n If h n = cn − 1 / ( d +4) then E ( � f n − f � ) ≤ Cn − 2 / ( d +4) . TOO SLOW. Gy¨ orfi Model selection for fast density estimation

Model selection for density estimation We wish to estimate a density f on R d Gy¨ orfi Model selection for fast density estimation

Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, Gy¨ orfi Model selection for fast density estimation

Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, but F k ⊂ F k +1 for all k . Gy¨ orfi Model selection for fast density estimation

Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, but F k ⊂ F k +1 for all k . � F = F k . k ≥ 1 Gy¨ orfi Model selection for fast density estimation

Model selection for density estimation We wish to estimate a density f on R d that belongs to a parametric family, F k , where k is unknown, but F k ⊂ F k +1 for all k . � F = F k . k ≥ 1 the complexity associated with f is defined as k ∗ = min { k ≥ 1 : f ∈ F k } . Gy¨ orfi Model selection for fast density estimation

Example F k is the set of mixtures of d dimensional normal densities, where the number of components is at most k Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Biau, Devroye (2004) Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Biau, Devroye (2004) k n and ˆ f k n via projection of the empirical measure with respect to the Yatracos class Gy¨ orfi Model selection for fast density estimation

Objective We wish to introduce an estimate k n of the complexity k ∗ and to pick a density estimate ˆ f k n in F with 1 k n → k ∗ almost surely (i.e., k n = k ∗ almost surely, for all n large enough) 2 and � 1 � � � � ˆ f k n − f � = O √ n . E Biau, Devroye (2004) k n and ˆ f k n via projection of the empirical measure with respect to the Yatracos class too complex Gy¨ orfi Model selection for fast density estimation

Testing homogeneity Gy¨ orfi Model selection for fast density estimation

Testing homogeneity Two mutually independent samples X ′ 1 , . . . , X ′ X 1 , . . . , X n and n distributed according to unknown probability distributions µ and µ ′ on R d . Gy¨ orfi Model selection for fast density estimation

Testing homogeneity Two mutually independent samples X ′ 1 , . . . , X ′ X 1 , . . . , X n and n distributed according to unknown probability distributions µ and µ ′ on R d . We are interested in testing the null hypothesis that the two samples are homogeneous, that is H 0 : µ = µ ′ . Gy¨ orfi Model selection for fast density estimation

Testing homogeneity Two mutually independent samples X ′ 1 , . . . , X ′ X 1 , . . . , X n and n distributed according to unknown probability distributions µ and µ ′ on R d . We are interested in testing the null hypothesis that the two samples are homogeneous, that is H 0 : µ = µ ′ . empirical probability distributions µ n and µ ′ n Gy¨ orfi Model selection for fast density estimation

The test statistic Based on a partition P n = { A n 1 , . . . , A nm n } of R d , we let the test statistic be defined as m n � | µ n ( A nj ) − µ ′ T n = n ( A nj ) | . j =1 Gy¨ orfi Model selection for fast density estimation

Asymptotic behavior of T n Theorem. Under H 0 , for all 0 < ε < 2, P { T n > ε } = e − n ( g T ( ε )+ o (1)) , as n → ∞ , Gy¨ orfi Model selection for fast density estimation

Asymptotic behavior of T n Theorem. Under H 0 , for all 0 < ε < 2, P { T n > ε } = e − n ( g T ( ε )+ o (1)) , as n → ∞ , where g T ( ε ) = (1 + ε/ 2) ln(1 + ε/ 2) + (1 − ε/ 2) ln(1 − ε/ 2) ≈ ε 2 / 4 . (Biau, Gy¨ orfi (2005)) Gy¨ orfi Model selection for fast density estimation

A strong consistent test Corollary. Consider the test which rejects H 0 when √ � m n T n > 2 ln 2 n . Gy¨ orfi Model selection for fast density estimation

A strong consistent test Corollary. Consider the test which rejects H 0 when √ � m n T n > 2 ln 2 n . Assume that m n m n lim n = 0 and lim ln n = ∞ . n →∞ n →∞ Gy¨ orfi Model selection for fast density estimation

A strong consistent test Corollary. Consider the test which rejects H 0 when √ � m n T n > 2 ln 2 n . Assume that m n m n lim n = 0 and lim ln n = ∞ . n →∞ n →∞ Then, under H 0 , after a random sample size the test makes a.s. no error. Gy¨ orfi Model selection for fast density estimation

Model selection for fast density estimation orfi 1 L aszl o (Laci) - PowerPoint PPT Presentation

Model selection for fast density estimation orfi 1 L aszl o (Laci) Gy 1 Department of Computer Science and Information Theory Budapest University of Technology and Economics Budapest, Hungary July 17, 2008 e-mail: gyorfi@szit.bme.hu

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Density Ratio Estimation Density Ratio Estimation in Machine Learning in Machine Learning

Outline Density Estimation 1 Nonparametric Methods Bins Kernel Estimators k-Nearest Neighbor

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Nonparametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Density Estimation

Polyethylene Monomer: Ethylene High Density Polyethylene (HDPE) Low Density Polyethylene

Bulk Density and Void Content Bulk Density Bulk density ( n .) the mass of a unit volume of bulk

Model selection and parameter estimation with covariates in logistic regression missing

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Lecture 7: Kernel Density Estimation Applied Statistics 2015 1 / 20 Kernel Density Estimator

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Nonparametric density estimation Christopher F Baum EC 823: Applied Econometrics Boston College,

Nonparametric density estimation Christopher F Baum ECON 8823: Applied Econometrics Boston

Realistic Image Synthesis - Density Estimation and Photon Mapping - Philipp Slusallek Karol

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

In the previous parts of the course, weve seen a number of different estimators used in

Mean Shift Paper by Comaniciu and Meer Presentation by Carlo Lopez-Tello What is the Mean Shift

ICS 667 Advanced HCI Design Methods 08. Intro to Evaluation Analytic Evaluation Dan Suthers

Spoken Document Retrieval and Browsing Ciprian Chelba OpenFst Library C++ template library

Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani

Density Estimation Parametric techniques Maximum Likelihood Maximum A Posteriori

Panel f u nctions DATA VISU AL IZATION W ITH L ATTIC E IN R Deepa y an Sarkar Associate

Efficient Structure-Aware Selection Techniques for 3D Point Cloud Visualizations with 2DOF Input

Sambuz

Useful Links

Newsletter

Mail Us

Model selection for fast density estimation orfi 1 L aszl o (Laci) - PowerPoint PPT Presentation

Model selection for fast density estimation orfi 1 L aszl o (Laci) Gy 1 Department of Computer Science and Information Theory Budapest University of Technology and Economics Budapest, Hungary July 17, 2008 e-mail: gyorfi@szit.bme.hu

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Density Ratio Estimation Density Ratio Estimation in Machine Learning in Machine Learning

Outline Density Estimation 1 Nonparametric Methods Bins Kernel Estimators k-Nearest Neighbor

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Nonparametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Density Estimation

Polyethylene Monomer: Ethylene High Density Polyethylene (HDPE) Low Density Polyethylene

Bulk Density and Void Content Bulk Density Bulk density ( n .) the mass of a unit volume of bulk

Model selection and parameter estimation with covariates in logistic regression missing

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Lecture 7: Kernel Density Estimation Applied Statistics 2015 1 / 20 Kernel Density Estimator

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Nonparametric density estimation Christopher F Baum EC 823: Applied Econometrics Boston College,

Nonparametric density estimation Christopher F Baum ECON 8823: Applied Econometrics Boston

Realistic Image Synthesis - Density Estimation and Photon Mapping - Philipp Slusallek Karol

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

In the previous parts of the course, weve seen a number of different estimators used in

Mean Shift Paper by Comaniciu and Meer Presentation by Carlo Lopez-Tello What is the Mean Shift

ICS 667 Advanced HCI Design Methods 08. Intro to Evaluation Analytic Evaluation Dan Suthers

Spoken Document Retrieval and Browsing Ciprian Chelba OpenFst Library C++ template library

Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani

Density Estimation Parametric techniques Maximum Likelihood Maximum A Posteriori

Panel f u nctions DATA VISU AL IZATION W ITH L ATTIC E IN R Deepa y an Sarkar Associate

Efficient Structure-Aware Selection Techniques for 3D Point Cloud Visualizations with 2DOF Input

Sambuz

Useful Links

Newsletter

Mail Us

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?