On the Theory and Practice of Variable Selection for Functional Data - PowerPoint PPT Presentation

On the Theory and Practice of Variable Selection for Functional Data Jos´ e Luis Torrecilla under the supervision of Jos´ e Ram´ on Berrendero and Antonio Cuevas Departamento de Matem´ aticas Universidad Aut´ onoma de Madrid Lectura de tesis Madrid - December 3, 2015 J.L. Torrecilla (UAM) Variable selection for functional data 1 / 76

Outline Introduction 1 FDA Variable Selection Functional classification RKHS 2 The RKHS approach The absolutely continuous case The singular case Variable selection 3 Variable selection and RKHS mRMR-RD Maxima hunting Experiments 4 Conclusions and future work 5 J.L. Torrecilla (UAM) Variable selection for functional data 2 / 76

Functional Data Analysis What are functional data? Let (Ω , F , P ) be a probability space and I ⊆ R an index set, an stochastic process is a collection of random variables { X ( ω, t ) : ω ∈ Ω , t ∈ I} where X ( · , t ) is an F -measurable function on Ω. A functional data is just a realization (often called “trajectory”) of a stochastic process for all t ∈ [0 , T ]. J.L. Torrecilla (UAM) Variable selection for functional data 5 / 76

Difficulties and particularities No obvious order structure (distribution functions), nor closeness or centrality notions (outliers, depth). Representation problems. Function spaces are “difficult to fill”. No natural densities: no natural translation-invariant measure plays the role of Lebesgue measure in R n . Redundancy: close variables are closely related (continuity). Fails in linear models. High dimension: the curse of the dimensionality, overvitting, computational cost... J.L. Torrecilla (UAM) Variable selection for functional data 6 / 76

Variable selection Idea Choose the most informative subset among the original variables. Motivation ◮ Variable selection is a successful technique of dimension reduction in other fields. ◮ This was an almost unexplored topic in FDA classification. ◮ The dimension reduction is made in terms of the original variables (interpretability). Goals ◮ Remove useless and redundant variables improving temporal and storage performance. ◮ Improve the classification accuracy decreasing the overfitting risk. ◮ Get theoretical and more interpretable models. J.L. Torrecilla (UAM) Variable selection for functional data 8 / 76

What do we mean by “variable selection” in FDA? Given a sample of functions X 1 ( t ) , · · · , X n ( t ), t ∈ [0 , 1] our aim is to replace every sample function X j with a vector ( X j ( t 1 ) , · · · , X j ( t d )) , for suitably chosen points t 1 , · · · , t d . Then we would apply multivariate methods (regression, classification,...) to the “reduced” data. According to our experience, the value of d should be typically small (not much larger than 5, say). J.L. Torrecilla (UAM) Variable selection for functional data 9 / 76

Relevance Vs. Redundancy mRMR MaxRel 850 900 950 1000 1050 850 900 950 1000 1050 J.L. Torrecilla (UAM) Variable selection for functional data 10 / 76

Relevance Vs. Redundancy mRMR MaxRel 850 900 950 1000 1050 850 900 950 1000 1050 err = 4 . 09% err = 1 . 86% J.L. Torrecilla (UAM) Variable selection for functional data 10 / 76

Functional classification problem 1.5 1.5 1.0 1.0 0.5 0.5 x(t) x(t) 0.0 0.0 −0.5 −0.5 −1.0 −1.0 0.0 0.4 0.8 0.0 0.4 0.8 t t J.L. Torrecilla (UAM) Variable selection for functional data 12 / 76

Functional classification problem (II) Which is the class of this trajectory? 1.5 1.0 0.5 x(t) 0.0 −0.5 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 t J.L. Torrecilla (UAM) Variable selection for functional data 13 / 76

Statement of the problem Independent observations: ( X 1 , Y 1 ) , . . . , ( X n , Y n ). X ∈ F [0 , T ] Y ∈ { 0 , 1 } J.L. Torrecilla (UAM) Variable selection for functional data 14 / 76

Statement of the problem Independent observations: ( X 1 , Y 1 ) , . . . , ( X n , Y n ). X ∈ F [0 , T ] Y ∈ { 0 , 1 } Optimal classification rule (Bayes rule) g ∗ ( X ) = I { η ( X ) > 1 / 2 } , where η ( x ) = E ( Y | X = x ). Bayes Error L ∗ = P ( g ∗ ( X ) � = Y ) . J.L. Torrecilla (UAM) Variable selection for functional data 14 / 76

Statement of the problem Independent observations: ( X 1 , Y 1 ) , . . . , ( X n , Y n ). X ∈ F [0 , T ] Y ∈ { 0 , 1 } Optimal classification rule (Bayes rule) g ∗ ( X ) = I { η ( X ) > 1 / 2 } , where η ( x ) = E ( Y | X = x ). Bayes Error L ∗ = P ( g ∗ ( X ) � = Y ) . g ∗ ( X ) = 1 ⇔ dP 1 ( X ) > 1 − p dP 0 p See Ba´ ıllo et al., Scand. J. Stat. (2011), Theorem 1 J.L. Torrecilla (UAM) Variable selection for functional data 14 / 76

Our general approach We consider the functional data as trajectories drawn from a stochastic process. We have tried to motivate our results and proposals in terms of this underlying stochastic process. This is somewhat in contrast with the mainstream research line in FDA, mostly centred in algorithmic aspects and real data analysis. “Curiously, despite a huge research activity in the field, few attempts have been made to connect the area of functional data analysis with the theory of stochastic processes” Biau et al. 2015 J.L. Torrecilla (UAM) Variable selection for functional data 15 / 76

Contributions a) A mathematical contribution to the functional classification problem (RKHS) b) Functional variable selection: a theoretical motivation and three different proposals. c) Large and replicable simulation studies. J.L. Torrecilla (UAM) Variable selection for functional data 16 / 76

RKHS approach “It turns out, in my opinion, that reproducing kernel Hilbert spaces are the natural setting in which to solve problems of statistical inference on time processes” . Parzen, 1961 Why natural? RKHS provides an intrinsic inner product depending on the covariance structure. Explicit expressions of the Bayes rule (equivalent distributions). Approximate optimal rule under mutually singular distributions. Insight into the near “perfect classification phenomenon” (Delaigle and Hall 2012) Natural setting to formalize variable selection problems (RK-VS and associated classifier). Berrendero, Cuevas and Torrecilla. On near perfect classification and functional Fisher rules via reproducing kernels. Manuscript. arXiv:1507.04398v2. J.L. Torrecilla (UAM) Variable selection for functional data 19 / 76

Some background Definition: If X = { X t , t ∈ [0 , T ] } is a L 2 -process with covariance function K ( s , t ), define ( H 0 ( K ) , �· , ·� ) by n � H 0 ( K ) := { f : f ( s ) = a i K ( s , t i ) , a i ∈ R , t i ∈ [0 , T ] , n ∈ N } i � � f , g � K = α i β j K ( s j , t i ) , i , j where f ( x ) = � i α i K ( x , t i ) and g ( x ) = � j β j K ( x , s j ). The RKHS associated with K , H ( K ), is defined as the completion of H 0 ( K ). More precisely, H ( K ) is the set of functions f : [0 , T ] → R obtained as t pointwise limit of a Cauchy sequence { f n } in H 0 ( K ). J.L. Torrecilla (UAM) Variable selection for functional data 20 / 76

On the Theory and Practice of Variable Selection for Functional Data - PowerPoint PPT Presentation

On the Theory and Practice of Variable Selection for Functional Data Jos e Luis Torrecilla under the supervision of Jos e Ram on Berrendero and Antonio Cuevas Departamento de Matem aticas Universidad Aut onoma de Madrid Lectura

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

MLCC 2019 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Variable selection STAT 401 - Statistical Methods for Research Workers Jarad Niemi Iowa State

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Design of experiments for the NIPS 2003 variable selection benchmark Isabelle Guyon July 2003

Theory or Practice? Theory : Without theory, practice is but routine born out of habit.

Nonparametric Variable Selection via Sufficient Dimension Reduction Lexin Li Workshop on Current

Variable Benefit Plans in Depth Kelly Coffing, FSA, EA, MAAA September 21, 2019 Agenda The

Measuring variable importance in random forests Variable Variable importance in RF importance

Variables in C++ The variable C++ Variables Kinds of Variables Memory storage

Variable & Value Ordering Heuristics Heuristics for backtracking algorithms Variable

MANU ANUSCR SCRIPT WR WRITING NG AND PU AND PUBL BLISH SHING Jennif nifer er C Cunning

KUU-US Crisis Line Society INDIGENOUS CRISIS LINE 24/7 365 DAYS A YEAR HELP IS ONLY A PHONE

Future Forum - Clinical Data Scien2sts for Tomorrow 11 October 2017 Process & People Who

Sampling Strategies in Sales Tax Audits Sampling Strategies in Sales Tax Audits Selecting a

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Lecture 1 : Introduction to Statistical Computing Biostatistics 615/815 - Statistical Computing .

education academic and administrative staff. Conducted consultancies and technical reports for

1 Topics of Interest for JONA in 2018 Population Health and the Role of Nursing APRN Models

On the Theory and Practice of Variable Selection for Functional Data - PowerPoint PPT Presentation

On the Theory and Practice of Variable Selection for Functional Data Jos e Luis Torrecilla under the supervision of Jos e Ram on Berrendero and Antonio Cuevas Departamento de Matem aticas Universidad Aut onoma de Madrid Lectura

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

MLCC 2019 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable

Luigi Spezia Biomathematics &amp; Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Variable selection STAT 401 - Statistical Methods for Research Workers Jarad Niemi Iowa State

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Design of experiments for the NIPS 2003 variable selection benchmark Isabelle Guyon July 2003

Theory or Practice? Theory : Without theory, practice is but routine born out of habit.

Nonparametric Variable Selection via Sufficient Dimension Reduction Lexin Li Workshop on Current

Variable Benefit Plans in Depth Kelly Coffing, FSA, EA, MAAA September 21, 2019 Agenda The

Measuring variable importance in random forests Variable Variable importance in RF importance

Variables in C++ The variable C++ Variables Kinds of Variables Memory storage

Variable &amp; Value Ordering Heuristics Heuristics for backtracking algorithms Variable

MANU ANUSCR SCRIPT WR WRITING NG AND PU AND PUBL BLISH SHING Jennif nifer er C Cunning

KUU-US Crisis Line Society INDIGENOUS CRISIS LINE 24/7 365 DAYS A YEAR HELP IS ONLY A PHONE

Future Forum - Clinical Data Scien2sts for Tomorrow 11 October 2017 Process &amp; People Who

Sampling Strategies in Sales Tax Audits Sampling Strategies in Sales Tax Audits Selecting a

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Lecture 1 : Introduction to Statistical Computing Biostatistics 615/815 - Statistical Computing .

education academic and administrative staff. Conducted consultancies and technical reports for

1 Topics of Interest for JONA in 2018 Population Health and the Role of Nursing APRN Models

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Variable & Value Ordering Heuristics Heuristics for backtracking algorithms Variable

Future Forum - Clinical Data Scien2sts for Tomorrow 11 October 2017 Process & People Who