contents
play

Contents I Introduction I Automatic relevance determination (ARD) I - PowerPoint PPT Presentation

P ROJECTION P REDICTIVE M ODEL S ELECTION F OR G AUSSIAN P ROCESSES Juho Piironen, Aki Vehtari Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Finland juho.piironen@aalto.fi,


  1. P ROJECTION P REDICTIVE M ODEL S ELECTION F OR G AUSSIAN P ROCESSES Juho Piironen, Aki Vehtari Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Finland juho.piironen@aalto.fi, aki.vehtari@aalto.fi Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  2. Contents I Introduction I Automatic relevance determination (ARD) I Projection predictive method I Examples I Summary Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  3. Introduction I Model target y with several input variables x I Only some of the inputs x relevant I Bayesian approach: use a relevant prior and integrate over all uncertainties Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  4. Introduction I Model target y with several input variables x I Only some of the inputs x relevant I Bayesian approach: use a relevant prior and integrate over all uncertainties I Radford Neal won the NIPS 2003 feature selection competition using Bayesian methods with all the features (500 – 100 000) Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  5. Introduction I Model target y with several input variables x I Only some of the inputs x relevant I Bayesian approach: use a relevant prior and integrate over all uncertainties I Radford Neal won the NIPS 2003 feature selection competition using Bayesian methods with all the features (500 – 100 000) I Sometimes we want to select a minimal subset from x with a good predictive performance I improved model interpretability I reduced measurement costs in the future I reduced prediction time Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  6. Gaussian process (GP) regression I GP-prior 0 , k ( x , x 0 ) � � f ( x ) ⇠ GP I Observation model ⇣ ⌘ y | f , � 2 I y | f ⇠ N I Predictive distribution f ⇤ | y ⇠ N ( f ⇤ | µ ⇤ , Σ ⇤ ) , µ ⇤ = K ⇤ ( K + � 2 I ) � 1 y Σ ⇤ = K ⇤⇤ � K ⇤ ( K + � 2 I ) � 1 K T ⇤ . Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  7. “Automatic relevance determination” I Squared exponential (SE) or exponentiated quadratic covariance function 0 1 D j ) 2 ( x j � x 0 @ � 1 k SE ( x , x 0 ) = � 2 X A . f exp ` 2 2 j j = 1 I Use of separate length-scales ` j for each input referred to as automatic relevance determination (ARD) I Idea: Optimizing marginal likelihood will yield large values ` j for irrelevant inputs I Problem: Large length-scale may simply mean linearity w.r.t. the input (not irrelevance) Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  8. Toy example f 1 ( x 1 ) f 2 ( x 2 ) f 3 ( x 3 ) f 4 ( x 4 ) 2 f ( x ) = f 1 ( x 1 ) + · · · + f 8 ( x 8 ) , 1 0 − 1 ⇣ f , 0 . 3 2 ⌘ − 2 y ⇠ N , − 1 0 1 − 1 0 1 − 1 0 1 − 1 0 1 f 5 ( x 5 ) f 6 ( x 6 ) f 7 ( x 7 ) f 8 ( x 8 ) 2 � � f j = 1 for all j . Var 1 0 ) All inputs equally relevant − 1 − 2 − 1 0 1 − 1 0 1 − 1 0 1 − 1 0 1 1 True relevance 0 . 5 0 2 4 6 8 Input Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  9. Toy example f 1 ( x 1 ) f 2 ( x 2 ) f 3 ( x 3 ) f 4 ( x 4 ) 2 f ( x ) = f 1 ( x 1 ) + · · · + f 8 ( x 8 ) , 1 0 − 1 ⇣ f , 0 . 3 2 ⌘ − 2 y ⇠ N , − 1 0 1 − 1 0 1 − 1 0 1 − 1 0 1 f 5 ( x 5 ) f 6 ( x 6 ) f 7 ( x 7 ) f 8 ( x 8 ) 2 � � f j = 1 for all j . Var 1 0 ) All inputs equally relevant − 1 − 2 − 1 0 1 − 1 0 1 − 1 0 1 − 1 0 1 1 True relevance Optimized ARD-values, 0 . 5 ARD-value ARD ( j ) = 1 / ` j (averaged over 100 data realizations, n = 200) 0 2 4 6 8 Input Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  10. How about estimating the predictive performance? I Cross-validation gives an (almost) unbiased estimate of the predictive performance I Fast LOO-CV approximations in Vehtari, Mononen, Tolvanen, Sivula, and Winther (2017). Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models. JMLR 17(103):1-38. Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  11. How about estimating the predictive performance? I Cross-validation gives an (almost) unbiased estimate of the predictive performance I Fast LOO-CV approximations in Vehtari, Mononen, Tolvanen, Sivula, and Winther (2017). Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models. JMLR 17(103):1-38. I But... Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  12. Selection induced bias in variable selection I Even if the model performance estimate is unbiased (like LOO-CV), but it’s noisy (like LOO-CV), then using it for model selection introduces additional fitting to the data Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  13. Selection induced bias in variable selection I Even if the model performance estimate is unbiased (like LOO-CV), but it’s noisy (like LOO-CV), then using it for model selection introduces additional fitting to the data I Performance of the selection process itself can be assessed using two level cross-validation, but it does not help choosing better models Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  14. Selection induced bias in variable selection I Even if the model performance estimate is unbiased (like LOO-CV), but it’s noisy (like LOO-CV), then using it for model selection introduces additional fitting to the data I Performance of the selection process itself can be assessed using two level cross-validation, but it does not help choosing better models I Bigger problem if there is a large number of models as in covariate selection I Juho Piironen and Aki Vehtari (2017). Comparison of Bayesian predictive methods for model selection. Statistics and Computing , 27(3):711-735. doi:10.1007/s11222-016-9649-y. arXiv preprint arXiv:1503.08650. Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  15. Selection induced bias in variable selection n = 20 n = 50 n = 100 − 0.5 − 1.4 − 1.5 − 1.5 − 1.8 − 2.4 − 2.5 − 3.5 − 3.3 − 2.2 0 25 50 0 25 50 0 25 50 Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  16. Selection induced bias in variable selection n = 100 n = 200 n = 400 0 . 3 0 . 3 0 . 3 0 0 0 CV-10 − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 . 3 0 . 3 0 . 3 0 0 0 WAIC − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 100 0 100 0 100 25 50 75 25 50 75 25 50 75 0 . 3 0 . 3 0 . 3 0 0 0 DIC − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 . 3 0 . 3 0 . 3 0 0 0 MPP − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 100 0 100 0 100 25 50 75 25 50 75 25 50 75 0 . 3 0 . 3 0 . 3 0 0 0 BMA-ref − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 . 3 0 . 3 0 . 3 0 0 0 BMA-proj − 0 . 3 − 0 . 3 − 0 . 3 Piironen & Vehtari (2017) − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

  17. Selection induced bias in variable selection n = 100 n = 200 n = 400 0 0 0 CV-10 − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 0 0 WAIC − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 0 0 DIC − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 0 0 MPP − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 0 0 BMA-ref − 0 . 3 − 0 . 3 − 0 . 3 − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 0 0 BMA-proj − 0 . 3 − 0 . 3 − 0 . 3 Piironen & Vehtari (2017) − 0 . 6 − 0 . 6 − 0 . 6 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 Projection Predictive Model Selection for Gaussian Processes Piironen, Vehtari

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend