Variational Bayesian Inference for Parametric and Non-Parametric - PowerPoint PPT Presentation

Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor Data Christel Faes, John Ormerod and Matt Wand August 23, 2010 Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Introduction Bayesian inference For parametric regression: long history (e.g. Box and Tiao, 1973; Gelman, Carlin, Stern and Rubin, 2004) For non-parametric regression: e.g. mixed model representations of penalized splines (e.g. Ruppert, Wand and Carroll, 2003) For dealing with missingness in data: allows incorporation of standard missing data models (e.g. Little and Rubin, 2004; Daniels and Hogan, 2008) Easy via MCMC, but can be costly in processing time Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Introduction Variational Bayes inference Part of mainstream Computer Science methodology (e.g. Bishop, 2006) Recently, used in statistical problems (e.g. Teschendorff et al. 2005; McGrory & Titterington, 2007; Ormerod & Wand, 2010) Deterministic approach that yields approximate inference Involves approximation of posterior densities by other densities for which inference is more tractable Faes, Ormerod and Wand (2010): develop and investigate variational Bayes for regression analysis with missing data Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Elements of Variational Bayes Bayesian inference is based on the posterior density function p ( θ | y ) = p ( y , θ ) p ( y ) For an arbitrary density function q over Θ, the following inequality holds �� p ( y , θ ) � � p ( y ) ≥ p ( y ; q ) = exp q ( θ ) log d θ q ( θ ) Variational Bayes relies on product density restrictions: M � q ( θ ) = q i ( θ i ) for some partition { θ 1 , . . . , θ M } of θ i =1 The optimal densities (with minimum KL divergence) can be shown to satisfy q ∗ i ( θ i ) ∝ exp { E − θ i log p ( θ i | rest) } Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Simple Linear Regression with Missing Predictor Data Assume the model ǫ i ∼ N (0 , σ 2 y i = β 0 + β 1 x i + ǫ i , ǫ ) Cough this in Bayesian framework by taking β 0 , β 1 ∼ N (0 , σ 2 β ) and σ 2 ǫ ∼ IG ( A ǫ , B ǫ ). Suppose that predictors are susceptible to missingness and assume x i ∼ N ( µ x , σ 2 x ) with hyperpriors µ x ∼ N (0 , σ 2 µ x ) and σ 2 x ∼ IG ( A x , B x ) Let R i be the missingness indicators and consider the missingness mechanisms: P ( R i = 1) = p : MCAR 1 P ( R i = 1) = Φ( φ 0 + φ 1 y i ) for φ 0 , φ 1 ∼ N (0 , σ 2 φ ): MAR 2 P ( R i = 1) = Φ( φ 0 + φ 1 x i ) for φ 0 , φ 1 ∼ N (0 , σ 2 φ ): MNAR 3 Use auxiliary variables a i | φ ∼ N (( Y φ ) i , 1) or a i | φ ∼ N (( X φ ) i , 1) for the probit regression components Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Approximate Inference via Variational Bayes We impose the product density restrictions: q ( β, σ 2 ǫ , x mis , µ x , σ 2 x ) = q ( β, µ x ) q ( σ 2 ǫ , σ 2 MCAR: x ) q ( x mis) q ( β, σ 2 ǫ , x mis , µ x , σ 2 x , φ, a ) = q ( β, µ x , φ ) q ( σ 2 ǫ , σ 2 MAR: x ) q ( x mis) q ( a ) q ( β, σ 2 ǫ , x mis , µ x , σ 2 x , φ, a ) = q ( β, µ x , φ ) q ( σ 2 ǫ , σ 2 MNAR: x ) q ( x mis) q ( a ) For the MCAR, this leads to optimal densities of the form q ∗ ( β ) = Bivariate normal density q ∗ ( µ x ) = Univariate normal density q ∗ ( σ 2 ǫ ) = Inverse Gamma density q ∗ ( σ 2 x ) = Inverse Gamma density q ∗ ( x mis) = product of Univariate Normal densities For MAR and MNAR situation, derivations of optimal densities for φ and a have easy expressions as well Non-parametric regression give rise to non-standard forms and numerical integration is required (we use numerical integration via quadrature) Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Simulation Simple Linear Regression with predictor MCAR Accuracy measure defined as accuracy( q ∗ ) = 1 − ( IAE ( q ∗ ) / sup q IAE ( q )) = 1 − 1 2 IAE ( q ∗ ) with IAE the integrated absolute error of q ∗ Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Simulation Simple Linear Regression with predictor MNAR Accuracy drops when amount of missing data is large and when data are noisy Accuracy of missing covariates is high in all situations Poor performance for missing mechanism parameters (due to strong correlation between φ and a ) Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Nonparametric Regression with Missing Predictor Data Good agreement between variational Bayes and MCMC in fitted functions Time needed: 75 seconds for variational Bayes, 15 . 5 hours for MCMC Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Nonparametric Regression with Missing Predictor Data Variational Bayes are able to handle the multimodality of posteriors of the x mis (coming from periodic nature of f ) Good to excellent performance for all parameters (except for missing mechanism parameters) Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Conclusions Variational Bayes inference achieves good to excellent accuracy for main parameters of interest Poor accuracy is realized for the missing data mechanism parameters Better accuracy maybe achieved with a more elaborate variational scheme – in situations where they are of interest Variational Bayes approximates multimodal posterior densities with high degree of accuracy Speed-up in the order of several hundreds Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Contact Information Christel Faes I-BioStat, Center for Statistics Hasselt University Diepenbeek, Belgium link to paper: http://www.uow.edu.au/ mwand/papers.html Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Variational Bayesian Inference for Parametric and Non-Parametric - PowerPoint PPT Presentation

Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor Data Christel Faes, John Ormerod and Matt Wand August 23, 2010 Christel Faes, John Ormerod and Matt Wand Variational Bayesian Inference

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

CS480/680 Machine Learning Lecture 11: February 11 th , 2020 Variational Inference Zahra

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

Variational Russian Roulette for Variational Russian Roulette for Deep Bayesian Nonparametrics

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. Probabilistic latent variable

Lecture Variational 13 Inference Panini Kaushal Scribes : - Margulies Smedeuranh Niklas

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

CHAPTER 9: PID TUNING Process Solve the tuning Apply, is the reaction curve problem. Requires

CHAPTER 7: THE FEEDBACK LOOP Disturbance Response = IAE = |SP(t)-CV(t)| dt 0 0.8

Chernobyl study of Chernobyl lava, corium and hot particles: experience of V.G. Khlopin

Fluid Dynamics Simulation of Rayleigh-Taylor Instability Xinwei Li, Xiaoyi Xie, Yu Guo 1

antineutron oscillations using a projected UCN source at the WWR-M reactor A. Fomin Project

Cryptography [MACs and Hash Functions] Spring 2020 Franziska (Franzi) Roesner

Weisz communication styles inventory (WCSI: Version 1.0): Development and validation Robert Weisz

Parity-violating Electron Scattering and Strangeness in the Nucleon: Results from HAPPEX-II

Sambuz

Useful Links

Newsletter

Mail Us