Bayesian Linear Regression Seung-Hoon Na Chonbuk National University
Bayesian Linear Regression β’ Compute the full posterior over π and π 2 β’ Case 1) the noise variance π 2 is known β Use Gaussian prior β’ Case 2) the noise variance π 2 is unknown β Use normal inverse gamma (NIG) prior
Posterior: π 2 is known β’ The likelihood offset putting an improper prior on π Further assume that the output is centered: β’ The conjugate prior:
Posterior: π 2 is known β’ The posterior: β If and then the posterior mean reduces to the ridge estimate with β1 π 2 π 2 π± + π πΌ π π π π π π =
Posterior: π 2 is known β’ 1D example β the true parameters: π₯ 0 = β0.3 , π₯ 1 = 0.5 β’ Sequential Bayesian inference β’ Posterior given the first n data points
π = 0 π = 1 π = 2 π = 20 π₯ 0 = β0.3 , π₯ 1 = 0.5
Posterior Predictive: π 2 is known β’ The posterior predictive distribution at a test point x : Gaussian β’ The plug-in approximation: constant error bar
Posterior Predictive: π 2 is known
Posterior Predictive: π 2 is known 10 samples from the posterior predictive 10 samples from the plugin approximation to posterior predictive.
Bayesian linear regression: π 2 is unknown β’ The likelihood: β’ The natural conjugate prior:
Inverse Wishart Distribution β’ Similarly, If D = 1, the Wishart reduces to the Gamma distribution
Inverse Wishart Distribution If D = 1, this reduces to the inverse Gamma
Bayesian linear regression: π 2 is unknown β’ The posterior: β’ The posterior marginals
Bayesian linear regression: π 2 is unknown β’ The posterior predictive: Student T distribution β’ Given new test inputs
Bayesian linear regression: π 2 is unknown β Uninformative prior β’ It is common to set π 0 = π 0 = 0 , corresponding to an uninformative prior for π 2 , and to set β’ The unit information prior:
Bayesian linear regression: π 2 is unknown β Uninformative prior β’ An uninformative prior: use the uninformative limit of the conjugate g-prior, which corresponds to setting π = β
Bayesian linear regression: π 2 is unknown β Uninformative prior β’ The marginal distribution of the weights:
Bayesian linear regression: Evidence procedure β’ Evidence procedure β an empirical Bayes procedure for picking the hyper- parameters β Choose to maximize the marginal likelihood, where π = 1/π 2 is the precision of the observation noise and π½ is the precision of the prior β Provides an alternative to using cross validation
Bayesian linear regression: Evidence procedure
Recommend
More recommend