I 4 - Bayesian parameter estimation in a normal model STAT 587 - PowerPoint PPT Presentation

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State University September 18, 2020

Bayesian parameter estimation Bayesian parameter estimation Recall that Bayesian parameter estimation involves p ( θ | y ) = p ( y | θ ) p ( θ ) p ( y | θ ) p ( θ ) = � p ( y ) p ( y | θ ) p ( θ ) dθ ) with posterior p ( θ | y ) , prior p ( θ ) , model p ( y | θ ) , and prior predictive p ( y ) . For this video, θ = ( µ, σ 2 ) and y | µ, σ 2 ∼ N ( µ, σ 2 ) .

Bayesian parameter estimation Normal model Bayesian parameter estimation in a normal model ind ∼ N ( µ, σ 2 ) and the default prior Let Y i p ( µ, σ 2 ) ∝ 1 σ 2 . Note: This “prior” is not a distribution since its integral is not finite. Nonetheless, we can still derive the following posterior , ( n − 1) s 2 � n − 1 � µ | y ∼ t n − 1 ( y, s 2 /n ) σ 2 | y ∼ IG and 2 2 where n is the sample size, � n y = 1 i =1 y i is the sample mean, and n s 2 = i =1 ( y i − y ) 2 is the sample variance. � n 1 n − 1

Bayesian parameter estimation Moments for the mean Posterior for the mean The posterior for the mean is µ | y ∼ t n − 1 ( y, s 2 /n ) and from properties of the generalized Student’s t distribution, we know E [ µ | y ] = y for n > 2 , � V ar [ µ | y ] = ( n − 1) s 2 n for n > 3 , ( n − 3) and µ − y s/ √ n ∼ t n − 1 .

Bayesian parameter estimation Credible intervals for the mean Credible intervals for µ Since µ − y s/ √ n ∼ t n − 1 a 100(1 − a ) % equal-tail credible interval is y ± t n − 1 ,a/ 2 s/ √ n where t n − 1 ,a/ 2 is a t critical value such that P ( T n − 1 < t n − 1 ,a/ 2 ) = 1 − a/ 2 when T n − 1 ∼ t n − 1 . For example, t 10 − 1 , 0 . 05 / 2 is n = 10 a = 0.05 # 95 \ % CI qt(1-a/2, df = n-1) [1] 2.262157

Bayesian parameter estimation Moments for the variance Posterior for the variance The posterior for the mean is , ( n − 1) s 2 � n − 1 � σ 2 | y ∼ IG 2 2 and from properties of the inverse Gamma distribution, we know E [ σ 2 | y ] = ( n − 1) s 2 for n > 3 , n − 3 and , ( n − 1) s 2 � 1 � n − 1 � � � y ∼ Ga � σ 2 2 2 where ( n − 1) s 2 / 2 is the rate parameter.

Bayesian parameter estimation Credible intervals for the variance Credible intervals for σ 2 For a 100(1 − a ) % credible interval, we need a/ 2 = P ( σ 2 < L | y ) = P ( σ 2 > U | y ) . To do this, we will find � 1 � 1 � � � � σ 2 > 1 σ 2 < 1 � � a/ 2 = P � y = P � y . � � L U Here is a function that performs this computation qinvgamma <- function(p, shape, scale = 1) 1/qgamma(1-p, shape = shape, rate = scale)

Bayesian parameter estimation Posterior for the standard deviation Posterior for the standard deviation, σ The variance is hard to interpret because its units are squared relative to Y i . In contrast, the √ σ 2 units are the same as Y i . standard deviation σ = For credible intervals (or any quantile), we can compute the square root of the endpoints since P ( σ 2 < c 2 ) = P ( σ < c ) . Find the pdf through transformations of random variables. In R code, dinvgamma <- function(x, shape, scale = 1) dgamma(1/x, shape = shape, rate = scale)/x^2 dsqrtinvgamma = function(x, shape, scale) dinvgamma(x^2, shape, scale)*2*x

Yield data analysis Yield data Suppose we have a random sample of 9 Iowa farms and we obtain corn yield in bushels per acre on those farms. Let Y i be the yield for farm i in bushels/acre and assume ind ∼ N ( µ, σ 2 ) . Y i We are interested in making statements about µ and σ 2 . yield_data <- read.csv("yield.csv") nrow(yield_data) [1] 9 yield_data farm yield 1 farm1 153.5451 2 farm2 205.6999 3 farm3 178.7548 4 farm4 170.1692 5 farm5 224.7723 6 farm6 184.0806 7 farm7 169.8615 8 farm8 201.2721 9 farm9 181.6356

Yield data analysis Histogram of yield Histogram of yield 2.0 1.5 count 1.0 0.5 0.0 150 170 190 210 230 yield

Yield data analysis Calculate sufficient statistics Calculate sufficient statistics n = length(yield_data$yield); n [1] 9 sample_mean = mean(yield_data$yield); sample_mean [1] 185.5323 sample_variance = var(yield_data$yield); sample_variance [1] 470.2817 Use these sufficient statistics to calculate: posterior densities posterior means credible intervals

Yield data analysis Posterior densities Posterior density for µ Posterior density for population mean 0.04 Posterior probability density function 0.03 0.02 0.01 0.00 160 180 200 220 Mean yield (bushels per acre)

Yield data analysis Posterior densities Posterior density for σ 2 Posterior density for population variance Posterior probability density function 0.0015 0.0010 0.0005 0.0000 0 500 1000 1500 2000 Variance of yield (bushels per acre)^2

Yield data analysis Posterior means Posterior means # Posterior mean of population yield mean, E[mu|y] sample_mean [1] 185.5323 Posterior mean for µ is E [ µ | y ] = 186 bushels/acre. # Posterior mean of population yield variance post_mean_var = (n-1)*sample_variance / (n-3) post_mean_var [1] 627.0422 Posterior mean for σ 2 is E [ σ 2 | y ] = 627 (bushels/acre) 2 .

Yield data analysis Credible intervals Credible intervals # 95% credible interval for the population mean a = 0.05 mean_ci = sample_mean + c(-1,1) * qt(1-a/2, df = n-1) * sqrt(sample_variance/n) mean_ci [1] 168.8630 202.2017 So a 95% credible interval for µ is (169,202) bushels/acre. # 95% credible interval for the population variance var_ci = qinvgamma(c(a/2, 1-a/2), shape = (n-1)/2, scale = (n-1)*sample_variance/2) var_ci [1] 214.5623 1726.0175 So a 95% credible interval for σ 2 is (215,1726) (bushels/acre) 2 .

Yield data analysis Credible intervals Posterior density for µ Posterior density for population mean 0.04 Posterior probability density function 0.03 0.02 0.01 0.00 160 180 200 220 Mean yield (bushels per acre)

Yield data analysis Credible intervals Posterior density for σ 2 Posterior density for population variance Posterior probability density function 0.0015 0.0010 0.0005 0.0000 0 500 1000 1500 2000 Variance of yield (bushels per acre)^2

Yield data analysis Credible intervals Posterior for the standard deviation, σ # Posterior median and 95% CI for population yield standard deviation sd_median = sqrt(qinvgamma(.5, shape = (n-1)/2, scale = (n-1)*sample_variance/2)) sd_median [1] 22.63362 So the posterior median for σ is 23 bushels/acre. # Posterior 95% CI for the population yield standard deviation sd_ci = sqrt(var_ci) sd_ci [1] 14.64795 41.54537 So a posterior 95% credible interval for σ is 15, 42 bushels/acre.

Yield data analysis Credible intervals Posterior for the standard deviation, σ Posterior density for population standard deviation Posterior probability density function 0.06 0.04 0.02 0.00 0 20 40 60 Standard deviation of yield (bushels per acre)

Summary Bayesian inference in a normal model Prior: p ( µ, σ 2 ) = 1 /σ 2 Posterior: , ( n − 1) s 2 � n − 1 � µ | y ∼ t n − 1 ( y, s 2 /n ) σ 2 | y ∼ IG and 2 2 # Sufficient statistics n = length(y) sample_mean = mean(y) sample_variance = var(y) # Posterior expectations sample_mean # mu (n-1)*sample_variance / (n-3) # sigma^2 # Posterior medians var_median = qinvgamma(.5, shape = (n-1)/2, scale = (n-1)*sample_variance/2) sd_median = sqrt(median_var) # Posterior credible intervals sample_mean + c(-1,1) * qt(1-a/2, df = n-1) * sqrt(sample_variance/n) var_ci = qinvgamma(c(a/2,1-a/2), shape = (n-1)/2, scale = (n-1)*sample_variance/2) sd_ci = sqrt(var_ci)

I 4 - Bayesian parameter estimation in a normal model STAT 587 - PowerPoint PPT Presentation

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State University September 18, 2020 Bayesian parameter estimation Bayesian parameter estimation Recall that Bayesian parameter estimation involves p ( | y ) =

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Maximum-likelihood and Bayesian parameter estimation Andrea Passerini passerini@disi.unitn.it

The Normal-Normal Model Alicia Johnson Associate Professor, Macalester College DataCamp

Linear regression How to measure the accuracy of linear regression models Linear Regression

Bayesian Methods for Parameter Estimation Bayesian vs Frequentist Inference Frequentist Chris

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Maximum likelihood parameter estimation Maximum likelihood parameter estimation For an HMM

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Martin Emms September 20, 2019 4CSLL5

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

Overview Bayesian Methods for Parameter Estimation Introduction to Bayesian Statistics: Learning

Outline Introduction Knowledge Structures Parameter Estimation Maximum Likelihood Estimation

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Normal A Spectrum of Engineering Design Normal Radical A Spectrum of Engineering Design Normal

Work Sheet 1: Scales and Aggregation Scales Assign the measure examples to the correct scale

Lecture 2: Software Metrics 2017-04-27 Prof. Dr. Andreas Podelski, Dr. Bernd Westphal

Statistics I Chapter 1 What is Statistics? Ling-Chieh Kung Department of Information

Two-interval musical scales and binary structures in computer science and biology Soshinsky Ivan

Monte Carlo Methods Lecture notes for MAP001169 Based on Script by Martin Sk old adopted by

On classical and quantum scattering for field equations on the (De Sitter) Kerr metric Dietrich

About different kinds of Substitutions Matthieu Deneufch atel S eminaire CALIN, 18 Janvier

2. Lecture Fourier Transform Outline