Bayesian Analysis of Multivariate Normal Models when Dimensions are - PowerPoint PPT Presentation

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Robert Zeithammer University of Chicago Peter Lenk University of Michigan http://webuser.bus.umich.edu/plenk/downloads.htm SBIES University of Iowa April 28–29, 2006 – p. 1

Outline Motivation Multivariate Regression HB Multivariate Regression HB Multinomial Probit Model Choice-Based Conjoint (CBC) Example SBIES University of Iowa April 28–29, 2006 – p. 2

Motivation Absent dimensions occur in multivariate problems when one or more dimensions are completely unobserved for some sampling units It differs from usual missing data problems in that both the independent and dependent variables are unobserved Problem is so pervasive that researchers may not recognize that they have absent dimensions SBIES University of Iowa April 28–29, 2006 – p. 3

Examples Not all stores carry all brands in every time period Sales are missing for absent dimensions Marketing mix is missing Not all choice sets include every brand in CBC Study Different schools offer different educational programs SBIES University of Iowa April 28–29, 2006 – p. 4

So What? Imputing both independent and dependent observations for absent dimension is ill-poised problem in many contexts Likelihood function is well-defined, but Multivariate observations have different lengths Inverted Wishart is no longer conjugate for the error covariance matrix Could do it with Metropolis, but that is not fun SBIES University of Iowa April 28–29, 2006 – p. 5

Common Kludge # 1 Restrict analysis to subset of dimensions that are present across all units Example: brand demand study Exclude small-share brands Focus on national brands and store brand Distorts market analysis Example: educational outcome study Focus on common set of programs Potentially biases outcomes SBIES University of Iowa April 28–29, 2006 – p. 6

Common Kludge # 2 Ignore error correlations Example: CBC Brand Study More brands in study than alternatives in choice sets Distorts estimated heterogeneity Misleading market share simulations IIA worries SBIES University of Iowa April 28–29, 2006 – p. 7

Common Kludge # 3 Pool absent dimensions into “Other” dimension Keeps full covariance Meaning of “Other” is problematic Demand for “Other”? Marketing mix for “Other”? SBIES University of Iowa April 28–29, 2006 – p. 8

Simple Solution In MCMC impute the missing error term for the absent dimensions Continue as though you have the full data set Adds about three lines of code Adds an indicator for absent dimensions to data structure SBIES University of Iowa April 28–29, 2006 – p. 9

Multivariate Regression Model: for i = 1 , . . . , n Y i = X i β + ǫ i with ǫ i ∼ N m (0 , Σ) Priors β ∼ N p ( b 0 , V 0 ) and Σ ∼ IW m ( f 0 , S 0 ) A ( i ) is set of indices for the absent dimensions with # A ( i ) = m i P ( i ) is set of indices for the present dimensions with # P ( i ) = m − m i SBIES University of Iowa April 28–29, 2006 – p. 10

MCMC: Initial Assignment Initialization of absent dimensions Y A ( i ) ← 0 X A ( i ) ← 0 Setting X A ( i ) to zero facilitates draws of the regression coefficients from their full conditional distributions SBIES University of Iowa April 28–29, 2006 – p. 11

MCMC: Absent Residuals Present residuals: R P ( i ) = Y P ( i ) − X P ( i ) β Absent residuals from conditional normal R A ( i ) | R P ( i ) , Σ , β ∼ N m − m i ( µ A ( i ) |P ( i ) , Σ A ( i ) |P ( i ) ) Conditional mean µ A ( i ) |P ( i ) = Σ A ( i ) , P ( i ) Σ − 1 P ( i ) , P ( i ) R P ( i ) Conditional covariance Σ A ( i ) |P ( i ) = Σ A ( i ) , A ( i ) − Σ A ( i ) , P ( i ) Σ − 1 P ( i ) , P ( i ) Σ P ( i ) , A ( i ) SBIES University of Iowa April 28–29, 2006 – p. 12

MCMC: Update Assignment Y A ( i ) ← R A ( i ) X A ( i ) ← 0 SBIES University of Iowa April 28–29, 2006 – p. 13

MCMC: β and Σ β | Rest ∼ N p ( b n , V n ) � − 1 + � n V − 1 i =1 X ′ i Σ − 1 X i � V n = 0 0 b 0 + � n V − 1 i =1 X i Σ − 1 Y i � � b n = V n Σ | Rest ∼ IW m ( f n , S n ) f n = f 0 + n i =1 ( Y i − X i β ) ( Y i − X i β ) ′ S n = S 0 + � n Same code as though all dimensions are present because SBIES University of Iowa April 28–29, 2006 – p. 14

Two Simulations m = 3; n = 500, and p = 2 One dimension is absent for each observation Simulation A Observe all pairs of present dimensions {1,2}, {1,3}, and {2,3} Simulation B Only observe pairs {1,2} and {2,3} No sample information about σ 1 , 3 SBIES University of Iowa April 28–29, 2006 – p. 15

Regression Coefficients Recovers true values Simulation A Simulation B Coefficient True Mean STD Mean STD 1.0 1.057 0.036 1.062 0.042 β 1 -1.0 -0.958 0.033 -0.953 0.040 β 2 SBIES University of Iowa April 28–29, 2006 – p. 16

Error Variance Estimate of σ 1 , 3 for Simulation B is based on prior, but other parameters are recovered Simulation A Simulation B Covariance True Mean STD Mean STD σ 1 , 1 1.0 0.990 0.074 0.900 0.082 σ 1 , 2 0.6 0.622 0.078 0.586 0.076 -0.5 -0.445 0.059 0.072 0.451 σ 1 , 3 1.4 1.358 0.105 1.517 0.096 σ 2 , 2 0.0 0.132 0.080 0.100 0.064 σ 2 , 3 σ 3 , 3 0.8 0.809 0.062 0.724 0.065 SBIES University of Iowa April 28–29, 2006 – p. 17

Simulation A: Error Variance Covariance Covariance Covariance 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 0.8 1 1.2 0.4 0.6 0.8 -0.6 -0.4 -0.2 Correlation Covariance Covariance 0.2 0.2 0.2 0.15 0.1 0.1 0.1 0.05 0 0 0 0.3 0.4 0.5 0.6 0.7 1 1.2 1.4 1.6 1.8 -0.2 0 0.2 0.4 Correlation Correlation Covariance 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 -0.6 -0.5 -0.4 -0.3 -0.2 0 0.2 0.4 0.6 0.8 1 SBIES University of Iowa April 28–29, 2006 – p. 18

Simulation B: Error Variance Covariance Covariance Covariance 0.2 0.2 0.2 0.15 0.1 0.1 0.1 0.05 0 0 0 0.8 1 1.2 0.4 0.6 0.8 -0.5 0 0.5 Correlation Covariance Covariance 0.2 0.2 0.2 0.15 0.15 0.1 0.1 0.1 0.05 0.05 0 0 0 0.4 0.5 0.6 1.2 1.4 1.6 1.8 -0.1 0 0.1 0.2 0.3 Correlation Correlation Covariance 0.2 0.15 0.2 0.15 0.1 0.1 0.1 0.05 0.05 0 0 0 -0.5 0 0.5 -0.1 0 0.1 0.2 0.3 0.6 0.8 1 SBIES University of Iowa April 28–29, 2006 – p. 19

Mixing Pay a small price in mixing of the MCMC chain Simulation n = 500; m = 3; p = 4 Full data set 1 3 of the dimensions were randomly deleted Posterior means are close for full and absent cases Posterior standard deviations are small for full case ACF on next slide SBIES University of Iowa April 28–29, 2006 – p. 20

Full versus Absent ACF B. ACF Coefficients Missing Data A. ACF Coefficients Full Data 0.12 0.12 0.10 0.10 0.08 0.08 0.06 0.06 ACF ACF 0.04 0.04 0.02 0.02 0.00 0.00 -0.02 -0.02 -0.04 -0.04 1 3 5 7 9 11 13 15 17 19 1 3 5 7 9 11 13 15 17 19 Lag Lag C. ACF Covariance Full Data D. ACF Covariance Missing Data 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 ACF ACF 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 -0.1 -0.1 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 11 13 15 17 19 1 1 1 1 1 Lag Lag SBIES University of Iowa April 28–29, 2006 – p. 21

HB Multivariate Regression Model: for j = 1 , . . . , n i and i = 1 , . . . , N Y ij = X ij β i + ǫ ij with ǫ i ∼ N m (0 , Σ) β i = Θ ′ z i + δ i with δ i ∼ N p (0 , Λ) Priors Σ ∼ IW m ( f 0 , S 0 ) Λ ∼ IW p ( g 0 , T 0 ) Θ ′ ∼ N pq ( U 0 , V 0 ) � SBIES University of Iowa April 28–29, 2006 – p. 22

Analysis Full conditional distribution of the residuals R A ( i,j ) for the absent dimensions has a conditional normal distribution given R P ( i,j ) Simulation m = 4; p = 5, and q = 3 (covariate z i ) N = 500 and 11 ≤ n i ≤ 20 One or two absent dimensions for each observation SBIES University of Iowa April 28–29, 2006 – p. 23

Fit Statistics for β i Correlation RMSE Intercept 1 0.972 1.824 Intercept 2 0.732 1.970 Intercept 3 0.692 2.140 Intercept 4 0.864 2.319 X1 0.998 0.364 X2 0.969 0.662 SBIES University of Iowa April 28–29, 2006 – p. 24

Error Variance True Y1 Y2 Y3 Y4 Y1 1.0 0.1 0.0 1.0 Y2 0.1 4.0 0.0 4.1 Y3 0.0 0.0 9.0 0.0 Y4 1.0 4.1 0.0 21.0 Bayes Y1 Y2 Y3 Y4 Y1 1.004 0.068 0.154 0.935 Y2 0.068 4.052 0.180 4.111 Y3 0.154 0.180 9.131 0.166 Y4 0.935 4.111 0.166 21.529 SBIES University of Iowa April 28–29, 2006 – p. 25

Explained Heterogeneity Θ True CNST 1 CNST 2 CNST 3 CNST 4 X1 X2 CNST -15.0 -5.0 5.0 20.0 -5.0 3.0 Z1 2.0 1.0 0.0 -2.0 1.0 -0.2 Z2 -1.0 -0.5 0.0 1.0 -0.2 0.5 Bayes CNST 1 CNST 2 CNST 3 CNST 4 X1 X2 CNST -14.778 -6.497 5.521 18.754 -4.168 -2.199 Z1 1.745 0.920 -0.203 -2.148 0.951 0.282 Z2 -0.798 -0.295 0.070 1.333 -0.186 0.530 SBIES University of Iowa April 28–29, 2006 – p. 26

Bayesian Analysis of Multivariate Normal Models when Dimensions are - PowerPoint PPT Presentation

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Robert Zeithammer University of Chicago Peter Lenk University of Michigan http://webuser.bus.umich.edu/plenk/downloads.htm SBIES University of Iowa April 2829,

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

MLES & Multivariate Normal Theory STA721 Linear Models Duke University Merlise Clyde

Linear regression How to measure the accuracy of linear regression models Linear Regression

Lecture 12 Gaussian Process Models 10/16/2018 1 Multivariate Normal Multivariate Normal

Multivariate Normal Distribution Max Turgeon STAT 4690Applied Multivariate Analysis Building

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

The Normal-Normal Model Alicia Johnson Associate Professor, Macalester College DataCamp

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Lecture 12 Gaussian Process Models Colin Rundel 02/27/2017 1 Multivariate Normal 2

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Multivariate Analysis of Variance Max Turgeon STAT 4690Applied Multivariate Analysis Quick

Multivariate Ordination Analyses: Principal Component Analysis Dilys Vela Tatiana Boza Tatiana

Multivariate Linear Regression Max Turgeon STAT 4690Applied Multivariate Analysis

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

V ALUE AT R ISK ( VaR ) Let X be a random variable representing loss, F its distribution function

TIERING History and Purpose Spring 2010 CAS Meeting Agenda n Definition n History n

MultivariateAnalysis MultivariateAnalysis AUnifiedPerspective

Sponsorship Management System Sbastien Auger Sponsorium sebastien@sponsor.com How do You Win

Effort and achievement of 15-year-olds in PISA 2015 across EU member states Opportunity versus

EASM 2014 empirical evidence about the awareness and meaning of Olympism. For example,

Robust method for EnKF in the presence of observation outliers/Multivariate localization methods

A Municipal Bond Case Study Material drawn from USAID, Developing Sustainable and Inclusive Urban

Bayesian Analysis of Multivariate Normal Models when Dimensions are - PowerPoint PPT Presentation

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Robert Zeithammer University of Chicago Peter Lenk University of Michigan http://webuser.bus.umich.edu/plenk/downloads.htm SBIES University of Iowa April 2829,

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

MLES &amp; Multivariate Normal Theory STA721 Linear Models Duke University Merlise Clyde

Linear regression How to measure the accuracy of linear regression models Linear Regression

Lecture 12 Gaussian Process Models 10/16/2018 1 Multivariate Normal Multivariate Normal

Multivariate Normal Distribution Max Turgeon STAT 4690Applied Multivariate Analysis Building

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

The Normal-Normal Model Alicia Johnson Associate Professor, Macalester College DataCamp

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Lecture 12 Gaussian Process Models Colin Rundel 02/27/2017 1 Multivariate Normal 2

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Multivariate Analysis of Variance Max Turgeon STAT 4690Applied Multivariate Analysis Quick

Multivariate Ordination Analyses: Principal Component Analysis Dilys Vela Tatiana Boza Tatiana

Multivariate Linear Regression Max Turgeon STAT 4690Applied Multivariate Analysis

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

V ALUE AT R ISK ( VaR ) Let X be a random variable representing loss, F its distribution function

TIERING History and Purpose Spring 2010 CAS Meeting Agenda n Definition n History n

MultivariateAnalysis MultivariateAnalysis AUnifiedPerspective

Sponsorship Management System Sbastien Auger Sponsorium sebastien@sponsor.com How do You Win

Effort and achievement of 15-year-olds in PISA 2015 across EU member states Opportunity versus

EASM 2014 empirical evidence about the awareness and meaning of Olympism. For example,

Robust method for EnKF in the presence of observation outliers/Multivariate localization methods

A Municipal Bond Case Study Material drawn from USAID, Developing Sustainable and Inclusive Urban

MLES & Multivariate Normal Theory STA721 Linear Models Duke University Merlise Clyde