Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State - PowerPoint PPT Presentation

Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019 Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 1 / 21

Outline Theoretical justification for hierarchical models Exchangeability de Finetti’s theorem Application to hierarchical models Normal hierarchical model Posterior Simulation study Shrinkage Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 2 / 21

Theoretical justification for hierarchical models Exchangability Exchangeability Definition The set Y 1 , Y 2 , . . . , Y n is exchangeable if the joint probability p ( y 1 , . . . , y n ) is invariant to permutation of the indices. That is, for any permutation π , p ( y 1 , . . . , y n ) = p ( y π 1 , . . . , y π n ) . An exchangeable but not iid example: Consider an urn with one red ball and one blue ball with probability 1/2 of drawing either. Draw without replacement from the urn. Let Y i = 1 if the i th ball is red and otherwise Y i = 0 . Since 1 / 2 = P ( Y 1 = 1 , Y 2 = 0) = P ( Y 1 = 0 , Y 2 = 1) = 1 / 2 , Y 1 and Y 2 are exchangeable. But 0 = P ( Y 2 = 1 | Y 1 = 1) � = P ( Y 2 = 1) = 1 / 2 and thus Y 1 and Y 2 are not independent. Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 3 / 21

Theoretical justification for hierarchical models Exchangability Exchangeability Theorem All independent and identically distributed random variables are exchangeable. Proof. iid Let y i ∼ p ( y ) , then n n � � p ( y 1 , . . . , y n ) = p ( y i ) = p ( y π i ) = p ( y π 1 , . . . , y π n ) i =1 i =1 Definition The sequence Y 1 , Y 2 , . . . is infinitely exchangeable if, for any n , Y 1 , Y 2 , . . . , Y n are exchangeable. Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 4 / 21

Theoretical justification for hierarchical models de Finetti’s theorem de Finetti’s theorem Theorem A sequence of random variables ( y 1 , y 2 , . . . ) is infinitely exchangeable iff, for all n , n � � p ( y 1 , y 2 , . . . , y n ) = p ( y i | θ ) P ( dθ ) , i =1 for some measure P on θ . If the distribution on θ has a density, we can replace P ( dθ ) with p ( θ ) dθ . This means that there must exist a parameter θ , ind a likelihood p ( y | θ ) such that y i ∼ p ( y | θ ) , and a distribution P on θ . Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 5 / 21

Theoretical justification for hierarchical models Hierarchical models Application to hierarchical models Assume ( y 1 , y 2 , . . . ) are infinitely exchangeable, then by de Finetti’s theorem for the ( y 1 , . . . , y n ) that you actually observed, there exists a parameter θ , ind a distribution p ( y | θ ) such that y i ∼ p ( y | θ ) , and a distribution P on θ . Assume θ = ( θ 1 , θ 2 , . . . ) with θ i infinitely exchangeable. By de Finetti’s theorem for ( θ 1 , . . . , θ n ) , there exists a parameter φ , ind a distribution p ( θ | φ ) such that θ i ∼ p ( θ | φ ) , and a distribution P on φ . Assume φ = φ with φ ∼ p ( φ ) . Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 6 / 21

Theoretical justification for hierarchical models Covariate information Exchangeability with covariates Suppose we observe y i observations and x i covariates for each unit i . Now we assume ( y 1 , y 2 , . . . ) are infinitely exchangeable given x i , then by de Finetti’s theorem for the ( y 1 , . . . , y n ) , there exists a parameter θ , ind a distribution p ( y | θ, x ) such that y i ∼ p ( y | θ, x i ) , and a distribution P on θ given x . Assume θ = ( θ 1 , θ 2 , . . . ) with θ i infinitely exchangeable given x . By de Finetti’s theorem for ( θ 1 , . . . , θ n ) , there exists a parameter φ , ind a distribution p ( θ | φ, x ) such that θ i ∼ p ( θ | φ, x i ) , and a distribution P on φ given x . Assume φ = φ with φ ∼ p ( φ | x ) . Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 7 / 21

Summary Summary Hierarchical model: ind ind y i ∼ p ( y | θ i ) , θ i ∼ p ( θ | φ ) , φ ∼ p ( φ ) Hierarchical linear model: ind ind y i ∼ p ( y | θ i , x i ) , θ i ∼ p ( θ | φ, x i ) , φ ∼ p ( φ | x ) Although hierarchical models are typically written using the conditional independence notation above, the assumptions underlying the model are exchangeability and functional forms for the priors. Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 8 / 21

Normal hierarchical models Normal hierarchical models Suppose we have the following model ind ∼ N ( θ i , σ 2 ) y ij iid ∼ N ( µ, τ 2 ) θ i with j = 1 , . . . , n i , i = 1 , . . . , I , and n = � I i =1 n i . This is a normal hierarchical model. Make the following assumptions for computational reasons: Let σ 2 = s 2 be known. Assume p ( µ, τ ) ∝ p ( µ | τ ) p ( τ ) ∝ p ( τ ) , i.e. assume an improper uniform prior on µ . Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 9 / 21

Normal hierarchical models Posterior Posterior distributions The necessary conditional and marginal posteriors are presented in section 5.4 of BDA. Let n i y i · = 1 � s 2 i = s 2 /n i y ij and n i j =1 Then � � ∝ p ( τ ) V 1 / 2 i + τ 2 ) − 1 / 2 exp − ( y i · − ˆ µ ) 2 � I i =1 ( s 2 p ( τ | y ) µ 2( s 2 i + τ 2 ) µ | τ, y ∼ N (ˆ µ, V µ ) ∼ N (ˆ θ i | µ, τ, y θ i , V i ) �� I � = � I y · i V − 1 1 µ ˆ = V µ µ i =1 i =1 s 2 i + τ 2 s 2 i + τ 2 � � y i · V − 1 ˆ i + µ = 1 i + 1 θ i = V i i s 2 τ 2 s 2 τ 2 Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 11 / 21

Normal hierarchical models Simulation study Simulation study Common to both simulation scenarios: I = 10 n i = 9 for all i s = 1 thus s i = 1 / 3 for all i Scenarios: 1. Common mean: θ i = 0 for all i 2. Group-specific means: θ i = i − (I / 2 + . 5) Use τ ∼ Ca + (0 , 1) . Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 12 / 21

Normal hierarchical models Simulation study Simulation study J = 10 n_per_group = 9 n = rep(n_per_group,J) sigma = 1 N = sum(n) group = rep(1:J, each=n_per_group) set.seed(1) df = rbind(data.frame(group = factor(group), simulation = "common_mean", y = rnorm(N )), # All means are the same data.frame(group = factor(group), simulation = "group_specific_mean", y = rnorm(N, group-(J/2+.5)))) # Each group has its own mean Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 13 / 21

Normal hierarchical models Simulation study common_mean group_specific_mean group 4 1 2 3 4 5 y 0 6 7 8 9 10 −4 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 group Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 14 / 21

Normal hierarchical models Simulation study Summary statistics simulation group n mean sd 1 common mean 1 9 0.18 0.81 2 common mean 2 9 0.09 1.11 3 common mean 3 9 0.18 0.91 4 common mean 4 9 -0.19 0.89 5 common mean 5 9 0.17 0.62 6 common mean 6 9 0.02 0.70 7 common mean 7 9 0.61 1.14 8 common mean 8 9 0.14 1.19 9 common mean 9 9 -0.31 0.60 10 common mean 10 9 0.20 0.81 11 group specific mean 1 9 -4.32 1.10 12 group specific mean 2 9 -3.40 0.88 13 group specific mean 3 9 -2.41 0.89 14 group specific mean 4 9 -1.38 0.60 15 group specific mean 5 9 -0.76 0.61 16 group specific mean 6 9 -0.16 0.95 17 group specific mean 7 9 1.21 1.12 18 group specific mean 8 9 2.23 1.15 19 group specific mean 9 9 3.97 1.26 20 group specific mean 10 9 5.08 0.77 Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 15 / 21

Normal hierarchical models Sampling on a grid Sampling on a grid Consider samping from an arbitrary unnormalized density f ( τ ) ∝ p ( τ | y ) using the following approach 1. Construct a step-function approximation to this density: a. Determine an interval [ L, U ] such that outside this interval f ( τ ) is small. b. Set an interval half-width h to generate a grid of M points ( x 1 , . . . , x M ) in this interval, i.e. x 1 = L + h and x m = x m − 1 + 2 h ∀ 1 < m ≤ M. c. Evaluate the density on this grid, i.e. f ( x m ) . �� M d. Normalize interval weights, i.e. w m = f ( x m ) i =1 f ( x i ) (to constructed a normalized density, divide each w m by 2 h .). 2. Sampling from this approximation: a. Sample an interval m with probability w m . b. Sample uniformly within this interval, i.e. τ ∼ Unif ( x m − h, x m + h ) . Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 16 / 21

Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State - PowerPoint PPT Presentation

Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019 Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 1 / 21 Outline Theoretical justification for hierarchical models

What is a hierarchical model? Richard Erickson Quantitative Ecologist DataCamp Hierarchical

Bayesian hierarchical models in Stata Nikolay Balov StataCorp LP 2016 Stata Conference Nikolay

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Hierarchical Bounding Volume October 11, 2005 () Hierarchical Bounding Volume October 11, 2005

Hierarchical models Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019 Jarad

Minimal ConT EXt Distribution Mojca Miklavec, BachoT EX 2008 Specifics of ConT EXt

Migration to ConT EXt? First experience with ConT EXt typesetting Tom Hla KONVOJ

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

CS171 Introduction to Computer Science II Recursion (cont.) + MergeSort Recursion (cont.) +

OLD Paired t-test Richard Erickson Quantitative Ecologist DataCamp Hierarchical and Mixed

Crash course on GLMs Richard Erickson Quantitative Ecologist DataCamp Hierarchical and Mixed

Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa

Unsupervised Learning and Clustering Owen Roberts, Zach Busser, Ganesh Sugunan Hierarchical

Error in library(blme): there is no package called blme Jarad Niemi (STAT544@ISU)

RTP T esting Strategies Colin P erkins < c.p erkins@cs.ucl.ac.uk > Depa rtment of

Gene Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics

A proposal for the analysis of disaster-related network data Miruna Petrescu-Prahova

1 Pythia 8 tutorial 1.1 Introduction This exercise corresponds to the Pythia 8 part of the more

De Finetti reductions and parallel repetition of multi-player non-local games joint work with

Poisson Clusters and Unique Factorization Ken Goodearl University of California at Santa Barbara

Discussion of Survival Models and Health Sequences Setup: Subjects indexed by i

Bayes and Lancaster at the Chinese restaurant. Statistical uses of the Fleming-Viot Process.

Sambuz

Useful Links

Newsletter

Mail Us

Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State - PowerPoint PPT Presentation

Hierarchical models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019 Jarad Niemi (STAT544@ISU) Hierarchical models (cont.) February 21, 2019 1 / 21 Outline Theoretical justification for hierarchical models

What is a hierarchical model? Richard Erickson Quantitative Ecologist DataCamp Hierarchical

Bayesian hierarchical models in Stata Nikolay Balov StataCorp LP 2016 Stata Conference Nikolay

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Hierarchical Bounding Volume October 11, 2005 () Hierarchical Bounding Volume October 11, 2005

Hierarchical models Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019 Jarad

Minimal ConT EXt Distribution Mojca Miklavec, BachoT EX 2008 Specifics of ConT EXt

Migration to ConT EXt? First experience with ConT EXt typesetting Tom Hla KONVOJ

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

CS171 Introduction to Computer Science II Recursion (cont.) + MergeSort Recursion (cont.) +

OLD Paired t-test Richard Erickson Quantitative Ecologist DataCamp Hierarchical and Mixed

Crash course on GLMs Richard Erickson Quantitative Ecologist DataCamp Hierarchical and Mixed

Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa

Unsupervised Learning and Clustering Owen Roberts, Zach Busser, Ganesh Sugunan Hierarchical

Error in library(blme): there is no package called blme Jarad Niemi (STAT544@ISU)

RTP T esting Strategies Colin P erkins &lt; c.p erkins@cs.ucl.ac.uk &gt; Depa rtment of

Gene Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics

A proposal for the analysis of disaster-related network data Miruna Petrescu-Prahova

1 Pythia 8 tutorial 1.1 Introduction This exercise corresponds to the Pythia 8 part of the more

De Finetti reductions and parallel repetition of multi-player non-local games joint work with

Poisson Clusters and Unique Factorization Ken Goodearl University of California at Santa Barbara

Discussion of Survival Models and Health Sequences Setup: Subjects indexed by i

Bayes and Lancaster at the Chinese restaurant. Statistical uses of the Fleming-Viot Process.

Sambuz

Useful Links

Newsletter

Mail Us

RTP T esting Strategies Colin P erkins < c.p erkins@cs.ucl.ac.uk > Depa rtment of