Hierarchical Modeling Hierarchical modeling has taken over the - PowerPoint PPT Presentation

Hierarchical Modeling Hierarchical modeling has taken over the landscape in contemporaery stochastic modeling Intent here is to show a range of examples In the development, we will also show the connections to Gibbs sampling, why Gibbs sampling and MCMC are ideally suited to fit these models We envision a three stage specification First stage: [data|model, parameters] Second stage: [model|parameters] Third stage: [(hyper)parameters] Hierarchical Modeling – p. 1/13

Standard hierarchical linear model First stage: Y | X , β ∼ N ( X β , Σ Y ) Second stage: β | Z , α ∼ N ( Z α , Σ β ) Third stage: α ∼ N ( α 0 , Σ α ) Assumes all Σ ’s known. If not, inverse Gamma or Wishart priors Standard Gibbs loop to update; conjugacy for all full conditionals Hierarchical Modeling – p. 2/13

CIHM Conditionally independent hierarchical model Π i [ Y i | θ i ]Π i [ θ i | η ][ η ] Exchangeable θ i Shrinkage - borrowing strength if η unknown Includes hierarchical GLM, i.e., nonGaussian first stage Hierarchical Modeling – p. 3/13

Random effects Random effects are usually assumed to be normally distributed with an associated variance component Typical linear version: Y ij = X T ij β + φ i + ǫ ij β has a Gaussian prior φ i iid ∼ N (0 , σ 2 φ ) ǫ ij iid ∼ N (0 , σ 2 ǫ ) Priors on variance components, σ 2 φ , σ 2 ǫ (with care) Again, can have a nonGaussian first stage Hierarchical Modeling – p. 4/13

Missing data Often have missing data Gibbs sampler (MCMC) extends the E-M algorithm to provide full posterior inference rather than an MLE with an asymptotic variance Simple example: Multivariate normal, Y i ∼ N ( µ , Σ) Some components of some of the Y i are missing A usual Gibbs loop: update parameters given missing data, update missing data given parameters Simple example: Missing categorical counts with a multinomial model Some categories are aggregated/collapsed so counts for the disaggregated categories are missing Again, usual Gibbs loop: update parameters given all counts, update missing counts given parameters Hierarchical Modeling – p. 5/13

Binary data models Usual binary response model is logit or probit Illustrate for probit Y i ∼ Bernoulli ( p ( X i )) (can be Bi ( n i , p ( X i )) ) Φ − 1 ( p ( X i )) = X i β with a prior on β Awkward to sample β in this form Introduce Z i ∼ N ( X i β , 1) P ( Y i = 1) = Φ( X i β ) = 1 − Φ( − X i β ) = P ( Z i ≥ 0) So, Gibbs loop: update Z ’s given β , y (truncated normal), update β given Z ’s and y (usual normal updating) Can extend to ordinal categorical data; multiple unknown cut points Hierarchical Modeling – p. 6/13

Growth Curves Typically, individual level curves centered around a population level curve Need population level curves to see average behavior of process Need individual level curves to prescribe individual level treatment Model: If Y ij is j th measurement for i th individual, Y ij = g ( X ij , Z i , β i ) + ǫ ij ǫ ij ∼ N (0 , σ 2 i ) β i = β + η i (or replace β with a regression in the Z i ) Hierarchical Modeling – p. 7/13

Mixture models Y ∼ � L l =1 p l f l ( Y | θ l ) , e.g., a normal mixture Also called classification problem or discriminant analysis L fixed or unknown? Observe Y i , i = 1 , 2 , ..., n Label for Y i is not observed (latent) If L i = l , then Y i ∼ f l ( Y | θ l ) So model is: Π i [ Y i | L i , θ ][Π i [ L i | α ][ α , θ ] Gibbs loop. Update β , α given L ’s and data. Update L ’s given β , α and data (discrete distribution) Covariates? In θ l ’s? In p l ’s? Hierarchical Modeling – p. 8/13

Errors in variables models Seek the relationship between say Y and X Observe say W a surrogate for X , perhaps observe Z , a surrogate for Y Model W | X - measurement error model; model X | W - Berkson model Model in first case: Π i [ Z i | Y i , γ ][ Y i | X i , β ][ W i | X i , γ ][ X i | α ] Model in second case: Π i [ Z i | Y i , γ ][ Y i | X i , β ][ X i | W i , γ ] Validation data: Perhaps some X, Y pairs; perhaps some X, W pairs Hierarchical Modeling – p. 9/13

Change point models Frequently, interest in a change in regime Need idea of a “least” significant change Two sampling scenarios - (i) full set of data. Try to find, retrospectively, if changes occurred and when, (ii) sequential data, try to identify changes as we collect. Simple first scenario example: f 1 ( y | θ 1 ) density before the change point; f 2 ( y | θ 2 ) density after the change point With data Y i , i = 1 , 2 , ..., n , then K , change point indicator, i.e., K = k means change at observation k + 1 ; k = n means “no change.” Then, model is L ( θ 1 , θ 2 , k ; y ) = Π k i =1 f 1 ( y i | θ 1 )Π n i = k +1 f 2 ( y i | θ 2 ) With a prior on θ 1 , θ 2 , k , a full model Again, loop: update θ ’s given k, y , update k given θ ’s and y (a discrete distribution) Hierarchical Modeling – p. 10/13 ’s can be dependent, can have order restrictions on ’s

Concurrent time series Dependent ARMA time series Model � � Y it = X T i β i + φ ij Y i,t − j + θ ik ǫ i,t − k + ǫ it j k Exchangeable β i , φ i , θ i Usual prior on β , constrained priors on the φ ’s and θ ’s ǫ t ∼ N (0 , Σ) Hierarchical Modeling – p. 11/13

Dynamic models Two stage form: observational (or data) stage and transition stage (unobserved) Simple example: Y ti = g ( X ti β t ) + ǫ ti with iid ǫ ’s First stage conditional independence β t = φ β t − 1 + η t Can have dynamics in X t ’s Then called “hidden Markov model” Hierarchical Modeling – p. 12/13

Summary So, the overall story is the following: Rich range of modeling possibilities We introduce latent variables to facilitate the writing of the likelihood and prior and the fitting of the model. These latent variables can be labels, missing data, other augmentations. MCMC model fitting is natural; we make Gibbs loops to do the required updating. Hierarchical Modeling – p. 13/13

Hierarchical Modeling Hierarchical modeling has taken over the - PowerPoint PPT Presentation

Hierarchical Modeling Hierarchical modeling has taken over the landscape in contemporaery stochastic modeling Intent here is to show a range of examples In the development, we will also show the connections to Gibbs sampling, why Gibbs sampling

Using Hierarchical Modeling to Assist Using Hierarchical Modeling to Assist Effects Based

Hierarchical Bounding Volume October 11, 2005 () Hierarchical Bounding Volume October 11, 2005

What is a hierarchical model? Richard Erickson Quantitative Ecologist DataCamp Hierarchical

Hierarchical Modeling A lesson in stick person anatomy. A lesson in stick person anatomy. or or

Bayesian hierarchical models in Stata Nikolay Balov StataCorp LP 2016 Stata Conference Nikolay

Unsupervised Learning and Clustering Owen Roberts, Zach Busser, Ganesh Sugunan Hierarchical

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

CS 4204 Computer Graphics Structure Graphics and Structure Graphics and Hierarchical Modeling

Perspective Hierarchical Dirichlet Process for Perspective Hierarchical Dirichlet Process for

Language Modeling CSE354 - Spring 2020 Task Language Modeling Probabilistic Modeling

Flexible Priors for Deep Hierarchies Jacob Steinhardt Wednesday, November 9, 2011 Hierarchical

3.2 Hierarchical Modeling Hao Li http://cs420.hao-li.com 1 Roadmap Last lecture: Viewing

On the Hierarchical Modeling Technique with Applications Jingfang Huang Department of

Agglomerative 2-3 Hierarchical Agglomerative 2-3 Hierarchical Clustering: theoretical

HIERARCHICAL DETERMINISTIC WALLETS JOHN NEWBERY @jfnewbery github.com/jnewbery HIERARCHICAL

Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC Berkeley with Michael I.

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , 2016 Dimitris Berberidis 1

Part 3: Probabilistic Inference in Graphical Models Sebastian Nowozin and Christoph H. Lampert

Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008

Felisa J. V azquez-Abad and Lachlan L. H. Andrew D epartement dinformatique et recherche

A Search for e Oscillation with MiniBooNE Hai-Jun Yang University of Michigan, Ann

Motivation Dramatic increase in digital data Privacy-Preserving Data Mining World Wide

Getting the global picture Jes us M. Gonz alez Barahona, Gregorio Robles GSyC, Universidad