BAYESIAN MODEL SELECTION IN SPATIAL LATTICE MODELS Victor De - PowerPoint PPT Presentation

BAYESIAN MODEL SELECTION IN SPATIAL LATTICE MODELS Victor De Oliveira Department of Management Science and Statistics The University of Texas at San Antonio San Antonio, TX USA victor.deoliveira@utsa.edu http://faculty.business.utsa.edu/vdeolive Joint work with J.J. Song The Fourth Erich L. Lehmann Symposium, May 9–12, 2011

Example 1: Phosphate Data Raw phosphate concentrations (in mg P/100 g of soil) collected over 16 by 16 regular lattice during several years in archaeological region of Greece 121 112 108 91 68 59 294 50 101 27 71 48 36 71 66 83 15 108 101 75 83 52 55 50 30 55 75 108 41 47 47 62 80 50 88 77 77 73 50 50 59 57 55 57 38 71 17 52 60 91 166 68 60 32 47 45 34 57 60 64 68 32 48 27 88 116 66 34 62 77 41 23 38 68 68 73 33 60 66 62 143 60 62 80 59 75 57 27 57 10 55 53 80 80 62 91 71 68 77 104 75 41 33 131 41 37 s 2 (meters) 64 45 62 21 60 38 47 77 73 62 27 44 53 53 52 36 64 28 44 45 60 62 34 47 75 83 71 77 83 73 77 59 59 38 32 55 60 30 41 59 57 71 66 83 85 85 77 83 45 47 48 68 80 44 64 64 68 68 88 116 108 85 91 73 37 41 38 36 19 57 47 131 80 83 80 88 73 73 97 62 5 31 45 34 66 71 85 80 121 91 136 108 108 80 80 73 55 34 62 41 80 75 101 50 71 91 94 94 91 75 68 59 57 55 66 40 57 68 73 80 125 83 66 77 55 71 71 47 77 59 45 55 59 60 48 68 71 57 60 55 53 57 62 64 0 0 5 10 15 s 1 (meters)

Example 2: Crime Data Homicide rates per 100 , 000 habitants for 1980 in the south of US, with n = 1412 counties 40 35 latitude 30 0.00−4.80 4.80−7.85 7.85−10.92 10.92−15.03 15.03−42.34 25 −105 −100 −95 −90 −85 −80 −75 longitude

Models for Spatial Lattice Data • Conditional Autoregressive (CAR) Models: Mostly studied and applied in Statistical literature • Simultaneously Autoregressive (SAR) Models: Mostly studied and applied in Econometric/geography literature All of these require specifying a neighborhood system

Neighborhood Systems Sites { 1 , . . . , n } are endowed with neighborhood system, { N i : i = 1 , . . . , n } , where N i = neighbors of site i . Examples: N i = { j : site j shares a boundary with site i } N i = { j : 0 < d ij < r } with r > 0 and d ij the distance between sites i and j

First and second order neighborhood systems X X

Goal Model selection for spatial lattice data using a default Bayesian approach, where the competing models: • Have the same mean structure • Have different covariance structures

CAR MODELS Conditional Specification: For i = 1 , . . . , n n j β ) , τ 2 ( Y i | Y ( i ) ) ∼ N( x ′ c ij ( Y j − x ′ � i β + i ) j =1 • Y ( i ) = { Y j , j � = i } • x ′ j = ( x j 1 , . . . , x jp ) • β ∈ R p , τ i > 0 • c ij ≥ 0 and c ij > 0 iff i ∼ j

Let M = diag( τ 2 1 , . . . , τ 2 n ) and C = ( c ij ) satisfy • M − 1 C is symmetric, so c ij τ 2 j = c ji τ 2 i • M − 1 ( I n − C ) positive definite Joint Specification: Y ∼ N n ( X β , ( I n − C ) − 1 M ) where X = ( x 1 , . . . , x n ) ′

Parameterization • M = σ 2 G , with σ 2 > 0 unknown and G diagonal (known) • C = φW , with φ ‘spatial parameter’ and W = ( w ij ) nonnegative “weight” known matrix (not necessarily symmetric), and w ij > 0 iff i ∼ j Let A = ( a ij ) [neighborhood matrix]: a ij = 1 if i ∼ j , and a ij = 0 otherwise

Classes of CAR Models • Homogeneous CAR (HCAR): G = I n , W = A • Weighted CAR (WCAR) (Besag et al. 1991): G = diag( | N 1 | − 1 , . . . , | N n | − 1 ) , W = GA with | N i | = � n j =1 a ij • Autocorrelation CAR (ACAR) (Cressie & Chang, 1989): G = diag( | N 1 | − 1 , . . . , | N n | − 1 ) W = G 1 / 2 AG − 1 / 2 ,

Facts Assume the above conditions hold and G − 1 M is symmetric. Then: (a) G − 1 / 2 WG 1 / 2 is symmetric (b) G − 1 / 2 WG 1 / 2 and W have the same nonzero eigenvalues, and all are real (c) M and C determine a CAR model iff σ 2 > 0 and n , λ − 1 φ ∈ ( λ − 1 1 ), with λ 1 ≥ . . . ≥ λ n ordered eigenvalues of G − 1 / 2 WG 1 / 2 Parameter space: Ω = R p × (0 , ∞ ) × ( λ − 1 n , λ − 1 1 )

SAR MODELS Conditional Specification: For i = 1 , . . . , n n Y i = x ′ b ij ( Y j − x ′ � i β + j β ) + ǫ i j =1 • ǫ i ∼ N(0 , ξ 2 i ), independent • β ∈ R p , ξ i > 0 • b ij ≥ 0 and b ij > 0 iff i ∼ j Let M = diag( ξ 2 1 , . . . , ξ 2 n ) and B = ( b ij ) satisfy that I n − B is nonsingular. Then Joint Specification: Y ∼ N n ( X β , ( I n − B ) − 1 M ( I n − B ′ ) − 1 )

Particular Model: • M = σ 2 I n • B = φA so Y ∼ N n ( X β , σ 2 (( I n − φA ) 2 ) − 1 Ω = R p × (0 , ∞ ) × ( λ − 1 n , λ − 1 Parameter space: 1 ), with λ 1 ≥ . . . ≥ λ n the ordered eigenvalues of A

MODEL SELECTION Let M 1 , M 2 , . . . , M k be the candidate models ( k ≥ 2) M j is either HCAR, WCAR, ACAR or SAR parameterized by η j = ( β , σ 2 j , φ j ) ∈ Ω j with covariance depending on G j and A j φ j ∈ (1 /λ ( j ) n , 1 /λ ( j ) 1 ) with λ ( j ) ≥ λ ( j ) ≥ . . . ≥ λ ( j ) eigenvalues of: n 1 2 • A j in case of HCAR, ACAR and SAR • G 1 / 2 A j G 1 / 2 in case of WCAR j j The approach proposed here assumes all models have the same mean structure

Likelihood for M j L j ( η j ; y ) = 1 j ) − n 1 2 | Σ − 1 ( y − X β ) ′ Σ − 1 (2 πσ 2 2 exp { − φ j | φ j ( y − X β ) } 2 σ 2 j where  I n − φ j A j for HCAR models   G − 1   − φ j A j for WCAR models  j Σ − 1  φ j = − φ j G − 1 / 2 A j G − 1 / 2 G − 1 for ACAR models  j j j   ( I n − φ j A j ) 2   for SAR models 

Prior for M j π ( η j | M j ) ∝ π ( φ j | M j ) 1 Ω j ( η j ) σ 2 j Two options for π ( φ j | M j ): • Uniform: π U ( φ j | M j ) = 1 (1 /λ ( j ) 1 ) ( φ j ) n , 1 /λ ( j ) • Independence Jeffreys: π J 1 ( φ j | M j ) = � n � 1 n λ ( j ) λ ( j ) 2 ) 2 − 1 � � ] 2 i i 1 ) ( φ j ) ( n [ 1 (1 /λ ( j ) n , 1 /λ ( j ) 1 − φ j λ ( j ) 1 − φ j λ ( j ) i =1 i i =1 i (De Oliveira & Song, 2008; De Oliveira, 2011)

(a) 12 prior indep. Jeffreys Jeffreys−rule 10 uniform 8 π ( φ ) 6 4 2 0 −0.2 −0.1 0.0 0.1 0.2 φ

Remarks • Bayes factors and posterior model probabilities are, in general, undetermined when improper priors are used • Important exception occurs when competing models have same invariance structure, up to individual model parameters that have proper priors (Berger et al., 1998) • CAR and SAR models fit this exception when all the competing models have the same mean structure and π ( φ j | M j ) is proper

Fact As φ j → 1 /λ ( j ) ; i = 1 or n i π J 1 ( φ j | M j ) = O ((1 − φ j λ ( j ) ) − 1 ) i so π J 1 ( φ j | M j ) is not integrable (De Oliveira & Song, 2008). Instead we use ( π J 1 ( φ j | M j )) r , with r < 1, which is proper and has the same “shape”.

For j = 1 , . . . , k : � 1 /λ ( j ) 1 m ( y | M j ) = Kc j h ( φ j , M j , y ) dφ j 1 /λ ( j ) n where h ( φ j , M j , y ) = | Σ − 1 φ j | 1 / 2 | X ′ Σ − 1 φ j X | − 1 / 2 ( S 2 φ j ) − ( n − p ) / 2 π ( φ j | M j ) β φ j ) ′ Σ − 1 S 2 φ j = ( y − X ˆ φ j ( y − X ˆ β φ j ) β φ j = ( X ′ Σ − 1 φ j X ) − 1 X ′ Σ − 1 ˆ φ j y − 1 K = Γ( n − p � 1 /λ ( j )   2 ) 1 , c j = π ( φ j | M j ) dφ j n − p   1 /λ ( j ) π n 2

Note • For posterior model probabilities to be well defined and calibrated, the proportionality constants in the like- lihoods and priors of all competing models should be retained • Computation of m ( y | M j ) involves one-dimensional integration over a bounded interval

Computation • Computation of ˆ c j straightforward: numerical quadrature or Monte Carlo � − 1 � m ( 1 1 ) 1 � ( π J 1 ( φ ( l ) | M j )) 1 / 2 c j = ˆ − j λ ( j ) λ ( j ) m n 1 l =1 iid with φ (1) , . . . , φ ( m ) ∼ unif(1 /λ ( j ) n , 1 /λ ( j ) 1 ) j j • Computation of m ( y | M j ) requires more care: h ( φ j , M j , y ) is highly peaked and concentrated near the right boundary for moderate or large sample sizes. Hence almost constant and very close to zero over most of the integration region, and common numerical quadrature or Monte Carlo estimates are often zero.

100 30 40 25 80 20 30 60 π ( φ |y) π ( φ |y) π ( φ |y) 15 20 40 10 10 20 5 0 0 0 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 φ φ φ 100 30 40 25 80 20 30 60 π ( φ |y) π ( φ |y) π ( φ |y) 15 20 40 10 10 20 5 0 0 0 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 φ φ φ

BAYESIAN MODEL SELECTION IN SPATIAL LATTICE MODELS Victor De - PowerPoint PPT Presentation

BAYESIAN MODEL SELECTION IN SPATIAL LATTICE MODELS Victor De Oliveira Department of Management Science and Statistics The University of Texas at San Antonio San Antonio, TX USA victor.deoliveira@utsa.edu

VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F.

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian Spatial Analysis Dr. Jarad Niemi STAT 615 - Iowa State University November 9, 2017

Bayesian Model Selection and Averaging Nonlinear Models Bayes factors Example Families FFX

Sequential Monte Carlo Methods for Bayesian Model Selection in Positron Emission Tomography Yan

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Resource 1: What is spatial? presentation notes Section Section text Notes 1. Spatial

Broadening the Study of Spatial Intelligence Mary Hegarty University of California, Santa

A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Bayesian variable selection Dr. Jarad Niemi Iowa State University September 4, 2017 Jarad Niemi

Overview Bayesian Model Selection Bayesian Learning of CPTs Dealing with Multiple Models Chris

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Modelling Survey Data with Bayesian Networks Marco Scutari scutari@stats.ox.ac.uk Department of

Overclocking the Humanity: The Science of Cycles Reveals Future Trends Overclocking the Humanity:

IM 2010: Operations Research, Spring 2014 Game Theory (Part 1): Static Games Ling-Chieh Kung

Agent-Based Systems Agent: autonomous Learning for Agent-Based Systems Environment: fully,

Doubts and Variability Authors: Rhys Bidder and Matthew E. Smith Presentation: Dan Greenwald

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Statistical Image Models Eero Simoncelli Howard Hughes Medical Institute, Center for Neural

Bayesian Estimation of Low-rank Matrices Pierre Alquier Journes de Statistique du Sud,

Sambuz

Useful Links

Newsletter

Mail Us

BAYESIAN MODEL SELECTION IN SPATIAL LATTICE MODELS Victor De - PowerPoint PPT Presentation

BAYESIAN MODEL SELECTION IN SPATIAL LATTICE MODELS Victor De Oliveira Department of Management Science and Statistics The University of Texas at San Antonio San Antonio, TX USA victor.deoliveira@utsa.edu

VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F. ROSADO VICTOR F.

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian Spatial Analysis Dr. Jarad Niemi STAT 615 - Iowa State University November 9, 2017

Bayesian Model Selection and Averaging Nonlinear Models Bayes factors Example Families FFX

Sequential Monte Carlo Methods for Bayesian Model Selection in Positron Emission Tomography Yan

Luigi Spezia Biomathematics &amp; Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Resource 1: What is spatial? presentation notes Section Section text Notes 1. Spatial

Broadening the Study of Spatial Intelligence Mary Hegarty University of California, Santa

A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Bayesian variable selection Dr. Jarad Niemi Iowa State University September 4, 2017 Jarad Niemi

Overview Bayesian Model Selection Bayesian Learning of CPTs Dealing with Multiple Models Chris

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Modelling Survey Data with Bayesian Networks Marco Scutari scutari@stats.ox.ac.uk Department of

Overclocking the Humanity: The Science of Cycles Reveals Future Trends Overclocking the Humanity:

IM 2010: Operations Research, Spring 2014 Game Theory (Part 1): Static Games Ling-Chieh Kung

Agent-Based Systems Agent: autonomous Learning for Agent-Based Systems Environment: fully,

Doubts and Variability Authors: Rhys Bidder and Matthew E. Smith Presentation: Dan Greenwald

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Statistical Image Models Eero Simoncelli Howard Hughes Medical Institute, Center for Neural

Bayesian Estimation of Low-rank Matrices Pierre Alquier Journes de Statistique du Sud,

Sambuz

Useful Links

Newsletter

Mail Us

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?