Pseudo Bayesian inference for intensity-dependent point processes - PowerPoint PPT Presentation

Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen 1 1 Department of Mathematical Sciences Aalborg University Joint work with Mari Myllym¨ aki Avignon, May 2012 1/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Motivation We are considering marked point patterns { ( x i , m i ) } , where { x i } denotes the locations of objects (trees) in “window” W , and { m i } denotes the corresponding marks (stem diameter at breast height Stoyan’s k mm ( r ) (DBH)). 100 1.1 80 1.0 60 k mm ( r ) 0.9 40 0.8 20 0.7 0 0 20 40 60 80 100 120 0 5 10 15 20 25 30 35 ◮ We want to construct a reasonable model for the marking r (and distribution) of points 2/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Hainich data Data: Location of 650 trees marked by dbh in a 118 . 5 m × 93 . 75 m region. The trees belong to a mixed broad-leaved forest in Hainich in Western Thuringia (Germany), as so-called selection forest (Plenterwald). 100 70 80 60 50 60 marks 40 40 30 20 20 0 10 0 20 40 60 80 100 120 0.02 0.04 0.06 0.08 0.10 estimated intensity ◮ Left plot suggests inhomogeneous point distribution. ◮ Right plot suggests mark distribution depends on point intensity. 3/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Intensity dependent marking We consider a situation where there is a relation between the marks and the intensity of the point pattern. Two examples where this is relevant ◮ Preferential sampling: One makes more measurements where the measured value (i.e. the mark) is high, e.g. pollution, see [Diggle et al., 2010] ◮ Density-dependence in plant ecology: In areas with relatively many trees the trees tend to be small, and vice versa. See [Myllym¨ aki and Penttinen, 2009]. 4/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

The model We consider a model with a density n 1 � π (( x i , m i }| β ) = β ( x i ) π ( m i | β ( x i )) c ( β, θ m , θ ϕ ) i =1 � × ϕ (( x i , m i ) , ( x j , m j ); θ ϕ ) , (1) i < j w.r.t. a Poisson process on W × R + . β : W → R + is the first order term. Conditional on β and { x i } the marks are then distributed as m i | x i , β ∼ π ( m i | β ( x i ) , θ m ) , i.e. the distribution of mark m i depends on β evaluated at the location x i and parameters θ m . Here where ϕ : ( W × M ) × ( W × M ) → [0 , 1] is the interaction function. 5/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Specifying the interaction function ϕ Specifically we choose � γ if � x i − x j �≤ R ( m i + m j ) ϕ (( x i , m i ) , ( x j , m j )) = 1 otherwise , where R ≥ 0 controls the interaction range and γ ∈ [0 , 1] controls the strength of the interaction. Interpretation: ◮ Circular influence zones, where the diameter of the influence zone centred at x i is proportional to m i (DBH). ◮ The interaction parameter γ specifies the degree of “penalty” on each pair of overlapping influence zones. 6/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Mark distribution Regarding the mark distribution, we assume � � �� c , 1 b m i − m 0 | θ, β ( x i ) ∼ Γ a + , � c β ( x i ) where m 0 ≥ 0 is the minimum mark size, and Γ( k , θ ) denotes the gamma distribution with shape parameter k and scale parameter θ . Hence b V ar[ m i − m 0 | θ, β ( x i )] = 1 E [ m i − m 0 | θ, β ( x i )] = a + and c . ( E [ m i − m 0 ]) 2 � β ( x i ) The special case, where m 0 = 0 and a = 0 we obtain a situation which is similar to location dependent scaling considered by [Hahn et al., 2003]. 7/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Bayesian inference We perform Bayesian posterior inference for ◮ β the first order term ◮ a , b , c parameters of the mark distribution ◮ R , γ the interaction parameters Priors ◮ For a , b , c , R and γ we assume uniform priors on a bounded interval. ◮ For β we assume a non-parametric approach 8/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Prior distribution for β As a prior on β we use a shot noise style prior � β ( x ) = λ K ( x − c ) , c ∈C where λ > 0, C is a Poisson process on R 2 and K is a kernel, i.e. a probability density on R 2 . This is the prior used by [Berthelsen and Møller, 2008] (in the 1-dimensional case). One alternative is a log Gaussian random field. This is the prior considered by H&S (2008) and M&P (2009) 9/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Approximative prior For the remainder we focus of the shot-noise prior: � β ( x ) = λ K ( x − c ) . c ∈C For simulation purposes we replace the Poisson process C on R 2 by a Poisson process C + on an extended window W + = { x ∈ R 2 : δ ( x , W ) ≤ ∆ } , ∆ ≥ 0 , where A , B ⊆ R 2 . δ ( A , B ) = x ∈ A , y ∈ B � x − y � , inf Further, we assume C + has intensity β C , and that K is the density of a bivariate normal distribution with covariance matrix σ 2 I . 10/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

How to choose ∆ The prior mean of β is [ β ( x )] = λβ C . When restricting C to W + the prior mean is (obviously) reduced. But by how much? Let D denoted the (missed) contribution for kernels centred outside W + : � � D = λ K ( x , c )d c . W c ∈C\ W + 11/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

How to choose ∆ The prior mean of β is [ β ( x )] = λβ C . When restricting C to W + the prior mean is (obviously) reduced. But by how much? Let D denoted the (missed) contribution for kernels centred outside W + : � � D = λ K ( x , c )d c . W c ∈C\ W + Then the expected value of D is � � E [ D ] = λβ C K ( x , c )d x d c R 2 \ W + W 12/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

How to choose ∆ The prior mean of β is [ β ( x )] = λβ C . When restricting C to W + the prior mean is (obviously) reduced. But by how much? Let D denoted the (missed) contribution for kernels centred outside W + : � � D = λ K ( x , c )d c . W c ∈C\ W + Then the expected value of D is � � E [ D ] = λβ C K ( x , c )d x d c R 2 \ W + W � � ≤ λβ C k ( x , c )d x d c , R 2 \ W + W where k ( x , c ) ≥ K ( x , c ) for all ( x , c ) ∈ W × ( R 2 \ W + ) is chosen to make integration easier. 13/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Choosing k Following “B&M 2008”, the function k ( x , c ) is chosen so that it is constant on W : − δ ( c , W ) 2 � � 1 k ( x , c ) = 2 πσ 2 exp 2 σ 2 Illustration of the 1-dimensional case: K ( x , c ) k ( x , c ) c x ∆ W ∆ Note: The 1-dimensional case is consider by B&M (2008) where the introduction of bounding function k is not needed. 14/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Bound on E [ D ] � � E [ D ] ≤ λβ C k ( x , c )d x d c R 2 \ W + W � ∞ e − r 2 / (2 σ 2 ) dr 2( a + b ) / (2 πσ 2 ) + r /σ 2 � � = λβ C | W | ∆ r ∆ b a The proportion contribution missed: � ∞ E [ D ] e − r 2 / (2 σ 2 ) d r 2( a + b ) / (2 πσ 2 ) + r /σ 2 � � W E [ β ( x )]d x = � ∆ Finally, ∆ is determined using numerical methods. 15/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Posterior simulations We want to explore the posterior distribution, π ( θ, β | x ) ∝ π ( x | θ, β ) π ( θ, β ), using MCMC. For convenience we write the likelihood as π (( x , m ) | θ, β ) = c − 1 ( θ, β ) f ( x | θ, β ) , where c − 1 is the unknown normalising constant of f ( y | θ ). 16/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Posterior simulations We want to explore the posterior distribution, π ( θ, β | x ) ∝ π ( x | θ, β ) π ( θ, β ), using MCMC. For convenience we write the likelihood as π (( x , m ) | θ, β ) = c − 1 ( θ, β ) f ( x | θ, β ) , where c − 1 is the unknown normalising constant of f ( y | θ ). Using (conventional) Metropolis-Hastings updates involves evaluating the Hastings ratio: H ( θ, θ ′ ) = c − 1 ( θ ′ , β ′ ) f ( x ; θ ′ , β ′ ) π ( θ ′ , β ′ ) q ( θ ′ , β ′ ; θ, β ) c − 1 ( θ, β ) f ( x ; θ, β ) π ( θ, β ) q ( θ, β ; θ ′ , β ′ ) Notice this involves evaluating a ratio of unknown normalising constants. 17/35 Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen

Pseudo Bayesian inference for intensity-dependent point processes - PowerPoint PPT Presentation

Pseudo Bayesian inference for intensity-dependent point processes Kasper K. Berthelsen 1 1 Department of Mathematical Sciences Aalborg University Joint work with Mari Myllym aki Avignon, May 2012 1/35 Pseudo Bayesian inference for

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Why Dependent Origination? So what is dependent origination? Dependent on ignorance, there

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

New tools for intensity interferometry New tools for intensity interferometry First experiment in

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

International Adaptation and Use of the Supports Intensity Scales James R. Thompson, PhD.

Short Bases of Lattices over Number Fields Claus Fieker Damien Stehl e University of Sydney/

for Planted Clique Part II Lecture Outline Part I: Relaxed k-clique Equations and Theorem

Efficient Clause Learning for Quantified Boolean Formulas via QBF Pseudo Unit Propagation Florian

Pseudo H -type algebras, integer structure constants and isomorphisms. Irina Markina University

Pseudo-Entropy and Pseudorandom Generators Iftach Haitner Tel Aviv University. January 6, 2015

Boolean Satisfiability Example problem instance Naive algorithm pseudocode example Ari

Timeline-based Planning and Execution: Theory and Practice - PLATINUm - A Novel framework for PL

SilverLine: Data and Network Isolation for Cloud Services Yogesh Mundada Anirudh Ramachandran

Sambuz

Useful Links

Newsletter

Mail Us