Bayesian cluster detection via adjacency modelling Craig Anderson - PowerPoint PPT Presentation

Bayesian cluster detection via adjacency modelling Craig Anderson University of Technology Sydney Bayes on the Beach 2015

Acknowledgements Co-authors Dr Duncan Lee (University of Glasgow). Dr Nema Dean (University of Glasgow). Funding Carnegie Trust for the Universities of Scotland. ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS). 2/26

Motivation Want to model respiratory admissions in Glasgow, Scotland. Glasgow is a city with many health inequalities. Need a model which accounts for these inequalities. Standard spatial modelling techniques are unsuitable. 3/26

Glasgow case study 4/26

The Glasgow effect Glasgow has the lowest life expectancy in the UK (73 for men, 78.5 for women). One in four children will not live beyond 65. Epidemiologists call this the ‘ Glasgow effect ’. Huge health inequalities within the city. Life expectancy ranges from 59 in Parkhead to 80 in Jordanhill & Kelvinside. 5/26

Why do we need spatial modelling? Disease risk often varies across a geographical region. Nearby areas tend to have more in common than those further apart. Model structure must account for this. Identifying high-risk areas is first step to fixing health issues. 6/26

Respiratory data Case study of respiratory hospital admissions in Greater Glasgow & Clyde Health Board. Region divided into 271 non-overlapping ‘Intermediate Geographies’ (IGs). Each IG has roughly the same population. Number of admissions in each IG is recorded ( Y i ). Also compute expected number of admissions ( E i ). 7/26

SIR ( Y i / E i ) for Glasgow respiratory admissions 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 8/26

Modelling risk Disease risk commonly modelled with a Poisson GLM. Random effect included to account for spatial variation. Y i | E i , R i ∼ Poisson( E i R i ) i = 1 , ..., n x T ln( R i ) = i β + φ i x T i β is covariate information. φ i is the random effect term for area i. 9/26

Conditional autoregressive model Simplest CAR model is the intrinsic model (Besag et al, 1991).   � n w ij φ j   τ 2  j =1  φ i | φ − i ∼ N ,   � n � n   w ij w ij j =1 j =1 φ − i is a vector of all random effects except φ i . w ij =1 if i and j are neighbours, 0 otherwise. τ 2 is a conditional variance term. 10/26

Fitted model for Glasgow respiratory admissions 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 11/26

Drawbacks of standard CAR models Assumes constant spatial smoothness across the study region. In reality, risk varies differently across the region. Extreme values smoothed towards mean - contrary to aim. Prefer a method which allows more flexible smoothing. 12/26

Alternative smoothing approach One approach is to introduce ‘boundaries’ in the risk surface. Areas separated by boundaries are not smoothed. We want to identify ‘closed’ boundaries which fully enclose a group of areal units. This approach involves grouping together similar neighbouring areas - ie clustering. Allows identification of clusters of high (or low) disease risk. 13/26

Agglomerative Hierarchical Clustering 1 Initially consider every object to be a ‘singleton’ cluster of size one. 2 Evaluate a dissimilarity measure for each possible pair. 3 Merge together the two most similar clusters. 4 Return to step 2 and repeat until all clusters are merged. 14/26

Spatial Agglomerative Hierarchical Clustering 1 Initially consider every object to be a ‘singleton’ cluster of size one. 2 Evaluate a dissimilarity measure for each possible pair. 3 Merge together the two most similar neighbouring clusters. 4 Return to step 2 and repeat until all clusters are merged. 15/26

Application to Glasgow SIR data 5 Clusters 10 Clusters 2.0 2.0 1.8 1.8 1.6 1.6 1.4 1.4 1.2 1.2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 20 Clusters 30 Clusters 2.0 2.0 1.8 1.8 1.6 1.6 1.4 1.4 1.2 1.2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 16/26

Selecting the best cluster structure The algorithm produces n possible cluster structures. Need to find a way to select the most suitable structure. Subjective methods could be used in some examples, but not ideal in general. Need to find an objective method to select a structure. 17/26

Incorporating into a model Need an approach which incorporates the choice of cluster structure into the model. Our cluster structures have a natural ordering ( C 1 , . . . C n ). Use this ordering to include the number of clusters as a model parameter. Induce clustering via correlation structure of random effects. 18/26

Altering the correlation structure Change the neighbourhood matrix to account for clusters. w ij = 1 if i , j are neighbours AND lie in the same cluster, w ij = 0 otherwise. But this could produce singleton clusters with no neighbours. Use localised CAR prior proposed by Lee et al (2014). Includes global random effect φ ∗ . 19/26

Lee et al (2014) Extend neighbourhood matrix as follows: � � W w ∗ � W = w T 0 ∗ w ∗ = ( w 1 ∗ . . . , w n ∗ ) w i ∗ = 1 if area i has at least one neighbour in a different cluster. w i ∗ = 0 otherwise. 20/26

Lee et al (2014) Localised CAR (LCAR) prior takes the form: �� n � τ 2 j =1 ˆ w ij φ j + ˆ w i ∗ φ ∗ φ i | � φ − i ∼ N � n w i ∗ + ǫ , � n j =1 ˆ w ij + ˆ j =1 ˆ w ij + ˆ w i ∗ + ǫ � � n � j =1 ˆ τ 2 w j ∗ φ j φ ∗ | � φ −∗ ∼ N � n w j ∗ + ǫ, � n . j =1 ˆ j =1 ˆ w j ∗ + ǫ 21/26

Model Random effect model: Y i | E i , R i ∼ Poisson( E i R i ) ln( R i ) = β 0 + φ i φ ∼ LCAR( ˆ W ) W ∼ Discrete( � W 1 , . . . , � ˆ W n ; π 1 , . . . , π n ) exp( − j θ ) π j = � n exp( − i θ ) i =1 θ ∼ Uniform(0,1) 22/26

Posterior density of ˆ W Density of Cluster Choices 0.25 0.20 0.15 Density 0.10 0.05 0.00 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Number of Clusters 23/26

Glasgow application 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 24/26

Discussion Presented a method which allow for more localised smoothing. Picks out geographical clusters of disease risk. Lots of applications in epidemiology and public health. Could help identify factors which are causing high-risk clusters. 25/26

Potential future work Single-stage model which doesn’t require prior clustering. Alternative spatial correlation structures. Consider applications to different forms of spatial data. Develop spatio-temporal disease clustering methodology. 26/26

Bayesian cluster detection via adjacency modelling Craig Anderson - PowerPoint PPT Presentation

Bayesian cluster detection via adjacency modelling Craig Anderson University of Technology Sydney Bayes on the Beach 2015 Acknowledgements Co-authors Dr Duncan Lee (University of Glasgow). Dr Nema Dean (University of Glasgow). Funding

Adjacency Matrices Representations memory? 1. Adjacency matrices. 2. Adjacency lists. 3.

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Sound nd F Fisheri eries M Management Strengthening Adjacency Adjacency Longstanding,

Terminology Adjacency Adjacency Two vertices u and v are adjacent if there is an edge connecting

Informatik II Ubung 10 FS 2019 1 Program Today Repetition Lectures: Adjacency Lists 1

Graphs We can represent a graph with an adjacency list How much space does it take to represent a

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

history and drivers The Aerospace Cluster The Cluster-Association The Aerospace Cluster The

Getting started on the cluster Learning Objectives Describe the structure of a compute cluster

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

The FASK algorithm FASK (Fast Adjacency Skewness) appeals to Skewness. It runs the Fast

P rediction of U nderlying L atent C lasses via K -means and H ierarchical C lustering A lgorithm

Reconstruction of the Intra-Host Evolution of HCV Mathieu Flinders Max Planck Institute for

SPECTRAL CLUSTERING OF LARGE NETWORKS A. Fender, N. Emad, S. Petiton, M. Naumov May 8th, 2017

InvIdenti: Author Disambiguation for 28 July 2016 Slide 1 Medical Patents Guide (IIIT-A) :

AdjEEXP f(d ) ResPP k ik k k N SmResPP Z

Part 1 Advanced Computational Intelligence and Deep Machine Learning for Early Detection and

Community Detection : A Simple Example Joon Ho Park, Yumlembam Hemajit and Ki-Ho Lee Project

NAXOS 2018 Assessment of wastewater N2O generation using multivariate techniques Vasilaki V. 1 ,