bayesian cluster detection via adjacency modelling
play

Bayesian cluster detection via adjacency modelling Craig Anderson - PowerPoint PPT Presentation

Bayesian cluster detection via adjacency modelling Craig Anderson University of Technology Sydney Bayes on the Beach 2015 Acknowledgements Co-authors Dr Duncan Lee (University of Glasgow). Dr Nema Dean (University of Glasgow). Funding


  1. Bayesian cluster detection via adjacency modelling Craig Anderson University of Technology Sydney Bayes on the Beach 2015

  2. Acknowledgements Co-authors Dr Duncan Lee (University of Glasgow). Dr Nema Dean (University of Glasgow). Funding Carnegie Trust for the Universities of Scotland. ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS). 2/26

  3. Motivation Want to model respiratory admissions in Glasgow, Scotland. Glasgow is a city with many health inequalities. Need a model which accounts for these inequalities. Standard spatial modelling techniques are unsuitable. 3/26

  4. Glasgow case study 4/26

  5. The Glasgow effect Glasgow has the lowest life expectancy in the UK (73 for men, 78.5 for women). One in four children will not live beyond 65. Epidemiologists call this the ‘ Glasgow effect ’. Huge health inequalities within the city. Life expectancy ranges from 59 in Parkhead to 80 in Jordanhill & Kelvinside. 5/26

  6. Why do we need spatial modelling? Disease risk often varies across a geographical region. Nearby areas tend to have more in common than those further apart. Model structure must account for this. Identifying high-risk areas is first step to fixing health issues. 6/26

  7. Respiratory data Case study of respiratory hospital admissions in Greater Glasgow & Clyde Health Board. Region divided into 271 non-overlapping ‘Intermediate Geographies’ (IGs). Each IG has roughly the same population. Number of admissions in each IG is recorded ( Y i ). Also compute expected number of admissions ( E i ). 7/26

  8. SIR ( Y i / E i ) for Glasgow respiratory admissions 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 8/26

  9. Modelling risk Disease risk commonly modelled with a Poisson GLM. Random effect included to account for spatial variation. Y i | E i , R i ∼ Poisson( E i R i ) i = 1 , ..., n x T ln( R i ) = i β + φ i x T i β is covariate information. φ i is the random effect term for area i. 9/26

  10. Conditional autoregressive model Simplest CAR model is the intrinsic model (Besag et al, 1991).   � n w ij φ j   τ 2  j =1  φ i | φ − i ∼ N ,   � n � n   w ij w ij j =1 j =1 φ − i is a vector of all random effects except φ i . w ij =1 if i and j are neighbours, 0 otherwise. τ 2 is a conditional variance term. 10/26

  11. Fitted model for Glasgow respiratory admissions 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 11/26

  12. Drawbacks of standard CAR models Assumes constant spatial smoothness across the study region. In reality, risk varies differently across the region. Extreme values smoothed towards mean - contrary to aim. Prefer a method which allows more flexible smoothing. 12/26

  13. Alternative smoothing approach One approach is to introduce ‘boundaries’ in the risk surface. Areas separated by boundaries are not smoothed. We want to identify ‘closed’ boundaries which fully enclose a group of areal units. This approach involves grouping together similar neighbouring areas - ie clustering. Allows identification of clusters of high (or low) disease risk. 13/26

  14. Agglomerative Hierarchical Clustering 1 Initially consider every object to be a ‘singleton’ cluster of size one. 2 Evaluate a dissimilarity measure for each possible pair. 3 Merge together the two most similar clusters. 4 Return to step 2 and repeat until all clusters are merged. 14/26

  15. Spatial Agglomerative Hierarchical Clustering 1 Initially consider every object to be a ‘singleton’ cluster of size one. 2 Evaluate a dissimilarity measure for each possible pair. 3 Merge together the two most similar neighbouring clusters. 4 Return to step 2 and repeat until all clusters are merged. 15/26

  16. Application to Glasgow SIR data 5 Clusters 10 Clusters 2.0 2.0 1.8 1.8 1.6 1.6 1.4 1.4 1.2 1.2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 20 Clusters 30 Clusters 2.0 2.0 1.8 1.8 1.6 1.6 1.4 1.4 1.2 1.2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 16/26

  17. Selecting the best cluster structure The algorithm produces n possible cluster structures. Need to find a way to select the most suitable structure. Subjective methods could be used in some examples, but not ideal in general. Need to find an objective method to select a structure. 17/26

  18. Incorporating into a model Need an approach which incorporates the choice of cluster structure into the model. Our cluster structures have a natural ordering ( C 1 , . . . C n ). Use this ordering to include the number of clusters as a model parameter. Induce clustering via correlation structure of random effects. 18/26

  19. Altering the correlation structure Change the neighbourhood matrix to account for clusters. w ij = 1 if i , j are neighbours AND lie in the same cluster, w ij = 0 otherwise. But this could produce singleton clusters with no neighbours. Use localised CAR prior proposed by Lee et al (2014). Includes global random effect φ ∗ . 19/26

  20. Lee et al (2014) Extend neighbourhood matrix as follows: � � W w ∗ � W = w T 0 ∗ w ∗ = ( w 1 ∗ . . . , w n ∗ ) w i ∗ = 1 if area i has at least one neighbour in a different cluster. w i ∗ = 0 otherwise. 20/26

  21. Lee et al (2014) Localised CAR (LCAR) prior takes the form: �� n � τ 2 j =1 ˆ w ij φ j + ˆ w i ∗ φ ∗ φ i | � φ − i ∼ N � n w i ∗ + ǫ , � n j =1 ˆ w ij + ˆ j =1 ˆ w ij + ˆ w i ∗ + ǫ � � n � j =1 ˆ τ 2 w j ∗ φ j φ ∗ | � φ −∗ ∼ N � n w j ∗ + ǫ, � n . j =1 ˆ j =1 ˆ w j ∗ + ǫ 21/26

  22. Model Random effect model: Y i | E i , R i ∼ Poisson( E i R i ) ln( R i ) = β 0 + φ i φ ∼ LCAR( ˆ W ) W ∼ Discrete( � W 1 , . . . , � ˆ W n ; π 1 , . . . , π n ) exp( − j θ ) π j = � n exp( − i θ ) i =1 θ ∼ Uniform(0,1) 22/26

  23. Posterior density of ˆ W Density of Cluster Choices 0.25 0.20 0.15 Density 0.10 0.05 0.00 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Number of Clusters 23/26

  24. Glasgow application 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 24/26

  25. Discussion Presented a method which allow for more localised smoothing. Picks out geographical clusters of disease risk. Lots of applications in epidemiology and public health. Could help identify factors which are causing high-risk clusters. 25/26

  26. Potential future work Single-stage model which doesn’t require prior clustering. Alternative spatial correlation structures. Consider applications to different forms of spatial data. Develop spatio-temporal disease clustering methodology. 26/26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend