Dealing with unknown discontinuities in data and models
Kerry Gallagher John Stephenson Chris Holmes
Dealing with unknown discontinuities in data and models Kerry - - PowerPoint PPT Presentation
Dealing with unknown discontinuities in data and models Kerry Gallagher John Stephenson Chris Holmes Discontinuities occur in both data and processes in the Earth and Environmental Sciences Spatial : faults, topography, lithology, phase,
Kerry Gallagher John Stephenson Chris Holmes
Spatial : faults, topography, lithology, phase, composition,… Temporal : climate, seismicity, tectonics,…
400 600 800 1000 1200 1400 1860 1880 1900 1920 1940 1960 1980
Nile discharge (m
3 x10 8)
Year
What was the signficance of the opening of the Aswan Dam ?
(data from Cobb 1978)
400 600 800 1000 1200 1400 1860 1880 1900 1920 1940 1960 1980
Nile discharge (m
3 x10 8)
Year
ƒ(t) = μ1I( t ≤ tc) + μ2I( t > tc)
(after Denison et al. 2002)
Data interpolation and prediction with discontinuities Standard methods may be too smooth
0.5 1 1.5 2
0.2 0.4 0.6 0.8 1 1.2 Kriging model of synthetic step function X Y Realisation of true data True Function Kriging Fit (Gaussian)
0.5 1 1.5 2
0.2 0.4 0.6 0.8 Kriging model of synthetic model X Y Realisation of true data True Function Kriging Fit (Gaussian)
s s ƒ(s) ƒ(s)
Need a method that can deal with an unknown number of discontinuities in unknown locations
0.5 1 1.5 2
0.2 0.4 0.6 0.8 X Y Partition model of synthetic data True Function Kriging Fit (Gaussian)
ƒ(x) x
1D 2D
Formulating a Partition Model
How many discontinuities, where are they ?
Regression function, ƒ, specified within region X ƒ(X) Space partitioned into discrete regions Parameters: (c1-N,ƒ1-N, N, σ2) = θ c6 c1 c3 c4 c5 c2 Voronoi Centres Partitions defined by Voronoi tessellation
Generating Partition Models
Θ
Bayes’ Theorem
Posterior Likelihood Prior Use Markov chain Monte Carlo (MCMC) to sample the posterior distribution, p(θ|D) D = observed data θ = model parameters y = value to be predicted
Prediction Monte Carlo integration
=
N i i D
1
Posterior distribution
Sampling with (transdimensional) MCMC
Initialise θ
Jump proposal Jacobian Model Proposal Prior Likelihood
Iterate
α(θ,θ’) = min 1, p(θ’)p(D|θ’) p(θ|θ’) p(θ)p(D|θ) p(θ’|θ) R |J|
Acceptance criterion
Sampling Partition Models natural parsimony
Better data fit
Likelihood
Atmospheric dust input to peat bogs
Mean±95%C.I.
38,500 yr
Looking for common signature in multiple systems
1D partition models for data interpolation
45,500 yr 8,850 yr
Partition Models – 2D example function
Partition Sampling – 2D single realisation
Multiple realisations … ensemble average (smooth, but maintain discontinuities)
Partition Model Digital Elevation Model (DEM) example
Pixels Pixels Raw ERS Sample Image 10 20 30 40 50 60 10 20 30 40 50 60 10 20 30 40 50 60 10 20 30 40 50 60 Pixels Pixels Contour Plot of Partition Model Pixel Value
Thermochronology : data are sensitived to temperature history experience by host rock
e.g. apatite fission track analysis
Likelihood is a non-linear function of unknown parameters at each location within each partition
The problem is to find (a) how to partition the samples in 2D (i) number of partitions (ii) location of the partitions (b) the distribution of thermal histories in each partition
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.25 0.15 0.1 0.05
P
X1 X2
inferred true
(Stephenson, Gallagher and Holmes 2006)
discontinuities with unknown geometry in variable dimensions
parameters, posterior predictions)
parameterisation
0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 p(k)
Sampling Partition Models distribution on number of partitions
Traditionally, each sample is modelled independently..
ignores spatial relationships….
ignores spatial relationships…. ..ideally want to group samples with common thermal history
Traditionally, each sample is modelled independently..
…but the spatial relationships may be unknown…
Traditionally, each sample is modelled independently..