Lecture 13 Gaussian Process Models - Part 2 Colin Rundel - PowerPoint PPT Presentation

Lecture 13 Gaussian Process Models - Part 2 Colin Rundel 03/01/2017 1

EDA and GPs 2

t i t j t i t j t i t j Variogram 2 Y t j Y t i E 2 can simplify to for all i and j ) then we t j t i If the process has constant mean (e.g. is called the semivariogram. where From the spatial modeling literature the typical approach is to examine an 2 t j Y t j t i Y t i E Y t j Var Y t i 2 Variogram: looking at the connection to the covariance. empirical variogram , first we’ll look at the theoretical variogram before 3

t i t j Variogram for all i and j ) then we 2 Y t j Y t i E 2 can simplify to t j From the spatial modeling literature the typical approach is to examine an t i If the process has constant mean (e.g. Variogram: looking at the connection to the covariance. empirical variogram , first we’ll look at the theoretical variogram before 3 2 γ ( t i , t j ) = Var ( Y ( t i ) − Y ( t j )) = E ([( Y ( t i ) − µ ( t i )) − ( Y ( t j ) − µ ( t j ))] 2 ) where γ ( t i , t j ) is called the semivariogram.

Variogram From the spatial modeling literature the typical approach is to examine an empirical variogram , first we’ll look at the theoretical variogram before looking at the connection to the covariance. Variogram: can simplify to 3 2 γ ( t i , t j ) = Var ( Y ( t i ) − Y ( t j )) = E ([( Y ( t i ) − µ ( t i )) − ( Y ( t j ) − µ ( t j ))] 2 ) where γ ( t i , t j ) is called the semivariogram. If the process has constant mean (e.g. µ ( t i ) = µ ( t j ) for all i and j ) then we 2 γ ( t i , t j ) = E ([ Y ( t i ) − Y ( t j )] 2 )

Some Properties of the theoretical Variogram / Semivariogram • there is no dependence if • if the process is stationary • both are non-negative • if the process is not stationary 4 • both are symmetric • both are 0 at distance 0 γ ( t i , t j ) ≥ 0 γ ( t i , t i ) = 0 γ ( t i , t j ) = γ ( t j , t i ) 2 γ ( t i , t j ) = Var ( Y ( t i )) + Var ( Y ( t j )) for all i ̸ = j 2 γ ( t i , t j ) = Var ( Y ( t i ) ) + Var ( Y ( t j ) ) − 2 Cov ( Y ( t i ) , Y ( t j ) ) 2 γ ( t i , t j ) = 2 Var ( Y ( t i ) ) − 2 Cov ( Y ( t i ) , Y ( t j ) )

Empirical Semivariogram We will assume that our process of interest is stationary, in which case we aggregate into bins and calculate the empirical semivariogram for each bin. data pairs to examine. Each individually is not very informative, so we n possible 2 n Practically, for any data set with n observations there are 5 Empirical Semivariogram: 1 will parameterize the semivariagram in terms of h = | t i − t j | . ∑ γ ( h ) = ˆ ( Y ( t i ) − Y ( t j )) 2 2 N ( h ) | t i − t j |∈ ( h − ϵ, h + ϵ )

Empirical Semivariogram We will assume that our process of interest is stationary, in which case we aggregate into bins and calculate the empirical semivariogram for each bin. data pairs to examine. Each individually is not very informative, so we 2 Practically, for any data set with n observations there are 5 Empirical Semivariogram: 1 will parameterize the semivariagram in terms of h = | t i − t j | . ∑ γ ( h ) = ˆ ( Y ( t i ) − Y ( t j )) 2 2 N ( h ) | t i − t j |∈ ( h − ϵ, h + ϵ ) ) + n possible ( n

Connection to Covariance 6

Covariance vs Semivariogram - Exponential 7 exp cov exp semivar 1.00 l 1 1.7 0.75 2.3 3 3.7 y 0.50 4.3 5 0.25 5.7 6.3 7 0.00 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 d

Covariance vs Semivariogram - Square Exponential 8 sq exp cov sq exp semivar 1.00 l 1 1.7 0.75 2.3 3 3.7 y 0.50 4.3 5 0.25 5.7 6.3 7 0.00 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 d

9 From last time 1 0 y −1 −2 0.00 0.25 0.50 0.75 1.00 t

Empirical semivariogram - no bins / cloud 10 4 gamma 2 0 0.00 0.25 0.50 0.75 1.00 h

Empirical semivariogram (binned) 11 binwidth=0.05 binwidth=0.075 4 3 2 1 0 gamma binwidth=0.1 binwidth=0.15 4 3 2 1 0 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 h

Empirical semivariogram (binned + n) 12 binwidth=0.05 binwidth=0.075 4 3 2 1 n 5 0 gamma 10 binwidth=0.1 binwidth=0.15 15 4 20 25 3 2 1 0 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 h

2 exp h l 2 2 exp h l 2 5 86 h 2 Theoretical vs empirical semivariogram After fitting the model last time we came up with a posterior median of Cov h h 2 1 89 1 89 exp 13 σ 2 = 1 . 89 and l = 5 . 86 for a square exponential covariance.

Theoretical vs empirical semivariogram After fitting the model last time we came up with a posterior median of 13 σ 2 = 1 . 89 and l = 5 . 86 for a square exponential covariance. Cov ( h ) = σ 2 exp ( − ( h l ) 2 ) γ ( h ) = σ 2 − σ 2 exp ( − ( h l ) 2 ) = 1 . 89 − 1 . 89 exp ( − ( 5 . 86 h ) 2 )

Theoretical vs empirical semivariogram After fitting the model last time we came up with a posterior median of 13 σ 2 = 1 . 89 and l = 5 . 86 for a square exponential covariance. Cov ( h ) = σ 2 exp ( − ( h l ) 2 ) γ ( h ) = σ 2 − σ 2 exp ( − ( h l ) 2 ) = 1 . 89 − 1 . 89 exp ( − ( 5 . 86 h ) 2 ) binwidth=0.05 binwidth=0.1 3 gamma 2 1 0 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 h

Variogram features 14

PM2.5 Example 15

FRN Data Measured PM2.5 data from an EPA monitoring station in Columbia, NJ. 16 20 15 pm25 10 5 Jan 2007 Apr 2007 Jul 2007 Oct 2007 Jan 2008 date

FRN Data -68.016 13.3 -68.016 46.682 230031011 57 2007-02-26 14.1 -68.016 46.682 230031011 54 2007-02-23 14.7 46.682 60 230031011 48 2007-02-17 6.5 -68.016 46.682 230031011 45 2007-02-14 11.5 -68.016 46.682 230031011 2007-03-01 230031011 2007-02-11 2007-03-10 75 2007-03-16 10.3 -68.016 46.682 230031011 72 2007-03-13 8.6 -68.016 46.682 230031011 69 14.0 46.682 -68.016 46.682 230031011 66 2007-03-07 9.0 -68.016 46.682 230031011 63 2007-03-04 8.6 -68.016 42 19.9 site 10.4 2007-01-18 7.5 -68.016 46.682 230031011 15 2007-01-15 9.7 -68.016 46.682 230031011 6 2007-01-06 -68.016 230031011 46.682 230031011 3 2007-01-03 8.9 -68.016 46.682 230031011 day date pm25 longitude latitude 18 46.682 -68.016 230031011 46.682 230031011 36 2007-02-05 9.1 -68.016 46.682 230031011 30 2007-01-30 16.2 -68.016 46.682 27 -68.016 2007-01-27 9.0 -68.016 46.682 230031011 24 2007-01-24 9.5 -68.016 46.682 230031011 21 2007-01-21 4.6 17

Mean Model ## Coefficients: -0.0724639 0.0001751 ## ## Call: ## lm(formula = pm25 ~ day + I(day^2), data = pm25) ## ## (Intercept) ## day I(day^2) ## 12.9644351 -0.0724639 0.0001751 12.9644351 I(day^2) 18 ## (Intercept) ## Coefficients: ## ## lm(formula = pm25 ~ day + I(day^2), data = pm25) ## Call: ## day 20 15 pm25 10 5 0 100 200 300 day

19 Detrended Residuals Residuals 10 5 resid 0 −5 0 100 200 300 day

Empirical Variogram 20 binwidth=3 binwidth=6 40 n 30 200 gamma 150 100 20 50 10 0 0 100 200 300 0 100 200 300 h

Empirical Variogram 21 binwidth=6 binwidth=9 15 10 gamma 5 0 0 50 100 150 0 50 100 150 h

1 d 2 d 2 w 2 0 w 0 w d Model What does the model we are trying to fit actually look like? 0 d where w w d d y d 22

Model What does the model we are trying to fit actually look like? where 22 y ( d ) = µ ( d ) + w ( d ) + w µ ( d ) = β 0 + β 1 d + β 2 d 2 w ( d ) ∼ GP ( 0 , Σ) w ∼ N ( 0 , σ 2 w )

JAGS Model ## ## Sigma[k,k] <- sigma2 + sigma2_w ## } ## ## for (i in 1:3) { ## beta[i] ~ dt(0, 2.5, 1) } ## ## sigma2_w ~ dnorm(10, 1/25) T(0,) ## sigma2 ~ dnorm(10, 1/25) T(0,) ## l ~ dt(0, 2.5, 1) T(0,) ## } for (k in 1:N) { ## ## model{ ## ## y ~ dmnorm(mu, inverse(Sigma)) ## ## for (i in 1:N) { ## mu[i] <- beta[1]+ beta[2] * x[i] + beta[3] * x[i]^2 ## } ## } for (i in 1:(N-1)) { ## for (j in (i+1):N) { ## Sigma[i,j] <- sigma2 * exp(- pow(l*d[i,j],2)) ## Sigma[j,i] <- Sigma[i,j] ## } ## 23

Posterior - Betas 24 Trace of beta[1] Density of beta[1] 0.08 10 0.00 0 15000 20000 25000 30000 35000 40000 −5 0 5 10 15 20 Iterations N = 715 Bandwidth = 1.543 Trace of beta[2] Density of beta[2] 0.15 8 4 −0.15 0 15000 20000 25000 30000 35000 40000 −0.2 −0.1 0.0 0.1 0.2 Iterations N = 715 Bandwidth = 0.01645 Trace of beta[3] Density of beta[3] 2500 −4e−04 0 15000 20000 25000 30000 35000 40000 −4e−04 −2e−04 0e+00 2e−04 4e−04 Iterations N = 715 Bandwidth = 3.873e−05

Posterior - Covariance Parameters 25 Trace of l Density of l 1.0 10 0.0 0 15000 20000 25000 30000 35000 40000 0.0 0.5 1.0 1.5 Iterations N = 715 Bandwidth = 0.01888 Trace of sigma2 Density of sigma2 0.05 15 0.00 0 15000 20000 25000 30000 35000 40000 0 5 10 15 20 25 30 Iterations N = 715 Bandwidth = 1.471 Trace of sigma2_w Density of sigma2_w 0.20 15 5 0.00 15000 20000 25000 30000 35000 40000 0 5 10 15 Iterations N = 715 Bandwidth = 0.5303

Lecture 13 Gaussian Process Models - Part 2 Colin Rundel - PowerPoint PPT Presentation

Lecture 13 Gaussian Process Models - Part 2 Colin Rundel 03/01/2017 1 EDA and GPs 2 t i t j t i t j t i t j Variogram 2 Y t j Y t i E 2 can simplify to for all i and j ) then we t j t i If the process has constant mean (e.g. is called

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Multiphase Modelling in Cancer Helen Byrne Wolfson Centre for Mathematical Biology Mathematical

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Lecture Capture Introduction to Lecture Capture Learning Outcomes What will lecture capture

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Repetition Automatic Control, Basic Course, Lecture 11 Fredrik Bagge Carlson December 17, 2016

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

A Short Introduction to Bayesian Optimization With applications to parameter tuning on

Overview Prediction with Gaussian Processes: Basic Ideas Bayesian Prediction Chris Williams

CMPUT 466 Introduction to Gaussian Processes Dan Lizotte The Plan Introduction to Gaussian

Gaussian processes - Refresher and some more in insig ights Marcel Lthi Graphics and Vision

Gaussian Processes for Robotics McGill COMP 765 Oct 24 th , 2017 A robot must learn Modeling

Understanding Wide Neural Networks Jaehoon Lee Google Brain HEP-AI Journal Club Feb 5, 2019

State Space Gaussian Processes with Non-Gaussian Likelihoods Hannes Nickisch 1 Arno Solin 2

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Sambuz

Useful Links

Newsletter

Mail Us