bayesian calibration of computer models bayesian
play

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & - PowerPoint PPT Presentation

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & Markov chain Monte Carlo Gaussian processes, Computer model calibration and prediction Dave Higdon, Virginia Tech, Brian Williams & Jim Gattiker Statistical Sciences Group,


  1. An Inverse Problem in Hydrology 0.30 0.30 0.30 0.20 0.20 0.20 Well.NW Well.NE Well.N 0.10 0.10 0.10 0.0 0.0 0.0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time Time Time 0.30 0.30 60 50 0.20 0.20 40 Well.W Well.E 30 0.10 0.10 20 10 0.0 0.0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time Time 0.30 0.30 0.30 0.20 0.20 0.20 Well.SW Well.SE Well.S 0.10 0.10 0.10 0.0 0.0 0.0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time Time Time ⎧ ⎫ ⎩ − 1 ⎪ ⎪ L ( y | η ( z )) ∝ | Σ | − 1 ⎨ ⎬ 2 exp 2( y − η ( z )) T Σ − 1 ( y − η ( z )) ⎪ ⎪ ⎭ ⎧ ⎫ ⎩ − 1 ⎪ ⎪ m ⎨ ⎬ 2 z T W z z π ( z | λ z ) ∝ λ z exp 2 ⎪ ⎪ ⎭ π ( λ z ) ∝ λ a z − 1 exp { b z λ z } z π ( z, λ z | y ) ∝ L ( y | η ( z )) × π ( z | λ z ) × π ( λ z )

  2. Posterior realizations of z with a MRF prior 60 60 60 50 50 50 40 40 40 30 30 30 20 20 20 10 10 10 0 0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 True Permeability Field MCMC realization MCMC realization 60 60 60 50 50 50 40 40 40 30 30 30 20 20 20 10 10 10 0 0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 MCMC realization MCMC realization Posterior Mean Permeabilitiy Field

  3. Bayesian inference - iid N ( µ, 1) example 5 Observed data y are a noisy versions of µ 0 y i iid y ( s i ) = µ + � i with � i ∼ N (0 , 1) , k = 1 , . . . , n − 5 0 2 4 6 8 i sampling model prior for µ i =1 exp { − 1 � n 2 ( y i − µ ) 2 } L ( y | µ ) ∝ π ( µ ) ∝ N (0 , 1 / λ µ ) , λ µ small posterior density for µ π ( µ | y ) ∝ L ( y | µ ) × π ( µ ) i =1 exp { − 1 2( y i − µ ) 2 } × exp { − 1 n � 2 λ µ µ 2 } ∝ ⎧ ⎫ ⎩ − 1 ⎪ � � ⎪ y n ) 2 + λ µ ( µ − 0) 2 ⎨ ⎬ ∝ exp n ( µ − ¯ ⎭ × f ( y ) ⎪ ⎪ 2 ⎛ ⎞ ⎛ ⎞ y n n + 0 · λ µ ⎝ ¯ 1 y n , 1 λ µ → 0 ⎜ ⎟ ⎜ ⎟ ⇒ µ | y ∼ N , → N ⎝ ¯ ⎜ ⎟ ⎠ ⎠ n + λ µ n + λ µ n

  4. Bayesian inference - iid N ( µ, λ − 1 y ) example 5 Observed data y are a noisy versions of µ 0 y i iid ∼ N (0 , λ − 1 y ( s i ) = µ + � i with � i y ) , k = 1 , . . . , n − 5 0 2 4 6 8 i sampling model prior for µ, λ y 1 y exp { − 1 � n 2 λ y ( y i − µ ) 2 } L ( y | µ ) ∝ 2 π ( µ, λ y ) = π ( µ ) × π ( λ y ) i =1 λ π ( µ ) ∝ 1 , π ( λ y ) ∝ λ a y − 1 e − b y λ y y posterior density for ( µ, λ y ) π ( µ, λ y | y ) ∝ L ( y | µ ) × π ( µ ) × π ( λ y ) y exp { − 1 n n i =1 ( y i − µ ) 2 } × λ a y − 1 � exp { − b y λ y } ∝ λ 2 λ y 2 y π ( µ, λ | y ) is not so easy recognize. Can explore π ( µ, λ | y ) numerically or via Monte Carlo.

  5. Full conditional distributions for π ( µ, λ | y ) y exp { − 1 n n i =1 ( y i − µ ) 2 } × λ a y − 1 � π ( µ, λ y | y ) ∝ λ exp { − b y λ y } 2 λ y 2 y Though π ( µ, λ | y ) is not of a simple form, its conditional distributions are: π ( µ | λ y , y ) ∝ exp { − 1 n i =1 ( y i − µ ) 2 } � 2 λ y ⎛ ⎞ y n , 1 ⎜ ⎟ ⇒ µ | λ y , y ∼ N ⎝ ¯ ⎜ ⎟ ⎠ n λ y ⎧ ⎫ ⎩ b y + 1 n ⎪ ⎪ π ( λ y | µ, y ) ∝ λ a y + n ⎨ ⎬ 2 − 1 i =1 ( y i − µ ) 2 � exp y ⎪ ⎪ 2 ⎭ ⎛ ⎞ ⎝ a y + n 2 , b y + 1 n i =1 ( y i − µ ) 2 � ⎠ . ⇒ λ y | µ, y ∼ Γ ⎜ ⎟ 2

  6. Markov Chain Monte Carlo – Gibbs sampling Given full conditionals for π ( µ, λ y | y ) , one can use Markov chain Monte Carlo (MCMC) to obtain draws from the posterior The Gibbs sampler is a MCMC scheme which iteratively replaces each parame- ter,in turn, by a draw from its full conditional: initialize parameters at ( µ, λ y ) 0 for t = 1 , . . . , niter { ⎛ ⎞ y n , 1 ⎜ ⎟ set µ = a draw from N ⎝ ¯ ⎜ ⎟ ⎠ n λ y ⎛ ⎞ ⎝ a + n 2 , b + 1 n � i =1 ( y i − µ ) 2 set λ y = a draw from Γ ⎜ ⎟ ⎠ 2 } (Be sure to use newly updated µ when updating λ y ) Draws ( µ, λ y ) 1 , . . . , ( µ, λ y ) niter are a dependent sample from π ( µ, λ y | y ) . In practice, initial portion of the sample is discarded to remove e ff ect of initial- ization values ( µ 0 , λ 0 y ) .

  7. Gibbs sampling for π ( µ, λ y | y ) 2 1.5 1 0.5 µ 0 − 0.5 − 1 − 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 λ y

  8. posterior summary for π ( µ, λ y | y ) 2 5 1.5 1 0 0.5 y i µ 0 − 0.5 − 5 − 1 0 2 4 6 8 0 1 2 3 4 5 λ y i 8 5 6 µ λ y 4 0 µ 2 0 − 5 − 2 0 200 400 600 800 1000 0 2 4 6 8 iteration i

  9. Gibbs sampler: intuition Gibbs sampler for a bivariate normal density ⎧ ⎫ − 1 � � ⎛ ⎞ − 1 ⎛ ⎞ � � ⎪ ⎪ ⎩ − 1 2 1 ρ ⎪ ( z 1 z 2 ) ⎝ 1 ⎝ z 1 ⎪ ρ � � ⎪ ⎪ ⎨ ⎬ � � π ( z ) = π ( z 1 , z 2 ) ∝ exp ⎜ ⎟ ⎜ ⎟ � � ⎠ ⎠ � � ⎪ ⎪ ρ 1 1 z 2 ρ � � ⎪ 2 ⎪ ⎪ ⎪ � � ⎭ ✻ Full conditionals of π ( z ) : z 1 | z 2 ∼ N ( ρ z 2 , 1 − ρ 2 ) z 2 | z 1 ∼ N ( ρ z 1 , 1 − ρ 2 ) ( z 1 1 , z 1 2 ) = z 1 • initialize chain with ✉ z 2 ✻ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ 0 ⎝ 1 ρ z 0 ∼ N ⎠ , ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎠ ⎠ 0 1 ρ • draw z 1 1 ∼ N ( ρ z 0 2 , 1 − ρ 2 ) ✲ z 0 = ( z 0 ✉ ✉ 1 , z 0 ( z 1 1 , z 0 2 ) 2 ) 2 ) T ∼ π ( z ) now ( z 1 1 , z 0 ✲ z 1

  10. Gibbs sampler: intuition Gibbs sampler gives z 0 , z 2 , . . . , z T which can be treated as dependent draws from π ( z ) . If z 0 is not a draw from π ( z ) , then the initial realizations will not have the correct distribution. In practice, the first 100?, 1000? realizations are discarded. The draws can be used to make inference ✻ about π ( z ) : • Posterior mean of z is estimated by: ⎛ ⎞ ⎛ ⎞ ⎝ z k ⎠ = 1 ⎝ ˆ µ 1 T � 1 ⎜ ⎟ ⎜ ⎟ ⎠ z k µ 2 ˆ T k =1 2 z 2 • Posterior probabilities: P ( z 1 > 1) = 1 T � k =1 I [ z k � 1 > 1] T P ( z 1 > z 2 ) = 1 T � k =1 I [ z k 1 > z k � 2 ] T • 90% interval: ( z [5%] , z [95%] ) . 1 1 ✲ z 1

  11. Sampling of π ( µ, λ | y ) via Metropolis Initialize parameters at some setting ( µ, λ y ) 0 . For t = 1 , . . . , T { update µ | λ y , y { • generate proposal µ ∗ ∼ U [ µ − r µ , µ + r µ ] . • compute acceptance probability ⎧ ⎫ ⎩ 1 , π ( µ ∗ , λ y | y ) ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ α = min ⎪ ⎪ ⎪ π ( µ, λ y | y ) ⎪ ⎭ • update µ to new value: ⎧ µ ∗ with probability α ⎪ ⎪ µ new = ⎨ ⎪ µ with probability 1 − α ⎪ ⎩ } update λ y | µ, y analagously } Here we ran for T = 1000 scans, giving realizations ( µ, λ y ) 1 , . . . , ( µ, λ y ) T from the posterior. Discarded the first 100 for burn in. Note: proposal width r µ tuned so that µ ∗ is accepted about half the time; proposal width r λ y tuned so that λ ∗ y is accepted about half the time.

  12. Metropolis sampling for π ( µ, λ y | y ) 2 1.5 1 0.5 µ 0 − 0.5 − 1 − 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 λ y

  13. posterior summary for π ( µ, λ y | y ) 2 5 1.5 1 0.5 0 y i µ 0 − 0.5 − 1 − 5 − 1.5 0 2 4 6 8 0 1 2 3 4 5 λ y i 6 5 5 4 µ λ y 3 0 µ 2 1 0 − 5 − 1 0 200 400 600 800 1000 0 2 4 6 8 iteration i

  14. Sampling from non-standard multivariate distributions Nick Metropolis – Computing pioneer at Los Alamos National Laboratory − inventor of the Monte Carlo method − inventor of Markov chain Monte Carlo: Equation of State Calculations by Fast Computing Machines (1953) by N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, Journal of Chemical Physics . Originally implemented on the MANIAC1 computer at LANL Algorithm constructs a Markov chain whose realiza- tions are draws from the target (posterior) distribu- tion. Constructs steps that maintain detailed balance.

  15. Gibbs Sampling and Metropolis for a bivariate normal density ⎧ ⎫ − 1 � � ⎛ ⎞ − 1 ⎛ ⎞ � � ⎪ ⎪ ⎩ − 1 1 ρ 2 ⎪ ( z 1 z 2 ) ⎝ 1 ⎝ z 1 ⎪ ρ � � ⎪ ⎪ ⎨ ⎬ � � ⎜ ⎟ ⎜ ⎟ π ( z 1 , z 2 ) ∝ exp � � ⎠ ⎠ � � ⎪ ⎪ ρ 1 1 z 2 ρ � � ⎪ 2 ⎪ ⎪ ⎪ � � ⎭ sampling from the full conditionals z 1 | z 2 ∼ N ( ρ z 2 , 1 − ρ 2 ) z 2 | z 1 ∼ N ( ρ z 1 , 1 − ρ 2 ) also called heat bath Metropolis updating: generate z ∗ 1 ∼ U [ z 1 − r, z 1 + r ] calculate α = min { 1 , π ( z ∗ π ( z 1 ,z 2 ) = π ( z ∗ 1 ,z 2 ) 1 | z 2 ) π ( z 1 | z 2 ) } ⎧ z ∗ ⎪ 1 with probability α ⎪ ⎨ set z new = 1 ⎪ z 1 with probability 1 − α ⎪ ⎩

  16. Kernel basis representation for spatial processes z ( s ) Define m basis functions k 1 ( s ) , . . . , k m ( s ) . 0.8 basis 0.4 0.0 − 2 0 2 4 6 8 10 12 s Here k j ( s ) is normal density cetered at spatial location ω j : 1 2 π exp { − 1 2( s − ω j ) 2 } k j ( s ) = √ m � set z ( s ) = j =1 k j ( s ) x j where x ∼ N (0 , I m ) . Can represent z = ( z ( s 1 ) , . . . , z ( s n )) T as z = Kx where K ij = k j ( s i )

  17. x and k ( s ) determine spatial processes z ( s ) k j ( s ) x j z ( s ) 1.5 1.5 basis z(s) 0.5 0.5 − 0.5 − 0.5 − 2 0 2 4 6 8 10 12 − 2 0 2 4 6 8 10 12 s s Continuous representation: m � z ( s ) = j =1 k j ( s ) x j where x ∼ N (0 , I m ) . Discrete representation: For z = ( z ( s 1 ) , . . . , z ( s n )) T , z = Kx where K ij = k j ( s i )

  18. Formulation for the 1-d example Data y = ( y ( s 1 ) , . . . , y ( s n )) T observed at locations s 1 , . . . , s n . Once knot locations ω j , j = 1 , . . . , m and kernel choice k ( s ) are specified, the remaining model formulation is trivial: Likelihood: ⎧ ⎫ ⎩ − 1 ⎪ ⎪ n ⎨ ⎬ 2 λ y ( y − Kx ) T ( y − Kx ) L ( y | x, λ y ) ∝ λ y exp 2 ⎪ ⎪ ⎭ where K ij = k ( ω j − s i ) . Priors: ⎧ ⎫ ⎩ − 1 ⎪ ⎪ m ⎨ ⎬ 2 λ x x T x π ( x | λ x ) ∝ λ x exp 2 ⎪ ⎪ ⎭ π ( λ x ) ∝ λ a x − 1 exp { − b x λ x } x π ( λ y ) ∝ λ a y − 1 exp { − b y λ y } y Posterior: � � π ( x, λ x , λ y | y ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] exp × y � � λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] exp x

  19. Posterior and full conditionals Posterior: � � π ( x, λ x , λ y | y ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] exp × y � � λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] exp x Full conditionals: π ( x | · · · ) ∝ exp { − 1 2[ λ y x T K T Kx − 2 λ y x T K T y + λ x x T x ] } � � π ( λ x | · · · ) ∝ λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] exp x � � π ( λ y | · · · ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] exp y Gibbs sampler implementation x | · · · ∼ N (( λ y K T K + λ x I m ) − 1 λ y K T y, ( λ y K T K + λ x I m ) − 1 ) λ x | · · · ∼ Γ ( a x + m 2 , b x + . 5 x T x ) λ y | · · · ∼ Γ ( a y + n 2 , b y + . 5( y − Kx ) T ( y − Kx ))

  20. 1-d example m = 6 knots evenly spaced between − . 3 and 1 . 2 . n = 5 data points at s = . 05 , . 25 , . 52 , . 65 , . 91 . k ( s ) is N (0 , sd = . 3) a y = 10 , b y = 10 · ( . 25 2 ) ⇒ strong prior at λ y = 1 /. 25 2 ; a x = 1 , b x = . 001 3 3 basis functions 2 2 1 1 y(s) k j (s) 0 0 − 1 − 1 − 2 − 2 − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 s s 8 50 6 40 4 30 λ x , λ y 2 x j 20 0 10 − 2 − 4 0 0 200 400 600 800 1000 0 200 400 600 800 1000 iteration iteration

  21. 1-d example From posterior realizations of knot weights x , one can construct posterior real- � m izations of the smooth fitted function z ( s ) = j =1 k j ( s ) x j . Note strong prior on λ y required since n is small. 3 3 posterior realizations of z(s) mean & pointwise 90% bounds 2 2 1 1 z(s), y(s) z(s), y(s) 0 0 − 1 − 1 − 2 − 2 − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 s s

  22. References • D. Gammerman and H. F. Lopes (2006) Markov Chain Monte Carlo , Chap- man & Hall. • A. Gelman, J. B. Carlin, H. S. Stern and D. B. Rubin (1995) Bayesian Data Analysis , Chapman & Hall. • M. Schervish (1995) Theory of Statistics , Springer-Verlag. • Besag, J., P. J. Green, D. Higdon, and K. Mengersen (1995), Bayesian com- putation and stochastic systems (with Discussion), Statistical Science , 10 , 3-66. • D. Higdon (2002) Space and space-time modeling using process convolutions, in Quantitative Methods for Current Environmental Issues (C. Anderson and V. Barnett and P. C. Chatwin and A. H. El-Shaarawi, eds),37–56. • D. Higdon, H. Lee and C. Holloman (2003) Markov chain Monte Carlo approaches for inference in computationally intensive inverse problems, in Bayesian Statistics 7, Proceedings of the Seventh Valencia Interna- tional Meeting (Bernardo, Bayarri, Berger, Dawid, Heckerman, Smith and West, eds).

  23. GAUSSIAN PROCESSES 1

  24. Gaussian process models for spatial phenomena 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s An example of z ( s ) of a Gaussian process model on s 1 , . . . , s n ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ z ( s 1 ) 0 ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ . . ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ . . ⎠ , with Σ ij = exp { − || s i − s j || 2 } , ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ z = . ⎠ ∼ N . ⎠ , Σ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ ⎝ ⎝ ⎠ z ( s n ) 0 where || s i − s j || denotes the distance between locations s i and s j . z has density π ( z ) = (2 π ) − n 2 | Σ | − 1 2 exp { − 1 2 z T Σ − 1 z } .

  25. Realizations from π ( z ) = (2 π ) − n 2 | Σ | − 1 2 exp { − 1 2 z T Σ − 1 z } 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s model for z ( s ) can be extended to continuous s

  26. Generating multivariate normal realizations Independent normals are standard for any computer package u ∼ N (0 , I n ) Well known property of normals: if u ∼ N ( µ, Σ ) , then z = Ku ∼ N ( Kµ, K Σ K T ) Use this to construct correlated realizations from iid ones. Want z ∼ N (0 , Σ ) 1. compute square root matrix L such that LL T = Σ ; 2. generate u ∼ N (0 , I n ) ; 3. Set z = Lu ∼ N (0 , LI n L T = Σ ) • Any square root matrix L will do here. • Columns of L are basis functions for representing realizations z . • L need not be square – see over or under specified bases.

  27. Standard Cholesky decomposition z = N (0 , Σ ) , Σ = LL T , z = Lu where u ∼ N (0 , I n ) , L lower triangular Σ ij = exp { − || s i − s j || 2 } , s 1 , . . . , s 20 equally spaced between 0 and 10 : columns 5 10 15 20 5 1.0 0.5 10 rows basis 0.0 − 0.5 15 − 1.0 0 2 4 6 8 10 20 s

  28. Cholesky decomposition with pivoting z = N (0 , Σ ) , Σ = LL T , z = Lu where u ∼ N (0 , I n ) , L permuted lower triangular Σ ij = exp { − || s i − s j || 2 } , s 1 , . . . , s 20 equally spaced between 0 and 10 : columns 5 10 15 20 5 1.0 0.5 10 rows basis 0.0 − 0.5 15 − 1.0 0 2 4 6 8 10 20 s

  29. Singular value decomposition z = N (0 , Σ ) , Σ = U Λ U T = LL T , z = Lu where u ∼ N (0 , I n ) Σ ij = exp { − || s i − s j || 2 } , s 1 , . . . , s 20 equally spaced between 0 and 10 : columns 5 10 15 20 5 1.0 0.5 10 rows basis 0.0 − 0.5 15 − 1.0 0 2 4 6 8 10 20 s

  30. Conditioning on some observations of z ( s ) 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 We observe z ( s 2 ) and z ( s 5 ) – what do we now know about { z ( s 1 ) , z ( s 3 ) , z ( s 4 ) , z ( s 6 ) , z ( s 7 ) , z ( s 8 ) } ? ⎛ ⎞ ⎛ ⎛ ⎞ ⎞ z ( s 2 ) 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ z ( s 5 ) ⎜ ⎟ ⎜ ⎜ 0 ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎛ ⎞ ⎟ 1 . 0001 � . 3679 · · · 0 ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ 0 � ⎜ z ( s 1 ) ⎟ ⎜ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � . 0001 1 0 · · · . 0001 ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ 0 � ⎜ z ( s 3 ) ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ . 3679 0 1 · · · 0 � ∼ N , ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � 0 ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ z ( s 4 ) � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ . . ⎟ ⎟ ... � . . ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � . . . . . . . . ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ 0 ⎟ ⎜ ⎟ ⎟ � z ( s 6 ) ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎝ ⎠ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ � 0 . 0001 0 · · · 1 ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ 0 z ( s 7 ) ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎝ ⎝ ⎠ ⎠ ⎝ ⎠ 0 z ( s 8 )

  31. Conditioning on some observations of z ( s ) ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ z 1 ⎝ 0 ⎝ Σ 11 Σ 12 ⎠ , z 2 | z 1 ∼ N ( Σ 21 Σ − 1 11 z 1 , Σ 22 − Σ 21 Σ − 1 ⎠ , ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎠ ∼ N 11 Σ 12 ) ⎝ ⎠ z 2 0 Σ 21 Σ 22 conditional mean 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 contitional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s

  32. More examples with various covariance functions and spatial scales Σ ij = exp { − ( || s i − s j || / scale ) 2 } Σ ij = exp { − ( || s i − s j || / scale ) 1 } Gaussian C(r), scale = 2 Exponential C(r), scale = 1 Brownian motion C(r), p = 1.5 scale = 1 2 2 2 • • • • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 -2 -2 -2 0 5 10 15 0 5 10 15 0 5 10 15 x x x Gaussian C(r), scale = 3 Exponential C(r), scale = 10 Brownian motion C(r), p = 1.5 scale = 3 2 2 2 • • • • • • 1 1 1 • • • • • • 0 0 0 z z z -1 -1 -1 -2 -2 -2 0 5 10 15 0 5 10 15 0 5 10 15 x x x Gaussian C(r), scale = 5 Exponential C(r), scale = 20 Brownian motion C(r), p = 1.5 scale = 5 2 2 2 • • • • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 -2 -2 -2 0 5 10 15 0 5 10 15 0 5 10 15 x x x

  33. More examples with various covariance functions and spatial scales Σ ij = exp { − ( || s i − s j || / scale ) 2 } Σ ij = exp { − ( || s i − s j || / scale ) 1 } Gaussian C(r), scale = 2 Exponential C(r), scale = 1 Brownian motion C(r), p = 1.5 scale = 1 3 3 3 2 2 2 • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 • • • -2 -2 -2 -3 -3 -3 -5 0 5 10 15 20 -5 0 5 10 15 20 -5 0 5 10 15 20 x x x Gaussian C(r), scale = 3 Exponential C(r), scale = 10 Brownian motion C(r), p = 1.5 scale = 3 3 3 3 2 2 2 • • • 1 1 1 • • • • • • 0 0 0 z z z -1 -1 -1 • • • -2 -2 -2 -3 -3 -3 -5 0 5 10 15 20 -5 0 5 10 15 20 -5 0 5 10 15 20 x x x Gaussian C(r), scale = 5 Exponential C(r), scale = 20 Brownian motion C(r), p = 1.5 scale = 5 3 3 3 2 2 2 • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 • • • -2 -2 -2 -3 -3 -3 -5 0 5 10 15 20 -5 0 5 10 15 20 -5 0 5 10 15 20 x x x

  34. A 2-d example, conditioning on the edge Σ ij = exp { − ( || s i − s j || / 5) 2 } a realization mean conditional on Y=1 points -2 -1 0 1 2 3 4 -2 -1 0 1 2 3 4 Z Z 20 20 15 15 0 20 2 5 15 1 10 10 0 10 Y Y 1 5 5 X X 5 5 realization conditional on Y=1 points realization conditional on Y=1 points -2 -1 0 1 2 3 4 -2 -1 0 1 2 3 4 Z Z 20 20 15 20 15 20 15 10 15 10 10 Y 10 Y 5 X 5 5 X 5

  35. Soft Conditioning (Bayes Rule) 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s Observed data y are a noisy version of z y ( s i ) = z ( s i ) + � ( s i ) with � ( s k ) iid ∼ N (0 , σ 2 y ) , k = 1 , . . . , n spatial process prior for z ( s ) Data Σ y = σ 2 y y I n µ z Σ z ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ σ 2 y 1 0 0 0 y ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . ⎜ ⎟ ⎜ ... ⎟ ⎜ ⎟ ⎜ ⎟ . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . 0 0 . Σ z ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ σ 2 0 y n 0 0 y L ( y | z ) ∝ | Σ y | − 1 π ( z ) ∝ | Σ z | − 1 2 exp { − 1 2 exp { − 1 2 ( y − z ) T Σ − 1 2 z T Σ − 1 y ( y − z ) } z z }

  36. Soft Conditioning (Bayes Rule) ... continued sampling model spatial prior L ( y | z ) ∝ | Σ y | − 1 π ( z ) ∝ | Σ z | − 1 2 exp { − 1 2 exp { − 1 2 ( y − z ) T Σ − 1 2 z T Σ − 1 y ( y − z ) } z z } ⇒ π ( z | y ) ∝ L ( y | z ) × π ( z ) ⇒ π ( z | y ) ∝ exp { − 1 2 [ z T ( Σ − 1 y + Σ − 1 z ) z + z T Σ − 1 y y + f ( y )] } z | y ∼ N ( V Σ − 1 y y, V ) , where V = ( Σ − 1 y + Σ − 1 z ) − 1 ⇒ conditional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 π ( z | y ) describes the updated uncertainty about z given the observations.

  37. Updated predictions for unobserved z ( s ) ’s conditional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 data locations y d = ( y ( s 1 ) , . . . , y ( s n )) T z d = ( z ( s 1 ) , . . . , z ( s n )) T prediction locations y ∗ = ( y ( s ∗ m )) T z ∗ = ( z ( s ∗ m )) T 1 ) , . . . , y ( s ∗ 1 ) , . . . , z ( s ∗ define y = ( y d ; y ∗ ) z = ( z d ; z ∗ ) Data spatial process prior for z ( s ) ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎝ y d ⎝ y d ⎝ σ 2 y I n 0 ⎝ 0 n ⎝ cov rule applied ⎠ = ⎜ ⎟ ⎜ ⎟ ⎠ Σ y = ⎜ ⎟ ⎜ ⎟ ⎠ Σ z = ⎜ ⎟ y = µ z = ⎠ ⎠ y ∗ to ( s, s ∗ ) 0 m 0 ∞ I m 0 m ⎛ ⎞ 1 y I n 0 σ 2 ⎜ ⎟ define Σ − y = ⎜ ⎟ ⎝ ⎠ 0 0 Now the posterior distribution for z = ( z d , z ∗ ) is y + Σ − 1 z ) − 1 z | y ∼ N ( V Σ − y y, V ) , where V = ( Σ −

  38. Updated predictions for unobserved z ( s ) ’s, Alternative: use the conditional normal rules: conditional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 data locations y = ( y ( s 1 ) , . . . , y ( s n )) T = ( z ( s 1 ) + � ( s 1 ) , . . . , z ( s n ) + � ( s n )) T prediction locations z ∗ = ( z ( s ∗ m )) T 1 ) , . . . , z ( s ∗ ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ σ 2 ⎝ y ⎝ 0 y I n 0 ⎠ , ⎠ + Σ z Jointly ⎜ ⎠ ∼ N ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎠ z ∗ 0 0 0 where ⎛ ⎞ ⎛ ⎞ Σ z ( s, s ∗ ) ⎝ Σ z ( s, s ) ⎝ cov rule applied ⎠ = Σ z = ⎜ ⎟ ⎜ ⎟ ⎠ Σ z ( s ∗ , s ) Σ z ( s ∗ , s ∗ ) to ( s, s ∗ ) ( n + m ) × ( n + m ) Therefore z ∗ | y ∼ N ( µ ∗ , Σ ∗ ) where µ ∗ = Σ z ( s ∗ , s )[ σ 2 y I n + Σ z ( s, s )] − 1 y Σ ∗ = Σ z ( s ∗ , s ∗ ) − Σ z ( s ∗ , s )[ σ 2 z I n + Σ z ( s, s )] − 1 Σ z ( s, s ∗ )

  39. Example: Dioxin concentration at Piazza Road Superfund Site 200 . . . . . . . . . . . . . . . 200 . . . . . . . . . . . . . . . 200 0.6 0.4 0.2 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.4 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 150 150 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.4 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . 0.6 0.4 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 100 100 y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 50 0.4 50 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.6 0.4 0 0.4 0.4 0.8 0.6 0.8 0 . . . . . . . . . . . . . . . 0 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 x -1 1 2 3 4 Posterior mean of z ∗ data pointwise posterior sd

  40. Bonus topic: constructing simultaneous intervals • generate a large sample of m -vectors z ∗ from π ( z ∗ | y ) . z ∗ that is the mean of the generated z ∗ s • compute the m -vector ˆ σ that is the pointwise sd of the generated z ∗ s • compute the m -vector ˆ • find the constant a such that 80% of the generated z ∗ s are completely z ∗ ± a ˆ contained within ˆ σ posterior realizations and mean 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 pointwise estimated sd 2 +/ − sd[z(s)] 1 0 − 1 − 2 0 1 2 3 4 5 6 7 simultaneous 80% credible interval 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7

  41. References • Ripley, B. (1989) Spatial Statistics , Wiley. • Cressie, N. (1992) Statistics for Spatial Data , Wiley. • Stein, M. (1999) Interpolation of Spatial Data: Some Theory for Krig- ing , Springer.

  42. GAUSSIAN PROCESSES 2

  43. Gaussian process models revisited Application: finding in a rod of material 1.5 • for various driving frequencies, acceleration of rod recorded 1 • the true frequency-acceleration curve is 0.5 smooth. scaled acceleration 0 • we have noisy measurements of accelera- tion. − 0.5 • estimate resonance frequency. − 1 • use GP model for frequency-accel curve. − 1.5 • smoothness of GP model important here. − 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 scaled frequency

  44. Gaussian process models formulation Take response y to be acceleration and spatial value s to be frequency. ( y 1 , . . . , y n ) T data: y = at spatial locations 1.5 s 1 , . . . , s n . 1 0.5 z ( s ) is a mean 0 Gaussian process with covariance scaled acceleration function 0 Cov ( z ( s ) , z ( s ′ )) = 1 − 0.5 exp { − β ( s − s ′ ) 2 } − 1 λ z − 1.5 β controls strength of dependence. − 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 scaled frequency Take z = ( z ( s 1 ) , . . . , z ( s n )) T to be z ( s ) restricted to the data observations. Model the data as: y = z + � , where � ∼ N (0 , 1 I n ) λ y We want to find the posterior distribution for the frequency s ⋆ where z ( s ) is maximal.

  45. Reparameterizing the spatial dependence parameter β It is convenient to reparameterize β as: ρ = exp { − β (1 / 2) 2 } ⇔ β = − 4 log( ρ ) So ρ is the correlation between two points on z ( s ) separated by 1 2 . Hence z has spatial prior z | ρ , λ z ∼ N (0 , 1 R ( ρ ; s )) λ z where R ( ρ ; s ) is the correlation matrix with ij elements R ij = ρ 4( s i − s j ) 2 Prior specification for z ( s ) is completed by specfying priors for λ z and ρ . π ( λ z ) ∝ λ a z − 1 exp { − b z λ z } if y is standardized, encourage λ z to be close to 1 – z eg. a z = b z = 5 . π ( ρ ) ∝ (1 − ρ ) − . 5 encourages ρ to be large if possible

  46. Bayesian model formulation Likelihood n 2 λ y ( y − z ) T ( y − z ) } L ( y | z, λ y ) ∝ λ y exp { − 1 2 Priors z | R ( ρ ; s ) | − 1 n 2 λ z z T R ( ρ ; s ) − 1 z } 2 exp { − 1 π ( z | λ z , ρ ) ∝ λ 2 π ( λ y ) ∝ λ a y − 1 e − b y λ y , uninformative here – a y = 1 , b y = . 005 y π ( λ z ) ∝ λ a z − 1 e − b z λ z , fairly informative – a z = 5 , b z = 5 z π ( ρ ) ∝ (1 − ρ ) − . 5 Marginal likelihood (integrating out z ) 1 2 y T Λ y } 2 exp { − 1 L ( y | λ � , λ z , ρ ) ∝ | Λ | where Λ − 1 = 1 λ y I n + 1 λ z R ( ρ ; s ) Posterior 1 2 exp { − 1 2 y T Λ y } × λ a y − 1 e − b y λ y × λ a z − 1 e − b z λ z × (1 − ρ ) − . 5 π ( λ y , λ z , ρ | y ) ∝ | Λ | y z

  47. Posterior Simulation Use Metropolis to simulate from the posterior 1 e − b y λ y × λ a z − 1 e − b z λ z × (1 − ρ ) − . 5 2 exp { − 1 2 y T Λ y } × λ a y − 1 π ( λ y , λ z , ρ | y ) ∝ | Λ | y z giving (after burn-in) ( λ y , λ z , ρ ) 1 , . . . , ( λ y , λ z , ρ ) T For any given realization ( λ y , λ z , ρ ) t , one can generate z ∗ = ( z ( s ∗ m )) T 1 ) , . . . , z ( s ∗ for any set of prediction locations s ∗ 1 , . . . , s ∗ m . From previous GP stu ff , we know ⎛ ⎞ ⎛ ⎛ ⎞ ⎞ ⎝ z ⎝ y ⎝ V Σ − ⎠ | · · · ∼ N ⎠ , V ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎠ y z ∗ 0 m where ⎛ ⎞ ⎝ λ � I n 0 ⎠ and V − 1 = Σ − y + λ z R ( ρ , ( s, s ∗ )) − 1 Σ − y = ⎜ ⎟ 0 0 Hence, one can generate corresponding z ∗ ’s for each posterior realization at a fine grid around the apparent resonance frequency z ⋆ .

  48. Or use conditional normal formula with ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ λ − 1 ⎝ y ⎝ 0 n � I n 0 ⎠ + λ − 1 ⎠ | · · · ∼ N ⎠ , z R ( ρ , ( s, s ∗ )) ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎠ z ∗ 0 m 0 0 where ⎛ ⎞ ⎛ ⎞ R ( ρ , ( s, s ∗ )) ⎝ R ( ρ , ( s, s )) ⎝ cor rule applied R ( ρ , ( s, s ∗ )) = ⎠ = ⎜ ⎟ ⎜ ⎟ ⎠ R ( ρ , ( s ∗ , s )) R ( ρ , ( s ∗ , s ∗ )) to ( s, s ∗ ) ( n + m ) × ( n + m ) Therefore z ∗ | y ∼ N ( µ ∗ , Σ ∗ ) where µ ∗ = λ − 1 z R ( ρ , ( s ∗ , s ))[ λ − 1 � I n + λ − 1 z R ( ρ , ( s, s ))] − 1 y Σ ∗ = λ − 1 z R ( ρ , ( s ∗ , s ∗ )) − λ − 1 z R ( ρ , ( s ∗ , s ))[ λ − 1 � I n + λ − 1 z R ( ρ , ( s, s ))] − 1 λ − 1 z R ( ρ , ( s, s ∗ ))

  49. MCMC output for ( λ y , λ z , ρ ) 1 rho 0.5 0 0 500 1000 1500 2000 2500 3000 iteration 2 1.5 lamz 1 0.5 0 0 500 1000 1500 2000 2500 3000 iteration 250 200 lamy 150 100 50 500 1000 1500 2000 2500 3000 iteration

  50. Posterior realizations for z ( s ) near z ⋆ 1.5 1 0.5 0 scaled acceleration − 0.5 − 1 − 1.5 − 2 − 2.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 scaled frequency

  51. Posterior for resonance frequency z ⋆ posterior distribution for scaled resonance frequency 90 80 70 60 50 40 30 20 10 0 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5 0.51 0.52 scaled resonance frequency

  52. Gaussian Processes for modeling complex computer simulators data input settings (spatial locations) ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ y 1 s 1 s 11 s 12 · · · s 1 p ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . . . . . y = ⎜ . ⎟ S = ⎜ . ⎟ ⎠ = ⎜ . . . . ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎝ ⎠ y n s n s n 1 s n 2 · · · s np Model responses y as a (stochastic) function of s y ( s ) = z ( s ) + � ( s ) Vector form – restricting to the n data points y = z + �

  53. Model response as a Gaussian processes y ( s ) = z ( s ) + � Likelihood n 2 λ � ( y − z ) T ( y − z ) } L ( y | z, λ � ) ∝ λ � exp { − 1 2 Priors z | R ( β ) | − 1 n 2 exp { − 1 2 λ z z T R ( β ) − 1 z } π ( z | λ z , β ) ∝ λ 2 π ( λ � ) ∝ λ a � − 1 e − b � λ � , perhaps quite informative � π ( λ z ) ∝ λ a z − 1 e − b z λ z , fairly informative if data have been standardized z p k =1 (1 − ρ k ) − . 5 � π ( ρ ) ∝ Marginal likelihood (integrating out z ) 1 2 exp { − 1 2 y T Λ y } L ( y | λ � , λ z , β ) ∝ | Λ | where Λ − 1 = 1 λ � I n + 1 λ z R ( β )

  54. GASP Covariance model for z ( s ) Cov ( z ( s i ) , z ( s j )) = 1 R ( β ) = 1 p � k =1 exp { − β k ( s ik − s jk ) α } λ z λ z • Typically α = 2 ⇒ z ( s ) is smooth. • Separable covariance – a product of componentwise covariances. • Can handle large number of covariates/inputs p . • Can allow for multiway interactions. • β k = 0 ⇒ input k is “inactive” ⇒ variable selection • reparameterize: ρ k = exp { − β k d α 0 } – typically d 0 is a halfwidth.

  55. Posterior Distribution and MCMC 1 2 exp { − 1 2 y T Λ λ , ρ y } × λ a � − 1 e − b � λ � × π ( λ � , λ z , ρ | y ) ∝ | Λ λ , ρ | � p λ a z − 1 e − b z λ z × k =1 (1 − ρ k ) − . 5 � z • MCMC implementation requires Metropolis updates. • Realizations of z ( s ) | λ , ρ , y can be obtained post-hoc: − define z ∗ = ( z ( s ∗ m )) T to be predictions at locations s ∗ 1 ) , . . . , z ( s ∗ 1 , . . . , s ∗ m , then ⎛ ⎞ ⎛ ⎛ ⎞ ⎞ ⎝ z ⎝ y ⎠ | · · · ∼ N ⎝ V Σ − ⎠ , V ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎠ y z ∗ 0 m where ⎛ ⎞ ⎝ λ � I n 0 ⎠ and V − 1 = Σ − y + λ z R ( ρ , ( s, s ∗ )) − 1 Σ − y = ⎜ ⎟ 0 0

  56. Example: Solar collector Code (Schonlau, Hamada and Welch, 1995) • n = 98 model runs, varying 6 independent variables. • Response is the increase in heat exchange e ff ectiveness. • A latin hypercube (LHC) design was used with 2-d space filling. 0.010 0.020 0.030 0 100 200 300 2 4 6 8 10 0.050 0.035 wind.vel 0.020 0.030 0.020 slot.width 0.010 90 Rey.num 70 50 300 admittance 100 0 0.07 0.05 plate.thickness 0.03 0.01 10 8 6 Nusselt.num 4 2 0.020 0.030 0.040 0.050 50 60 70 80 90 100 0.01 0.03 0.05 0.07

  57. Example: Solar collector Code • Fit of GASP model and predictions of 10 holdout points • Two most active covariates are shown here. x 10 4 lame 2 data 0 2 20 lamz 0 500 1000 1500 2000 1 iteration 0 2 0 500 1000 1500 2000 15 iteration 0 beta(1:p) 10 y − 2 5 − 4 1 1 0.5 0.5 0 0 500 1000 1500 2000 0 0 x5 x4 iteration posterior mean − 1 2 − 1.5 0 − 2 − 2 − 4 − 2.5 1 1 0.5 0.5 − 3 0 0 − 3 − 2.5 − 2 − 1.5 − 1 x5 x4

  58. Example: Solar collector Code • Visualizing a 6-d response surface is di ffi cult • 1-d marginal e ff ects shown here. 1 − D Marginal Effects 2 2 2 1 1 1 0 0 0 y y y − 1 − 1 − 1 − 2 − 2 − 2 x1 x2 x3 − 3 − 3 − 3 0 0.5 1 0 0.5 1 0 0.5 1 x1 x2 x3 2 2 2 1 1 1 0 0 0 y y y − 1 − 1 − 1 − 2 − 2 − 2 x4 x5 x6 − 3 − 3 − 3 0 0.5 1 0 0.5 1 0 0.5 1 x4 x5 x6

  59. References • J. Sacks, W. J. Welch, T. J. Mitchell and H. P. Wynn (1989) Design and analysis of comuter experiments Statistical Science , 4:409–435.

  60. COMPUTER MODEL CALIBRATION 1

  61. Inference combining a physics model with experimental data 7 Data generated from model: 6 d 2 z dt 2 = − 1 − . 3 dz 5 dt + � drop time 4 3 simulation model: 2 d 2 z dt 2 = − 1 1 2 4 6 8 10 drop height (floor) statistical model: 7 6 y ( z ) = η ( z ) + δ ( z ) + � 5 drop time 4 3 2 1 2 4 6 8 10 drop height (floor) Improved physics model: 7 d 2 z dt 2 = − 1 − θ dz dt + � 6 5 drop time 4 statistical model: 3 y ( z ) = η ( z, θ ) + δ ( z ) + � 2 1 2 4 6 8 10 drop height (floor)

  62. Accounting for limited simulator runs data & simulations • Borrows from Kennedy and O’Hagan (2001). 3 x model or system inputs 2 y(x), η (x, θ ) calibration parameters θ 1 ζ ( x ) true physical system response given inputs x 0 η ( x, θ ) simulator response at x and θ . − 1 simulator run at limited input settings − 2 0 .5 1 m )) T η = ( η ( x ∗ 1 , θ ∗ 1 ) , . . . , η ( x ∗ m , θ ∗ θ − 3 0 0.5 1 treat η ( · , · ) as a random function x use GP prior for η ( · , · ) model runs y ( x ) experimental observation of the physical system e ( x ) observation error of the experimental data y ( x ) = ζ ( x ) + e ( x ) 2 η (x, θ ) y ( x ) = η ( x, θ ) + e ( x ) 0 − 2 0 0 0.5 0.5 1 1 x θ

  63. OA designs for simulator runs Example: N = 16 , 3 factors each at 4 levels OA (16 , 4 3 ) design 2-d projections 0.0 0.2 0.4 0.6 0.8 1.0 1.0 0.8 0.6 x_1 0.4 0.2 0.0 1.0 1 0.8 0.6 x_2 0.4 .5 0.2 x 3 0.0 0 1.0 0.8 1 0.6 x_3 1 0.4 0.2 .5 .5 0.0 x 1 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x 2 0 0 OA design ensures importance measures R 2 can be accurately estimated for low dimensions Can spread out design for building a response surface emulator of η ( x )

  64. Gaussian Process models for combining field data and complex computer simulators field data input settings (spatial locations) y ( x 1 ) x 11 x 12 · · · x 1 p x ⎛ ⎞ ⎛ ⎞ . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . y = . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ y ( x n ) ⎠ ⎝ x n 1 x n 2 · · · x np x ⎠ sim data input settings x ; params θ ∗ η ( x ∗ 1 , θ ∗ 1 ) x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎛ ⎞ ⎛ ⎞ 11 1 p x 11 1 p θ . . . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . . . η = . . . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ η ( x ∗ m , θ ∗ m ) ⎠ ⎝ x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎠ m 1 mp x m 1 mp θ Model sim response η ( x, θ ) as a Gaussian process y ( x ) = η ( x, θ ) + � η ( x, θ ) ∼ GP (0 , C η ( x, θ )) � ∼ iid N (0 , 1 / λ � ) C η ( x, θ ) depends on p x + p θ -vector ρ η and λ η

  65. Vector form – restricting to n field obs and m simulation runs y = η ( θ ) + � η ∼ N m (0 m , C η ( ρ η , λ η )) ⎝ y ⎝ 0 n ⎝ 1 / λ � I n 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎠ , C y η = C η + ⎠ ∼ N n + m ⇒ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ 0 m 0 1 / λ s I m η ⎝ ⎠ ⎠ where ⎝ x ⎝ 1 θ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ C η = 1 / λ η R η ⎠ , ⎠ ; ρ η ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ x ∗ θ ∗ ⎠ and the correlation matrix R η is given by p θ p x k ) 2 k ) 2 4( x k − x ′ 4( θ k − θ ′ R η (( x, θ ) , ( x ′ , θ ′ ); ρ η ) = k =1 ρ k =1 ρ � � × η k η ( k + p x ) λ s is typically set to something large like 10 6 to stabalize matrix computations and allow for numerical fluctuation in η ( x, θ ) . note: the covariance matrix C η depends on θ through its “distance”-based correlation function R η (( x, θ ) , ( x ′ , θ ′ ); ρ η ) . We use a 0 mean for η ( x, θ ) ; an alternative is to use a linear regression mean model.

  66. Likelihood L ( y, η | λ � , ρ η , λ η , λ s , θ ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ 2 exp C − 1 ⎪ ⎪ ⎨ ⎬ ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ Priors π ( λ � ) ∝ λ a � − 1 e − b � λ � perhaps well known from observation process � p x + p θ k =1 (1 − ρ η k ) − . 5 , where ρ η k = e − . 5 2 β η k correlation at dist = . 5 ∼ β (1 , . 5) . π ( ρ η k ) ∝ � π ( λ η ) ∝ λ a η − 1 e − b η λ η η π ( λ s ) ∝ λ a s − 1 e − b s λ s s π ( θ ) ∝ I [ θ ∈ C ] • could fix ρ η , λ η from prior GASP run on model output. • Many prefer to reparameterize ρ as β = − log( ρ ) /. 5 2 in the likelihood term

  67. Posterior Density π ( λ � , ρ η , λ η , λ s , θ | y, η ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ C − 1 2 exp ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ p x + p θ k =1 (1 − ρ η k ) − . 5 × λ a η − 1 e − b η λ η × λ a s − 1 e − b s λ s × � η s λ a � − 1 e − b � λ � × I [ θ ∈ C ] � If ρ η , λ η , and λ s are fixed from a previous analysis of the simulator data, then π ( λ � , θ | y, η , ρ η , λ η , λ s ) ∝ ⎧ T ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ ⎪ ⎪ | C y η | − 1 ⎪ ⎪ 2 exp C − 1 ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ λ a � − 1 e − b � λ � × I [ θ ∈ C ] �

  68. Accounting for limited simulation runs posterior realizations of η (x, θ * ) prior uncertainty posterior uncertainty 3 3 2 2 y(x), η (x, θ ) y(x), η (x, θ ) 2 1 1 η (x, θ * ) 0 0 0 − 1 − 1 − 2 0 − 2 − 2 0 0 .5 1 0 .5 1 0.5 θ 0.5 θ − 3 − 3 θ * 1 0 0.5 1 1 0 0.5 1 x x x Again, standard Bayesian estimation gives: π ( θ , η ( · , · ) , λ � , ρ η , λ η | y ( x )) ∝ L ( y ( x ) | η ( x, θ ) , λ � ) × π ( θ ) × π ( η ( · , · ) | λ η , ρ η ) π ( λ � ) × π ( ρ η ) × π ( λ η ) • Posterior means and quantiles shown. • Uncertainty in θ , η ( · , · ) , nuisance parameters are incorporated into the forecast. • Gaussian process models for η ( · , · ) .

  69. Predicting a new outcome: ζ = ζ ( x ′ ) = η ( x ′ , θ ) Given a MCMC realization ( θ , λ � , ρ η , λ η ) , a realization for ζ ( x ′ ) can be produced using Bayes rule. Data GP prior for η ( x, θ )( s ) y λ � I n 0 0 0 n x 1 θ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎠ C η = λ − 1 η R η v = ⎠ Σ − v = 0 λ s I m 0 µ z = 0 m x ∗ ⎠ , θ ∗ ⎠ ; ρ η ⎜ η ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ 0 0 0 ⎠ ⎝ 0 ⎝ ⎝ x ′ ⎝ ⎠ ζ θ Now the posterior distribution for v = ( y, η , ζ ) T is v | y, η ∼ N ( µ v | y η = V Σ − v + C − 1 η ) − 1 v v, V ) , where V = ( Σ − Restricting to ζ we have ζ | y, η ∼ N ( µ v | y η m + n +1 , V n + m +1 ,n + m +1 ) Alternatively, one can apply the conditional normal formula to λ − 1 y 0 � I n 0 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ λ − 1 ⎠ ∼ N 0 ⎠ , 0 s I m 0 ⎠ + C η ⎜ η ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ζ ⎝ ⎝ 0 ⎝ 0 0 0 ⎠ so that ⎝ y ⎛ ⎛ ⎞ ⎞ ⎝ Σ 21 Σ − 1 ⎠ , Σ 22 − Σ 21 Σ − 1 ζ | y, η ∼ N 11 Σ 12 ⎜ ⎜ ⎟ ⎟ 11 η ⎠

  70. Accounting for model discrepancy • Borrows from Kennedy and O’Hagan (2001). prior uncertainty x model or system inputs 3 calibration parameters θ 2 y(x), η (x, θ ) ζ ( x ) true physical system response given inputs x 1 η ( x, θ ) simulator response at x and θ . 0 y ( x ) experimental observation of the physical system − 1 δ ( x ) discrepancy between ζ ( x ) and η ( x, θ ) − 2 0 .5 1 θ may be decomposed into numerical error and bias − 3 0 0.2 0.4 0.6 0.8 1 e ( x ) observation error of the experimental data x y ( x ) = ζ ( x ) + e ( x ) y ( x ) = η ( x, θ ) + δ ( x ) + e ( x ) y ( x ) = η ( x, θ ) + δ n ( x ) + δ b ( x ) + e ( x )

  71. Accounting for model discrepancy Again, standard Bayesian estimation gives: prior uncertainty posterior model uncertainty π ( θ , η , δ | y ( x )) ∝ L ( y ( x ) | η ( x, θ ) , δ ( x )) × 3 3 2 2 π ( θ ) × π ( η ) × π ( δ ) y(x), η (x, θ ) y(x), η (x, θ ) 1 1 0 0 • Posterior means and 90% CI’s shown. − 1 − 1 − 2 − 2 0 .5 1 0 .5 1 θ θ • Posterior prediction for ζ ( x ) is obtained − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x by computing the posterior distribution for posterior model discrepancy calibrated forecast 3 3 η ( x, θ ) + δ ( x ) 2 2 1 y(x), ζ (x) 1 • Uncertainty in θ , η ( x, t ) , and δ ( x ) are in- δ (x) 0 0 − 1 − 1 corporated into the forecast. − 2 − 2 • Gaussian process models for η ( x, t ) and − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x δ ( x )

  72. Gaussian Process models for combining field data and complex computer simulators field data input settings (spatial locations) y ( x 1 ) x 11 x 12 · · · x 1 p x ⎛ ⎞ ⎛ ⎞ . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . y = . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ y ( x n ) ⎠ ⎝ x n 1 x n 2 · · · x np x ⎠ sim data input settings x ; params θ ∗ η ( x ∗ 1 , θ ∗ 1 ) x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎛ ⎞ ⎛ ⎞ 11 1 p x 11 1 p θ . . . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . . . η = . . . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ η ( x ∗ m , θ ∗ m ) x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎝ ⎠ ⎝ ⎠ m 1 mp x m 1 mp θ Model sim response η ( x, θ ) as a Gaussian process y ( x ) = η ( x, θ ) + δ ( x ) + � η ( x, θ ) ∼ GP (0 , C η ( x, θ )) � � 0 , C δ ( x ) δ ( x ) ∼ GP � ∼ iid N (0 , 1 / λ � ) C η ( x, θ ) depends on p x + p θ -vector ρ η and λ η C δ ( x ) depends on p x -vector ρ δ and λ δ

  73. Vector form – restricting to n field obs and m simulation runs y = η ( θ ) + δ + � η ∼ N m (0 m , C η ( ρ η , λ η )) ⎝ C δ ⎝ y ⎝ 0 n 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎠ , C y η = C η + ⎠ ∼ N n + m ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ 0 m 0 0 η ⎝ ⎠ ⎠ where ⎝ x ⎝ 1 θ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ C η = 1 / λ η R η ⎠ , ⎠ ; ρ η ⎠ + 1 / λ s I m + n ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ x ∗ θ ∗ C δ = 1 / λ δ R δ ( x ; ρ δ ) + 1 / λ � I n and the correlation matricies R η and R δ are given by p θ p x k ) 2 k ) 2 4( x k − x ′ 4( θ k − θ ′ R η (( x, θ ) , ( x ′ , θ ′ ); ρ η ) = k =1 ρ k =1 ρ � � × η k η ( k + p x ) p x k ) 2 4( x k − x ′ R δ ( x, x ′ ; ρ δ ) = k =1 ρ � δ k λ s is typically set to something large like 10 6 to stabalize matrix computations and allow for numerical fluctuation in η ( x, θ ) . We use a 0 mean for η ( x, θ ) ; an alternative is to use a linear regression mean model.

  74. Likelihood L ( y, η | λ � , ρ η , λ η , λ s , ρ δ , λ δ , θ ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ 2 exp C − 1 ⎪ ⎪ ⎨ ⎬ ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ Priors π ( λ � ) ∝ λ a � − 1 e − b � λ � perhaps well known from observation process � p x + p θ k =1 (1 − ρ η k ) − . 5 , where ρ η k = e − . 5 2 β η k correlation at dist = . 5 ∼ β (1 , . 5) . π ( ρ η k ) ∝ � π ( λ η ) ∝ λ a η − 1 e − b η λ η η π ( λ s ) ∝ λ a s − 1 e − b s λ s s p x k =1 (1 − ρ δ k ) − . 5 , where ρ δ k = e − . 5 2 β δ π ( ρ δ k ) ∝ � k π ( λ δ ) ∝ λ a δ − 1 e − b δ λ δ , δ π ( θ ) ∝ I [ θ ∈ C ] • could fix ρ η , λ η from prior GASP run on model output. • Again, many choose to reparameterize correlation parameters: β = − log( ρ ) /. 5 2 in the likelihood term

  75. Posterior Density π ( λ � , ρ η , λ η , λ s , ρ δ , λ δ , θ | y, η ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ C − 1 2 exp ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ p x + p θ k =1 (1 − ρ η k ) − . 5 × λ a η − 1 e − b η λ η × λ a s − 1 e − b s λ s × � η s p x k =1 (1 − ρ δ k ) − . 5 × λ a δ − 1 e − b δ λ δ × λ a � − 1 e − b � λ � × I [ θ ∈ C ] � δ � If ρ η , λ η , and λ s are fixed from a previous analysis ofthe simulator data, then π ( λ � , ρ δ , λ δ , θ | y, η , ρ η , λ η , λ s ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ C − 1 2 exp ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ p x k =1 (1 − ρ δ k ) − . 5 × λ a δ − 1 e − b δ λ δ × λ a � − 1 e − b � λ � × I [ θ ∈ C ] � δ �

  76. Predicting a new outcome: ζ = ζ ( x ′ ) = η ( x ′ , θ ) + δ ( x ′ ) y = η ( x, θ ) + δ ( x ) + � ( x ) η = η ( x ∗ , θ ∗ ) + � s , � s small or 0 ζ = η ( x ′ , θ ) + δ ( x ′ ) , x ′ univariate or multivariate λ − 1 y 0 n � I n 0 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎠ + C η + C δ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ λ − 1 ⎠ ∼ N n + m +1 0 m ⎠ , 0 s I m 0 (1) ⎜ η ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⇒ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ ⎝ 0 ⎝ 0 0 0 ⎠ ζ where x 1 θ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ C η = 1 / λ η R η ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ x ∗ ⎠ , θ ∗ ⎠ ; ρ η ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ x ′ ⎝ ⎠ θ ⎝ x ⎛ ⎛ ⎞ ⎞ C δ = 1 / λ δ R δ ⎠ ; ρ δ ⎠ , on indicies 1 , . . . , n, n + m + 1 ; zeros elsewhere ⎜ ⎜ ⎟ ⎟ x ′ ⎝ Given a MCMC realization ( θ , λ � , ρ η , λ η , ρ δ , λ δ ) , a realization for ζ ( x ′ ) can be produced using (1) and the conditional normal formula: ⎝ y ⎛ ⎛ ⎞ ⎞ ⎝ Σ 21 Σ − 1 ⎠ , Σ 22 − Σ 21 Σ − 1 ζ | y, η ∼ N 11 Σ 12 ⎜ ⎜ ⎟ ⎟ 11 η ⎠

  77. Accounting for model discrepancy Again, standard Bayesian estimation gives: prior uncertainty posterior model uncertainty π ( θ , η n , δ | y ( x )) ∝ L ( y ( x ) | η ( x, θ ) , δ ( x )) × 3 3 2 2 π ( θ ) × π ( η ) × π ( δ ) y(x), η (x, θ ) y(x), η (x, θ ) 1 1 0 0 • Posterior means and 90% CI’s shown. − 1 − 1 − 2 − 2 0 .5 1 0 .5 1 θ θ • Posterior prediction for ζ ( x ) is obtained − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x by computing the posterior distribution for posterior model discrepancy calibrated forecast 3 3 η ( x, θ ) + δ ( x ) 2 2 1 y(x), ζ (x) 1 • Uncertainty in θ , η ( x, t ) , and δ ( x ) are in- δ (x) 0 0 − 1 − 1 corporated into the forecast. − 2 − 2 • Gaussian process models for η ( x, t ) and − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x δ ( x )

  78. References • T. Santner, B. J. Williams and W. I. Notz (2003) The Design and Analysis of Computer Experiments , Springer. • M. Kennedy and A. O’Hagan (2001) Bayesian Calibration of Computer Mod- els (with Discussion), Journal of the Royal Statistical Society B , 63, 425– 464. • J. Sacks, W. J. Welch, T. J. Mitchell and H. P. Wynn (1989). Design and Analysis of computer experiments (with discussion). Statistical Science , 4, 409–423. • Higdon, D., Kennedy, M., Cavendish, J., Cafeo, J. and Ryne R. D. (2004) Combining field observations and simulations for calibration and prediction. SIAM Journal of Scientific Computing , 26, 448–466.

  79. COMPUTER MODEL CALIBRATION 2 DEALING WITH MULTIVARIATE OUTPUT

  80. Carry out simulated implosions using Neddermeyer’s model Sequence of runs carried at m input settings ( x ∗ , θ ∗ 1 , θ ∗ 2 ) = ( m e /m, s, u 0 ) varying   x ∗ 1 θ ∗ 11 θ ∗ 12 . . . . . . over predefined ranges using an OA (32 , 4 3 ) -based LH design. . . .   x ∗ m θ ∗ m 1 θ ∗ m 2 2.5 2 2.5 2 inner radius (cm) 1.5 inner radius (cm) 1.5 1 1 0.5 0 0.5 0 2pi 1 2 pi 3 − 5 4 x 10 0 5 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 angle (radians) time (s) − 5 time (s) x 10 radius by time radius by time and angle φ . Each simulation produces a n η = 22 · 26 vector of radii for 22 times × 26 angles.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend