Lecture 12 Gaussian Process Models Colin Rundel 02/27/2017 1

Multivariate Normal 2

Multivariate Normal Distribution . . . . ... . . . . . Y n . . . (positive semidefinite) can be written as Y 3 Y 1 For an n -dimension multivate normal distribution with covariance Σ n × 1 ∼ N ( µ n × 1 , Σ n × n ) where { Σ } ij = σ 2 ij = ρ ij σ i σ j µ 1 ρ 11 σ 1 σ 1 · · · ρ 1 n σ 1 σ n          ∼ N  ,               µ n ρ n 1 σ n σ 1 · · · ρ nn σ n σ n

Density and its log density is given by 1 2 1 2 n For the n dimensional multivate normal given on the last slide, its density is 4 1 given by ( ) ( 2 π ) − n / 2 det ( Σ ) − 1 / 2 exp 2 ( Y − µ ) ′ Σ − 1 − n × n ( Y − µ ) 1 × n n × 1 − log 2 π − log det ( Σ ) − − 2 ( Y − µ ) ′ Σ − 1 n × n ( Y − µ ) 1 × n n × 1

Sampling • Find a matrix A such that A A t , most often we use A Chol • Draw n iid unit normals ( 0 1 ) as z • Construct multivariate normal draws using Y A z 5 To generate draws from an n -dimensional multivate normal with mean µ and covariance matrix Σ ,

Sampling • Draw n iid unit normals ( 0 1 ) as z • Construct multivariate normal draws using Y A z 5 To generate draws from an n -dimensional multivate normal with mean µ and covariance matrix Σ , • Find a matrix A such that Σ = A A t , most often we use A = Chol ( Σ )

Sampling • Construct multivariate normal draws using Y A z 5 To generate draws from an n -dimensional multivate normal with mean µ and covariance matrix Σ , • Find a matrix A such that Σ = A A t , most often we use A = Chol ( Σ ) • Draw n iid unit normals ( N ( 0 , 1 ) ) as z

Sampling • Construct multivariate normal draws using 5 To generate draws from an n -dimensional multivate normal with mean µ and covariance matrix Σ , • Find a matrix A such that Σ = A A t , most often we use A = Chol ( Σ ) • Draw n iid unit normals ( N ( 0 , 1 ) ) as z Y = µ + A z

Bivariate Example 1 1 6 0 0 ( ) ( ρ ) Σ = µ = ρ rho=0.9 rho=0.7 rho=0.5 rho=0.1 3 2 1 0 −1 −2 −3 y rho=−0.9 rho=−0.7 rho=−0.5 rho=−0.1 3 2 1 0 −1 −2 −3 −2 0 2 −2 0 2 −2 0 2 −2 0 2 x

Marginal distributions . . . j i 1 i 1 i 1 i k . . i 1 ... . . . i k i 1 i k i k . i k 7 y ij For a univariate marginal distribution, y i i ii For a bivariate marginal distribution, i y i 1 j ii ij ji jj For a k -dimensional marginal distribution, Proposition - For an n -dimensional multivate normal with mean µ and covariance matrix Σ , any of the possible marginal distributions will also (multivariate) normal.

Marginal distributions . i k i k i k i 1 . . . ... . . . i 1 i k i 1 i 1 j . . i 1 i k y i 1 For a k -dimensional marginal distribution, jj ji ij ii j i y ij For a bivariate marginal distribution, For a univariate marginal distribution, 7 Proposition - For an n -dimensional multivate normal with mean µ and covariance matrix Σ , any of the possible marginal distributions will also (multivariate) normal. y i = N ( µ i , γ ii )

Marginal distributions . . . j i 1 i 1 i 1 i k . . i 1 ... . . . i k i 1 i k i k . i k 7 y i 1 For a k -dimensional marginal distribution, For a univariate marginal distribution, For a bivariate marginal distribution, Proposition - For an n -dimensional multivate normal with mean µ and covariance matrix Σ , any of the possible marginal distributions will also (multivariate) normal. y i = N ( µ i , γ ii ) (( µ i ) ( γ ii )) γ ij y ij = N , µ j γ ji γ jj

Marginal distributions For a k -dimensional marginal distribution, . . . ... . . . . . . 7 For a univariate marginal distribution, For a bivariate marginal distribution, Proposition - For an n -dimensional multivate normal with mean µ and covariance matrix Σ , any of the possible marginal distributions will also (multivariate) normal. y i = N ( µ i , γ ii ) (( µ i ) ( γ ii )) γ ij y ij = N , µ j γ ji γ jj       µ i 1 γ i 1 i 1 · · · γ i 1 i k y i 1 , ··· , i k = N  ,            µ j γ i k i 1 · · · γ i k i k

Conditional Distributions 21 Y 2 a 1 12 1 22 a 2 11 12 1 22 Y 2 then the conditional distributions are given by Y 1 b 2 21 1 11 b 1 22 21 1 11 21 Y 1 8 then Y Y 2 Y 1 If we partition the n -dimensions into two pieces such that Y = ( Y 1 , Y 2 ) t   ( ) ( ) Σ 11 Σ 12 µ 1 n × 1 ∼ N   , Σ 21 Σ 22 µ 2 n × 1 n × n k × 1 ∼ N ( µ 1 k × k ) , Σ 11 k × 1 n − k × 1 ∼ N ( µ 2 n − k × n − k ) Σ 22 , n − k × 1

Conditional Distributions Y 1 then the conditional distributions are given by Y 2 8 then Y If we partition the n -dimensions into two pieces such that Y = ( Y 1 , Y 2 ) t   ( ) ( ) Σ 11 Σ 12 µ 1 n × 1 ∼ N   , Σ 21 Σ 22 µ 2 n × 1 n × n k × 1 ∼ N ( µ 1 k × k ) , Σ 11 k × 1 n − k × 1 ∼ N ( µ 2 n − k × n − k ) Σ 22 , n − k × 1 Y 1 | Y 2 = a ∼ N ( µ 1 + Σ 12 Σ − 1 22 ( a − µ 2 ) , Σ 11 − Σ 12 Σ − 1 22 Σ 21 ) Y 2 | Y 1 = b ∼ N ( µ 2 + Σ 21 Σ − 1 11 ( b − µ 1 ) , Σ 22 − Σ 21 Σ − 1 11 Σ 21 )

Gaussian Processes From Shumway, integer n, have a multivariate normal distribution. So far we have only looked at examples of time series where T is discete (and evenly spaces & contiguous), it turns out things get a lot more interesting when we explore the case where T is defined on a continuous space (e.g. or some subset of ). 9 A process, Y = { Y t : t ∈ T } , is said to be a Gaussian process if all possible finite dimensional vectors y = ( y t 1 , y t 2 , ..., y t n ) t , for every collection of time points t 1 , t 2 , . . . , t n , and every positive

Gaussian Processes From Shumway, integer n, have a multivariate normal distribution. So far we have only looked at examples of time series where T is discete (and evenly spaces & contiguous), it turns out things get a lot more interesting when we explore the case where T is defined on a continuous 9 A process, Y = { Y t : t ∈ T } , is said to be a Gaussian process if all possible finite dimensional vectors y = ( y t 1 , y t 2 , ..., y t n ) t , for every collection of time points t 1 , t 2 , . . . , t n , and every positive space (e.g. R or some subset of R ).

Gaussian Process Regression 10

Parameterizing a Gaussian Process up to n n • Simple parameterization of • Stationarity • Necessary to make some simplifying assumptions: n ) 2 unique values ( p 1 • The unconstrained covariance matrix for the observed data can have Imagine we have a Gaussian process defined such that process. with which to say something useful about this infinite dimension Y n • We will only have a (small) finite number of observations Y 1 • We now have an uncountably infinite set of possible Y t s. 11 Y = { Y t : t ∈ [ 0 , 1 ] } ,

Parameterizing a Gaussian Process Imagine we have a Gaussian process defined such that • We now have an uncountably infinite set of possible Y t s. with which to say something useful about this infinite dimension process. • The unconstrained covariance matrix for the observed data can have up to n n 1 2 unique values ( p n ) • Necessary to make some simplifying assumptions: • Stationarity • Simple parameterization of 11 Y = { Y t : t ∈ [ 0 , 1 ] } , • We will only have a (small) finite number of observations Y 1 , . . . , Y n

Parameterizing a Gaussian Process Imagine we have a Gaussian process defined such that • We now have an uncountably infinite set of possible Y t s. with which to say something useful about this infinite dimension process. • The unconstrained covariance matrix for the observed data can have • Necessary to make some simplifying assumptions: • Stationarity • Simple parameterization of 11 Y = { Y t : t ∈ [ 0 , 1 ] } , • We will only have a (small) finite number of observations Y 1 , . . . , Y n up to n ( n + 1 ) / 2 unique values ( p >>> n )

Parameterizing a Gaussian Process Imagine we have a Gaussian process defined such that • We now have an uncountably infinite set of possible Y t s. with which to say something useful about this infinite dimension process. • The unconstrained covariance matrix for the observed data can have • Necessary to make some simplifying assumptions: • Stationarity 11 Y = { Y t : t ∈ [ 0 , 1 ] } , • We will only have a (small) finite number of observations Y 1 , . . . , Y n up to n ( n + 1 ) / 2 unique values ( p >>> n ) • Simple parameterization of Σ

Lecture 12 Gaussian Process Models Colin Rundel 02/27/2017 1 - PowerPoint PPT Presentation

Lecture 12 Gaussian Process Models Colin Rundel 02/27/2017 1 Multivariate Normal 2 Multivariate Normal Distribution . . . . ... . . . . . Y n . . . (positive semidefinite) can be written as Y 3 Y 1 For an n -dimension

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Multiphase Modelling in Cancer Helen Byrne Wolfson Centre for Mathematical Biology Mathematical

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Lecture Capture Introduction to Lecture Capture Learning Outcomes What will lecture capture

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Repetition Automatic Control, Basic Course, Lecture 11 Fredrik Bagge Carlson December 17, 2016

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

Introduction to Gaussian Processes Stephen Keeley and Jonathan Pillow Princeton Neuroscience

Advanced Introduction to Machine Learning CMU-10715 Gaussian Processes Barnabs Pczos

Constructive and analytic enumeration of circulant graphs with p 3 vertices; p = 3 , 5 Joint work

Natural Language Generation (Not Only) in Dialogue Systems Ond rej Du sek Institute of

Gaussian, Markov and stationary processes Gonzalo Mateos Dept. of ECE and Goergen Institute for

Ellipse and Gaussian Distribution Prof. Seungchul Lee Industrial AI Lab. Coordinates 2

The Normal Distribution INFO-1301, Quantitative Reasoning 1 University of Colorado Boulder March

Hands on Demos for Gaussian Process using R Software Tak (Hyungsuk) Tak & David Jones SAMSI