Gaussian processes - Refresher and some more in insig ights - PowerPoint PPT Presentation

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Gaussian processes - Refresher and some more in insig ights Marcel Lüthi Graphics and Vision Research Group Department of Mathematics and Computer Science University of Basel

2 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Outline • Gaussian process – refresher • Vector-valued and scalar valued Gaussian processes • The space of samples • Gaussian process regression

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Gaussian process: Formal definition 𝑞 𝑣 = 𝐻𝑄 𝜈, 𝑙 A Gaussian process is a probability distribution over functions 𝑣 ∶ 𝒴 → ℝ 𝑒 such that every finite restriction to function values 𝑣 𝑌 = (𝑣 𝑦 1 , … , 𝑣 𝑦 𝑜 ) is a multivariate normal distribution 𝑞(𝑣 𝑌 ) = 𝑂 𝜈 𝑌 , 𝑙 𝑌𝑌 .

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Gaussian process: Illustration Restriction to values at points 𝑌 = {𝑦} 𝑦 𝑣 1 𝑦 𝑣 𝑦 = ∼ 𝑂 𝜈 𝑌 , 𝑙 𝑌𝑌 𝜈(𝑦) 𝑣 2 𝑦 𝜈 1 (𝑦) 𝜈 2 (𝑦) , 𝑙 11 (𝑦, 𝑦) 𝑙 12 (𝑦, 𝑦) = 𝑂 𝑙 21 (𝑦, 𝑦) 𝑙 22 (𝑦, 𝑦)

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Gaussian process: Illustration 𝑣(𝑦′) 𝑦 ′ Restriction to values at points 𝑌 = {𝑦, 𝑦′} 𝑣 1 𝑦 𝑦 𝑣(𝑦) 𝑣 2 (𝑦) = ∼ 𝑂 𝜈 𝑌 , 𝑙 𝑌𝑌 = 𝑣(𝑦) 𝑣(𝑦′) 𝑣 1 𝑦′ 𝑣 2 (𝑦′) 𝜈 1 (𝑦) k 11 (𝑦, 𝑦) k 12 (𝑦, 𝑦) k 11 (𝑦, 𝑦′) k 12 (𝑦, 𝑦′) 𝜈 2 (𝑦) k 21 (𝑦, 𝑦) k 22 (𝑦, 𝑦) k 21 (𝑦, 𝑦′) k 22 (𝑦, 𝑦′) 𝑂 , k 11 (𝑦 ′ , 𝑦) k 12 (𝑦 ′ , 𝑦) k 11 (𝑦 ′ , 𝑦′) k 12 (𝑦 ′ , 𝑦′) 𝜈 1 (𝑦′) k 21 (𝑦 ′ , 𝑦) k 22 (𝑦 ′ , 𝑦) k 21 (𝑦 ′ , 𝑦′) k 22 (𝑦 ′ , 𝑦′) 𝜈 2 (𝑦′)

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Defining a Gaussian process A Gaussian process 𝐻𝑄 𝜈, 𝑙 ex is completely specified by a mean function 𝜈 and covariance function (or kernel) 𝑙 . • 𝜈: 𝒴 → ℝ 𝑒 defines how the average deformation looks like • 𝑙: 𝒴 × 𝒴 → ℝ 𝑒×𝑒 defines how it can deviate from the mean • Must be positive semi-definite

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Marginalization property Let 𝑌 = 𝑦 1 , … , 𝑦 𝑜 and 𝑍 = 𝑧 1 , … , 𝑧 𝑛 𝜈 𝑌 𝜈 𝑍 , Σ 𝑌𝑌 Σ 𝑌𝑍 p 𝑌, 𝑍 = 𝑂 Σ 𝑍𝑌 Σ 𝑍𝑍 The marginal distribution 𝑞 𝑌 = ∫ 𝑞 𝑌, 𝑍 𝑒𝑍 is given by 𝑞 𝑌 = 𝑂 𝜈 𝑌 , Σ 𝑌𝑌 . • Evaluating the Gaussian process 𝐻𝑄 𝜈, 𝑙 defined on domain 𝒴 at the points 𝑌 = (𝑦 1 , … , 𝑦 𝑜 ) is marginalizing out (ignoring) all random variables 𝒴 \ 𝑌

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL From continuous to discrete Conceptual formulation: Continuous: 𝐻𝑄(𝜈, 𝑙) Practical implementation: Discrete: 𝑂(𝜈, 𝐿)

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL The Karhunen-Loève expansion u ∼ 𝐻𝑄 𝜈, 𝑙 We can write ∞ 𝛽 𝑗 𝑣 ∼ 𝜈 + σ 𝑗=1 𝜇 𝑗 𝜚 𝑗 , 𝛽 𝑗 ∼ 𝑂(0, 1) as • 𝜚 𝑗 is the eigenfunction with associated eigenvalue 𝜇 𝑗 of the linear operator [𝑈 𝑙 𝑣](𝑦) = ∫ 𝑙 𝑦, 𝑡 𝑣 𝑡 𝑒𝑡

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Eigenvalues and variance • Eigenvalue 𝜇 𝑗 • Interpretation: Variance of 𝛽 𝑗 𝜇 𝑗 𝜚 𝑗 • The total variance of the process 𝑣 ∼ 𝐻𝑄 𝜈, 𝑙 ∞ 𝜇 𝑗 . is given by σ 𝑗=1 • Observatio ion: Most variance is explained by the first eigenfunctions

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Low-rank approximation 𝑠 𝑣 = 𝜈 + ෍ 𝛽 𝑗 𝜇 𝑗 𝜚 𝑗 , 𝛽 𝑗 ∼ 𝑂(0, 1) 𝑗=1 Main idea: Represent process using only the first 𝑠 components • We have a finite, parametric representation of the process. • Any deformation 𝑣 is determined by the coefficients 𝛽 = 𝛽 1 , … , 𝛽 𝑠 𝑠 1 2 /2) 𝑞 𝑣 = 𝑞 𝛽 = ෑ exp(−𝛽 𝑗 2𝜌 𝑗=1

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Vector-valued and single valued Gaussian processes

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Scalar-valued Gaussian processes Vector-valu lued (th (this is cou ourse) Sc Scalar-valu lued (m (more common) • Samples u are deformation fields: • Samples f are real-valued functions 𝑔 ∶ ℝ 𝑜 → ℝ 𝑣: ℝ 𝑜 → ℝ 𝑒

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Scalar-valued Gaussian processes Vector-valu lued (th (this is cou ourse) Sc Scalar-valu lued (m (more common) 𝑣 ∼ 𝐻𝑄 Ԧ 𝜈, 𝒍 𝑔 ∼ 𝐻𝑄 𝜈, 𝑙 𝜈: 𝒴 → ℝ 𝑒 Ԧ 𝜈: 𝒴 → ℝ 𝒍: 𝒴 × 𝒴 → ℝ 𝑒×𝑒 𝑙: 𝒴 × 𝒴 → ℝ

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL A connection Matrix-valued kernels can be reinterpreted as scalar-valued kernels: Matrix valued kernel: 𝒍: 𝒴 × 𝒴 → ℝ 𝒆×𝒆 Scalar valued kernel: 𝑙: 𝒴 × 1. . 𝑒 × 𝒴 × 1. . 𝑒 → ℝ Bijection: : Define 𝑦 ′ , 𝑘 = 𝒍 𝑦 ′ , 𝑦 ′ 𝑗,𝑘 𝑙( 𝑦, 𝑗 ,

21 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL GP Regression – Vector-valued case 𝑙 11 𝑦 1 , 𝑦 1 𝑙 12 𝑦 1 , 𝑦 1 𝑙 11 𝑦 1 , 𝑦 𝑜 𝑙 12 𝑦 1 , 𝑦 𝑜 … 𝑙 21 𝑦 1 , 𝑦 1 𝑙 22 𝑦 1 , 𝑦 1 𝑙 21 𝑦 1 , 𝑦 𝑜 𝑙 22 𝑦 1 , 𝑦 𝑜 𝑳 = ⋮ ⋮ 𝑙 11 𝑦 𝑜 , 𝑦 1 𝑙 12 𝑦 𝑜 , 𝑦 1 𝑙 11 𝑦 𝑜 , 𝑦 𝑜 𝑙 12 𝑦 𝑜 , 𝑦 𝑜 … 𝑙 21 𝑦 𝑜 , 𝑦 1 𝑙 22 𝑦 𝑜 , 𝑦 1 𝑙 21 𝑦 𝑜 , 𝑦 𝑜 𝑙 22 𝑦 𝑜 , 𝑦 𝑜 𝑙 (𝑦 1 , 1), (𝑦 1 , 1) 𝑙 (𝑦 1 , 1), (𝑦 1 , 2) 𝑙 (𝑦 1 , 1), (𝑦 𝑜 , 1) 𝑙 (𝑦 1 , 1), (𝑦 𝑜 , 2) … 𝑙 𝑦 1 , 2 , (𝑦 1 , 1) 𝑙 𝑦 1 , 2 , (𝑦 1 , 2) 𝑙 𝑦 1 , 2 , (𝑦 𝑜 , 1) 𝑙 𝑦 1 , 2 , (𝑦 𝑜 , 2) 𝐿 = ⋮ ⋮ 𝑙 (𝑦 𝑜 , 1), (𝑦 1 , 1) 𝑙 (𝑦 𝑜 , 1), (𝑦 1 , 2) 𝑙 (𝑦 𝑜 , 1), (𝑦 𝑜 , 1) 𝑙 (𝑦 𝑜 , 1), (𝑦 𝑜 , 2) … 𝑙 𝑦 𝑜 , 2 , (𝑦 1 , 1) 𝑙 𝑦 𝑜 , 2 , (𝑦 1 , 2) 𝑙 𝑦 𝑜 , 2 , (𝑦 𝑜 , 1) 𝑙 𝑦 𝑜 , 2 , (𝑦 𝑜 , 2)

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL A connection Matrix-valued kernels can be reinterpreted as scalar-valued kernels: Matrix valued kernel: 𝒍: 𝒴 × 𝒴 → ℝ 𝒆×𝒆 Scalar valued kernel: 𝑙: 𝒴 × 1. . 𝑒 × 𝒴 × 1. . 𝑒 → ℝ Bijection: : Define 𝑦 ′ , 𝑘 = 𝒍 𝑦 ′ , 𝑦 ′ 𝑗,𝑘 𝑙( 𝑦, 𝑗 , All the theory developed for the scalar-valued GPs holds also for vector-valued GPs!

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Sampling revisited

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Finite views on infinite objects Infinite Gaussian process dimensional Continuous domain Finite dimensional Finite domain Finite rank (KL- (Marginalization) Expansion) 24

25 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL The space of samples Sampling from 𝐻𝑄 𝜈, 𝑙 is done using the corresponding normal distribution 𝑂( Ԧ 𝜈, K) Algorithm for sampling (slightly inefficient) Do an SVD: K = 𝑉𝐸 2 𝑉 𝑈 1. 2. Draw a normal vector 𝛽 ∼ 𝑂 0, 𝐽 𝑜×𝑜 3. Compute Ԧ 𝜈 + 𝑉𝐸𝛽

26 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL The space of samples • From K = 𝑉𝐸 2 𝑉 𝑈 (using that 𝑉 𝑈 𝑉 = 𝐽) we have that K𝑉𝐸 −1 = 𝑉𝐸 • Any sample 𝜈 + K𝑉𝐸 −1 𝛽 = 𝜈 + K𝛾 𝑡 = Ԧ 𝜈 + 𝑉𝐸𝛽 = Ԧ is a linear combinations of the columns of K . Two ways to represent sample: 𝜈 + σ 𝑗 𝑒 𝑗 𝛽 𝑗 𝑣 𝑗 1. KL-Expansion: 𝑡 = Ԧ 𝜈 + σ 𝑘 𝛾𝑙 𝑘 2. Linear combination of kernels: 𝑡 = Ԧ

27 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Four examples covariance functions 𝑙 𝑦, 𝑦 ′ = 𝑔 𝑦 𝑔 𝑦 ′ 3 𝑙 𝑦, 𝑦 ′ = ෍ 𝑗 (𝑦 ′ ) 𝑔 𝑗 𝑦 𝑔 f x = (1 − 𝑡 𝑦 )2𝑦 2 + 𝑡 𝑦 sin 𝑦 2 𝑗=1 3 𝑦 = cos(𝑦 2 ) 𝑔 1 𝑦 = sin 𝑦 , 𝑔 2 𝑦 = 𝑦, 𝑔

28 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Four examples covariance functions 𝑙 𝑦, 𝑦 ′ = 𝜀(𝑦, 𝑦 ′ ) 𝑙 𝑦, 𝑦 ′ = exp − 𝑦 − 𝑦 ′ 2 9

29 > DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Example 1 𝑙 𝑦, 𝑦 ′ = 𝑔 𝑦 𝑔(𝑦 ′ )

Gaussian processes - Refresher and some more in insig ights - PowerPoint PPT Presentation

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 | BASEL Gaussian processes - Refresher and some more in insig ights Marcel Lthi Graphics and Vision Research Group Department of Mathematics and Computer Science University

individual and community change: In Insig ights fr from a combined panel and trend study Karen

individual and community change: In Insig ights fr from a combined panel and trend study Karen

In Insig ights from Socia ial l Scie iences Sahaya G. Selvam, sdb Tangaza University College:

COR ORPORA ORATE TE TRAVEL MANAGEMEN MENT IN INSIG IGHTS HTS AFTER COVID ID-19 19

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

GDPR Annual Refresher Training Annual refresher training - intro It is a legal requirement that

Group Rig ights Marinella Capriati Matthias Brinkmann Marinella Capriati RIG IGHTS Matthias

Gaussian Processes Dan Cervone NYU CDS November 10, 2015 Dan Cervone (NYU CDS) Gaussian

CMPUT 466 Introduction to Gaussian Processes Dan Lizotte The Plan Introduction to Gaussian

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

State Space Gaussian Processes with Non-Gaussian Likelihoods Hannes Nickisch 1 Arno Solin 2

Another introduction to Gaussian Processes Richard Wilkinson School of Maths and Statistics

Gaussian Processes for Big Data James Hensman joint work with Nicol o Fusi, Neil D. Lawrence

Gaussian Processes Seung-Hoon Na Chonbuk National University Gaussian Process Regression

CSC 411 Lecture 20: Gaussian Processes Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla

MAXIMUM CONSISTENCY METHOD for Data Fitting under Interval Uncertainty Sergey P. Shary

Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks Yang Cao

The stochastic heat equation driven by a Gaussian noise: Markov property Doyoon Kim 1 , 2 Raluca

Overview Prediction with Gaussian Processes: Basic Ideas Bayesian Prediction Chris Williams

A Short Introduction to Bayesian Optimization With applications to parameter tuning on

Lecture 13 Gaussian Process Models - Part 2 Colin Rundel 03/01/2017 1 EDA and GPs 2 t i t j t

Gaussian Processes for Robotics McGill COMP 765 Oct 24 th , 2017 A robot must learn Modeling