Kernel Design GP Summer School, Sheffield, September 2016 Nicolas - PowerPoint PPT Presentation

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Kernel Design GP Summer School, Sheffield, September 2016 Nicolas Durrande, Mines St-Étienne, durrande@emse.fr GP Summer School Kernel Design 1 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Introduction What is a kernel ? Choosing the appropriate kernel Making new from old Effect of linear operators Application : Periodicity detection Conclusion GP Summer School Kernel Design 2 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion We have seen during the introduction lectures that the distribution of a GP Z depends on two functions : the mean m ( x ) = E ( Z ( x )) the covariance k ( x , x ′ ) = cov ( Z ( x ) , Z ( x ′ )) In this talk, we will focus on the covariance function , which is often call the kernel . GP Summer School Kernel Design 4 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion We assume we have observed a function f for a limited number of time points x 1 , . . . , x n : 1.5 1.0 0.5 f ( x ) 0.0 -0.5 -1.0 0.0 0.2 0.4 0.6 0.8 1.0 x The observations are denoted by f i = f ( x i ) (or F = f ( X )). GP Summer School Kernel Design 5 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Since f in unknown, we make the general assumption that it is a sample path of a Gaussian process Z : 4 3 2 Z ( x ) 1 0 -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 x GP Summer School Kernel Design 6 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Combining these two informations means keeping the samples interpolating the data points : 1.5 Z ( x ) | Z ( X ) = F 1.0 0.5 0.0 -0.5 -1.0 0.0 0.2 0.4 0.6 0.8 1.0 x GP Summer School Kernel Design 7 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion The conditional distribution is still Gaussian with moments : m ( x ) = E ( Z ( x ) | Z ( X ) = F ) = k ( x , X ) k ( X , X ) − 1 F c ( x , x ′ ) = cov ( Z ( x ) , Z ( x ′ ) | Z ( X ) = F ) = k ( x , x ′ ) − k ( x , X ) k ( X , X ) − 1 k ( X , x ′ ) It can be represented as a mean function with confidence intervals. 1.5 Z ( x ) | Z ( X ) = F 1.0 0.5 0.0 -0.5 -1.0 0.0 0.2 0.4 0.6 0.8 1.0 x GP Summer School Kernel Design 8 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Let Z be a random process with kernel k . Some properties of kernels can be obtained directly from their definition. Example k ( x , x ) = cov ( Z ( x ) , Z ( x )) = var ( Z ( x )) ≥ 0 ⇒ k ( x , x ) is positive . k ( x , y ) = cov ( Z ( x ) , Z ( y )) = cov ( Z ( y ) , Z ( x )) = k ( y , x ) ⇒ k ( x , y ) is symmetric . We can obtain a thinner result... GP Summer School Kernel Design 10 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion We introduce the random variable T = � n i =1 a i Z ( x i ) where n , a i and x i are arbitrary. Computing the variance of T gives :   � � � �  = var ( T ) = cov a i Z ( x i ) , a j Z ( x j ) a i a j cov ( Z ( x i ) , Z ( x j )) i j i j � � = a i a j k ( x i , x j ) Since a variance is positive, we have � � a i a j k ( x i , x j ) ≥ 0 i j for any arbitrary n , a i and x i . Definition The functions satisfying the above inequality for all n ∈ N , for all x i ∈ D , for all a i ∈ R are called positive semi-definite functions. GP Summer School Kernel Design 11 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion We have just seen : k is a covariance ⇒ k is a positive semi-definite function The reverse is also true : Theorem (Loeve) k corresponds to the covariance of a GP � k is a symmetric positive semi-definite function GP Summer School Kernel Design 12 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Proving that a function is psd is often intractable. However there are a lot of functions that have already been proven to be psd : � � − ( x − y ) 2 k ( x , y ) = σ 2 exp squared exp. 2 θ 2 √ √ � � � � + 5 | x − y | 2 5 | x − y | 5 | x − y | k ( x , y ) = σ 2 Matern 5/2 1 + exp − 3 θ 2 θ θ √ √ � � � � 3 | x − y | 3 | x − y | k ( x , y ) = σ 2 Matern 3/2 1 + exp − θ θ � � − | x − y | k ( x , y ) = σ 2 exp exponential θ k ( x , y ) = σ 2 min( x , y ) Brownian k ( x , y ) = σ 2 δ x , y white noise k ( x , y ) = σ 2 constant k ( x , y ) = σ 2 xy linear When k is a function of x − y , the kernel is called stationary . σ 2 is called the variance and θ the lengthscale . GP Summer School Kernel Design 13 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion GP Summer School Kernel Design 14 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion If k is stationary psd implies further results : Properties If ˜ k is n times differentiable in 0, then it is n times differentiable everywhere. The maximum value of ˜ k ( t ) is reached in t = 0. Example The following functions are not valid covariance structures K ( t ) K ( t ) K ( t ) t t t GP Summer School Kernel Design 15 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion For a few kernels, it is possible to prove they are psd directly from the definition. k ( x , y ) = δ x , y k ( x , y ) = 1 For most of them a direct proof from the definition is not possible. The following theorem is helpful for stationary kernels : Theorem (Bochner) A continuous stationary function k ( x , y ) = ˜ k ( | x − y | ) is positive definite if and only if ˜ k is the Fourier transform of a finite positive measure : � ˜ e − i ω t d µ ( ω ) k ( t ) = R GP Summer School Kernel Design 16 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Example We consider the following measure : 0.0 k ( t ) = sin( t ) Its Fourier transform gives ˜ : t 0.0 As a consequence, k ( x , y ) = sin( x − y ) is a valid covariance x − y function. GP Summer School Kernel Design 17 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Usual kernels Bochner theorem can be used to prove the positive definiteness of many usual stationary kernels The Gaussian is the Fourier transform of itself ⇒ it is psd. Matérn kernels are the Fourier transforms of 1 (1+ ω 2 ) p ⇒ they are psd. GP Summer School Kernel Design 18 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Unusual kernels Inverse Fourier transform of a (symmetrised) sum of Gaussian gives (A. Wilson, ICML 2013) : ˜ µ ( ω ) k ( t ) − → F 0.0 0.0 The obtained kernel is parametrised by its spectrum. GP Summer School Kernel Design 19 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Unusual kernels The sample paths have the following shape : 6 4 2 0 2 4 6 0 1 2 3 4 5 GP Summer School Kernel Design 20 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Changing the kernel has a huge impact on the model : Gaussian kernel: Exponential kernel: GP Summer School Kernel Design 22 / 60

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion This is because changing the kernel implies changing the prior Gaussian kernel: Exponential kernel: GP Summer School Kernel Design 23 / 60

Kernel Design GP Summer School, Sheffield, September 2016 Nicolas - PowerPoint PPT Presentation

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Kernel Design GP Summer School, Sheffield, September 2016 Nicolas Durrande, Mines St-tienne, durrande@emse.fr GP Summer School

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

A kernel in a library Genodes custom kernel approach Martin Stein <

Linux Kernel Synchronization System Calls Synchronization in Kernel the kernel RCU File

Debugging the Linux Kernel with GDB Kieran Bingham Debugging the Linux Kernel with GDB Many

TOS Arno Puder 1 Demo Kernel /* tos/kernel/main.c */ #include <kernel.h> WINDOW

Multiple Kernel Learning and Feature Space Denoising Fei Yan, Josef Kittler and Krystian

Efficient Multiple Kernel Learning Lei Tang Outline What is Kernel Learning? Whats the

Introduction to Kubernetes Containers container vs virtual machine Virtual machine Container

Calibration and bad channels with new protoDUNE data ProtoDUNE SP operations David Adams BNL

STRATEGIC MANAGEMENT REVISION CHAPTER-4 STRATEGIC MANAGEMENT REVISION CHAPTER-4 STRATEGIC

SW & Computing Organization and progress at CERN Andrea DellAcqua, CERN EP-ADE

Building Shared Perspec2ves by Piecing Together Measurements Leslie

Strategy Design Process Proposed Approach CCS prepares starting points, Advisory Committee

Fiscal 2020 Second Quarter Earnings Ma rc h 1 9 , 2 0 2 0 1 Forward-Looking Statements

Functional Programming Final Review CS16: Introduction to Data Structures & Algorithms

Stochastic Perrons Method in Linear and Nonlinear Problems Mihai S rbu, The University of

Kernel Design GP Summer School, Sheffield, September 2016 Nicolas - PowerPoint PPT Presentation

Introduction What is a kernel ? Kernel choice Making new from old linear operators Application Conclusion Kernel Design GP Summer School, Sheffield, September 2016 Nicolas Durrande, Mines St-tienne, durrande@emse.fr GP Summer School

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

A kernel in a library Genodes custom kernel approach Martin Stein &lt;

Linux Kernel Synchronization System Calls Synchronization in Kernel the kernel RCU File

Debugging the Linux Kernel with GDB Kieran Bingham Debugging the Linux Kernel with GDB Many

TOS Arno Puder 1 Demo Kernel /* tos/kernel/main.c */ #include &lt;kernel.h&gt; WINDOW

Multiple Kernel Learning and Feature Space Denoising Fei Yan, Josef Kittler and Krystian

Efficient Multiple Kernel Learning Lei Tang Outline What is Kernel Learning? Whats the

Introduction to Kubernetes Containers container vs virtual machine Virtual machine Container

Calibration and bad channels with new protoDUNE data ProtoDUNE SP operations David Adams BNL

STRATEGIC MANAGEMENT REVISION CHAPTER-4 STRATEGIC MANAGEMENT REVISION CHAPTER-4 STRATEGIC

SW &amp; Computing Organization and progress at CERN Andrea DellAcqua, CERN EP-ADE

Building Shared Perspec2ves by Piecing Together Measurements Leslie

Strategy Design Process Proposed Approach CCS prepares starting points, Advisory Committee

Fiscal 2020 Second Quarter Earnings Ma rc h 1 9 , 2 0 2 0 1 Forward-Looking Statements

Functional Programming Final Review CS16: Introduction to Data Structures &amp; Algorithms

Stochastic Perrons Method in Linear and Nonlinear Problems Mihai S rbu, The University of

A kernel in a library Genodes custom kernel approach Martin Stein <

TOS Arno Puder 1 Demo Kernel /* tos/kernel/main.c */ #include <kernel.h> WINDOW

SW & Computing Organization and progress at CERN Andrea DellAcqua, CERN EP-ADE

Functional Programming Final Review CS16: Introduction to Data Structures & Algorithms