DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik - PowerPoint PPT Presentation

Introduction Recap Kernels Gaussian Processes References Kernels Kernels and covariances • Covariance between columns: X T Y (data-dimensions) • Covariance between rows: XY T (data-points) • Kernels: k ( x , y ) = ϕ ( x ) T ϕ ( y ) ▶ Kernel functions are covariances between data-points • A kernel function describes the co-variance of the data points • Specific class of functions Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Kernels 2 ℓ 2 ( x i − x j ) T ( x i − x j ) 1 k ( x i , x j ) = σ 2 e − (17) Squared Exponential • How does the data vary along the dimensions spanned by the data • RBF, Squared Exponential, Exponentiated Quadratic • Co-variance smoothly decays with distance Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Building Kernels Expression Conditions k ( x , z ) = c k 1 ( x , z ) c - any non negative real constant. k ( x , z ) = f ( x ) k 1 ( x , z ) f ( z ) f - any real-valued function. k ( x , z ) = q ( k 1 ( x , z )) q - any polynomial with non-negative coefficients. k ( x , z ) = exp ( k 1 ( x , z )) k ( x , z ) = k 1 ( x , z ) + k 2 ( x , z ) k ( x , z ) = k 1 ( x , z ) k 2 ( x , z ) k ( x , z ) = k 3 ( φ ( x ) , φ ( z )) k 3 - valid kernel in the space mapped by φ . k ( x , z ) = h Ax , z i = h x , Az i A - symmetric psd matrix. k ( x , z ) = k a ( x a , z a ) + k b ( x b , z b ) x a and x b - non-necessarily disjoint partitions of x ; k ( x , z ) = k a ( x a , z a ) k b ( x b , z b ) k a and k b - valid kernels on their respective spaces. Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Summary • Defines inner products in some space • We don’t need to know the space, its implicitly defined by the kernel function • Defines co-variance between data-points Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Introduction Recap Kernels Gaussian Processes Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References What have you seen up till now? • Probabilistic modelling ▶ likelihood, prior, posterior ▶ marginalisation • Implicit feature spaces ▶ kernel functions • We have assumed the form of the mapping without uncertainty Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Outline • General Regression • Introduce uncertainty in mapping • prior over the space of functions Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Regression Regression model, y i = f ( x i ) + ϵ (18) ϵ ∼ N ( 0 , σ 2 I ) (19) Introduce f i as instansiation of function, f i = f ( x i ) , (20) as a new random variable. Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Regression Model, p ( Y , f , X , θ ) = p ( Y | f ) p ( f | X , θ ) p ( X ) p ( θ ) (21) Want to “push” X through a mapping f of which we are uncertain, p ( f | X , θ ) , (22) prior over instansiations of function. Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Priors over functions 3 3 Lecture7/gp basics.py Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Gaussian Distribution Joint Distribution, [ x 1 ([ µ 1 [ σ ( x 1 , x 1 ) ] ] ]) σ ( x 1 , x 2 ) ∼ N , . (23) x 2 µ 2 σ ( x 2 , x 1 ) σ ( x 2 , x 2 ) µ 2 + σ ( x 1 , x 2 ) σ ( x 1 , x 1 ) − 1 ( x 1 − µ 1 ) , x 2 | x 1 ∼ N ( σ ( x 2 , x 2 ) − σ ( x 2 , x 1 ) σ ( x 1 , x 1 ) − 1 σ ( x 1 , x 2 ) ) (24) Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 5 N , (25) 0 0 . 5 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 [ 1 ] ]) 0 N , (35) 0 0 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References If all instansiations of the function is jointly Gaussian such that the co-variance structure depends on how much information an observation provides for the other we will get the curve above. Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Row space • Co-variance between each point! • Co-variance function is a kernel! • We can do all this in induced space, i.e. allow for any function! Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 p ( f | X , θ ) ∼ GP ( µ ( X ) , k ( X , X )) (40) Defenition A Gaussian Process is an infinite collection of random variables who any subset is jointly gaussian. The process is specified by a mean function µ ( · ) and a co-variance function k ( · , · ) f ∼ GP ( µ ( · ) , k ( · , · )) (41) 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 p ( f | X , θ ) ∼ GP ( µ ( X ) , k ( X , X )) (42) y i = f i + ϵ (43) ϵ ∼ N ( 0 , σ 2 I ) (44) ∫ p ( Y | X , θ ) = p ( Y | f ) p ( f | X , θ ) d f (45) Connection to Distribution GP is infinite, but we only observe finite amount of data. This means conditioning on a subset of the data, the GP is a just a Gaussian distribution, which is self-conjugate. 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 The mean function • Function of only the input location • What do I expect the function value to be only accounting for the input location • We will assume this to be constant The co-variance function • Function of two input locations • How should the information from other locations with known function value observations effect my estimate • Encodes the behavior of the function 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 The Prior p ( f | X , θ ) = GP ( µ ( x ) , k ( x , x ′ )) (46) µ ( x ) = 0 (47) 2 ℓ 2 ( x i − x j ) T ( x i − x j ) 1 k ( x i , x j ) = σ 2 e − (48) 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik - PowerPoint PPT Presentation

Introduction Recap Kernels Gaussian Processes References DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik Ek { chek } @csc.kth.se Royal Institute of Technology November 5th, 2015 Ek KTH DD2434 - Advanced Machine Learning

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

ADVANCED MACHINE LEARNING Kernel PCA 11 ADVANCED MACHINE LEARNING Overview Todays Lecture

ADVANCED MACHINE LEARNING Non-linear regression techniques 1 1 ADVANCED MACHINE LEARNING

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Discrete and Continuous Reinforcement Learning (not part of exam material) 1 1 ADVANCED

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu L Murray, Iain

GP-BayesFilters Bayes Filters CSE-571 u(k-1) u(k) u(k+1)

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

Kriging a.k.a. Gaussian Process Regression(GPR) Yubo Paul Yang, Algorithm Interest Group,

Safe model-based learning for robot control Breaking your robot is only fun in simulation Felix

NONLINEAR REGRESSION Sylvain Calinon Robot Learning & Interaction Group Idiap Research

Repeatability Simulations Definition: repeatability A (distributed) simulation program is Each

Nonlinear dynamic stochastic general equilibrium models David Schenck Senior Econometrician

DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik - PowerPoint PPT Presentation

Introduction Recap Kernels Gaussian Processes References DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik Ek { chek } @csc.kth.se Royal Institute of Technology November 5th, 2015 Ek KTH DD2434 - Advanced Machine Learning

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

ADVANCED MACHINE LEARNING Kernel PCA 11 ADVANCED MACHINE LEARNING Overview Todays Lecture

ADVANCED MACHINE LEARNING Non-linear regression techniques 1 1 ADVANCED MACHINE LEARNING

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Discrete and Continuous Reinforcement Learning (not part of exam material) 1 1 ADVANCED

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu L Murray, Iain

GP-BayesFilters Bayes Filters CSE-571 u(k-1) u(k) u(k+1)

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

Kriging a.k.a. Gaussian Process Regression(GPR) Yubo Paul Yang, Algorithm Interest Group,

Safe model-based learning for robot control Breaking your robot is only fun in simulation Felix

NONLINEAR REGRESSION Sylvain Calinon Robot Learning &amp; Interaction Group Idiap Research

Repeatability Simulations Definition: repeatability A (distributed) simulation program is Each

Nonlinear dynamic stochastic general equilibrium models David Schenck Senior Econometrician

NONLINEAR REGRESSION Sylvain Calinon Robot Learning & Interaction Group Idiap Research