dd2434 advanced machine learning
play

DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik - PowerPoint PPT Presentation

Introduction Recap Kernels Gaussian Processes References DD2434 - Advanced Machine Learning Gaussian Processes Carl Henrik Ek { chek } @csc.kth.se Royal Institute of Technology November 5th, 2015 Ek KTH DD2434 - Advanced Machine Learning


  1. Introduction Recap Kernels Gaussian Processes References Kernels Kernels and covariances • Covariance between columns: X T Y (data-dimensions) • Covariance between rows: XY T (data-points) • Kernels: k ( x , y ) = ϕ ( x ) T ϕ ( y ) ▶ Kernel functions are covariances between data-points • A kernel function describes the co-variance of the data points • Specific class of functions Ek KTH DD2434 - Advanced Machine Learning

  2. Introduction Recap Kernels Gaussian Processes References Kernels Kernels and covariances • Covariance between columns: X T Y (data-dimensions) • Covariance between rows: XY T (data-points) • Kernels: k ( x , y ) = ϕ ( x ) T ϕ ( y ) ▶ Kernel functions are covariances between data-points • A kernel function describes the co-variance of the data points • Specific class of functions Ek KTH DD2434 - Advanced Machine Learning

  3. Introduction Recap Kernels Gaussian Processes References Kernels Kernels and covariances • Covariance between columns: X T Y (data-dimensions) • Covariance between rows: XY T (data-points) • Kernels: k ( x , y ) = ϕ ( x ) T ϕ ( y ) ▶ Kernel functions are covariances between data-points • A kernel function describes the co-variance of the data points • Specific class of functions Ek KTH DD2434 - Advanced Machine Learning

  4. Introduction Recap Kernels Gaussian Processes References Kernels Kernels and covariances • Covariance between columns: X T Y (data-dimensions) • Covariance between rows: XY T (data-points) • Kernels: k ( x , y ) = ϕ ( x ) T ϕ ( y ) ▶ Kernel functions are covariances between data-points • A kernel function describes the co-variance of the data points • Specific class of functions Ek KTH DD2434 - Advanced Machine Learning

  5. Introduction Recap Kernels Gaussian Processes References Kernels Kernels and covariances • Covariance between columns: X T Y (data-dimensions) • Covariance between rows: XY T (data-points) • Kernels: k ( x , y ) = ϕ ( x ) T ϕ ( y ) ▶ Kernel functions are covariances between data-points • A kernel function describes the co-variance of the data points • Specific class of functions Ek KTH DD2434 - Advanced Machine Learning

  6. Introduction Recap Kernels Gaussian Processes References Kernels Kernels and covariances • Covariance between columns: X T Y (data-dimensions) • Covariance between rows: XY T (data-points) • Kernels: k ( x , y ) = ϕ ( x ) T ϕ ( y ) ▶ Kernel functions are covariances between data-points • A kernel function describes the co-variance of the data points • Specific class of functions Ek KTH DD2434 - Advanced Machine Learning

  7. Introduction Recap Kernels Gaussian Processes References Kernels 2 ℓ 2 ( x i − x j ) T ( x i − x j ) 1 k ( x i , x j ) = σ 2 e − (17) Squared Exponential • How does the data vary along the dimensions spanned by the data • RBF, Squared Exponential, Exponentiated Quadratic • Co-variance smoothly decays with distance Ek KTH DD2434 - Advanced Machine Learning

  8. Introduction Recap Kernels Gaussian Processes References Building Kernels Expression Conditions k ( x , z ) = c k 1 ( x , z ) c - any non negative real constant. k ( x , z ) = f ( x ) k 1 ( x , z ) f ( z ) f - any real-valued function. k ( x , z ) = q ( k 1 ( x , z )) q - any polynomial with non-negative coefficients. k ( x , z ) = exp ( k 1 ( x , z )) k ( x , z ) = k 1 ( x , z ) + k 2 ( x , z ) k ( x , z ) = k 1 ( x , z ) k 2 ( x , z ) k ( x , z ) = k 3 ( φ ( x ) , φ ( z )) k 3 - valid kernel in the space mapped by φ . k ( x , z ) = h Ax , z i = h x , Az i A - symmetric psd matrix. k ( x , z ) = k a ( x a , z a ) + k b ( x b , z b ) x a and x b - non-necessarily disjoint partitions of x ; k ( x , z ) = k a ( x a , z a ) k b ( x b , z b ) k a and k b - valid kernels on their respective spaces. Ek KTH DD2434 - Advanced Machine Learning

  9. Introduction Recap Kernels Gaussian Processes References Summary • Defines inner products in some space • We don’t need to know the space, its implicitly defined by the kernel function • Defines co-variance between data-points Ek KTH DD2434 - Advanced Machine Learning

  10. Introduction Recap Kernels Gaussian Processes References Summary • Defines inner products in some space • We don’t need to know the space, its implicitly defined by the kernel function • Defines co-variance between data-points Ek KTH DD2434 - Advanced Machine Learning

  11. Introduction Recap Kernels Gaussian Processes References Summary • Defines inner products in some space • We don’t need to know the space, its implicitly defined by the kernel function • Defines co-variance between data-points Ek KTH DD2434 - Advanced Machine Learning

  12. Introduction Recap Kernels Gaussian Processes References Introduction Recap Kernels Gaussian Processes Ek KTH DD2434 - Advanced Machine Learning

  13. Introduction Recap Kernels Gaussian Processes References What have you seen up till now? • Probabilistic modelling ▶ likelihood, prior, posterior ▶ marginalisation • Implicit feature spaces ▶ kernel functions • We have assumed the form of the mapping without uncertainty Ek KTH DD2434 - Advanced Machine Learning

  14. Introduction Recap Kernels Gaussian Processes References What have you seen up till now? • Probabilistic modelling ▶ likelihood, prior, posterior ▶ marginalisation • Implicit feature spaces ▶ kernel functions • We have assumed the form of the mapping without uncertainty Ek KTH DD2434 - Advanced Machine Learning

  15. Introduction Recap Kernels Gaussian Processes References Outline • General Regression • Introduce uncertainty in mapping • prior over the space of functions Ek KTH DD2434 - Advanced Machine Learning

  16. Introduction Recap Kernels Gaussian Processes References Outline • General Regression • Introduce uncertainty in mapping • prior over the space of functions Ek KTH DD2434 - Advanced Machine Learning

  17. Introduction Recap Kernels Gaussian Processes References Outline • General Regression • Introduce uncertainty in mapping • prior over the space of functions Ek KTH DD2434 - Advanced Machine Learning

  18. Introduction Recap Kernels Gaussian Processes References Regression Regression model, y i = f ( x i ) + ϵ (18) ϵ ∼ N ( 0 , σ 2 I ) (19) Introduce f i as instansiation of function, f i = f ( x i ) , (20) as a new random variable. Ek KTH DD2434 - Advanced Machine Learning

  19. Introduction Recap Kernels Gaussian Processes References Regression Model, p ( Y , f , X , θ ) = p ( Y | f ) p ( f | X , θ ) p ( X ) p ( θ ) (21) Want to “push” X through a mapping f of which we are uncertain, p ( f | X , θ ) , (22) prior over instansiations of function. Ek KTH DD2434 - Advanced Machine Learning

  20. Introduction Recap Kernels Gaussian Processes References Priors over functions 3 3 Lecture7/gp basics.py Ek KTH DD2434 - Advanced Machine Learning

  21. Introduction Recap Kernels Gaussian Processes References Priors over functions 3 3 Lecture7/gp basics.py Ek KTH DD2434 - Advanced Machine Learning

  22. Introduction Recap Kernels Gaussian Processes References Priors over functions 3 3 Lecture7/gp basics.py Ek KTH DD2434 - Advanced Machine Learning

  23. Introduction Recap Kernels Gaussian Processes References Priors over functions 3 3 Lecture7/gp basics.py Ek KTH DD2434 - Advanced Machine Learning

  24. Introduction Recap Kernels Gaussian Processes References Priors over functions 3 3 Lecture7/gp basics.py Ek KTH DD2434 - Advanced Machine Learning

  25. Introduction Recap Kernels Gaussian Processes References Gaussian Distribution Joint Distribution, [ x 1 ([ µ 1 [ σ ( x 1 , x 1 ) ] ] ]) σ ( x 1 , x 2 ) ∼ N , . (23) x 2 µ 2 σ ( x 2 , x 1 ) σ ( x 2 , x 2 ) µ 2 + σ ( x 1 , x 2 ) σ ( x 1 , x 1 ) − 1 ( x 1 − µ 1 ) , x 2 | x 1 ∼ N ( σ ( x 2 , x 2 ) − σ ( x 2 , x 1 ) σ ( x 1 , x 1 ) − 1 σ ( x 1 , x 2 ) ) (24) Ek KTH DD2434 - Advanced Machine Learning

  26. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 5 N , (25) 0 0 . 5 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  27. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 5 N , (26) 0 0 . 5 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  28. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 5 N , (27) 0 0 . 5 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  29. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 5 N , (28) 0 0 . 5 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  30. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 5 N , (29) 0 0 . 5 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  31. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 99 N , (30) 0 0 . 99 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  32. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 99 N , (31) 0 0 . 99 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  33. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 99 N , (32) 0 0 . 99 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  34. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 99 N , (33) 0 0 . 99 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  35. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 ] [ ]) 1 0 . 99 N , (34) 0 0 . 99 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  36. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 [ 1 ] ]) 0 N , (35) 0 0 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  37. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 [ 1 ] ]) 0 N , (36) 0 0 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  38. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 [ 1 ] ]) 0 N , (37) 0 0 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  39. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 [ 1 ] ]) 0 N , (38) 0 0 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  40. Introduction Recap Kernels Gaussian Processes References The Gaussian Conditional 4 ([ 0 [ 1 ] ]) 0 N , (39) 0 0 1 4 Lecture7/conditional gaussian.py Ek KTH DD2434 - Advanced Machine Learning

  41. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  42. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  43. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  44. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  45. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  46. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  47. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  48. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  49. Introduction Recap Kernels Gaussian Processes References Ek KTH DD2434 - Advanced Machine Learning

  50. Introduction Recap Kernels Gaussian Processes References If all instansiations of the function is jointly Gaussian such that the co-variance structure depends on how much information an observation provides for the other we will get the curve above. Ek KTH DD2434 - Advanced Machine Learning

  51. Introduction Recap Kernels Gaussian Processes References Row space • Co-variance between each point! • Co-variance function is a kernel! • We can do all this in induced space, i.e. allow for any function! Ek KTH DD2434 - Advanced Machine Learning

  52. Introduction Recap Kernels Gaussian Processes References Row space • Co-variance between each point! • Co-variance function is a kernel! • We can do all this in induced space, i.e. allow for any function! Ek KTH DD2434 - Advanced Machine Learning

  53. Introduction Recap Kernels Gaussian Processes References Row space • Co-variance between each point! • Co-variance function is a kernel! • We can do all this in induced space, i.e. allow for any function! Ek KTH DD2434 - Advanced Machine Learning

  54. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 p ( f | X , θ ) ∼ GP ( µ ( X ) , k ( X , X )) (40) Defenition A Gaussian Process is an infinite collection of random variables who any subset is jointly gaussian. The process is specified by a mean function µ ( · ) and a co-variance function k ( · , · ) f ∼ GP ( µ ( · ) , k ( · , · )) (41) 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  55. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 p ( f | X , θ ) ∼ GP ( µ ( X ) , k ( X , X )) (42) y i = f i + ϵ (43) ϵ ∼ N ( 0 , σ 2 I ) (44) ∫ p ( Y | X , θ ) = p ( Y | f ) p ( f | X , θ ) d f (45) Connection to Distribution GP is infinite, but we only observe finite amount of data. This means conditioning on a subset of the data, the GP is a just a Gaussian distribution, which is self-conjugate. 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  56. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 The mean function • Function of only the input location • What do I expect the function value to be only accounting for the input location • We will assume this to be constant The co-variance function • Function of two input locations • How should the information from other locations with known function value observations effect my estimate • Encodes the behavior of the function 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  57. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 The mean function • Function of only the input location • What do I expect the function value to be only accounting for the input location • We will assume this to be constant The co-variance function • Function of two input locations • How should the information from other locations with known function value observations effect my estimate • Encodes the behavior of the function 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  58. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 The mean function • Function of only the input location • What do I expect the function value to be only accounting for the input location • We will assume this to be constant The co-variance function • Function of two input locations • How should the information from other locations with known function value observations effect my estimate • Encodes the behavior of the function 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  59. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 The Prior p ( f | X , θ ) = GP ( µ ( x ) , k ( x , x ′ )) (46) µ ( x ) = 0 (47) 2 ℓ 2 ( x i − x j ) T ( x i − x j ) 1 k ( x i , x j ) = σ 2 e − (48) 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  60. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  61. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  62. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  63. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  64. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

  65. Introduction Recap Kernels Gaussian Processes References Gaussian Processes 5 5 Bishop 2006, p. 6.4.2 Ek KTH DD2434 - Advanced Machine Learning

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend