gaussian processes for big data
play

Gaussian Processes for Big Data James Hensman joint work with - PowerPoint PPT Presentation

Gaussian Processes for Big Data James Hensman joint work with Nicol o Fusi, Neil D. Lawrence Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples Overview Motivation Sparse Gaussian Processes


  1. Gaussian Processes for Big Data James Hensman joint work with Nicol´ o Fusi, Neil D. Lawrence

  2. Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples

  3. Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples

  4. Motivation Inference in a GP has the following demands: O ( n 3 ) Complexity: O ( n 2 ) Storage: Inference in a sparse GP has the following demands: O ( nm 2 ) Complexity: O ( nm ) Storage: where we get to pick m !

  5. Still not good enough! Big Data ◮ In parametric models, stochastic optimisation is used. ◮ This allows for application to Big Data. This work ◮ Show how to use Stochastic Variational Inference in GPs ◮ Stochastic optimisation scheme: each step requires O ( m 3 )

  6. Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples

  7. Computational savings K nn ≈ Q nn = K nm K − 1 mm K mn Instead of inverting K nn , we make a low rank (or Nystr¨ om) approximation, and invert K mm instead.

  8. Information capture Everything we want to do with a GP involves marginalising f ◮ Predictions ◮ Marginal likelihood ◮ Estimating covariance parameters The posterior of f is the central object. This means inverting K nn .

  9. s e u l a v n X , y o i t c n u f input space (X)

  10. s e u l a v n X , y o i t c n u f f ( x ) ∼ G P input space (X)

  11. s e u l a v n X , y o i t c n u f ( x ) ∼ G P f p ( f ) = N ( 0 , K nn ) input space (X)

  12. s e u l a v n X , y o i t c n u f ( x ) ∼ G P f p ( f ) = N ( 0 , K nn ) p ( f | y , X ) input space (X)

  13. Introducing u Take and extra M points on the function, u = f ( Z ). p ( y , f , u ) = p ( y | f ) p ( f | u ) p ( u )

  14. Introducing u

  15. Introducing u Take and extra M points on the function, u = f ( Z ). p ( y , f , u ) = p ( y | f ) p ( f | u ) p ( u ) � � y | f , σ 2 I p ( y | f ) = N � � f | K nm K mm ı u , � p ( f | u ) = N K p ( u ) = N ( u | 0 , K mm )

  16. s e u l a v n X , y o i t c n u f ( x ) ∼ G P f p ( f ) = N ( 0 , K nn ) p ( f | y , X ) Z , u p ( u ) = N ( 0 , K mm ) input space (X)

  17. s e u l a v n X , y o i t c n u f ( x ) ∼ G P f p ( f ) = N ( 0 , K nn ) p ( f | y , X ) p ( u ) = N ( 0 , K mm ) � p ( u | y , X ) input space (X)

  18. The alternative posterior Instead of doing p ( y | f ) p ( f | X ) � p ( f | y , X ) = p ( y | f ) p ( f | X )d f We’ll do p ( y | u ) p ( u | Z ) � p ( u | y , Z ) = p ( y | u ) p ( u | Z )d u

  19. The alternative posterior Instead of doing p ( y | f ) p ( f | X ) � p ( f | y , X ) = p ( y | f ) p ( f | X )d f We’ll do p ( y | u ) p ( u | Z ) � p ( u | y , Z ) = p ( y | u ) p ( u | Z )d u but p ( y | u ) involves inverting K nn

  20. Variational marginalisation of f � ln p ( y | u ) = ln p ( y | f ) p ( f | u , X )d f

  21. Variational marginalisation of f � ln p ( y | u ) = ln p ( y | f ) p ( f | u , X )d f � p ( y | f ) � ln p ( y | u ) = ln E p ( f | u , X )

  22. Variational marginalisation of f � ln p ( y | u ) = ln p ( y | f ) p ( f | u , X )d f � p ( y | f ) � ln p ( y | u ) = ln E p ( f | u , X ) � ln p ( y | f ) � � ln � ln p ( y | u ) ≥ E p ( f | u , X ) p ( y | u )

  23. Variational marginalisation of f � ln p ( y | u ) = ln p ( y | f ) p ( f | u , X )d f � p ( y | f ) � ln p ( y | u ) = ln E p ( f | u , X ) � ln p ( y | f ) � � ln � ln p ( y | u ) ≥ E p ( f | u , X ) p ( y | u ) No inversion of K nn required

  24. An approximate likelihood � n � mm u , σ 2 � � � �� y i | k ⊤ mn K − 1 − 1 k nn − k ⊤ mn K − 1 � p ( y | u ) = N exp mm k mn 2 σ 2 i = 1 A straightforward likelihood approximation, and a penalty term

  25. Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples

  26. log p ( y | X ) ≥ � L 1 + log p ( u ) − log q ( u ) � q ( u ) � L 3 . (1) � � n � mm m , β − 1 � y i | k ⊤ mn K − 1 L 3 = log N i = 1 � − 1 k i , i − 1 2 β � 2tr ( S Λ i ) − KL � q ( u ) � p ( u ) � (2)

  27. Optimisation The variational objective L 3 is a function of ◮ the parameters of the covariance function ◮ the parameters of q ( u ) ◮ the inducing inputs, Z Strategy: set Z . Take the data in small minibatches, take stochastic gradient steps in the covariance function parameters, stochastic natural gradient steps in the parameters of q ( u ).

  28. Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples

  29. UK apartment prices ◮ Monthly price paid data for February to October 2012 (England and Wales) ◮ from http://data.gov.uk/dataset/ land-registry-monthly-price-paid-data/ ◮ 75,000 entries ◮ Cross referenced against a postcode database to get lattitude and longitude ◮ Regressed the normalised logarithm of the apartment prices

  30. Airline data ◮ Flight delays for every 0.9 0.8 commercial flight in the 0.7 USA from January to April Inverse lengthscale 0.6 0.5 2008. 0.4 ◮ Average delay was 30 0.3 0.2 minutes. 0.1 ◮ We randomly selected 0.0 Month DayOfMonth DayOfWeek DepTime ArrTime AirTime Distance PlaneAge 800,000 datapoints (we have limited memory!) ◮ 700,000 train, 100,000 test

  31. GPs on subsets SVI GP 37 37 36 36 35 35 RMSE 34 34 33 33 32 32 N=800 N=1000 N=1200 0 200 400 600 800 1000 1200 iteration

  32. Download the code! github.com/SheffieldML/GPy Cite our paper! Hensman, Fusi and Lawrence, Gaussian Processes for Big Data Proceedings of UAI 2013

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend