Prior Knowledge and Sparse Methods for Convolved Multiple Outputs - PowerPoint PPT Presentation

Prior Knowledge and Sparse Methods for Convolved Multiple Outputs Gaussian Processes Mauricio A. Álvarez Joint work with Neil D. Lawrence, David Luengo and Michalis K. Titsias School of Computer Science University of Manchester (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 1 / 51

Contents Latent force models. ❑ Sparse approximations for latent force models. ❑ (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 2 / 51

Data driven paradigm Traditionally, the main focus in machine learning has been model ❑ generation through a data driven paradigm . Combine a data set with a flexible class of models and, through ❑ regularization, make predictions on unseen data. Problems ❑ – Data is scarce relative to the complexity of the system. – Model is forced to extrapolate. (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 3 / 51

Mechanistic models Models inspired by the underlying knowledge of a physical system are ❑ common in many areas. Description of a well characterized physical process that underpins the ❑ system, typically represented with a set of differential equations. Identifying and specifying all the interactions might not be feasible. ❑ A mechanistic model can enable accurate prediction in regions where ❑ there may be no available training data (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 4 / 51

Hybrid systems We suggest a hybrid approach involving a mechanistic model of the ❑ system augmented through machine learning techniques. Dynamical systems (e.g. incorporating first order and second order ❑ differential equations). Partial differential equations for systems with multiple inputs. ❑ (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 5 / 51

Latent variable model: definition Our approach can be seen as a type of latent variable model. ❑ Y = UW + E , where Y ∈ R N × D , U ∈ R N × Q , W ∈ R Q × D ( Q < D ) and E is a matrix variate white Gaussian noise with columns e : , d ∼ N ( 0 , Σ ) . In PCA and FA the common approach to deal with the unknowns is to ❑ integrate out U under a Gaussian prior and optimize with respect to W . (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 6 / 51

Latent variable model: alternative view Data with temporal nature and Gaussian (Markov) prior for rows of U ❑ leads to the Kalman filter/smoother. Consider a joint distribution for p ( U | t ) , t = [ t 1 . . . t N ] ⊤ , with the form of a ❑ Gaussian process (GP), Q � � � p ( U | t ) = N u : , q | 0 , K u : , q , u : , q . q = 1 The latent variables are random functions, { u q ( t ) } Q q = 1 with associated covariance K u : , q , u : , q . The GP for Y can be readily implemented. In [TSJ05] this is known as a ❑ semi-parametric latent factor model (SLFM). (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 7 / 51

Latent force model: mechanistic interpretation (1) We include a further dynamical system with a mechanistic inspiration. ❑ Reinterpret equation Y = UW + E , as a force balance equation ❑ YB = US + � E , where S ∈ R Q × D is a matrix of sensitivities, B ∈ R D × D is diagonal matrix � � of spring constants, W = SB − 1 and � 0 , B ⊤ Σ B . e : , d ∼ N (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 8 / 51

Latent force model: mechanistic interpretation (2) y d ( t ) B d U ( t ) (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 9 / 51

Latent force model: mechanistic interpretation (2) S d 1 u 1 ( t ) S d 2 u 2 ( t ) U ( t ) S dQ u Q ( t ) (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 9 / 51

Latent force model: mechanistic interpretation (2) S d 1 u 1 ( t ) y d ( t ) B d S d 2 u 2 ( t ) U ( t ) S dQ u Q ( t ) YB = US + � E (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 9 / 51

Latent force model: extension (1) The model can be extended including dampers and masses. ❑ We can write ❑ YB + ˙ YC + ¨ YM = US + � E , where Y is the first derivative of Y w.r.t. time ˙ Y is the second derivative of Y w.r.t. time ¨ C is a diagonal matrix of damping coefficients M is a diagonal matrix of masses � E is a matrix variate white Gaussian noise. (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 10 / 51

Latent force model: extension (2) y d ( t ) B d U ( t ) m d C d (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 11 / 51

Latent force model: extension (2) S d 1 u 1 ( t ) S d 2 u 2 ( t ) U ( t ) S dQ u Q ( t ) (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 11 / 51

Latent force model: extension (2) S d 1 u 1 ( t ) y d ( t ) B d S d 2 u 2 ( t ) U ( t ) m d C d S dQ u Q ( t ) YB + ˙ YC + ¨ YM = US + � E (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 11 / 51

Latent force model: properties This model allows to include behaviors like inertia and resonance. ❑ We refer to these systems as latent force models (LFMs). ❑ One way of thinking of our model is to consider puppetry. ❑ (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 12 / 51

Second Order Dynamical System Using the system of second order differential equations Q d 2 y d ( t ) � d y d ( t ) m d + C d + B d y d ( t ) = S dq u q ( t ) , d t 2 d t q = 1 where u q ( t ) latent forces y d ( t ) displacements over time C d damper constant for the d -th output B d spring constant for the d -th output m d mass constant for the d -th output S dq sensitivity of the d -th output to the q -th input. (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 13 / 51

Second Order Dynamical System: solution Solving for y d ( t ) , we obtain Q � y d ( t ) = L dq [ u q ]( t ) , q = 1 where the linear operator is given by a convolution: � t L dq [ u q ]( t ) = S dq exp ( − α d ( t − τ )) sin ( ω d ( t − τ )) u q ( τ ) d τ, ω d 0 � with ω d = 4 B d − C 2 d / 2 and α d = C d / 2. (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 14 / 51

Second Order Dynamical System: covariance matrix Behaviour of the system summarized by the damping ratio: ζ d = 1 � 2 C d / B d ζ d > 1 overdamped system ζ d = 1 critically damped system ζ d < 1 underdamped system ζ d = 0 undamped system (no friction) f(t) 0.8 Example covariance matrix: 0.6 y 1 (t) 0.4 ζ 1 = 0 . 125 underdamped 0.2 ζ 2 = 2 overdamped y 2 (t) 0 ζ 3 = 1 critically damped −0.2 y 3 (t) −0.4 f(t) y 1 (t) y 2 (t) y 3 (t) (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 15 / 51

Second Order Dynamical System: samples from GP 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 0 5 10 15 20 Joint samples from the ODE covariance, cyan : u ( t ) , red : y 1 ( t ) (underdamped) and green : y 2 ( t ) (overdamped) and blue : y 3 ( t ) (critically damped). (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 16 / 51

Second Order Dynamical System: samples from GP 2 1.5 1 0.5 0 −0.5 −1 0 5 10 15 20 Joint samples from the ODE covariance, cyan : u ( t ) , red : y 1 ( t ) (underdamped) and green : y 2 ( t ) (overdamped) and blue : y 3 ( t ) (critically damped). (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 16 / 51

Second Order Dynamical System: samples from GP 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 0 5 10 15 20 Joint samples from the ODE covariance, cyan : u ( t ) , red : y 1 ( t ) (underdamped) and green : y 2 ( t ) (overdamped) and blue : y 3 ( t ) (critically damped). (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 16 / 51

Motion Capture Data (1) CMU motion capture data, motions 18, 19 and 20 from subject 49. ❑ Motions 18 and 19 for training and 20 for testing. ❑ (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 17 / 51

Motion Capture Data (2) The data down-sampled by 32 (from 120 frames per second to 3.75). ❑ We focused on the subject’s left arm. ❑ For testing, we condition only on the observations of the shoulder’s ❑ orientation (motion 20) to make predictions for the rest of the arm’s angles. (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 18 / 51

Motion Capture Results Root mean squared (RMS) angle error for prediction of the left arm’s configuration in the motion capture data. Prediction with the latent force model outperforms the prediction with regression for all apart from the radius’s angle. Latent Force Regression Angle Error Error Radius 4.11 4.02 Wrist 6.65 6.55 Hand X rotation 3.21 1.82 Hand Z rotation 6.14 2.76 Thumb X rotation 3.10 1.77 Thumb Z rotation 6.09 2.73 (University of Manchester) Prior Knowledge and Sparse Methods 12/12/2009 19 / 51

Prior Knowledge and Sparse Methods for Convolved Multiple Outputs - PowerPoint PPT Presentation

Prior Knowledge and Sparse Methods for Convolved Multiple Outputs Gaussian Processes Mauricio A. lvarez Joint work with Neil D. Lawrence, David Luengo and Michalis K. Titsias School of Computer Science University of Manchester (University of

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.1 Direct Methods

Using Prior Knowledge Ji Kubalk jiri.kubalik@cvut.cz Symbolic Regression Using Prior

R. Martijn van der Plas 1 How to use prior knowledge in defining a control strategy? Some

Control Strategy EMA, London; 23 November 2017 1 EMA Prior Knowledge Workshop Case Study:

Control Strategy EMA, London; 23 November 2017 1 EMA Prior Knowledge Workshop Case study Use of

Extremal results for sparse pseudorandom graphs Yufei Zhao Massachusetts Institute of Technology

Sparse tensors are a natural way of representing real-world data 1 Sparse tensors are a natural

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

CNBC Matlab Mini-Course Sparse Matrices Sparse matrices provide an efficient means to store

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Machine Learning and Sparsity Klaus-Robert Mller !!et al.!! Todays Talk sensing, sparse

MMSE Approximation for the Sparse Prior Using Stochastic Resonance Dror Simon Computer Science,

Sparse Feature Learning Philipp Koehn 3 March 2015 Philipp Koehn Machine Translation: Sparse

Diet and Behaviour Myth or Science? Janice M. Joneja, Ph.D. Janice M. Joneja, Ph.D.

Cache Memories 15-213: Introduc;on to Computer Systems 12 th

IC220 Caching 2: Memory Hierarchy (more from Chapter 5 - specifically 5.7, 5.8) 1 Cache design

SC B1 Approach for testing various options of EHV DC Transmission cable systems Pierre Argaut

Introduction to Solar Energy Photovoltaics Week 1.3 Arno Smets Solar cell operation front

Dynamical spin properties in helical Luttinger liquids Maura Sassetti in collaboration with

Complex tensor order and quantum criticality in half-Heusler superconductors Igor Boettcher

Seeing what we hear: Electromagne2c Counterparts of Gravita2onal