State Space Expectation Propagation Efficient Inference Schemes for - PowerPoint PPT Presentation

IJ Aalto University State Space Expectation Propagation Efficient Inference Schemes for Temporal Gaussian Processes William Wilkinson ∗ , Paul Chang ∗ , Michael Riis Andersen † , Arno Solin ∗ Aalto University ∗ , Technical University of Denmark † ICML 2020

Motivation • We’re interested in long temporal and spatio-temporal data with interesting non-conjugate GP models (e.g. classification, log-Gaussian Cox processes). • Idea: We should treat the temporal dimension in a fundamentally different manner to other dimensions. State Space Expectation Propagation Wilkinson et. al. 1/10

Approximate Inference in Temporal GPs There exists a dual kernel / SDE form for most popular Gaussian process (GP) models � � 0 , K θ ( t , t ′ ) f ( t ) ∼ GP f k = A θ, k f k − 1 + q k , q k ∼ N ( 0 , Q k ) , y k ∼ p ( y k | f ( t k )) y k = h ( f k , σ k ) , σ k ∼ N ( 0 , Σ k ) State Space Expectation Propagation Wilkinson et. al. 2/10

Approximate Inference in Temporal GPs There exists a dual kernel / SDE form for most popular Gaussian process (GP) models � � 0 , K θ ( t , t ′ ) f ( t ) ∼ GP f k = A θ, k f k − 1 + q k , q k ∼ N ( 0 , Q k ) , y k ∼ p ( y k | f ( t k )) y k = h ( f k , σ k ) , σ k ∼ N ( 0 , Σ k ) inference in O ( n ) via Kalman filtering and smoothing State Space Expectation Propagation Wilkinson et. al. 2/10

Approximate Inference Kalman filter update step: p ( f k | y 1 : k ) ∝ N ( m predict , P predict ) p ( y k | f ( t k )) k k 2 f ( t ) 0 − 2 0 50 100 150 200 250 300 time - t State Space Expectation Propagation Wilkinson et. al. 3/10

Approximate Inference Kalman filter update step: p ( f k | y 1 : k ) ∝ N ( m predict , P predict ) p ( y k | f ( t k )) k k ≈ N ( m predict , P predict k , P site ) N ( m site k ) k k � �� “site” 2 f ( t ) 0 − 2 0 50 100 150 200 250 300 time - t State Space Expectation Propagation Wilkinson et. al. 3/10

Approximate Inference Kalman filter update step: Approx. Inference: p ( f k | y 1 : k ) ∝ N ( m predict , P predict ) p ( y k | f ( t k )) k k select parameters ← ≈ N ( m predict , P predict ) N ( m site k , P site k ) k k 2 f ( t ) 0 − 2 0 50 100 150 200 250 300 time - t State Space Expectation Propagation Wilkinson et. al. 3/10

Approximate Inference 2 f ( t ) 0 − 2 0 50 100 150 200 250 300 time - t State Space Expectation Propagation Wilkinson et. al. 3/10

Approximate Inference Smoothing: • update posterior with future observations, p ( f k | y 1 : N ) = N ( m post. , P post. ) k k 2 f ( t ) 0 − 2 0 50 100 150 200 250 300 time - t State Space Expectation Propagation Wilkinson et. al. 3/10

Approximate Inference Smoothing: • update posterior with future observations, p ( f k | y 1 : N ) = N ( m post. , P post. ) k k Our Contribution: Given marginal posterior N ( m post . , P post . ) , we show k k how approximate inference amounts to a simple site parameter update rule during smoothing. 2 f ( t ) 0 − 2 0 50 100 150 200 250 300 time - t State Space Expectation Propagation Wilkinson et. al. 3/10

Approximate Inference Smoothing: • update posterior with future observations, p ( f k | y 1 : N ) = N ( m post. , P post. ) k k Our Contribution: Given marginal posterior N ( m post . , P post . ) , we show k k how approximate inference amounts to a simple site parameter update rule during smoothing. This encompasses: • Power Expectation Propagation • Variational Inference (with natural gradients) • Extended Kalman Smoothing • Unscented / Gauss-Hermite Kalman Smoothing • Posterior Linearisation State Space Expectation Propagation Wilkinson et. al. 3/10

Parameter Update Rules for ∇ L k = d L k d m k Power Expectation Propagation: q cavity ( f k ) = q post. ( f k ) / q α site ( f k ) � � L k = log E q cavity p α ( y k | f k ) � � − 1 � P cavity P site ∇ 2 L k = − α + � k k � − 1 ∇ L k = m cavity m site ∇ 2 L k − � k k State Space Expectation Propagation Wilkinson et. al. 4/10

Parameter Update Rules for ∇ L k = d L k d m k Power Expectation Propagation: Variational Inference: q cavity ( f k ) = q post. ( f k ) / q α site ( f k ) � � L k = log E q cavity p α ( y k | f k ) � � L k = E q post. log p ( y k | f k ) � � − 1 � P cavity � − 1 P site ∇ 2 L k P site = − α + � � ∇ 2 L k = − k k k � − 1 ∇ L k � − 1 ∇ L k = m post. m site � ∇ 2 L k = m cavity − m site ∇ 2 L k − � k k k k State Space Expectation Propagation Wilkinson et. al. 4/10

Parameter Update Rules for ∇ L k = d L k d m k Power Expectation Propagation: Variational Inference: q cavity ( f k ) = q post. ( f k ) / q α site ( f k ) � � L k = log E q cavity p α ( y k | f k ) � � L k = E q post. log p ( y k | f k ) � � − 1 � P cavity � − 1 P site ∇ 2 L k P site = − α + � � ∇ 2 L k = − k k k � − 1 ∇ L k � − 1 ∇ L k = m post. m site � ∇ 2 L k = m cavity − m site ∇ 2 L k − � k k k k Extended Kalman Smoother: = y k − h ( m post . v k , 0 ) k f P post . S k = H ⊤ H f + H σ Σ k H ⊤ k σ � − 1 � � − 1 � P site H ⊤ H σ Σ k H ⊤ = H f k f σ = m post. + P post . m site + ( P site ) H ⊤ f S − 1 v k k k k k k for H f = d h d f and H σ = d h d σ , σ k ∼ N ( 0 , Σ k ) State Space Expectation Propagation Wilkinson et. al. 4/10

A Unifying Perspective • For sequential data, the EKF / UKF / GHKF are equivalent to single-sweep EP where the moment matching is solved via linearisation. State Space Expectation Propagation Wilkinson et. al. 5/10

A Unifying Perspective • For sequential data, the EKF / UKF / GHKF are equivalent to single-sweep EP where the moment matching is solved via linearisation. • The iterated Kalman smoothers (EKS / UKS / GHKS) can also be recovered under certain parameter choices. But note that they optimise a different objective to EP (see paper for details). State Space Expectation Propagation Wilkinson et. al. 5/10

A Unifying Perspective • For sequential data, the EKF / UKF / GHKF are equivalent to single-sweep EP where the moment matching is solved via linearisation. • The iterated Kalman smoothers (EKS / UKS / GHKS) can also be recovered under certain parameter choices. But note that they optimise a different objective to EP (see paper for details). • We show how natural gradient VI updates are surprisingly similar to the EP updates (when using a similar parametrisation). State Space Expectation Propagation Wilkinson et. al. 5/10

New Algorithms • We propose to mix the beneficial properties of EP with the efficiency of classical smoothers. State Space Expectation Propagation Wilkinson et. al. 6/10

State Space Expectation Propagation Efficient Inference Schemes for - PowerPoint PPT Presentation

IJ Aalto University State Space Expectation Propagation Efficient Inference Schemes for Temporal Gaussian Processes William Wilkinson , Paul Chang , Michael Riis Andersen , Arno Solin Aalto University , Technical University of

more on expectation 1 2 properties of expectation properties of expectation Linearity, II

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

CS70: Jean Walrand: Lecture 27. Expectation; Conditional Expectation; B(n, p); G(p) 1. Review of

Expectation Will Perkins January 21, 2013 Expectation Definition The expectation of a random

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Foundations of Computer Science Lecture 20 Expected Value of a Sum Linearity of Expectation

Foundations of Computer Science Lecture 20 Expected Value of a Sum Linearity of Expectation

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Lecture no: 2 Short on dB calculations Basics about antennas Propagation mechanisms

Geometric Sound Transmission Micah Taylor Overview Geometric propagation Very fast Can be

Amateur Radio License Propagation and Antennas Todays Topics Propagation Antennas

RADIO PROPAGATION MODELS 1 Radio Propagation Models 1 Path Loss Free Space Loss

Partial-Order Planning 1 State-Space vs. Plan-Space State-space ( situation space ) planning

Deepire: First Experiments with Neural Guidance in Vampire Martin Suda Czech Technical

What the other 85% of V1 is doing Bruno A. Olshausen Helen Wills Neuroscience Institute School

LoraWAN Technology Luka Mustafa, Institute IRNAS, November 2018 IRNAS.EU CC BY-SA 4.0 LoRa

Tuning the Receiver Structure and the Pilot-to-Data Power Ratio in Multiple Input Multiple Output

Nested sampling with demons Michael Habeck Max Planck Institute for Biophysical Chemistry and

Logical Foundations of Cyber-Physical Systems Andr Platzer Andr Platzer (CMU) LFCPS/14:

1 & 2 Samuel Series Lesson #062 September 6, 2016 Dean Bible Ministries

Eph. 6:11, Put on the full armor of God, so that you will be able to stand firm against the

State Space Expectation Propagation Efficient Inference Schemes for - PowerPoint PPT Presentation

IJ Aalto University State Space Expectation Propagation Efficient Inference Schemes for Temporal Gaussian Processes William Wilkinson , Paul Chang , Michael Riis Andersen , Arno Solin Aalto University , Technical University of

more on expectation 1 2 properties of expectation properties of expectation Linearity, II

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

CS70: Jean Walrand: Lecture 27. Expectation; Conditional Expectation; B(n, p); G(p) 1. Review of

Expectation Will Perkins January 21, 2013 Expectation Definition The expectation of a random

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Foundations of Computer Science Lecture 20 Expected Value of a Sum Linearity of Expectation

Foundations of Computer Science Lecture 20 Expected Value of a Sum Linearity of Expectation

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Lecture no: 2 Short on dB calculations Basics about antennas Propagation mechanisms

Geometric Sound Transmission Micah Taylor Overview Geometric propagation Very fast Can be

Amateur Radio License Propagation and Antennas Todays Topics Propagation Antennas

RADIO PROPAGATION MODELS 1 Radio Propagation Models 1 Path Loss Free Space Loss

Partial-Order Planning 1 State-Space vs. Plan-Space State-space ( situation space ) planning

Deepire: First Experiments with Neural Guidance in Vampire Martin Suda Czech Technical

What the other 85% of V1 is doing Bruno A. Olshausen Helen Wills Neuroscience Institute School

LoraWAN Technology Luka Mustafa, Institute IRNAS, November 2018 IRNAS.EU CC BY-SA 4.0 LoRa

Tuning the Receiver Structure and the Pilot-to-Data Power Ratio in Multiple Input Multiple Output

Nested sampling with demons Michael Habeck Max Planck Institute for Biophysical Chemistry and

Logical Foundations of Cyber-Physical Systems Andr Platzer Andr Platzer (CMU) LFCPS/14:

1 &amp; 2 Samuel Series Lesson #062 September 6, 2016 Dean Bible Ministries

Eph. 6:11, Put on the full armor of God, so that you will be able to stand firm against the

1 & 2 Samuel Series Lesson #062 September 6, 2016 Dean Bible Ministries