Numerical Gaussian Processes (Physics Informed Learning Machines) - - PowerPoint PPT Presentation

numerical gaussian processes
SMART_READER_LITE
LIVE PREVIEW

Numerical Gaussian Processes (Physics Informed Learning Machines) - - PowerPoint PPT Presentation

Numerical Gaussian Processes (Physics Informed Learning Machines) Maziar Raissi Division of Applied Mathematics, Brown University, Providence, RI, USA maziar_raissi@brown.edu June 7, 2017 Probabilistic Numerics v.s. Numerical Gaussian


slide-1
SLIDE 1

Numerical Gaussian Processes

(Physics Informed Learning Machines)

Maziar Raissi Division of Applied Mathematics, Brown University, Providence, RI, USA maziar_raissi@brown.edu

June 7, 2017

slide-2
SLIDE 2

1

Probabilistic Numerics v.s. Numerical Gaussian Processes

Probabilistic numerics aim to capitalize on the recent developments in probabilistic machine learning to revisit classical methods in numerical analysis and mathematical physics from a statistical inference point of view.

Maziar Raissi | Numerical Gaussian Processes

slide-3
SLIDE 3

1

Probabilistic Numerics v.s. Numerical Gaussian Processes

Probabilistic numerics aim to capitalize on the recent developments in probabilistic machine learning to revisit classical methods in numerical analysis and mathematical physics from a statistical inference point of view. This is exciting. However, it would be even more exciting if we could do the exact opposite.

Maziar Raissi | Numerical Gaussian Processes

slide-4
SLIDE 4

1

Probabilistic Numerics v.s. Numerical Gaussian Processes

Probabilistic numerics aim to capitalize on the recent developments in probabilistic machine learning to revisit classical methods in numerical analysis and mathematical physics from a statistical inference point of view. This is exciting. However, it would be even more exciting if we could do the exact opposite. Numerical Gaussian processes aim to capitalize on the long-standing developments of classical methods in numerical analysis and revisits machine leaning from a mathematical physics point of view.

Maziar Raissi | Numerical Gaussian Processes

slide-5
SLIDE 5

2

Physics Informed Learning Machines

Numerical Gaussian processes enable the construction of data-efficient learning machines that can encode physical conservation laws as structured prior information.

Maziar Raissi | Numerical Gaussian Processes

slide-6
SLIDE 6

2

Physics Informed Learning Machines

Numerical Gaussian processes enable the construction of data-efficient learning machines that can encode physical conservation laws as structured prior information. Numerical Gaussian processes are essentially physics informed learning machines.

Maziar Raissi | Numerical Gaussian Processes

slide-7
SLIDE 7

3

Content

Motivating Example Introduction to Gaussian Processes Prior Training Posterior Numerical Gaussian Processes Burgers’ Equation – Nonlinear PDEs Backward Euler Prior Training Posterior General Framework Linear Multi-step Methods Runge-Kutta Methods Experiments Navier-Stokes Equations

Maziar Raissi | Numerical Gaussian Processes

slide-8
SLIDE 8

3

Motivating Example

Maziar Raissi | Numerical Gaussian Processes

slide-9
SLIDE 9

4

Road Networks

Consider a 2 × 2 junction as shown below. 1 2 3 4 Roads have length Li, i = 1, 2, 3, 4.

Maziar Raissi | Numerical Gaussian Processes

slide-10
SLIDE 10

5

Road Networks

Maziar Raissi | Numerical Gaussian Processes

slide-11
SLIDE 11

6

Hyperbolic Conservation Law

The road traffic densities ρi(t, x) ∈ [0, 1] satisfy the one-dimensional hyperbolic conservation law ∂tρi + ∂xf(ρi) = 0,

  • n [0, T] × [0, Li].

Here, f(ρ) = ρ(1 − ρ).

Maziar Raissi | Numerical Gaussian Processes

slide-12
SLIDE 12

7

Black Box Initial Conditions

The densities must satisfy the initial conditions ρi(0, x) = ρ0

i (x),

where ρ0

i (x) are black-box functions. This means that ρ0 i (x) are

  • bservable only through noisy measurements {x0

i , ρ0 i }.

Maziar Raissi | Numerical Gaussian Processes

slide-13
SLIDE 13

7

Introduction to Gaussian Processes

Maziar Raissi | Numerical Gaussian Processes

slide-14
SLIDE 14

8

Gaussian Processes

A Gaussian process f(x) ∼ GP(0, k(x, x′; θ)), is just a shorthand notation for f(x) f(x′)

  • ∼ N(
  • ,

k(x, x; θ) k(x, x′; θ) k(x′, x; θ) k(x′, x′; θ)

  • .

Maziar Raissi | Numerical Gaussian Processes

slide-15
SLIDE 15

9

Gaussian Processes

A Gaussian process f(x) ∼ GP(0, k(x, x′; θ)), is just a shorthand notation for f(x) f(x′)

  • ∼ N(
  • ,

k(x, x; θ) k(x, x′; θ) k(x′, x; θ) k(x′, x′; θ)

  • .

Maziar Raissi | Numerical Gaussian Processes

slide-16
SLIDE 16

10

Squared Exponential Covariance Function

A typical example for the kernel k(x, x′; θ) is the squared exponential covariance function, i.e., k(x, x′; θ) = γ2 exp

  • −1

2w2(x − x′)2

  • ,

where θ = (γ, w) are the hyper-parameters of the kernel.

Maziar Raissi | Numerical Gaussian Processes

slide-17
SLIDE 17

11

Training

Given a dataset {x, y} of size N, the hyper-parameters θ and the noise variance parameter σ2 can be trained by minimizing the negative log marginal likelihood NLML(θ, σ) = 1 2yTK −1y + 1 2 log |K| + N 2 log(2π), resulting from y ∼ N(0, K), where K = k(x, x; θ) + σ2I.

Maziar Raissi | Numerical Gaussian Processes

slide-18
SLIDE 18

12

Prediction

Having trained the hyper-parameters and parameters of the model,

  • ne can use the posterior distribution

f(x∗)|y ∼ N(k(x∗, x)K −1y, k(x∗, x∗) − k(x∗, x)K −1k(x, x∗)). to make predictions at a new test point x∗.

Maziar Raissi | Numerical Gaussian Processes

slide-19
SLIDE 19

12

Prediction

Having trained the hyper-parameters and parameters of the model,

  • ne can use the posterior distribution

f(x∗)|y ∼ N(k(x∗, x)K −1y, k(x∗, x∗) − k(x∗, x)K −1k(x, x∗)). to make predictions at a new test point x∗. This is obtained by writing the joint distribution f(x∗) y

  • ∼ N(
  • ,

k(x∗, x) k(x∗, x) k(x, x∗) K

  • .

Maziar Raissi | Numerical Gaussian Processes

slide-20
SLIDE 20

13

Example

Code Maziar Raissi | Numerical Gaussian Processes

slide-21
SLIDE 21

13

Numerical Gaussian Processes

Maziar Raissi | Numerical Gaussian Processes

slide-22
SLIDE 22

14

Numerical Gaussian Processes

Definition

Numerical Gaussian processes are Gaussian processes with covariance functions resulting from temporal discretization of time-dependent partial differential equations.

Maziar Raissi | Numerical Gaussian Processes

slide-23
SLIDE 23

15

Example: Burgers’ Equation

Burgers’ equation is a fundamental non-linear partial differential equation arising in various areas of applied mathematics, including fluid mechanics, nonlinear acoustics, gas dynamics, and traffic flow.

Maziar Raissi | Numerical Gaussian Processes

slide-24
SLIDE 24

15

Example: Burgers’ Equation

Burgers’ equation is a fundamental non-linear partial differential equation arising in various areas of applied mathematics, including fluid mechanics, nonlinear acoustics, gas dynamics, and traffic flow. In one space dimension the Burgers’ equation reads as ut + uux = νuxx, along with Dirichlet boundary conditions u(t, −1) = u(t, 1) = 0, where u(t, x) denotes the unknown solution and ν = 0.01/π is a viscosity parameter.

Maziar Raissi | Numerical Gaussian Processes

slide-25
SLIDE 25

16

Problem Setup

Burgers’ Equation

Let us assume that all we observe are noisy measurements {x0, u0}

  • f the black-box initial function u(0, x).

Given such measurements, we would like to solve the Burgers’ equation while propagating through time the uncertainty associated with the noisy initial data.

Maziar Raissi | Numerical Gaussian Processes

slide-26
SLIDE 26

17

Burgers’ equation

Movie Code Maziar Raissi | Numerical Gaussian Processes

slide-27
SLIDE 27

18

Burgers’ equation

It is remarkable that the proposed methodology can effectively propagate an infinite collection of correlated Gaussian random variables (i.e., a Gaussian process) through the complex nonlinear dynamics of the Burgers’ equation.

Maziar Raissi | Numerical Gaussian Processes

slide-28
SLIDE 28

19

Backward Euler

Burgers’ Equation

Let us apply the backward Euler scheme to the Burgers’ equation. This can be written as un + ∆tun d dx un − ν∆t d2 dx2 un = un−1.

Maziar Raissi | Numerical Gaussian Processes

slide-29
SLIDE 29

20

Backward Euler

Burgers’ Equation

Let us apply the backward Euler scheme to the Burgers’ equation. This can be written as un + ∆tµn−1 d dx un − ν∆t d2 dx2 un = un−1.

Maziar Raissi | Numerical Gaussian Processes

slide-30
SLIDE 30

21

Prior Assumption

Burger’s Equation

Let us make the prior assumption that un(x) ∼ GP(0, k(x, x′; θ)), is a Gaussian process with a neural network covariance function k(x, x′; θ) = 2 π sin−1   2(σ2

0 + σ2xx′)

  • (1 + 2
  • σ2

0 + σ2x2)

  • (1 + 2
  • σ2

0 + σ2x′2)

 , where θ =

  • σ2

0, σ2

denotes the hyper-parameters.

Maziar Raissi | Numerical Gaussian Processes

slide-31
SLIDE 31

22

Numerical Gaussian Process

Burgers’ Equation – Backward Euler

This enables us to obtain the following Numerical Gaussian Process

  • un

un−1

  • ∼ GP
  • 0,
  • kn,n

u,u

kn,n−1

u,u

kn−1,n−1

u,u

  • .

Maziar Raissi | Numerical Gaussian Processes

slide-32
SLIDE 32

23

Kernels

Burgers’ Equation – Backward Euler

The covariance functions for the Burgers’ equation example are given by kn,n

u,u = k,

kn,n−1

u,u

= k + ∆tµn−1(x′) d dx′ k − ν∆t d2 dx′2 k.

Maziar Raissi | Numerical Gaussian Processes

slide-33
SLIDE 33

23

Kernels

Burgers’ Equation – Backward Euler

The covariance functions for the Burgers’ equation example are given by kn,n

u,u = k,

kn,n−1

u,u

= k + ∆tµn−1(x′) d dx′ k − ν∆t d2 dx′2 k. Compare this with un + ∆tµn−1 d dx un − ν∆t d2 dx2 un = un−1.

Maziar Raissi | Numerical Gaussian Processes

slide-34
SLIDE 34

24

Kernels

Burgers’ Equation – Backward Euler

kn−1,n−1

u,u

= k + ∆tµn−1(x′) d dx′ k − ν∆t d2 dx′2 k, + ∆tµn−1(x) d dx k + ∆t2µn−1(x)µn−1(x′) d dx d dx′ k − ν∆t2µn−1(x) d dx d2 dx′2 k − ν∆t d2 dx2 k − ν∆t2µn−1(x′) d2 dx2 d dx′ k + ν2∆t2 d2 dx2 d2 dx′2 k.

Maziar Raissi | Numerical Gaussian Processes

slide-35
SLIDE 35

25

Training

Burgers’ Equation – Backward Euler

The hyper-parameters θ and the noise parameters σ2

n, σ2 n−1 can be

trained by employing the Negative Log Marginal Likelihood resulting from

  • un

b

un−1

  • ∼ N (0, K) ,

where {xn

b , un b} are the (noisy) data on the boundary and

{xn−1, un−1} are artificially generated data. Here, K =

  • kn,n

u,u(xn b , xn b ; θ) + σ2 nI

kn,n−1

u,u

(xn

b , xn−1; θ)

kn−1,n

u,u

(xn−1, xn

b ; θ)

kn,n−1

u,u

(xn−1, xn−1; θ) + σ2

n−1I

  • Maziar Raissi | Numerical Gaussian Processes
slide-36
SLIDE 36

26

Prediction & Propagating Uncertainty

Burgers’ Equation – Backward Euler

In order to predict un(xn

∗) at a new test point xn ∗, we use the following

conditional distribution un(xn

∗) | un b ∼ N (µn(xn ∗), Σn,n(xn ∗, xn ∗)) ,

where µn(xn

∗) = qTK −1

  • un

b

µn−1

  • ,

and Σn,n(xn

∗, xn ∗) = kn,n u,u(xn ∗, xn ∗) − qTK −1q+qTK −1

Σn−1,n−1

  • K −1q.

Here, qT =

  • kn,n

u,u(xn ∗, xn b ) kn,n−1 u,u

(xn

∗, xn−1)

  • .

Maziar Raissi | Numerical Gaussian Processes

slide-37
SLIDE 37

27

Artificial data

Burgers’ Equation – Backward Euler

Now, one can use the resulting posterior distribution to obtain the artificially generated data {xn, un} for the next time step with un ∼ N (µn, Σn,n) . Here, µn = µn(xn) and Σn,n = Σn,n(xn, xn).

Maziar Raissi | Numerical Gaussian Processes

slide-38
SLIDE 38

28

Noiseless data

Movie Code Maziar Raissi | Numerical Gaussian Processes

slide-39
SLIDE 39

28

General Framework

Maziar Raissi | Numerical Gaussian Processes

slide-40
SLIDE 40

29

Numerical Gaussian Processes

It must be emphasized that numerical Gaussian processes, by construction, are designed to deal with cases where:

◮ (1) all we observe is noisy data on black-box initial conditions,

and

◮ (2) we are interested in quantifying the uncertainty associated

with such noisy data in our solutions to time-dependent partial differential equations.

Maziar Raissi | Numerical Gaussian Processes

slide-41
SLIDE 41

30

General Framework

Numerical Gaussian Processes

Let us consider linear partial differential equations of the form ut = Lxu, x ∈ Ω, t ∈ [0, T], where Lx is a linear operator and u(t, x) denotes the latent solution.

Maziar Raissi | Numerical Gaussian Processes

slide-42
SLIDE 42

30

Linear Multi-step Methods

Maziar Raissi | Numerical Gaussian Processes

slide-43
SLIDE 43

31

Linear Multi-step Methods

Trapezoidal Rule

The trapezoidal time-stepping scheme can be written as un − 1 2∆tLxun = un−1 + 1 2∆tLxun−1

Maziar Raissi | Numerical Gaussian Processes

slide-44
SLIDE 44

32

Linear Multi-step Methods

Trapezoidal Rule

The trapezoidal time-stepping scheme can be written as un − 1 2∆tLxun =: un−1/2 := un−1 + 1 2∆tLxun−1

Maziar Raissi | Numerical Gaussian Processes

slide-45
SLIDE 45

33

Numerical Gaussian Process

Trapezoidal Rule

By assuming un−1/2(x) ∼ GP(0, k(x, x′; θ)), we can capture the entire structure of the trapezoidal rule in the resulting joint distribution of un and un−1.

Maziar Raissi | Numerical Gaussian Processes

slide-46
SLIDE 46

33

Runge-Kutta Methods

Maziar Raissi | Numerical Gaussian Processes

slide-47
SLIDE 47

34

Runge-Kutta Methods

Trapezoidal Rule

The trapezoidal time-stepping scheme can be written as un = un−1 + 1 2∆tLxun−1 + 1 2∆tLxun un = un.

Maziar Raissi | Numerical Gaussian Processes

slide-48
SLIDE 48

35

Runge-Kutta Methods

Trapezoidal Rule

The trapezoidal time-stepping scheme can be written as un

2

= un−1 + 1 2∆tLxun−1 + 1 2∆tLxun un

1

= un.

Maziar Raissi | Numerical Gaussian Processes

slide-49
SLIDE 49

36

Numerical Gaussian Process

Trapezoidal Rule – Runge-Kutta Methods

By assuming un(x) ∼ GP(0, kn,n(x, x′; θn)), un−1(x) ∼ GP(0, kn+1,n+1(x, x′; θn+1)), we can capture the entire structure of the trapezoidal rule in the resulting joint distribution of un, un−1, un

2, and un

  • 1. Here,

un

2 = un 1 = un.

Maziar Raissi | Numerical Gaussian Processes

slide-50
SLIDE 50

36

Experiments

Maziar Raissi | Numerical Gaussian Processes

slide-51
SLIDE 51

37

Wave Equation – Trapezoidal Rule

Movie Code Maziar Raissi | Numerical Gaussian Processes

slide-52
SLIDE 52

38

Advection Equation – Gauss-Legendre

Movie Code Maziar Raissi | Numerical Gaussian Processes

slide-53
SLIDE 53

39

Heat Equation – Trapezoidal Rule

Movie Code Maziar Raissi | Numerical Gaussian Processes

slide-54
SLIDE 54

39

Navier-Stokes Equations

Maziar Raissi | Numerical Gaussian Processes

slide-55
SLIDE 55

40

Navier-Stokes Equations in 2D

Let us consider the Navier-Stokes equations in 2D given explicitly by ut + uux + vuy = −px + 1 Re(uxx + uyy), vt + uvx + vvy = −py + 1 Re(vxx + vyy), where the unknowns are the 2-dimensional velocity field (u(t, x, y), v(t, x, y)) and the pressure p(t, x, y). Here, Re denotes the Reynolds number.

Maziar Raissi | Numerical Gaussian Processes

slide-56
SLIDE 56

41

Continuity Equation

Solutions to the Navier-Stokes equations are searched in the set of divergence-free functions; i.e., ux + vy = 0. This extra equation is the continuity equation for incompressible fluids that describes the conservation of mass of the fluid.

Maziar Raissi | Numerical Gaussian Processes

slide-57
SLIDE 57

42

Backward Euler

Applying the backward Euler time stepping scheme to the Navier-Stokes equations we obtain un + ∆tun−1un

x + ∆tvn−1un y + ∆tpn x − ∆t

Re(un

xx + un yy) = un−1,

vn + ∆tun−1vn

x + ∆tvn−1vn y + ∆tpn y − ∆t

Re(vn

xx + vn yy) = vn−1,

where un(x, y) = u(tn, x, y).

Maziar Raissi | Numerical Gaussian Processes

slide-58
SLIDE 58

43

Divergence Free

We make the assumption that un = ψn

y,

vn = −ψn

x,

for some latent function ψn(x, y). Under this assumption, the continuity equation will be automatically satisfied. We proceed by placing a Gaussian process prior on ψn(x, y) ∼ GP (0, k((x, y), (x′, y′); θ)) , where θ are the hyper-parameters of the kernel k((x, y), (x′, y′); θ).

Maziar Raissi | Numerical Gaussian Processes

slide-59
SLIDE 59

44

Divergence Free Prior

This will result in the following multi-output Gaussian process un vn

  • ∼ GP
  • 0,

kn,n

u,u

kn,n

u,v

kn,n

v,u

kn,n

v,v

  • ,

where kn,n

u,u = ∂

∂y ∂ ∂y′ k, kn,n

u,v = − ∂

∂y ∂ ∂x′ k, kn,n

v,u = − ∂

∂x ∂ ∂y′ k, kn,n

v,v = ∂

∂x ∂ ∂x′ k. Any samples generated from this multi-output Gaussian process will satisfy the continuity equation.

Maziar Raissi | Numerical Gaussian Processes

slide-60
SLIDE 60

45

Pressure

Moreover, independent from ψn(x, y), we will place a Gaussian process prior on pn(x, y); i.e., pn(x, y) ∼ GP(0, kn,n

p,p ((x, y), (x′, y′); θp)).

Maziar Raissi | Numerical Gaussian Processes

slide-61
SLIDE 61

46

Numerical Gaussian Processes

Navier-Stokes equations

This will allow us to obtain the following numerical Gaussian process encoding the structure of the Navier-Stokes equations and the backward Euler time stepping scheme in its kernels; i.e.,       un vn pn un−1 vn−1       ∼ GP        0,        kn,n

u,u

kn,n

u,v

kn,n−1

u,u

kn,n−1

u,v

kn,n

v,v

kn,n−1

v,u

kn,n−1

v,v

kn,n

p,p

kn,n−1

p,u

kn,n−1

p,v

kn−1,n−1

u,u

kn−1,n−1

u,v

kn−1,n−1

v,v

              .

Maziar Raissi | Numerical Gaussian Processes

slide-62
SLIDE 62

47

Taylor-Green Vortex

1 2 3 4 5 6 x 1 2 3 4 5 6 y

Time: 1.000000, u error: 2.896727e-02, v error: 2.736063e-02

Maziar Raissi | Numerical Gaussian Processes

slide-63
SLIDE 63

48

Concluding Remarks

We have presented a novel machine learning framework for encoding physical laws described by partial differential equations into Gaussian process priors for nonparametric Bayesian regression. The proposed algorithms can be used to infer solutions to time-dependent and nonlinear partial differential equations, and effectively quantify and propagate uncertainty due to noisy initial or boundary data.

Maziar Raissi | Numerical Gaussian Processes

slide-64
SLIDE 64

Thank you!