Structured Inference Networks for Nonlinear State Space Models - - PowerPoint PPT Presentation

β–Ά
structured inference networks for nonlinear state space
SMART_READER_LITE
LIVE PREVIEW

Structured Inference Networks for Nonlinear State Space Models - - PowerPoint PPT Presentation

Structured Inference Networks for Nonlinear State Space Models Rahul G. Krishnan, Uri Shalit, David Sontag New York University 30 Sep 2016 Chris Cremer CSC2541 Nov 4 2016 Overview VAE Gaussian State Space Models Inference Network


slide-1
SLIDE 1

Structured Inference Networks for Nonlinear State Space Models

Rahul G. Krishnan, Uri Shalit, David Sontag New York University 30 Sep 2016

Chris Cremer CSC2541 Nov 4 2016

slide-2
SLIDE 2

Overview

  • VAE
  • Gaussian State Space Models
  • Inference Network
  • Results
slide-3
SLIDE 3

Recap - VAE

π‘Ÿ" 𝑨 𝑦 = π’ͺ(𝜈" 𝑦 , Ξ£"(𝑦)) π‘ž- 𝑦 𝑨 = π’ͺ 𝜈- 𝑨 , Ξ£- 𝑨 π‘ž-(𝑨) = π’ͺ(0,𝐽) Generative Model Recognition Network Use MLP to model the mean and covariance Learning and Inference –> Maximize Lower Bound

Reconstruction Loss Divergence from Prior Calculated by sampling π‘Ÿ" 𝑨 𝑦 with reparameterization trick Analytic equation

slide-4
SLIDE 4

Gaussian State Space Models

Generative Model

  • HMM with continuous hidden state
  • If transition and emission are linear Gaussian, then we

can do inference analytically (Kalman Filter)

  • Deep Markov Model:
  • Transition and emissions distributions are

parametrized by MLPs

  • Inference: VAE
slide-5
SLIDE 5

Inference – Factorized Lower Bound

Reconstruction Loss Divergence from Prior Calculated by sampling π‘Ÿ" 𝑨1 𝑦 βƒ— with reparameterization trick Analytic equation Divergence from Prior Analytic equation Reconstruction Loss Divergence from Prior Calculated by sampling π‘Ÿ" 𝑨 𝑦 with reparameterization trick Analytic equation

slide-6
SLIDE 6

Inference Networks

  • Evaluate possibilities for the inference networks
  • Mean-Field Model (MF) vs Structured Model (ST)
  • Observations from past (L), future (R), or both (LR)
  • Combiner Function: MLP that combines the previous state with the

RNN output

Deep KalmanSmoothing (ST-R)

slide-7
SLIDE 7

Inference Networks Results

Results:

  • ST-LR and DKS substantially outperform MF-LR and ST-L
  • Due to previous state (zt-1) and future observations(xt, …, xT)
  • zt-1 summarizes past observations (x1, …, xt)
  • DKS network has half the parameters of the ST-LR

Polyphonic music data (Boulanger-Lewandowski et al., 2012)

  • Sequence of 88-dimensional binary vectors corresponding to the notes of a piano
  • Report held-out negative log-likelihood (NLL)
slide-8
SLIDE 8

Model Comparison

Results:

  • Increasing the complexity of the generative model improves the likelihood (DMM vs DMM-Aug)
  • DMM-Aug (DKS) obtains better results on all datasets (except LV-RNN on JSB)
  • Demonstrates the inference network’s ability to learn powerful generative models

Held-out negative log-likelihood (NLL) DMM (DKS) DMM-Aug (DKS) HMSBN STORN TSBN LV-RNN (NASMC)

slide-9
SLIDE 9

EHR Patient Data

  • What would happen if the patient received diabetic medication or not?
slide-10
SLIDE 10

Conclusion

  • Structured Inference Networks for Nonlinear State Space Models

VAE for sequential data

slide-11
SLIDE 11

Questions?