Stan: Probabilistic Modeling Language, MCMC Sampler, and Optimizer - - PowerPoint PPT Presentation

stan probabilistic modeling language mcmc sampler and
SMART_READER_LITE
LIVE PREVIEW

Stan: Probabilistic Modeling Language, MCMC Sampler, and Optimizer - - PowerPoint PPT Presentation

Stan: Probabilistic Modeling Language, MCMC Sampler, and Optimizer Development Team: Andrew Gelman, Bob Carpenter , Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, Allen Riddell MCMski 2014


slide-1
SLIDE 1

Stan: Probabilistic Modeling Language, MCMC Sampler, and Optimizer

Development Team: Andrew Gelman, Bob Carpenter, Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, Allen Riddell

MCMski 2014 mc-stan.org

slide-2
SLIDE 2

Goals / Aims

  • Scalability

– model complexity, number of parameters, data size

  • Efficiency

– fast iterations, low memory, high effective sample sizes

  • Robustness

– numerical routines, model structure (i.e., posterior geometry)

  • Usability

– general purpose, clear modeling language, integration (R, Python, command line), expose log prob & gradients/Hessians & I/O

slide-3
SLIDE 3

History

  • Derived from BUGS
  • declarative → imperative
  • untyped → strong static typing
  • Gibbs sampling → adaptive (R)HMC & optimization
  • interpreted → compiled
  • restrictive licenses (proprietary/GPL) → liberal (BSD)
slide-4
SLIDE 4

Technical Implementation

  • Model Specification

– (trans) data, (trans) parameters, log prob, generated quantities

  • Sampling via Adaptive Hamiltonian Monte Carlo

– warmup converges & estimates mass matrix and step size – (Geo)NUTS adapts number of steps

  • Optimization via BFGS Quasi-Newton
  • Translated to C++ with Template Metaprogramming

– constraints to transforms + Jacobians; declarations to I/O – automatic differentitation for gradients & Hessians – custom probability and special functions

slide-5
SLIDE 5

Strengths

  • high effective sample size/second (HMC / RHMC)
  • expressive language vs. BUGS; extensible like JAGS
  • extensive doc & example models
  • active, helpful user community
  • large, diverse development team
  • integrated into R, Python, command-line (shell)
  • reusable template lib (auto-diff, distributions & funs, models)
slide-6
SLIDE 6

Limitations

  • no discrete parameters (can marginalize)
  • no implicit missing data (code as parameters)
  • not parallelized within chains
  • language limited relative to black boxes (cf., emcee)
  • limited data types and constraints
  • C++ template code is complex for user extension
  • sampling slow, nonscalable; optimization brittle or approx
slide-7
SLIDE 7

Current and Future Development

  • (stiff) diff eq solving by integration
  • Riemann manifold HMC (more complex geometry)
  • approximate inference: [stochastic] VB, EP

, max marginal

  • structured matrices: Cholesky correlation, sparse
  • L-BFGS optimization (more scalable)
  • more robust adaptation (cross chain?)
  • parallelization within and across chains
  • better probabilistic testing for correctness
  • faster, cleaner C++ code & more useful interfaces
slide-8
SLIDE 8

How Stan Got its Name

  • “Stan” is not an acronym; Gelman mashed up
  • 1. Eminem song about a stalker fan, and
  • 2. Stanislaw Ulam (1909–1984), co-inventor of

Monte Carlo method (and hydrogen bomb).

Ulam holding the Fermiac, Enrico Fermi’s physical Monte Carlo simulator for random neutron diffusion