high dimensional estimation of nonlinear transformations
play

High-dimensional estimation of nonlinear transformations for - PowerPoint PPT Presentation

High-dimensional estimation of nonlinear transformations for Bayesian filtering Ricardo Baptista, Daniele Bigoni, Alessio Spantini, Youssef Marzouk Massachusetts Institute of Technology Department of Aeronautics & Astronautics 7th


  1. High-dimensional estimation of nonlinear transformations for Bayesian filtering Ricardo Baptista, Daniele Bigoni, Alessio Spantini, Youssef Marzouk Massachusetts Institute of Technology Department of Aeronautics & Astronautics 7th International Symposium on Data Assimilation Kobe, Japan January 23, 2019 Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 1 / 16

  2. Bayesian Approach to Filtering Non-Gaussian State-Space Model ◮ Model dynamics - transition kernel: x t ∼ f ( ·| x t − 1 ) ◮ Observations - likelihood model: y t ∼ g ( ·| x t ) x t − 1 x t + 1 x 0 x 1 x t y t − 1 y t + 1 y 1 y t Goal : Characterize filtering distributions π t | t := π ( x t | y 1 , . . . , y t ) Challenges of Filtering ◮ Complex nonlinear dynamics (e.g., chaotic system) ◮ Sparse observations in space and time ◮ Limited model evaluations available (e.g., small ensemble sizes) ◮ High-dimensional states, x t ∈ R d for d ∼ O ( 10 6 ) Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 2 / 16

  3. Stochastic Maps Algorithm [Spantini et al., 2019] Generalization of EnKF for Inference Step Find a nonlinear map T that couples forecast π t | t − 1 and analysis π t | t Main Idea ◮ Learn T given N ≪ d forecast samples x ( i ) ∼ π t | t − 1 t ◮ Generate analysis samples T ( x ( i ) t ) ∼ π t | t for i = 1 , . . . , N Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 3 / 16

  4. Building Block of Stochastic Maps Transport Maps [Moselhy et al., 2012] ◮ Deterministic coupling between densities π, η on R d such that π ( x ) = S # η ( x ) := η ◦ S ( x ) | det ( ∇ S ( x )) | ◮ Coupling exists and is unique for triangular and monotone maps   S 1 ( x 1 )   S 2 ( x 1 , x 2 )   S ( x ) =   . .   . S d ( x 1 , x 2 , . . . , x d ) ◮ For Gaussian η , find S by solving decoupled convex problems � 1 � 2 S k ( x ) 2 − log | ∂ k S k ( x ) | S D KL ( π || S # η ) min ⇔ min ∀ k E π S k Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 4 / 16

  5. Triangular Maps Enable Conditional Sampling ◮ Each component S k characterizes one marginal conditional of π π ( x ) = π ( x 1 ) π ( x 2 | x 1 ) · · · π ( x d | x 1 , . . . , x d − 1 ) ◮ For π ( y , x ) and η ( z 1 , z 2 ) , consider the triangular map � S y ( y ) � S ( y , x ) = S x ( y , x ) ◮ The map x �→ S x ( y ∗ , x ) pushes forward π ( x | y ∗ ) to η ( z 2 ) Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 5 / 16

  6. Stochastic Maps Algorithm Forecast Step Apply forward model to generate forecast ensemble x ( i ) ∼ f ( ·| x ( i ) t − 1 ) 1 t Analysis Step Perturbed observations : Sample y ( i ) ∼ g ( ·| x ( i ) t ) using forecast 1 t Estimate lower-triangular map � S that couples π y t , x t and N ( 0 , I ) 2 � � � S y ( y ) � S ( y , x ) = � S x ( y , x ) S x ( y ∗ , · ) − 1 ◦ � Compose maps � T ( y , x ) = � S x ( y , x ) 3 t ) ( i ) = � T ( y ( i ) t , x ( i ) Generate analysis ensemble ( x a t ) for i = 1 , . . . , N 4 Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 6 / 16

  7. Performance of Stochastic Maps Lorenz-96 Model ◮ d = 40 with F = 8, ∆ t obs = 0 . 4 and 20 observations ◮ Structure for S is based on tuning localization radius Challenge : Build adaptive estimators for S using N ≪ d samples Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 7 / 16

  8. Structure Inherited by Maps Theorem: Sparsity of Transport Maps [Spantini et al., 2018] Conditional independence of π defines functional dependence of S k ( x ) Lorenz-96 Model ◮ Estimate forecast covariance C t | t − 1 over 1000 assimilation cycles 10 9 8 7 6 5 4 3 2 1 Average C − 1 Sparsity of C − 1 t | t − 1 t | t − 1 Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 8 / 16

  9. Learning Transport Maps with Sparse Structure Key Idea Learn rather than impose sparsity in map’s parameters Linear Transport Maps ◮ Linear components: S ( x ) = Lx , with lower-triangular L ◮ Approximating density: π = S # η = N ( 0 , C ) where C − 1 = LL T Connection to Linear Regression ◮ Normalize diagonal: S k ( x ) = L kk ( β 1 x 1 + · · · + β k − 1 x k − 1 + x k ) ◮ Rewrite optimization problem for linear map parameters: � 1 � kk ( x 1 : k − 1 β + x k ) 2 − log | L kk | 2 L 2 min L kk > 0 , β E π ◮ Using samples from π : � � − 1 / 2 ˆ � N � x 1 : k − 1 ˆ 2 N � x 1 : k − 1 β + x k � 2 1 1 β + x k � 2 β ∈ arg min 2 , L kk = 2 β Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 9 / 16

  10. Learning Transport Maps with Sparse Structure Proposed Approach ◮ Add ℓ 1 -penalty for sparse linear regression (LASSO): ˆ 2 N � x 1 : k − 1 β + x k � 2 1 β ∈ arg min 2 + λ n � β � 1 β Existing Work in Filtering ◮ Learn bandwidth of inverse covariance ( C − 1 ) using BIC [Ueno, 2009] ◮ Add ℓ 1 -penalty to negative log-likelihood of C − 1 [Hou, 2016] ◮ Banding or tapering Cholesky factor of C − 1 [Nino-Ruiz, 2018] Maps Generalize to non-Gaussian Densities ◮ Parametrize monotone nonlinear maps using: � x k � S k ( x 1 , . . . , x k ) = β j ψ j ( x 1 : k − 1 ) + h α ( x 1 : k − 1 , t ) dt 0 j ◮ Add ℓ 1 -penalty to learn sparsity of β , α parameters Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 10 / 16

  11. Theoretical Performance Assumptions : sub-Gaussian density π and basis functions ψ j ( x ) Theorem [BZM] For polynomial maps of degree m with sparsity s , with high probability � � � � � s 2 m log k π ( x k | x 1 : k − 1 ) || � S # k η � E π D KL N Takeaways ◮ Accurate estimation is feasible in high-dimensions with N ≪ k ◮ From factorization property of density, error in conditionals ensures � s 2 m log d D KL ( π || � S # η ) � d N ◮ ℓ 2 regularization requires N = O ( k ) samples for each component Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 11 / 16

  12. Numerical Results ◮ Map components: S k ( x ) = � j β j ψ j ( x 1 : k − 1 ) + α k x k ◮ Solve ℓ 1 -penalized problem to estimate map coefficients ◮ Compare to oracle (known sparsity) and no regularization Total-order degree 2 Hermite basis ψ j with random coefficients: Error with increasing N Error with increasing d Accuracy extends to maps with nonlinear diagonal functions in practice Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 12 / 16

  13. Transport Maps for Posterior Inference Linear Gaussian Problem ◮ Prior : x ∼ N ( µ, Σ pr ) with exponential covariance ◮ Likelihood : Local observations y = Hx + ǫ with ǫ ∼ N ( 0 , Γ) Takeaway ◮ Learning sparse prior-to-posterior map matches oracle scaling Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 13 / 16

  14. Two Approaches for Posterior Sampling x | y ∗ ∼ � S x ) − 1 ◦ � x | y ∗ ∼ ( � T # π y , x for � T = ( � S x ) # η S x Takeaway ◮ Propagating forecast through composed maps has lower error Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 14 / 16

  15. Performance of Stochastic Maps Lorenz-96 Model ◮ d = 40 with F = 8, ∆ t obs = 0 . 4 and 20 observations Takeaway ◮ Best estimators adapt complexity to extract information from samples Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 15 / 16

  16. Conclusion and Outlook Summary ◮ Learned sparse transport maps for prior-to-posterior transformations ◮ Regularization via map sparsity extends to the nonlinear case ◮ Demonstrated log dependence of sample size on dimension Outlook on Future Work ◮ Exploration of sparse nonlinear transports in filtering applications ◮ Relate approximation errors to RMSE and metrics on distributions Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 16 / 16

  17. Conclusion and Outlook Summary ◮ Learned sparse transport maps for prior-to-posterior transformations ◮ Regularization via map sparsity extends to the nonlinear case ◮ Demonstrated log dependence of sample size on dimension Outlook on Future Work ◮ Exploration of sparse nonlinear transports in filtering applications ◮ Relate approximation errors to RMSE and metrics on distributions Thank You Supported by the Air Force Office of Scientific Research Baptista ( rsb@mit.edu ) Estimating Transformations in Filtering 16 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend