Recurrent Structures in System Identification Ant onio H. Ribeiro - PowerPoint PPT Presentation

One-step-ahead Prediction vs Free-run Simulation System Identification Procedure Nonlinear Difference Equation y [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3]; Θ ) . One-step-ahead Prediction Free-run Simulation Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 14 / 56

One-step-ahead Prediction vs Free-run Simulation System Identification Procedure Nonlinear Difference Equation � � y [ k ] = F y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3]; Θ . One-step-ahead Prediction Free-run Simulation Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 14 / 56

Parameter Estimation Prediction Error Methods General Framework Noise model ⇒ Optimal Predictor: ˆ y [ k ] = E { y [ k ] | k − 1 } Compute errors: e [ k ] = ˆ y [ k ] − y [ k ] Find parameter Θ such the sum of square errors is minimized: � � e [ k ] � 2 min Θ k Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 15 / 56 Figure: Parameter estimation framework.

NARX Model Prediction Error Methods NARX (Nonlinear AutoRegressive with eXogenous input) model. True system y [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3]; Θ )+ v [ k ] . �� white noise Optimal Predictor One-step-ahead prediction: ˆ y 1 [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3]; Θ ) . Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 16 / 56

NARX Model Prediction Error Methods Figure: NARX model prediction error. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 17 / 56

NOE Model Prediction Error Methods NOE (Nonlinear Output Error) model. True system y ∗ [ k ] = F ( y ∗ [ k − 1] , y ∗ [ k − 2] , y ∗ [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3]; Θ ) , y [ k ] = y ∗ [ k ] + w [ k ] . �� white noise Optimal Predictor Free-run simulation: ˆ y s [ k ] = F (ˆ y s [ k − 1] , ˆ y s [ k − 2] , ˆ y s [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3]; Θ ) . Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 18 / 56

NOE Model Prediction Error Methods Figure: NOE model prediction error. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 19 / 56

NARMAX Model Prediction Error Methods NARMAX (Nonlinear AutoRegressive Moving Average with eXogenous input) model. True system y [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3] , v [ k − 1] , v [ k − 2] , v [ k − 3]; Θ ) + v [ k ] . �� white noise Optimal Predictor ˆ y v [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3] , y [ k − 1] − ˆ y [ k − 1] , y [ k − 2] − ˆ y [ k − 2] , y [ k − 3] − ˆ y [ k − 3]; Θ ) . Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 20 / 56

NARMAX Model Prediction Error Methods NARMAX (Nonlinear AutoRegressive Moving Average with eXogenous input) model. True system y [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3] , v [ k − 1] , v [ k − 2] , v [ k − 3] ; Θ ) + v [ k ] . �� white noise Optimal Predictor ˆ y v [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3] , y [ k − 1] − ˆ y [ k − 1] , y [ k − 2] − ˆ y [ k − 2] , y [ k − 3] − ˆ y [ k − 3]; Θ ) . Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 20 / 56

NARMAX Model Prediction Error Methods NARMAX (Nonlinear AutoRegressive Moving Average with eXogenous input) model. True system y [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3] , v [ k − 1] , v [ k − 2] , v [ k − 3] ; Θ ) + v [ k ] . �� white noise Optimal Predictor ˆ y v [ k ] = F ( y [ k − 1] , y [ k − 2] , y [ k − 3] , u [ k − 1] , u [ k − 2] , u [ k − 3] , y [ k − 1] − ˆ y [ k − 1] , y [ k − 2] − ˆ y [ k − 2] , y [ k − 3] − ˆ y [ k − 3] ; Θ ) . Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 20 / 56

NARMAX Model Prediction Error Methods Figure: NARMAX model prediction error. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 21 / 56

Recurrent Structures in System Identification Motivation for this Dissertation Figure: Prediction depends only on measured values. Chalenges Unboundedness; Multiple Minima. Figure: Predictor has a recurrent structure. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 22 / 56

Nonlinear Least Squares Problem Nonlinear Least Squares Minimizing the sum of squared errors: Θ V ( Θ ) = � e ( Θ ) � 2 min Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 23 / 56

Objective Function Derivatives Nonlinear Least Squares Derivatives: ∇ V ( Θ ) = J ( Θ ) T e ( Θ ) , N e � � � ∇ 2 V ( Θ ) = J T ( Θ ) J ( Θ ) + ∇ 2 e i ( Θ ) e i ( Θ ) . i =1 Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 24 / 56

Algorithms Nonlinear Least Squares Iterative Algorithms. Starting in Θ 0 updates the solution: Θ n +1 = Θ n + ∆ Θ n Gauss-Newton: � � − 1 J ( Θ ) T e ( Θ ) J T ( Θ ) J ( Θ ) ∆ Θ = − µ �� step lenght Hessian approx. grad. Levenberg-Marquardt: � � − 1 J ( Θ ) T e ( Θ ) J T ( Θ ) J ( Θ ) ∆ Θ = − + λ D � �� Hessian approx. grad. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 25 / 56

“Parallel Training Considered Harmful?” Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 26 / 56

Parallel vs Series-parallel Training “Parallel Training Considered Harmful?” Parallel training ⇒ NOE model; Series-parallel training ⇒ NARX model. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 27 / 56

Literature Review “Parallel Training Considered Harmful?” Series-parallel training alleged advantages Series-parallel to be preferred [Narendra and Parthasarathy, 1990]: 1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results should not be significantly different; 4 More accurate inputs to the neural network during training. * Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 28 / 56

References “Parallel Training Considered Harmful?” Narendra, K. S. and Parthasarathy, K. (1990). Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks , 1(1):4–27. Zhang, D.-y., Sun, L.-p., and Cao, J. (2006). Modeling of temperature-humidity for wood drying based on time-delay neural network. Journal of Forestry Research , 17(2):141–144. Singh, M., Singh, I., and Verma, A. (2013). Identification on non linear series-parallel model using neural network. MIT Int. J. Electr. Instrumen. Eng , 3(1):21–23. Beale, M. H., Hagan, M. T., and Demuth, H. B. (2017). Neural network toolbox for use with MATLAB. Technical report, Mathworks. Diaconescu, E. (2008). The use of NARX neural networks to predict chaotic time series. WSEAS Transactions on Computer Research , 3(3):182–191. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 29 / 56

References “Parallel Training Considered Harmful?” Saad, M., Bigras, P., Dessaint, L.-A., and Al-Haddad, K. (1994). Adaptive robot control using neural networks. IEEE Transactions on Industrial Electronics , 41(2):173–181. Saggar, M., Meri¸ cli, T., Andoni, S., and Miikkulainen, R. (2007). System identification for the Hodgkin-Huxley model using artificial neural networks. In Neural Networks, 2007. IJCNN 2007. International Joint Conference on , pages 2239–2244. IEEE. Warwick, K. and Craddock, R. (1996). An introduction to radial basis functions for system identification. a comparison with other neural network methods. In Decision and Control, 1996., Proceedings of the 35th IEEE Conference on , volume 1, pages 464–469. IEEE. Kami´ nnski, W., Strumitto, P., and Tomczak, E. (1996). Genetic algorithms and artificial neural networks for description of thermal deterioration processes. Drying Technology , 14(9):2117–2133. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 30 / 56

References “Parallel Training Considered Harmful?” Rahman, M. F., Devanathan, R., and Kuanyi, Z. (2000). Neural network approach for linearizing control of nonlinear process plants. IEEE Transactions on Industrial Electronics , 47(2):470–477. c, E., ´ c, ˇ c, V., ´ Petrovi´ Cojbaˇ si´ Z., Risti´ c-Durrant, D., Nikoli´ Ciri´ c, I., and Mati´ c, S. (2013). Kalman filter and NARX neural network for robot vision based human tracking. Facta Universitatis, Series: Automatic Control And Robotics , 12(1):43–51. Tijani, I. B., Akmeliawati, R., Legowo, A., and Budiyono, A. (2014). Nonlinear identification of a small scale unmanned helicopter using optimized NARX network with multiobjective differential evolution. Engineering Applications of Artificial Intelligence , 33:99–115. Khan, E. A., Elgamal, M. A., and Shaarawy, S. M. (2015). Forecasting the number of muslim pilgrims using NARX neural networks with a comparison study with other modern methods. British Journal of Mathematics & Computer Science , 6(5):394. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 31 / 56

Dynamic Systems Present During Identification Parallel Training and Unbounded Signals The following dynamic systems are present during the system identification procedure: 1 True System; 2 Predictor ; 3 Estimated Model. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 33 / 56

Dynamic Systems Present During Identification Parallel Training and Unbounded Signals The following dynamic systems are present during the system identification procedure: 1 True System; Predictor ; 2 3 Estimated Model. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 33 / 56

Feedforward Network Neural Network Training Figure: Three-layer feedforward network. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 34 / 56

Computational Cost per Stage Complexity Analysis Stage - Levenberg-Marquardt Series-parallel Parallel Compute error vector e O ( N · N w ) O ( N · N w ) O ( N · N Θ · N 2 Compute Jacobian matrix J O ( N · N w · N y ) y ) Parameter update O ( N · N 2 Θ + N 3 O ( N · N 2 Θ + N 3 Θ ) Θ ) � − 1 � J T e . J T J + λ D ∆ Θ = − Table: Complexity Analysis Relation N y < N 2 y < N w ≈ N Θ Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 36 / 56

Computational Cost per Stage Complexity Analysis Stage Series-parallel Parallel Compute error vector e O ( N · N w ) O ( N · N w ) O ( N · N Θ · N 2 Compute Jacobian matrix J O ( N · N w · N y ) y ) Parameter update O ( N · N 2 Θ + N 3 O ( N · N 2 Θ + N 3 Θ ) Θ ) � − 1 � J T e . J T J + λ D ∆ Θ = − Table: Complexity Analysis Relation N y < N 2 y < N w ≈ N Θ Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 36 / 56

Computational Cost per Stage Complexity Analysis Stage - Levenberg-Marquardt Series-parallel Parallel Compute error vector e O ( N · N w ) O ( N · N w ) O ( N · N Θ · N 2 Compute Jacobian matrix J O ( N · N w · N y ) y ) Parameter update O ( N · N 2 Θ + N 3 O ( N · N 2 Θ + N 3 Θ ) Θ ) � − 1 � J T e . J T J + λ D ∆ Θ = − Table: Complexity Analysis Relation N y < N 2 y < N w ≈ N Θ Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 36 / 56

Feedforward Network Neural Network Training Figure: Three-layer feedforward network. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 37 / 56

Computational Cost per Stage Complexity Analysis Stage Series-parallel Parallel Compute error vector e O ( N · N w ) O ( N · N w ) O ( N · N Θ · N 2 Compute Jacobian matrix J O ( N · N w · N y ) y ) Parameter update O ( N · N 2 Θ + N 3 O ( N · N 2 Θ + N 3 Θ ) Θ ) � − 1 � J T e . J T J + λ D ∆ Θ = − Table: Complexity Analysis Relation N y < N 2 y < N w ≈ N Θ Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 38 / 56

Computer Generated Example Comparing Parallel and Series-parallel Models Problem Statement Generate data using the following system: [Chen et al., 1990] (0 . 8 − 0 . 5exp( − y ∗ [ k − 1] 2 ) y ∗ [ k − 1] − y ∗ [ k ] = (0 . 3 + 0 . 9exp( − y ∗ [ k − 1] 2 ) y ∗ [ k − 2] + u [ k − 1] + 0 . 2 u [ k − 2] + 0 . 1 u [ k − 1] u [ k − 2] + v [ k ] y ∗ [ k ] + w [ k ] . y [ k ] = 10 nodes in the hidden layer; 800 samples for identification and 200 samples for validation; Compare error in validation window. Chen, S., Billings, S. A., and Grant, P. M. (1990). Non-linear system identification using neural networks. International Journal of Control , 51(6):1191–1214. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 40 / 56

Computer Generated Example Comparing Parallel and Series-parallel Models 1.0 1.0 0.8 0.8 0.6 0.6 MSE MSE 0.4 0.4 0.2 0.2 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 σ w σ v (a) σ v = 0; (b) σ w = 0; Figure: MSE (mean square error) vs noise levels on the validation window for parallel training ( • ) and series-parallel training ( × ). Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 41 / 56

Computer Generated Example Comparing Parallel and Series-parallel Models Table: Running time. Experiment Conditions Execution time Parallel Training Series-parallel Training N hidden N 10 1000 samples 3.7 s 3.1 s 30 1000 samples 6.4 s 5.7 s 10 5000 samples 14.6 s 11.0 s 30 5000 samples 18.5 s 17.5 s Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 42 / 56

Computer Generated Example Comparing Parallel and Series-parallel Models 10 4 LM CG BFGS k e s k 2 10 3 10 2 0 20 40 60 80 100 k Figure: Sum of squared simulation errors per epoch for: Levenberg-Marquardt (LM); Conjugate-gradient (CG); and, BFGS Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 43 / 56

Optimization Methods and Unboundedness Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 44 / 56

Gradient Descent Applied to Linear System Optimization Methods and Unboundedness First-Order Linear System y [ k ] = θ 1 ˆ ˆ y [ k − 1] + θ 2 u [ k − 1] Figure: Set of parameters ( θ 1 , θ 2 ) that yield a bounded solution ˆ y [ k ]. Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 45 / 56

Class of Algorithms that can cope with Unboundedness Optimization Methods and Unboundedness Trust-region methods; Levenberg-Marquardt; Backtrack line search; Pattern-Search; Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 46 / 56

Multiple Shooting Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 47 / 56

Motivation Shooting Methods for Parameter Estimation of Output Error Models Multiple Shooting Applications: Boundary values problems; 1 ODE parameter estimation; 2 Optimal control; 3 Escape local minima; Better numerical stability; Can be implemented in parallel. Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017) . Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 48 / 56

Single Shooting Shooting Methods for Parameter Estimation of Output Error Models Single Shooting Estimate NOE model solving: Θ � e s � 2 min Figure: The initial conditions are represented with circles � and subsequent simulated values with diamonds � . Antˆ onio H. Ribeiro (UFMG) Recurrent Structures in System Identification July 19, 2017 49 / 56

Recurrent Structures in System Identification Ant onio H. Ribeiro - PowerPoint PPT Presentation

Recurrent Structures in System Identification Ant onio H. Ribeiro Universidade Federal de Minas Gerais - UFMG Escola de Engenharia Programa de P os-Gradua c ao em Engenharia El etrica - PPGEE Supervisor: Luis A. Aguirre July 19,

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

Introduction CSCE CSCE 496/896 496/896 Lecture 6: Lecture 6: Recurrent Recurrent CSCE

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

In System Identification, System Identification: . . . Interval (and Fuzzy) Estimates Algorithm

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

RISK IDENTIFICATION Everything your competitor knows about Risk Identification on Software

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Introduction to Recurrent Neural Networks Jakob Verbeek Modeling sequential data with Recurrent

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

Time Domain Decomposition Methods Martin J. Gander martin.gander@math.unige.ch University of

Running Valgrind on multiple processors: a prototype Philippe Waroquiers FOSDEM 2015 valgrind

Interactive Parallel Computing with Python and IPython Brian Granger Research Scientist Tech-X

RUNNING CP2K IN PARALLEL ON ARCHER Iain Bethune (ibethune@epcc.ed.ac.uk) Overview

Time Evolution Time-evolution problems are widely solved in scientific Parareal Acceleration of

Parallel-in-Time Integration with PFASST From prototyping to applications June 5, 2019 Robert

On the parallel computation of invariant tori a, ` Enric Castell` Angel Jorba and Estrella

Flexible multibody dynamics: From FE formulations to control and optimization Olivier Brls