Recurrent Structures in System Identification Ant onio H. Ribeiro - - PowerPoint PPT Presentation

recurrent structures in system identification
SMART_READER_LITE
LIVE PREVIEW

Recurrent Structures in System Identification Ant onio H. Ribeiro - - PowerPoint PPT Presentation

Recurrent Structures in System Identification Ant onio H. Ribeiro Universidade Federal de Minas Gerais - UFMG Escola de Engenharia Programa de P os-Gradua c ao em Engenharia El etrica - PPGEE Supervisor: Luis A. Aguirre July 19,


slide-1
SLIDE 1

Recurrent Structures in System Identification

Antˆ

  • nio H. Ribeiro

Universidade Federal de Minas Gerais - UFMG Escola de Engenharia Programa de P´

  • s-Gradua¸

c˜ ao em Engenharia El´ etrica - PPGEE

Supervisor: Luis A. Aguirre

July 19, 2017 Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 1 / 56

slide-2
SLIDE 2

Overview

1

Introduction

2

“Parallel Training Considered Harmful?”

3

Optimization Methods and Unboundedness

4

Multiple Shooting

5

Conclusion

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 2 / 56

slide-3
SLIDE 3

Introduction

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 3 / 56

slide-4
SLIDE 4

Problem Statement

What is System Identification?

Figure: The system identification problem.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 4 / 56

slide-5
SLIDE 5

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 5 / 56

slide-6
SLIDE 6

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 5 / 56

slide-7
SLIDE 7

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 5 / 56

slide-8
SLIDE 8

Dynamic Representation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F (y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ) .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 6 / 56

slide-9
SLIDE 9

Dynamic Representation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F (y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ) .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 6 / 56

slide-10
SLIDE 10

Dynamic Representation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3] , u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 6 / 56

slide-11
SLIDE 11

Dynamic Representation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3] ; Θ
  • .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 6 / 56

slide-12
SLIDE 12

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 7 / 56

slide-13
SLIDE 13

Dynamic Representation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F (y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ) .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 8 / 56

slide-14
SLIDE 14

Dynamic Representation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 8 / 56

slide-15
SLIDE 15

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 9 / 56

slide-16
SLIDE 16

Noise Model

System Identification Procedure

Output Error, Equation Error and Error-in-Variables

u[k] = u∗[k] + s[k], y∗[k] = F (y∗[k − 1], y∗[k − 2], y∗[k − 3], u[k − 1], u[k − 2], u[k − 3]) + v[k], y[k] = y∗[k] + w[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 10 / 56

slide-17
SLIDE 17

Noise Model

System Identification Procedure

Output Error, Equation Error and Error-in-Variables

u[k] = u∗[k] + s[k], y∗[k] = F (y∗[k − 1], y∗[k − 2], y∗[k − 3], u[k − 1], u[k − 2], u[k − 3]) + v[k], y[k] = y∗[k] + w[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 10 / 56

slide-18
SLIDE 18

Noise Model

System Identification Procedure

Output Error, Equation Error and Error-in-Variables

u[k] = u∗[k] + s[k], y∗[k] = F (y∗[k − 1], y∗[k − 2], y∗[k − 3], u[k − 1], u[k − 2], u[k − 3]) + v[k] , y[k] = y∗[k] + w[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 10 / 56

slide-19
SLIDE 19

Noise Model

System Identification Procedure

Output Error, Equation Error and Error-in-Variables

u[k] = u∗[k] + s[k], y∗[k] = F (y∗[k − 1], y∗[k − 2], y∗[k − 3], u[k − 1], u[k − 2], u[k − 3]) + v[k], y[k] = y∗[k] + w[k] .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 10 / 56

slide-20
SLIDE 20

Noise Model

System Identification Procedure

Output Error, Equation Error and Error-in-Variables

u[k] = u∗[k] + s[k] , y∗[k] = F (y∗[k − 1], y∗[k − 2], y∗[k − 3], u[k − 1], u[k − 2], u[k − 3]) + v[k], y[k] = y∗[k] + w[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 10 / 56

slide-21
SLIDE 21

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 11 / 56

slide-22
SLIDE 22

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 11 / 56

slide-23
SLIDE 23

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 11 / 56

slide-24
SLIDE 24

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 11 / 56

slide-25
SLIDE 25

Validation Data

System Identification Procedure

Figure: Comparison between measured data and predicted values

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 12 / 56

slide-26
SLIDE 26

Typical Steps

System Identification Procedure

1 Test design and data collection; 2 Choice of mathematical representation;

Dynamic representation; Approximation function; Noise model.

3 Choice of model order and structure; 4 Estimation of model parameters; 5 Model validation.

Validation data. One-step-ahead prediction vs free-run simulation.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 13 / 56

slide-27
SLIDE 27

One-step-ahead Prediction vs Free-run Simulation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F (y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ) .

One-step-ahead Prediction Free-run Simulation

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 14 / 56

slide-28
SLIDE 28

One-step-ahead Prediction vs Free-run Simulation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3] , u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

One-step-ahead Prediction Free-run Simulation

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 14 / 56

slide-29
SLIDE 29

One-step-ahead Prediction vs Free-run Simulation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3] , u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

One-step-ahead Prediction Free-run Simulation

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 14 / 56

slide-30
SLIDE 30

One-step-ahead Prediction vs Free-run Simulation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3] , u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

One-step-ahead Prediction Free-run Simulation

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 14 / 56

slide-31
SLIDE 31

One-step-ahead Prediction vs Free-run Simulation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3] , u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

One-step-ahead Prediction Free-run Simulation

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 14 / 56

slide-32
SLIDE 32

One-step-ahead Prediction vs Free-run Simulation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3] , u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

One-step-ahead Prediction Free-run Simulation

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 14 / 56

slide-33
SLIDE 33

One-step-ahead Prediction vs Free-run Simulation

System Identification Procedure

Nonlinear Difference Equation

y[k] = F

  • y[k − 1], y[k − 2], y[k − 3] , u[k − 1], u[k − 2], u[k − 3]; Θ
  • .

One-step-ahead Prediction Free-run Simulation

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 14 / 56

slide-34
SLIDE 34

Parameter Estimation

Prediction Error Methods

Figure: Parameter estimation framework.

General Framework

Noise model ⇒ Optimal Predictor: ˆ y[k] = E{y[k] | k − 1} Compute errors: e[k] = ˆ y[k] − y[k] Find parameter Θ such the sum of square errors is minimized: min

Θ

  • k

e[k]2

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 15 / 56

slide-35
SLIDE 35

Parameter Estimation

Prediction Error Methods

Figure: Parameter estimation framework.

General Framework

Noise model ⇒ Optimal Predictor: ˆ y[k] = E{y[k] | k − 1} Compute errors: e[k] = ˆ y[k] − y[k] Find parameter Θ such the sum of square errors is minimized: min

Θ

  • k

e[k]2

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 15 / 56

slide-36
SLIDE 36

Parameter Estimation

Prediction Error Methods

Figure: Parameter estimation framework.

General Framework

Noise model ⇒ Optimal Predictor: ˆ y[k] = E{y[k] | k − 1} Compute errors: e[k] = ˆ y[k] − y[k] Find parameter Θ such the sum of square errors is minimized: min

Θ

  • k

e[k]2

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 15 / 56

slide-37
SLIDE 37

NARX Model

Prediction Error Methods

NARX (Nonlinear AutoRegressive with eXogenous input) model.

True system

y[k] = F(y[k−1], y[k−2], y[k−3], u[k−1], u[k−2], u[k−3]; Θ)+ v[k]

  • white noise

.

Optimal Predictor

One-step-ahead prediction: ˆ y1[k] = F(y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 16 / 56

slide-38
SLIDE 38

NARX Model

Prediction Error Methods

Figure: NARX model prediction error.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 17 / 56

slide-39
SLIDE 39

NOE Model

Prediction Error Methods

NOE (Nonlinear Output Error) model.

True system

y∗[k] = F (y∗[k − 1], y∗[k − 2], y∗[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ) , y[k] = y∗[k] + w[k]

  • white noise

.

Optimal Predictor

Free-run simulation: ˆ ys[k] = F(ˆ ys[k − 1], ˆ ys[k − 2], ˆ ys[k − 3], u[k − 1], u[k − 2], u[k − 3]; Θ).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 18 / 56

slide-40
SLIDE 40

NOE Model

Prediction Error Methods

Figure: NOE model prediction error.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 19 / 56

slide-41
SLIDE 41

NARMAX Model

Prediction Error Methods

NARMAX (Nonlinear AutoRegressive Moving Average with eXogenous input) model.

True system

y[k] = F(y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3], v[k − 1], v[k − 2], v[k − 3]; Θ) + v[k]

  • white noise

.

Optimal Predictor

ˆ yv[k] = F(y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3], y[k − 1] − ˆ y[k − 1], y[k − 2] − ˆ y[k − 2], y[k − 3] − ˆ y[k − 3]; Θ).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 20 / 56

slide-42
SLIDE 42

NARMAX Model

Prediction Error Methods

NARMAX (Nonlinear AutoRegressive Moving Average with eXogenous input) model.

True system

y[k] = F(y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3], v[k − 1], v[k − 2], v[k − 3] ; Θ) + v[k]

  • white noise

.

Optimal Predictor

ˆ yv[k] = F(y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3], y[k − 1] − ˆ y[k − 1], y[k − 2] − ˆ y[k − 2], y[k − 3] − ˆ y[k − 3]; Θ).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 20 / 56

slide-43
SLIDE 43

NARMAX Model

Prediction Error Methods

NARMAX (Nonlinear AutoRegressive Moving Average with eXogenous input) model.

True system

y[k] = F(y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3], v[k − 1], v[k − 2], v[k − 3] ; Θ) + v[k]

  • white noise

.

Optimal Predictor

ˆ yv[k] = F(y[k − 1], y[k − 2], y[k − 3], u[k − 1], u[k − 2], u[k − 3], y[k − 1] − ˆ y[k − 1], y[k − 2] − ˆ y[k − 2], y[k − 3] − ˆ y[k − 3] ; Θ).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 20 / 56

slide-44
SLIDE 44

NARMAX Model

Prediction Error Methods

Figure: NARMAX model prediction error.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 21 / 56

slide-45
SLIDE 45

Recurrent Structures in System Identification

Motivation for this Dissertation

Figure: Prediction depends only on measured values. Figure: Predictor has a recurrent structure.

Chalenges

Unboundedness; Multiple Minima.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 22 / 56

slide-46
SLIDE 46

Nonlinear Least Squares Problem

Nonlinear Least Squares

Minimizing the sum of squared errors: min

Θ V (Θ) = e(Θ)2

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 23 / 56

slide-47
SLIDE 47

Objective Function Derivatives

Nonlinear Least Squares

Derivatives: ∇V (Θ) = J(Θ)T e(Θ), ∇2V (Θ) = JT(Θ)J(Θ) +

Ne

  • i=1

ei(Θ)

  • ∇2ei(Θ)
  • .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 24 / 56

slide-48
SLIDE 48

Objective Function Derivatives

Nonlinear Least Squares

Derivatives: ∇V (Θ) = J(Θ)T e(Θ) , ∇2V (Θ) = JT(Θ)J(Θ) +

Ne

  • i=1

ei(Θ)

  • ∇2ei(Θ)
  • .

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 24 / 56

slide-49
SLIDE 49

Algorithms

Nonlinear Least Squares

Iterative Algorithms. Starting in Θ0 updates the solution: Θn+1 = Θn + ∆Θn Gauss-Newton: ∆Θ = − µ

  • step lenght
  • JT(Θ)J(Θ)
  • Hessian approx.

−1 J(Θ)T e(Θ)

  • grad.

Levenberg-Marquardt: ∆Θ = −

  • JT(Θ)J(Θ)
  • Hessian approx.

+λD −1 J(Θ)T e(Θ)

  • grad.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 25 / 56

slide-50
SLIDE 50

Algorithms

Nonlinear Least Squares

Iterative Algorithms. Starting in Θ0 updates the solution: Θn+1 = Θn + ∆Θn Gauss-Newton: ∆Θ = − µ

  • step lenght
  • JT(Θ)J(Θ)
  • Hessian approx.

−1 J(Θ)T e(Θ)

  • grad.

Levenberg-Marquardt: ∆Θ = −

  • JT(Θ)J(Θ)
  • Hessian approx.

+λD −1 J(Θ)T e(Θ)

  • grad.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 25 / 56

slide-51
SLIDE 51

Algorithms

Nonlinear Least Squares

Iterative Algorithms. Starting in Θ0 updates the solution: Θn+1 = Θn + ∆Θn Gauss-Newton: ∆Θ = − µ

  • step lenght
  • JT(Θ)J(Θ)
  • Hessian approx.

−1 J(Θ)T e(Θ)

  • grad.

Levenberg-Marquardt: ∆Θ = −

  • JT(Θ)J(Θ)
  • Hessian approx.

+λD −1 J(Θ)T e(Θ)

  • grad.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 25 / 56

slide-52
SLIDE 52

“Parallel Training Considered Harmful?”

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 26 / 56

slide-53
SLIDE 53

Parallel vs Series-parallel Training

“Parallel Training Considered Harmful?”

Parallel training ⇒ NOE model; Series-parallel training ⇒ NARX model.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 27 / 56

slide-54
SLIDE 54

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 28 / 56

slide-55
SLIDE 55

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 28 / 56

slide-56
SLIDE 56

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 28 / 56

slide-57
SLIDE 57

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 28 / 56

slide-58
SLIDE 58

References

“Parallel Training Considered Harmful?” Narendra, K. S. and Parthasarathy, K. (1990). Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1(1):4–27. Zhang, D.-y., Sun, L.-p., and Cao, J. (2006). Modeling of temperature-humidity for wood drying based on time-delay neural network. Journal of Forestry Research, 17(2):141–144. Singh, M., Singh, I., and Verma, A. (2013). Identification on non linear series-parallel model using neural network. MIT Int. J. Electr. Instrumen. Eng, 3(1):21–23. Beale, M. H., Hagan, M. T., and Demuth, H. B. (2017). Neural network toolbox for use with MATLAB. Technical report, Mathworks. Diaconescu, E. (2008). The use of NARX neural networks to predict chaotic time series. WSEAS Transactions on Computer Research, 3(3):182–191.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 29 / 56

slide-59
SLIDE 59

References

“Parallel Training Considered Harmful?” Saad, M., Bigras, P., Dessaint, L.-A., and Al-Haddad, K. (1994). Adaptive robot control using neural networks. IEEE Transactions on Industrial Electronics, 41(2):173–181. Saggar, M., Meri¸ cli, T., Andoni, S., and Miikkulainen, R. (2007). System identification for the Hodgkin-Huxley model using artificial neural networks. In Neural Networks, 2007. IJCNN 2007. International Joint Conference on, pages 2239–2244. IEEE. Warwick, K. and Craddock, R. (1996). An introduction to radial basis functions for system identification. a comparison with other neural network methods. In Decision and Control, 1996., Proceedings of the 35th IEEE Conference on, volume 1, pages 464–469. IEEE. Kami´ nnski, W., Strumitto, P., and Tomczak, E. (1996). Genetic algorithms and artificial neural networks for description of thermal deterioration processes. Drying Technology, 14(9):2117–2133.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 30 / 56

slide-60
SLIDE 60

References

“Parallel Training Considered Harmful?” Rahman, M. F., Devanathan, R., and Kuanyi, Z. (2000). Neural network approach for linearizing control of nonlinear process plants. IEEE Transactions on Industrial Electronics, 47(2):470–477. Petrovi´ c, E., ´ Cojbaˇ si´ c, ˇ Z., Risti´ c-Durrant, D., Nikoli´ c, V., ´ Ciri´ c, I., and Mati´ c, S. (2013). Kalman filter and NARX neural network for robot vision based human tracking. Facta Universitatis, Series: Automatic Control And Robotics, 12(1):43–51. Tijani, I. B., Akmeliawati, R., Legowo, A., and Budiyono, A. (2014). Nonlinear identification of a small scale unmanned helicopter using optimized NARX network with multiobjective differential evolution. Engineering Applications of Artificial Intelligence, 33:99–115. Khan, E. A., Elgamal, M. A., and Shaarawy, S. M. (2015). Forecasting the number of muslim pilgrims using NARX neural networks with a comparison study with other modern methods. British Journal of Mathematics & Computer Science, 6(5):394.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 31 / 56

slide-61
SLIDE 61

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 32 / 56

slide-62
SLIDE 62

Dynamic Systems Present During Identification

Parallel Training and Unbounded Signals

The following dynamic systems are present during the system identification procedure:

1 True System; 2 Predictor ; 3 Estimated Model. Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 33 / 56

slide-63
SLIDE 63

Dynamic Systems Present During Identification

Parallel Training and Unbounded Signals

The following dynamic systems are present during the system identification procedure:

1 True System; 2

Predictor ;

3 Estimated Model. Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 33 / 56

slide-64
SLIDE 64

Feedforward Network

Neural Network Training

Figure: Three-layer feedforward network.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 34 / 56

slide-65
SLIDE 65

Feedforward Network

Neural Network Training

Figure: Three-layer feedforward network.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 34 / 56

slide-66
SLIDE 66

Feedforward Network

Neural Network Training

Figure: Three-layer feedforward network.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 34 / 56

slide-67
SLIDE 67

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 35 / 56

slide-68
SLIDE 68

Computational Cost per Stage

Complexity Analysis

Stage - Levenberg-Marquardt Series-parallel Parallel Compute error vector e O(N · Nw) O(N · Nw) Compute Jacobian matrix J O(N · Nw · Ny) O(N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O(N · N2

Θ + N3 Θ)

O(N · N2

Θ + N3 Θ)

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 36 / 56

slide-69
SLIDE 69

Computational Cost per Stage

Complexity Analysis

Stage Series-parallel Parallel Compute error vector e O( N · Nw) O( N · Nw) Compute Jacobian matrix J O( N · Nw · Ny) O( N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O( N · N2

Θ + N3 Θ)

O( N · N2

Θ + N3 Θ)

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 36 / 56

slide-70
SLIDE 70

Computational Cost per Stage

Complexity Analysis

Stage Series-parallel Parallel Compute error vector e O(N · Nw ) O(N · Nw ) Compute Jacobian matrix J O(N · Nw · Ny) O(N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O(N · N2

Θ + N3 Θ)

O(N · N2

Θ + N3 Θ)

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 36 / 56

slide-71
SLIDE 71

Computational Cost per Stage

Complexity Analysis

Stage Series-parallel Parallel Compute error vector e O(N · Nw) O(N · Nw) Compute Jacobian matrix J O(N · Nw · Ny) O(N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O(N · N2

Θ + N3 Θ )

O(N · N2

Θ + N3 Θ )

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 36 / 56

slide-72
SLIDE 72

Computational Cost per Stage

Complexity Analysis

Stage Series-parallel Parallel Compute error vector e O(N · Nw) O(N · Nw) Compute Jacobian matrix J O(N · Nw · Ny ) O(N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O(N · N2

Θ + N3 Θ)

O(N · N2

Θ + N3 Θ)

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 36 / 56

slide-73
SLIDE 73

Computational Cost per Stage

Complexity Analysis

Stage - Levenberg-Marquardt Series-parallel Parallel Compute error vector e O(N · Nw) O(N · Nw) Compute Jacobian matrix J O(N · Nw · Ny) O(N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O(N · N2

Θ + N3 Θ)

O(N · N2

Θ + N3 Θ)

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 36 / 56

slide-74
SLIDE 74

Feedforward Network

Neural Network Training

Figure: Three-layer feedforward network.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 37 / 56

slide-75
SLIDE 75

Computational Cost per Stage

Complexity Analysis

Stage Series-parallel Parallel Compute error vector e O(N · Nw) O(N · Nw) Compute Jacobian matrix J O(N · Nw · Ny) O(N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O(N · N2

Θ + N3 Θ)

O(N · N2

Θ + N3 Θ)

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 38 / 56

slide-76
SLIDE 76

Computational Cost per Stage

Complexity Analysis

Stage Series-parallel Parallel Compute error vector e O(N · Nw) O(N · Nw) Compute Jacobian matrix J O(N · Nw · Ny) O(N · NΘ · N2

y )

Parameter update

∆Θ = −

  • JTJ + λD

−1 JT e.

O(N · N2

Θ + N3 Θ)

O(N · N2

Θ + N3 Θ)

Table: Complexity Analysis

Relation

Ny < N2

y < Nw ≈ NΘ

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 38 / 56

slide-77
SLIDE 77

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 39 / 56

slide-78
SLIDE 78

Literature Review

“Parallel Training Considered Harmful?”

Series-parallel training alleged advantages

Series-parallel to be preferred [Narendra and Parthasarathy, 1990]:

1 Bounded signals; 2 Smaller computational cost; 3 Simulated output should tend to the real one, therefore the results

should not be significantly different;

4 More accurate inputs to the neural network during training. *

Ribeiro, A. H., and Aguirre, L. A. (2017) ”Parallel Training Considered Harmful?”: Comparing Series-Parallel and Parallel Feedforward Network Training. arXiv preprint arXiv:1706.07119.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 39 / 56

slide-79
SLIDE 79

Computer Generated Example

Comparing Parallel and Series-parallel Models

Problem Statement

Generate data using the following system: [Chen et al., 1990]

y ∗[k] = (0.8 − 0.5exp(−y ∗[k − 1]2)y ∗[k − 1] − (0.3 + 0.9exp(−y ∗[k − 1]2)y ∗[k − 2] + u[k − 1] + 0.2u[k − 2] + 0.1u[k − 1]u[k − 2] + v[k] y[k] = y ∗[k] + w[k].

10 nodes in the hidden layer; 800 samples for identification and 200 samples for validation; Compare error in validation window.

Chen, S., Billings, S. A., and Grant, P. M. (1990). Non-linear system identification using neural networks. International Journal of Control, 51(6):1191–1214.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 40 / 56

slide-80
SLIDE 80

Computer Generated Example

Comparing Parallel and Series-parallel Models

Problem Statement

Generate data using the following system: [Chen et al., 1990]

y ∗[k] = (0.8 − 0.5exp(−y ∗[k − 1]2)y ∗[k − 1] − (0.3 + 0.9exp(−y ∗[k − 1]2)y ∗[k − 2] + u[k − 1] + 0.2u[k − 2] + 0.1u[k − 1]u[k − 2] + v[k] y[k] = y ∗[k] + w[k].

10 nodes in the hidden layer; 800 samples for identification and 200 samples for validation; Compare error in validation window.

Chen, S., Billings, S. A., and Grant, P. M. (1990). Non-linear system identification using neural networks. International Journal of Control, 51(6):1191–1214.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 40 / 56

slide-81
SLIDE 81

Computer Generated Example

Comparing Parallel and Series-parallel Models

Problem Statement

Generate data using the following system: [Chen et al., 1990]

y ∗[k] = (0.8 − 0.5exp(−y ∗[k − 1]2)y ∗[k − 1] − (0.3 + 0.9exp(−y ∗[k − 1]2)y ∗[k − 2] + u[k − 1] + 0.2u[k − 2] + 0.1u[k − 1]u[k − 2] + v[k] y[k] = y ∗[k] + w[k].

10 nodes in the hidden layer; 800 samples for identification and 200 samples for validation; Compare error in validation window.

Chen, S., Billings, S. A., and Grant, P. M. (1990). Non-linear system identification using neural networks. International Journal of Control, 51(6):1191–1214.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 40 / 56

slide-82
SLIDE 82

Computer Generated Example

Comparing Parallel and Series-parallel Models

Problem Statement

Generate data using the following system: [Chen et al., 1990]

y ∗[k] = (0.8 − 0.5exp(−y ∗[k − 1]2)y ∗[k − 1] − (0.3 + 0.9exp(−y ∗[k − 1]2)y ∗[k − 2] + u[k − 1] + 0.2u[k − 2] + 0.1u[k − 1]u[k − 2] + v[k] y[k] = y ∗[k] + w[k].

10 nodes in the hidden layer; 800 samples for identification and 200 samples for validation; Compare error in validation window.

Chen, S., Billings, S. A., and Grant, P. M. (1990). Non-linear system identification using neural networks. International Journal of Control, 51(6):1191–1214.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 40 / 56

slide-83
SLIDE 83

Computer Generated Example

Comparing Parallel and Series-parallel Models

0.0 0.2 0.4 0.6 0.8 1.0

σw

0.0 0.2 0.4 0.6 0.8 1.0

MSE

(a) σv = 0; 0.0 0.2 0.4 0.6 0.8 1.0

σv

0.0 0.2 0.4 0.6 0.8 1.0

MSE

(b) σw = 0; Figure: MSE (mean square error) vs noise levels on the validation window for parallel training (•) and series-parallel training (×).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 41 / 56

slide-84
SLIDE 84

Computer Generated Example

Comparing Parallel and Series-parallel Models

Table: Running time.

Experiment Conditions Execution time Nhidden N Parallel Training Series-parallel Training 10 1000 samples 3.7 s 3.1 s 30 1000 samples 6.4 s 5.7 s 10 5000 samples 14.6 s 11.0 s 30 5000 samples 18.5 s 17.5 s

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 42 / 56

slide-85
SLIDE 85

Computer Generated Example

Comparing Parallel and Series-parallel Models

20 40 60 80 100

k

102 103 104

kesk2

LM CG BFGS

Figure: Sum of squared simulation errors per epoch for: Levenberg-Marquardt (LM); Conjugate-gradient (CG); and, BFGS

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 43 / 56

slide-86
SLIDE 86

Optimization Methods and Unboundedness

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 44 / 56

slide-87
SLIDE 87

Gradient Descent Applied to Linear System

Optimization Methods and Unboundedness

First-Order Linear System

ˆ y[k] = θ1ˆ y[k − 1] + θ2u[k − 1]

Figure: Set of parameters (θ1, θ2) that yield a bounded solution ˆ y[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 45 / 56

slide-88
SLIDE 88

Gradient Descent Applied to Linear System

Optimization Methods and Unboundedness

First-Order Linear System

ˆ y[k] = θ1ˆ y[k − 1] + θ2u[k − 1]

Figure: Set of parameters (θ1, θ2) that yield a bounded solution ˆ y[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 45 / 56

slide-89
SLIDE 89

Gradient Descent Applied to Linear System

Optimization Methods and Unboundedness

First-Order Linear System

ˆ y[k] = θ1ˆ y[k − 1] + θ2u[k − 1]

Figure: Set of parameters (θ1, θ2) that yield a bounded solution ˆ y[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 45 / 56

slide-90
SLIDE 90

Gradient Descent Applied to Linear System

Optimization Methods and Unboundedness

First-Order Linear System

ˆ y[k] = θ1ˆ y[k − 1] + θ2u[k − 1]

Figure: Set of parameters (θ1, θ2) that yield a bounded solution ˆ y[k].

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 45 / 56

slide-91
SLIDE 91

Class of Algorithms that can cope with Unboundedness

Optimization Methods and Unboundedness

Trust-region methods; Levenberg-Marquardt; Backtrack line search; Pattern-Search;

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 46 / 56

slide-92
SLIDE 92

Multiple Shooting

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 47 / 56

slide-93
SLIDE 93

Motivation

Shooting Methods for Parameter Estimation of Output Error Models

Multiple Shooting

Applications:

1

Boundary values problems;

2

ODE parameter estimation;

3

Optimal control;

Escape local minima; Better numerical stability; Can be implemented in parallel.

Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 48 / 56

slide-94
SLIDE 94

Motivation

Shooting Methods for Parameter Estimation of Output Error Models

Multiple Shooting

Applications:

1

Boundary values problems;

2

ODE parameter estimation;

3

Optimal control;

Escape local minima; Better numerical stability; Can be implemented in parallel.

Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 48 / 56

slide-95
SLIDE 95

Motivation

Shooting Methods for Parameter Estimation of Output Error Models

Multiple Shooting

Applications:

1

Boundary values problems;

2

ODE parameter estimation;

3

Optimal control;

Escape local minima; Better numerical stability; Can be implemented in parallel.

Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 48 / 56

slide-96
SLIDE 96

Motivation

Shooting Methods for Parameter Estimation of Output Error Models

Multiple Shooting

Applications:

1

Boundary values problems;

2

ODE parameter estimation;

3

Optimal control;

Escape local minima; Better numerical stability; Can be implemented in parallel.

Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 48 / 56

slide-97
SLIDE 97

Motivation

Shooting Methods for Parameter Estimation of Output Error Models

Multiple Shooting

Applications:

1

Boundary values problems;

2

ODE parameter estimation;

3

Optimal control;

Escape local minima; Better numerical stability; Can be implemented in parallel.

Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 48 / 56

slide-98
SLIDE 98

Motivation

Shooting Methods for Parameter Estimation of Output Error Models

Multiple Shooting

Applications:

1

Boundary values problems;

2

ODE parameter estimation;

3

Optimal control;

Escape local minima; Better numerical stability; Can be implemented in parallel.

Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 48 / 56

slide-99
SLIDE 99

Motivation

Shooting Methods for Parameter Estimation of Output Error Models

Multiple Shooting

Applications:

1

Boundary values problems;

2

ODE parameter estimation;

3

Optimal control;

Escape local minima; Better numerical stability; Can be implemented in parallel.

Ribeiro, A. H., and Aguirre, L. A. (2017) Shooting methods for Parameter Estimation of Output Error Models. IFAC world congress (Toulouse, France 2017).

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 48 / 56

slide-100
SLIDE 100

Single Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Figure: The initial conditions are represented with circles and subsequent simulated values with diamonds .

Single Shooting

Estimate NOE model solving: min

Θ es2

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 49 / 56

slide-101
SLIDE 101

Multiple Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Figure: Three consecutive simulations ˆ y (i), i = 1, 2, 3 are indicated with different colors.The initial conditions are represented with circles and subsequent simulated values with diamonds .

Multiple Shooting

ms subdivisions. ˆ y(i) ⇒ i-th simulation. e(i)

s

⇒ i-th error. ems =    e(1)

s

. . . e(ms)

s

  

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 50 / 56

slide-102
SLIDE 102

Multiple Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Figure: Three consecutive simulations ˆ y (i), i = 1, 2, 3 are indicated with different colors.The initial conditions are represented with circles and subsequent simulated values with diamonds .

Multiple Shooting

ms subdivisions. ˆ y(i) ⇒ i-th simulation. e(i)

s

⇒ i-th error. ems =    e(1)

s

. . . e(ms)

s

  

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 50 / 56

slide-103
SLIDE 103

Multiple Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Figure: Three consecutive simulations ˆ y (i), i = 1, 2, 3 are indicated with different colors.The initial conditions are represented with circles and subsequent simulated values with diamonds .

Multiple Shooting

ms subdivisions. ˆ y(i) ⇒ i-th simulation. e(i)

s

⇒ i-th error. ems =    e(1)

s

. . . e(ms)

s

  

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 50 / 56

slide-104
SLIDE 104

Multiple Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Figure: Three consecutive simulations ˆ y (i), i = 1, 2, 3 are indicated with different colors.The initial conditions are represented with circles and subsequent simulated values with diamonds .

Multiple Shooting

ms subdivisions. ˆ y(i) ⇒ i-th simulation. e(i)

s

⇒ i-th error. ems =    e(1)

s

. . . e(ms)

s

  

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 50 / 56

slide-105
SLIDE 105

Multiple Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Figure: Three consecutive simulations ˆ y (i), i = 1, 2, 3 are indicated with different colors.The initial conditions are represented with circles and subsequent simulated values with diamonds .

Multiple Shooting

ems = es if initial conditions matches the previous ones.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 51 / 56

slide-106
SLIDE 106

Multiple Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Single Shooting

Estimate NOE model solving: min

Θ es2

Multiple Shooting

Estimate NOE model solving: min

Φ ems2

subject to: ˆ y(i)[end] = y(i+1) i = 1, · · · , ms

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 52 / 56

slide-107
SLIDE 107

Multiple Shooting

Shooting Methods for Parameter Estimation of Output Error Models

Single Shooting

Parameter Θ.

Multiple Shooting

Extended parameter Φ: Φ =       Θ y(1) . . . y(ms)      

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 52 / 56

slide-108
SLIDE 108

Numerical Example

Multiple Shooting for cooping with Local Minima

Logistic Map

A dataset with 300 samples were generated using the logistic map: y[k] = θy[k − 1](1 − y[k − 1]), for θ = 3.78.

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

ˆ θ

50 100 150

kemsk2

ms = 1

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 53 / 56

slide-109
SLIDE 109

Numerical Example

Multiple Shooting for cooping with Local Minima

Logistic Map

A dataset with 300 samples were generated using the logistic map: y[k] = θy[k − 1](1 − y[k − 1]), for θ = 3.78.

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

ˆ θ

100 200 300 400

kemsk2

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

ˆ θ

1 10 100

Frequency

ms = 30

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 53 / 56

slide-110
SLIDE 110

Numerical Example

Multiple Shooting for cooping with Local Minima

Logistic Map

A dataset with 300 samples were generated using the logistic map: y[k] = θy[k − 1](1 − y[k − 1]), for θ = 3.78.

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

ˆ θ

200 400 600 800 1000 1200

kemsk2

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

ˆ θ

1 10 100

Frequency

ms = 100

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 53 / 56

slide-111
SLIDE 111

Numerical Example

Multiple Shooting for cooping with Local Minima

Logistic Map

A dataset with 300 samples were generated using the logistic map: y[k] = θy[k − 1](1 − y[k − 1]), for θ = 3.78.

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

ˆ θ

500 1000 1500 2000 2500 3000 3500

kemsk2

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

ˆ θ

1 10 100

Frequency

ms = 300

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 53 / 56

slide-112
SLIDE 112

Conclusion

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 54 / 56

slide-113
SLIDE 113

Future Work

Penalty method ⇒ Byrd-Omojokun SQP method; Structure selection procedure using l1 regularization: min

Θ e2 2 + µΘ1

And its application to multiple shooting: min

Φ 1 2ems2 + µΘ1

subject to: ˆ y(i)[end] = y(i+1) , i = 1, · · · , ms − 1.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 55 / 56

slide-114
SLIDE 114

Future Work

Penalty method ⇒ Byrd-Omojokun SQP method; Structure selection procedure using l1 regularization: min

Θ e2 2 + µΘ1

And its application to multiple shooting: min

Φ 1 2ems2 + µΘ1

subject to: ˆ y(i)[end] = y(i+1) , i = 1, · · · , ms − 1.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 55 / 56

slide-115
SLIDE 115

Future Work

Penalty method ⇒ Byrd-Omojokun SQP method; Structure selection procedure using l1 regularization: min

Θ e2 2 + µΘ1

And its application to multiple shooting: min

Φ 1 2ems2 + µΘ1

subject to: ˆ y(i)[end] = y(i+1) , i = 1, · · · , ms − 1.

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 55 / 56

slide-116
SLIDE 116

The End

Antˆ

  • nio H. Ribeiro (UFMG)

Recurrent Structures in System Identification July 19, 2017 56 / 56