P RINCIPAL C OMPONENT A NALYSIS (PCA) Singular Value Decomposition - - PowerPoint PPT Presentation

p rincipal c omponent a nalysis pca
SMART_READER_LITE
LIVE PREVIEW

P RINCIPAL C OMPONENT A NALYSIS (PCA) Singular Value Decomposition - - PowerPoint PPT Presentation

M ODEL I DENTIFICATION BY G RADIENT M ETHODS Dr. Julien Billeter Laboratoire d'Automatique Ecole Polytechnique Fdrale de Lausanne (EPFL) MLS-S03 | 2013-2014 M ODEL I DENTIFICATION BY G RADIENT METHODS D YNAMIC M ODELS Conservation


slide-1
SLIDE 1

MODEL IDENTIFICATION

BY GRADIENT METHODS

  • Dr. Julien Billeter

Laboratoire d'Automatique Ecole Polytechnique Fédérale de Lausanne (EPFL) MLS-S03 | 2013-2014

slide-2
SLIDE 2

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 2

MODEL IDENTIFICATION BY GRADIENT METHODS

  • DYNAMIC MODELS

– Conservation of Mass (Concentration Measurements) – Conservation of Energy (Calorimetry) – Beer’s Law (Spectroscopy)

  • INTEGRATION OF DYNAMIC MODELS

– Euler’s Method – Runge-Kutta’s Methods (RK)

  • LINEAR REGRESSION (OLS) PROBLEMS

– Calibration-free Calorimetry and Spectroscopy

  • GRADIENT-BASED NONLINEAR REGRESSION (NLR) METHODS

– Steepest Descent Method (SD) – Newton-Raphson and Newton-Gauss Methods (NG) – Newton-Gauss Levenberg Marquardt Method (NGLM)

  • REFERENCES
slide-3
SLIDE 3

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 3

SCALAR, VECTOR AND MATRIX NOTATION

  • Scalars

(1 × 1) = number of dim 1 written in lowercase/UPPERCASE italics

  • Vectors

(n × 1) = n-dim array (column vector) written in lowercase boldface

  • Matrices

(n × m) = array of dimensions n (rows) by m (columns) written in UPPERCASE BOLDFACE

ω Ω , , , a A ω , a Ω , A

slide-4
SLIDE 4

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 4

SCALAR, VECTOR AND MATRIX OPERATIONS

  • Scalar multiplication
  • Addition
  • Multiplication
  • Transposition
  • Inverse (identity matrix)
  • Rank and null space (kernel)
  • Rank-nullity theorem

+ + , a b A B

T T

, a A α α , a A , a b A B ( ), ( ) rank = A A ker A = =

  • 1
  • 1

A A A A I ( )= ( ) ( ) dim rank nullity + A A A

slide-5
SLIDE 5

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 5

PRINCIPAL COMPONENT ANALYSIS (PCA)

  • Singular Value Decomposition (SVD) is a method to

decompose a matrix into a product of orthonormal column ( ) and row ( ) singular vectors weighted by singular values ( ).

  • Principal Component Analysis (PCA) is a method to reduce

the dimensionality of a matrix to its number of significant singular values.

T

= Y U S V

2

with = Λ S

T

V U S Y Y

T

≈ = Y Y U S V with noise − = Y Y

slide-6
SLIDE 6

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 6

LAW OF CONSERVATION OF MASS

  • “Nothing is lost, nothing is created, everything is transformed”

– Lavoisier (1743-1794)

T T

( ) ( ) ( )

S S w

m t t t = = → = 1 m 1 M n   

( ) T ( ) ( ) T ( )

( ) ( ) ( ) ( ), (0) ( ) ( ) ( ( ) ( ) ( ) ( ) ), (0)

  • ut

in

u t m in in m t t m n c m i V t

t V t t t t t t t t t t ω = ± + − = = ± + − =

u

r r n N W W u n n n c N W W c c c   ζ ζ

( )

T T

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) T T

with ( ) , ( ) ( ) and ( ) ( ) ( ) ( )

p m p in

  • ut

m m

t t u t V t t m t t m t V t m t m t t m t t p in

  • ut

p m

t V t V t m t t u t t

ρ ρ ρ ρ

ω = + = ± − = − = − ±

1 1 u

1 u 1

ζ

ζ

   

 

slide-7
SLIDE 7

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 7

LAW OF CONSERVATION OF ENERGY

  • “Any theory which demands the annihilation of energy, is

necessarily erroneous” – Joule (1818-1889)

T

( ) ( ) ( )

acc

Q t q t t = → = 1 q  ( ) ( ) ( ) , (0)

p ex in loss h

  • ut

r m

m t c t T t q q q q q q q T T = ± + + − + − = 

( )

( )

T , T T

with , , ( ( ) ( )( ) ( ) ( ) ( ) ( ) ( ) , ( ) ( ) ( ) , ( ) ( ) ( ) ( ) )

ex j in p in in in r r m m

  • ut

p

  • ut

m

q t UA T T t q t t T T q t V t t q t t q t c t u t t T t = − = −Δ = −Δ = − = h r h c u ζ

slide-8
SLIDE 8

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 8

BEER’S LAW

  • “The absorbance of a solution is proportional to the product of its

concentration and the distance light travels through it” – Beer (1825-1863), Lambert (1728- 1777) and Bouguer (1698-1758) = Y C A

T 1 1

with ( ), [ ( ),..., ( )] ( ) and ( ) [ ( ),..., ( )]

n nw t

nt nw nt w w t t S nw S = × = × × Y C a c A a c  

T

A C

= 

Y

I 10 I

A log (T), T = − =

Units conversion:

slide-9
SLIDE 9

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 9

NUMERICAL INTEGRATION OF ODE’S

  • Euler’s method (implicit, explicit) was invented by the Swiss

mathematician Euler (1707-1783) h: integration stepsize

  • Runge-Kutta’s methods (RK2, RK4, explicit, implicit) were

elaborated by Runge (1856-1927) and Kutta (1867-1944)

2

1

( )

i i

O h

h

+

+

= + y y y 

1 2 1 1 2 2

2 2

with ( , ) ( , )

h i i i i h i i i

t t

+ + +

= + = + y y y y y y y   

1 2 1 2 2 3 2 2 2 4 3

with ( , ), ( , ) ( , ), ( , )

i i h h i i h h i i i i

t t t t h h = = + + = + + = + + k y y k y y k k y y k k y y k    

3

1 2

1

( )

i i i

O h

h

+ + +

= + y y y 

5

1 1 1 2 3 4 6

( )

( +2 +2 + )

i i

O h

h

+

+

= + y y k k k k

slide-10
SLIDE 10

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 10

REGRESSION PROBLEMS

  • A regression problem consists in minimizing the difference between

measured output variables and modeled output variables (the objective/cost function) by postulating a dynamic model and an output model , and adjusting the parameters (and ).

  • In least-squares problems, is defined as the sum of squared residuals

( with ) and the following matrices are defined:

( ) ( )

,

ˆ { , }* arg min ( ), ( , , ) s.t. ˆ ˆ( , , ) ( ( , ) ( , ) , ),

f g

f g f g g f f f f g

t t g t f t t t φ   =   = =  

p p

c p p p y y p p p p c p p y p 

T

( ) ( ) ssq = vec R vec R ˆ = − R Y Y

T T T 1 1 1

ˆ ˆ ˆ ˆ [ ( ),..., ( )] , [ ( , , ),..., ( , , )] , [ ( , ),..., ( , )]

nt f g nt f g f nt f

t t t t t t = = = Y y y Y y p p y p p C c p c p ( , )

f

f t p ( ( , ), )

f g

g t c p p

f

p

g

p ( ) t y ˆ( , , )

f g

t y p p

φ

slide-11
SLIDE 11

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 11

SYSTEMS OF LINEAR EQUATIONS

  • A systems of linear equations can be written in matrix

with the regressors and the regressands

  • The number of solutions of S is:

¥ when m < n underdetermined system 1 m = n determined system ¥ m > n

  • verdetermined system

1,1 1 1, 1 ,1 1 ,

S :

n n m m n n m

a x a x y a x a x y  + + =   =   + + =  A x y       ( ), ( 1) m n n × × A x ( 1) m × y

1 −

 = x A y

slide-12
SLIDE 12

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 12

LINEAR REGRESSION (LR, OLS)

  • For univariate data (data organized in a vector y), a linear model relating

the n independent variables (regressors, x) to the m > n dependent variables (regressands, y) can be constructed as: with and

  • For multivariate data (data organized in a matrix Y), the linear model

relating the n ∙ w regressors X to the m ∙ w regressands Y is built as: with and

{ }

T T T 1

( arg min ( ) with * ) ( )

+ + −

= − − = =

x

vec Ax y vec Ax A A A A x y A y = y A x

{ }

T

arg min ( ) * ( )

+

= − = −

x

vec AX Y vec AX Y A X Y = Y A X ( ), ( 1) m n n × × A x ( 1) m × y ( )= ( )= rank dim n

+

A A A The left pseudo-inverse exists onl y if ( ) n w × X ( ) m w × Y

slide-13
SLIDE 13

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 13

LEFT OR RIGHT PSEUDO-INVERSE ?

  • Left pseudo-inverse
  • Right pseudo-inverse

{ }

T

arg mi n * ( ) ( )

+

=  − = = −

X vec AX

Y vec AX Y Y X X A A Y * * * ( )

r r v r r v + + +

−Δ =  = =  = = −Δ  = y C a Y Y C C C A Y h a q R R A h q Spectroscopy : Calorimetry :

( ), ( ), ( ), ( 1), ( ), ( 1)

r v

nt nw nt S S nw nt nt R R × × × × × Δ × Y C A q R h with

T 1 T

( )

+ −

= A A A A

T T 1

( )

+ −

= X X XX

{ }

T

arg mi n * ( ) ( )

+

=  − = = −

A vec AX

Y vec AX Y Y A X YX A *

+

=  = Y C A A Y C Spectroscopy : ( ) ( ) rank dim = A A ( ) ( ) rank dim = X X ( ) rank S = C ( ) rank S = A ( )

v

rank R = R

slide-14
SLIDE 14

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 14

EXPLICIT vs IMPLICIT CALIBRATION

  • In explicit calibration, a static calibration set is used to

construct a calibration model from which concentrations are predicted for dynamic experiments.

  • In implicit calibration (i.e. calibration free), dynamic

experiments are used as an internal calibration set to eliminate the (static) linear counter-part A.

The implicit calibration can even be used in case of rank-deficient data, i.e. when rank(C) < S

ˆ ˆ ˆ

+ +

=  =  = Y C A A C Y A Y C   ˆ ˆ ˆ ˆ +

+

=  =  = Y C A A C Y Y Y C C        

slide-15
SLIDE 15

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 15

NONLINEAR (LEAST SQUARES) REGRESSION (NLR)

  • Unlike linear regression problems, nonlinear regression problems are

solved iteratively. Since linear parameters can be estimated (eliminated) at each iteration, the optimization problem simplifies:

  • The (nested) linear regression problem is solved at each iteration as:

g

p

( ) ( ) ( ) ( ) ( )

ˆ ˆ ( ) ( ) ˆ , ( ) ( ( ) ( ) ( ) )

f lin f lin f lin f g lin f g g li f n f

g g g g g

+ +

= =  = = Y p C p Y p C C p p C p p p p p C Y Y * =arg min ˆ( , ) ( , ) s.t.

f

f f f

ssq t f t       =

p

c p p p 

( ) ( )

( ) ( )

T

ˆ( , ) ( ) ( )

f lin f lin f

ssq g g + = = − = − vec R vec R R Y Y Y p C p C p Y with

slide-16
SLIDE 16

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 16

CONVEX SETS AND FUNCTIONS

  • Convex Set: A set

is said to be convex if for every points , the points are also in the set .

  • Convex function: A function defined on a convex set

is said to be convex if for each and . (1 ) [0,1], λ λ λ = + − ∀ ∈ z x y

( )

(1 ) ( ) (1 ) ( ), f f f λ λ λ λ + − ≤ + − x y x y C

n

⊂  , C ∈ x y C : C f →  C

n

⊂  , C ∈ x y [0,1] λ ∈

Courtesy of B. Chachuat Courtesy of B. Chachuat

slide-17
SLIDE 17

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 17

NECESSARY CONDITIONS OF OPTIMALITY (NCO)

  • 1st order NCO: If is a local minimum of , then
  • 2nd order NCO: If is a local minimum of , then

Note: 1st and 2nd order NCO form a set of sufficient conditions if is a convex function defined on a convex set . ( *) ( *) φ ∇ = = x J x

2 ( *)

( *) is positive semidefinite φ ∇ = x H x * x : C φ →  * x : C φ →  ( ) φ x C

n

⊂  * is a stationary point x = ( ) ( ) '

n n

p s λ λ λ λ λ → = → = = → ≥ Hv v H I v H I

slide-18
SLIDE 18

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 18

STEEPEST (GRADIENT) DESCENT METHOD

  • By definition, the gradient points out the direction of the

maximum of . Hence, a recurrence relation for finding the minimum is:

  • In simple algorithms, the stepsize parameter is fixed.
  • In more sophisticated algorithms, the stepsize parameter is

adapted at each iteration so that the step in the current search direction is maximum. ( ) φ ∇ x ( ) φ x

T 1

( ) ( )

i i i i

γ

+ =

− x x J x r x with a stepsize parameter γ

( d ) ( ) d

( ) with d (1 )

i i i i

i i i

ε

+ −

= = +

r x x r x x

J x x x

slide-19
SLIDE 19

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 19

NEWTON-RAPHSON’S METHOD (NR)

  • The method of Newton (1642-1727) – Raphson (1648-1715)

is an algorithm for finding iteratively the zeros of a system consisting of n equations and m unknowns: For m = n = 1, one finds the well-known relation for f(x) = 0, m = n, the unique solution is found as m > n, a solution in the least-squares sense is found as ( ) φ = x

1

( ) ( )

i i i i

φ φ

+ +

= − ∇ x x x x

1 1

( ) ( )

i i i i

φ φ

− + =

− ∇ x x x x

1

( ) ( )

i i i i

x x x x φ φ

+ =

− 

slide-20
SLIDE 20

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 20

NEWTON-GAUSS METHOD (NG)

  • The 1st order NCO directly provides a procedure for finding the

minimum of a (regression) function, which can be solved using the Newton-Raphson method. This leads to the method of Newton (1642-1727) – Gauss (1777-1855):

  • Rel. convergence criterion

1 1 T 2 T

NCO: ( *) ( *) NR: ( ) ( ) with ( ) ( ) 2 ( ) ( ) ( ) ( ) 2 ( ) ( )

i i i i i i i i i i i i

f f f f φ φ φ

− +

∇ = = = − ∇ = ∇ = ∇ = ∇ ≈ x J x x x x x x x J x r x x x J x J x

1 T 1

( ) ( ) ( ) ) ( ) (

i i i i i i i i + − +

= − ≈ − x x r x x r J x J x x H x

( )

1 1

tol

i i i

ssq ssq ssq

− −

Courtesy of M. Maeder and Y.-M. Neuhold

slide-21
SLIDE 21

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 21

NG LEVENBERG-MARQUARDT METHOD (NGLM)

  • Levenberg (1919-1973) – Marquardt (1929-1997) modification allows

interpolating between the Newton-Gauss method (NG) and the steepest descent method (SD): For λi = 0  NG ; for λi ¥  SD (shorter stepsize) The parameter λi is adapted at each iteration according to heuristic arguments to avoid divergence problems due to a bad choice of the initial guesses in the original NG method.

( )

1 T 1

( ) ( ) ( )

i i i i i i − + =

− + λ x x H x I J x r x

T

with ( ) ( ) ( ) and : Marquardt parameter (mp)

i i i i

≈ λ ≥ H x J x J x

slide-22
SLIDE 22

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 22

NGLM ALGORITHM

Courtesy of M. Maeder and Y.-M. Neuhold

slide-23
SLIDE 23

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 23

STATISTICS PROVIDED BY GRADIENT METHODS

  • Degree of freedom
  • Residual variance
  • Variance/covariance (correlation) matrix
  • Correlation matrix

( )

# # #

f g

df = − + Y p p

2 ssq r df

σ =

2 2 1

f

p r

σ

= H σ

2 ( , )

, , ( , ) ( , )

( , ) [0,1]

pf p p f f

i j f i f j i i j j

corr p p

σ σ σ ⋅

= ∈

slide-24
SLIDE 24

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 24

REFERENCES

 M. Maeder, Y.-M. Neuhold, Practical Data Analysis in Chemistry, Elsevier, 2007

  • M. Maeder, Y.-M. Neuhold, Chapter 7 of P. Gemperline (ed.)

Practical Guide to Chemometrics, Taylor and Francis, 2006

  • W.H. Press, W.T. Vetterling, S.A. Teukolsky, B.P. Flannery,

Numerical Recipes in C++ – The art of Scientific Computing, 2nd Edition, Cambridge University Press, 2005

  • B. Chachuat, G. François, Nonlinear Dynamic Optimization – From

Theory to Practice, Lecture notes, McMaster University - EPFL, 2009

slide-25
SLIDE 25

MLS-S03 MODEL IDENTIFICATION BY GRADIENT METHODS 25

REFERENCES

 G. Puxty, M. Maeder, K. Hungerbühler,

  • Chemom. Intell. Lab. Syst. 81 (2006), 149
  • V.M. Taavitsainen, H. Haario, J. Chemom. 15 (2001), 215
  • V.M. Taavitsainen, H. Haario et al, J. Chemom. 17 (2003), 140
  • M. Maeder, A.D. Zuberbühler, Anal. Chem. 62 (1990), 2220
  • J. Billeter, Chemometric Methods for Prediction of Uncertainties

and Spectral Validation of Rank Deficient Mechanisms in Kinetic Hard-modelling of Spectroscopic Data, Doctoral dissertation n°18311, ETH Zurich, 2009