Projection-based Chemometrics and Deep Reconstruction
- Dr. Uwe Kruger
Department of Biomedical Engineering Jonsson Engineering Center Rensselaer Polytechnic Institute
and Deep Reconstruction Dr. Uwe Kruger Department of Biomedical - - PowerPoint PPT Presentation
Projection-based Chemometrics and Deep Reconstruction Dr. Uwe Kruger Department of Biomedical Engineering Jonsson Engineering Center Rensselaer Polytechnic Institute Presentation Outline Motivation for kernel-based methods (kernel density
Department of Biomedical Engineering Jonsson Engineering Center Rensselaer Polytechnic Institute
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 2
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
the idea behind reproducing kernels:
variable X using a set of n observations drawn from the distribution of X?
Slide 3
i
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
used to formulate a total of n Bernoulli trials (like flipping a coin)
cumulative probability distribution function for x, i.e. F (x) ; and
probability that xi is smaller than or equal to x is F (x) for 1 i n.
degrees of freedom and the probability of success is F (x):
Slide 4
x n x
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Binomial distribution can be approximated by a normal distribution with a reasonable degree of accuracy, meaning a large enough sample size: np > 5 and n ( 1 – p ) > 5!
Slide 5
x F x F x F V n x F x F n x F x nF n x S V x F V x F n x nF n x S E x F E
n n n
lim ˆ lim ˆ lim 1 1 ˆ ˆ
2 2
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
=0.05!
second…
Slide 6
i i i
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 7
x n x F x F x f x x K n x n x F x F x f n x F x F x F x K n n x F x F x F n x F x F x F x n n x F x F x F x x x x x x F x nF x nF x x F x nF x nF
n i i x n i i x n i i i i x i x n i i
d 1 d 96 . 1 1 d 1 d 96 . 1 1 96 . 1 d 1 1 96 . 1 1 96 . 1 d 1 1 96 . 1 if if 1 d 1 96 . 1 d 1 96 . 1
1 1 function delta Dirac spiky" " less slightly 1 1
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
integral must be equal to one, so how about defining it as follows:
Slide 8
x f x x K n x f x x F x F n x f x n x F x F x f x n x F x F x f x x K n x n x F x F x f
n i i n n n n i i
1 1
1 lim d 1 d 1 lim d 1 d 96 . 1 lim d 1 d 96 . 1 1 d 1 d 96 . 1
i i x x i
x x x x K e x x K
i
2 1
lim
2 2 1
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
derivative shows that, asymptotically, the estimate: converges to the true probability density function for any value of x. The above estimator is defined as a kernel density estimator.
counterpart of data-driven chemometric modeling techniques, such as principal component analysis (PCA) and partial least squares (PLS).
modeling technique, i.e. the neurons are, effectively, small kernels.
Slide 9
i
n i i
x x K n
1
1
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
can be considered if their area is equal to 1 and include the Epanechnikov, the triangular and the uniform kernel among others.
function does not influence the estimate in an asymptotic sense.
the accuracy of the estimate. This yields the following general form of the kernel density estimator:
Slide 10
2 2 1
2 1
i
x x i
e x x K bandwidth , , 1
2 2 1
2 1 1
h e h x x K h x x K nh
h x x i n i i
i
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
al., 2008).
singular value decomposition
Slide 11
T T T n T T T n T T
E E ULP A s s s z z z Z s A z s z As z
2 1 2 1
dim dim
position eigendecom its and matrix Gram , position eigendecom its and matrix covariance data
2 z 2 1 1
T T T n T n z
U L U ZZ Z Z Φ P L P Z Z Σ
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
similarity transformation) – which are the principal components:
measured variables nonlinear, i.e.: which we assume to be bijective!
Slide 12
Z U L P ULP Z z Z Φ U L Zz U L z P t P A T UL S ULP A s s s
T T T z T T T T T T n T T 1 1 1 2 1
given that , ,
n T T T
z ψ z ψ z ψ F f P t z ψ f s θ z
2 1 T
, , ,
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 13
nn n n n n n n T n T n T n T T T n T T T T T T n T T n
2 1 2 22 21 1 12 11 2 1 2 2 2 1 2 1 2 1 1 1 centering mean ing incorporat 1 matrix kernel the as defined centering mean ing incorporat 1 z
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
function using the kernel density estimator using kernels:
Slide 14
2 2 2 1 2 1 2 1 2 2 2 1 2 1 2 2 1 2 1 2 1 2 2 1 2 1 2 2 1
2 1 2 1
z z z z z z z z z z z z z z
n n n n j i
j i T ij
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
components:
Slide 15
T T n T n T n
2
1 1 1 z
T
z
1 F A
functions basis
sum weighted a is this network, neural a like 1 1 1
1
T n T T n T
T n T
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
important.
kernels, any function can be constructed in the feature space that maps the nonlinear surface in the data space to become a plane (subspace) in the feature space.
components in the feature space that are related to the source variables in the original variable space – connected through the following mappings:
Slide 16
T
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
squares concept: orthogonally projecting the data points onto directions for the predictor space: cos 𝛽 =
𝒚𝑈𝒙 𝒚 𝒙
with 𝒙 = 1, we get cos 𝛽 𝒚 = 𝒚𝑈𝒙 = 𝑢 and the response space: cos 𝛾 𝒛 = 𝒛𝑈𝒘 = 𝑣 if 𝒘 = 1
Slide 17
x w t y v u
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
dimension – X and Y describing the predictor and response sets that are related as follows: 𝒁 = 𝑪𝒀 + 𝑭 - E being a random vector describing uncertainty
𝑪 = 𝑻𝑍𝑌𝑻𝑌𝑌
−1
covariance matrix SXX may not exist or is badly conditioned!
𝑈 = 𝒀𝑈𝒙 and 𝑉 = 𝒁𝑈𝒘
their covariance!
Slide 18
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
𝐾 = 𝐹 𝑈𝑉 − 𝜇1 𝒙𝑈𝒙 − 1 − 𝜇2 𝒘𝑈𝒘 − 1 𝐾 = 𝒙𝑈𝐹 𝒀𝒁𝑈 𝒘 − 𝜇1 𝒙𝑈𝒙 − 1 − 𝜇2 𝒘𝑈𝒘 − 1 𝐾 = 𝒙𝑈𝑻𝑌𝑍𝒘 − 𝜇1 𝒙𝑈𝒙 − 1 − 𝜇2 𝒘𝑈𝒘 − 1
𝜖𝐾 𝜖𝒙 = 𝑻𝑌𝑍𝒘 − 2𝜇1𝒙 = 𝟏 𝜖𝐾 𝜖𝒘 = 𝑻𝑍𝑌𝒙 − 2𝜇2𝒘 = 𝟏
𝑻𝑌𝑍𝑻𝑍𝑌𝒙 = 4𝜇1𝜇2𝒙 𝑻𝑍𝑌𝑻𝑌𝑍𝒘 = 4𝜇1𝜇2𝒘
Slide 19
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
𝑮 = 𝒀 − 𝑼𝒒 and 𝑭 = 𝒁 − 𝑼𝒓 – these being the residual vectors for X and Y, respectively.
squares regression problems – minimizing the length of the residual vectors: 𝒒 =
𝐹 𝒀𝑈 𝐹 𝑈2 and 𝒓 = 𝐹 𝒁𝑈 𝐹 𝑈2
vectors F and E instead of the original random vectors X and Y.
as the standard PLS algorithm and detailed on the next slide. This algorithm was first published by Herman Wold in 1966.
Slide 20
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
random vectors X and Y, respectively:
column are mean centered and scaled to have a unit variance:
Slide 21
nN n n N N nN n n N N
y y y y y y y y y x x x x x x x x x
2 1 2 22 21 1 12 11 2 1 2 22 21 1 12 11
Y X
T T T n Y T T T n X T n T n
x 1 Y x 1 Y σ x 1 X x 1 X σ 1 Y y 1 X x
diag : diag : vectors variance Sample : : rs mean vecto Sample
1 1 2 1 1 2 1 1
diag diag : matrices both g Normalizin
2 1 2 1
2 2 Y T X T
σ y 1 Y Y σ x 1 X X
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 22
1 1 1 1
1 , : matrix covariance
Sample 1 , : matrix covariance Sample
XY XY T n XY XX XX T n XX
n n S M Y X S S M X X S
end ; ; norm ; norm ; ; norm ; 10 1 while ; 1 :, norm 1 :, 100; : 1 for
u u u 1 1 1 1
w w w w w w w v M w v v v w M v M M w
i- XY i- YX i XX i XX
e- m i
end ; i :, ; i :, ; i :, ; ; ; ; ; ; ;
i i i 1 1 1 1 1 1 1 1
w W q Q p P Y X M X X M q w X Y Y p w X X X w M w w M q w M w w M p w w
i T i i XY i T i i XX T i i i i i T i i i i i i i XX T i i YX i i i XX T i i i XX i i
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
look at the standard algorithm again:
nonlinear transformation involving the random vectors X, Y and E:
Slide 23
T T
W P W Q B
1
m atrix Gram a is This
T T
T
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
vector v as follows:
the following relationship:
Slide 24
X ψ X X Φ for matrix Gram the , , is This 1 1
X
T n T T n T
u T T XY
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
𝐔, we can also use the nonlinear
Gram matrix 𝚾𝑌 𝐘0, 𝐘0 , which gives rise to:
vector w, we can scale the vector t to unit length:
𝑈 = 𝚾𝑌 𝐘0, 𝐘0 :
Slide 25
0,
T i i T i i T i i T i i i T i i i T i i T i i i T i i i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
and the response matrix 𝐙0:
again at the linear PLS algorithm first:
Slide 26
1 1 1 1 1 1 1 1 1 1 1 1 1
i T i i i T i i T i i i T i i i i
T 1 T T 1 1 T 1 1 1 1
T T T T T T T T T T T T T T T T
T 1 T
T T T T T T
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
counterparts gives rise to:
Gram matrix, the rest of the algorithm is related to the linear PLS algorithm.
weights, the “only” parameter that needs to be specified is the kernel
regression problem an solved using the robust PLS algorithm!
Slide 27
T 1 T
T T
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
say beyond 10,000 (remember the size of the Gram matrix is equal to the number of data points squared)
neural network models when the number of variables x or y are larger and/or the number of data points is small.
structure of large network topologies on the next slide in more detail
Slide 28
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
the accuracy of the network prediction – e.g. for specific tasks (set of lung images)?
combinations if we have two sets of images (one set that is labeled normal, whilst the other set is labeled as containing anomalies)? Slide 29
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 30
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 31
DYNAMOMETER
air in exhaust manifold plenum chamber Fault 2: intercooler blockage (process) Fault 1: injector pump fuel meter (sensor) inlet manifold pressure inlet manifold temperature turbine inlet pressure turbine inlet temperature turbine exit pressure compressor (turbocharger)
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 32
Variables analysed Modelling results
Principal Component Variance Captured (%) Variance Total (%) 1 79.5998 79.5998 2 16.4492 96.0490 3 2.4169 98.4659 4 1.0745 99.5404 5 0.4010 99.9414 6 0.0586 100.000 Number of Bottleneck Nodes Variance Captured (%) Note 1 97.8160 Important variation 2 99.4212 3 99.8336 4 99.8725 Negligible 5 99.9401 6 99.9414 No Engine Variable Unit Note 1 Fuel Flow kg/h
2 Air Flow kg/h 3 Inlet Manifold Pressure Bar 4 Inlet Manifold Temperature
5 Turbine Inlet Pressure Bar 6 Turbine Inlet Temperature
RPM
1500 2500 3500 4500 Pedal Position 30% 49% 57% 62% 40% 59% 64% 65% 54% 74% 74% 76% 62% 78% 80% 83% 100% 100% 100% 100%
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 33
Air leak of 2mm in the manifold plenum chamber
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 34
could be successfully detected.
recorded engine variable is affected by this event could not be obtained.
diagnose this event.
diagnosis is expensive, whilst data- driven techniques are a viable alternative that are cost-effective.
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 35
Air leak of 6mm in the manifold plenum chamber
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 36
(i) The fault could clearly be detected; (ii) The diagnosis provides the engine management system with sufficient information to trace this event to an air leak
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 37
energy
Variable Selection
Projection-Based Data Chemometrics and Deep Reconstruction Troy, November 19., 2017
Slide 38
Variable Selection