SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based - - PowerPoint PPT Presentation
SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based - - PowerPoint PPT Presentation
SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based Machine Learning Tool Boleslaw Szymanski, Lijuan Zhu, Long Han and Mark Embrechts Rensselaer Polytechnic Institute, Troy, NY 12180, USA and Alexander Ross and Karsten Sternickel
1. Introduction: SUPANOVA Kernels 2. Heuristic for Efficient SUPANOVA Kernel Computation 3. Results of SKT for Two Benchmarks:
- Iris Data
- Boston Housing Market
4. An Industrial Application: Automatic Analysis of Magnetocardiograms 5. Preprocessing Measurements into CMI Data 6. Results of Processing CMI Data under SKT Presentation Outline
CMI/RPI: Spline Kernel Based Machine Learning 2
Represent the prediction function as a sum of kernels
- kernels weighted by nonnegative coefficients
ANOVA kernel
- decomposes functions of the order m into a sum of terms:1-ary,
2-ary … m-ary order functions of the original arguments
Each kernel function in ANOVA kernel is spline kernel
- vector with dimensions
6 ) , min( 2 ) , min( ) ( ) , (
3 i i i i i i i i i i spline
v u v u v u v u v u k − + + =
∑
=
≥
- =
- =
M j j j j
c a x x K c a x x K x f , ) , ( ) , ( ) (
SUPANOVA Kernels
∏ ∑ ∏ ∑
= < = =
+ + + + = + =
m i i i j i j j i i i m i m i i i i ANOVA
v u k v u k v u k v u k v u k v u K
1 1 1
) , ( ... ) , ( ) , ( ) , ( 1 )] , ( 1 [ ) , (
c
m
M 2 1 = +
{
{
⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + + ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + + ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ m m M m j j j m i i i m
K K K K K K K K
1 2 2 1 1 2 1
... 4 4 3 4 4 2 1 4 4 3 4 4 2 1
CMI/RPI: Spline Kernel Based Machine Learning 3
- Objective function (S. Gunn and J. Kandola 2002)
- quadratic error:
- smoothness error:
- approximate sparseness error with one-norm:
- Proposed objective function
- ideal zero-norm sparseness error
- weight for smoothness error
- weight for sparseness error
- sparseness vector
- same parameter as used in traditional kernels
∑
= M j j c
c λ
2 2
∑
=
- −
M j j j
a K c y a K a c
j M j T j a
- ×
∑
=0
λ
) , (
2 2
≥ +
- ×
+
- −
= Φ
∑ ∑ ∑
= = = j M j j c j M j T j a M j j j
c c a K a c a K c y c a λ λ
SUPANOVA: Objective Function
a
λ
c
λ
c
a
CMI/RPI: Spline Kernel Based Machine Learning 4
The Gunn’s Iterative Solution
*
0: ' 1 1: ' argmin ( , ') and compute starting with large value and decreasing it until the minimum error is achieved. S2: c argmin ( ', ) where 0 and is set so
a a c a c
S c S a a c a c λ λ λ = = Φ = Φ = u r u u r r u r u u r u u r r
* *
the loss is the same as at initialization 3: argmin ( , ) where 0 and is computed starting with a large value and then decreasing it to mini
M T j j j c a a c a
c a K a M S a a c λ λ λ λ
=
× × = = Φ =
∑
r r uu r u u r r mize the error.
2 2
( , )
M M M T j j a j j c j j j j j
a c y c K a c a K a c c λ λ
= = =
Φ = −
- +
×
- +
≥
∑ ∑ ∑
CMI/RPI: Spline Kernel Based Machine Learning 5
) , (
2 2 ' '
≥ +
- −
= Φ
∑ ∑
= = j M j j c M j j j
c c a K c y c a λ
Heuristic Method for Computing Sparse Vector c
- Initialization
- create empty set S of all selected elements of vector c
- create a set E contains all the remaining elements of vector c
- Selection
- select, one by one ci in set E and compute the minimal value of the
loss function with this selection
- choose the element ci that achieves the smallest value of loss
function among all elements of set E
- Adjustment
- solve the set of linear equations to refit the c values in set S
- Control loop
- stop heuristic when the process reaches the limited
iteration number or set E becomes empty
In a single step
CMI/RPI: Spline Kernel Based Machine Learning 6
0.5 1 1.5 2 2.5 3 3.5 4 5 10 15 20 25 30 Target and Predicted Values Sorted Sequence Number q2 = 0.023 Q2 = 0.173 RMSE = 0.355 target predicted
0.5 1 1.5 2 2.5 3 3.5 0.5 1 1.5 2 2.5 3 3.5 Predicted Value Target Value q2 = 0.023 Q2 = 0.173 RMSE = 0.355
10 20 30 40 50 60 20 40 60 80 100 120 140 160 Target and Predicted Values Sorted Sequence Number q2 = 0.214 Q2 = 0.217 RMSE = 4.768 target predicted
SUPANOVA Experimental Results
Iris data Iris data Boston Housing
10 20 30 40 50 5 10 15 20 25 30 35 40 45 50 55 Predicted Value Target Value q2 = 0.214 Q2 = 0.217 RMSE = 4.768
Boston Housing
- Iris data ( 4 variables, 50 samples, classification)
- all combination of features
- Boston Housing Market ( 13 variables, 506 samples, regression)
- up-to binary combination of features
CMI/RPI: Spline Kernel Based Machine Learning 7
SUPANOVA Results Discussion
- Heuristic performance with adjustment versus without adjustment
30 40 50 60 70 80 90 100 110 120 130 140 5 10 15 20 25 30 35 40 Step S2 Error Number of non-zero elements with-adjustment without-adjustment
10 20 30 40 50 60 20 40 60 80 100 120 140 160 Target and Predicted Values Sorted Sequence Number q2 = 0.214 Q2 = 0.217 RMSE = 4.768 target predicted
- 10
10 20 30 40 50 60 20 40 60 80 100 120 140 160 Target and Predicted Values Sorted Sequence Number q2 = 0.205 Q2 = 0.213 RMSE = 4.717 target predicted
- Up-to binary versus up-to ternary combinations of features
CMI/RPI: Spline Kernel Based Machine Learning 8
An Industrial Application: Automatic Classification of Magnetocardiograms
Technical Objectives Goals
Normal LCX block 3 vessel disease
- Identify cardiac ischemia from
magnetocardiograms
- Accurate diagnosis of ischemia in patients
with left and/or right bundle branch blocks
- Incorporate domain knowledge to enhance
automatic classification of diseases
- Provide cardiologists with a hypothesis
testing module
- Use computational intelligence to solve a
“missing information” problem
- Reverse the process and identify the signature
- f cardiac diseases in magnetocardiograms
- Develop an online database for data exchange
(pooling) between hospitals Visualization of magnetocardiograms (examples)
CMI 2409
This is a 9-channel MCG system, capable of scanning A 20 cm x 20 cm region
- ver the heart.
CMI/RPI: Spline Kernel Based Machine Learning 9
Typical Time Series Data for T3-T4 MagnetoCardiogram
201
CMI/RPI: Spline Kernel Based Machine Learning 10
From Magnetocardiogram to Data Series for Analysis
- 1. Preprocessing of the time
- 1. Preprocessing of the time
series & averaging series & averaging
- 2. Heart cycle interval selection:
- 2. Heart cycle interval selection:
Chose the T wave for the Chose the T wave for the detection of ischemia detection of ischemia
- 3. Automatically determine the
- 3. Automatically determine the
window of interest window of interest
- 4. Data export to machine
- 4. Data export to machine
learning processing learning processing
- 1. Recording
- 2. Filtering
- 3. Averaging
- 4. Selecting
- 5. Exporting
CMI/RPI: Spline Kernel Based Machine Learning 11
CMI Delay Data and Resulting Sparse Vector c
- Training data: 241 patients
- Test data: 84 patients
- Features: 74 data representing delays
- f pick of polarization signal
- Heuristic parameters: binary
combinations of features
- Kernel processing results:
- run time: 15mins(desktop)
- size of sparse vector c: 10 elements
2517 15.28137 1431 16.20629 509 2.13248 1416 12.94138 579 5.35348 1319 13.27779 1256 8.21285 1492 10.90624 2506 16.66780 2207 6.72628 772 3.01011
All 10 selected features are combination features Code Value
CMI/RPI: Spline Kernel Based Machine Learning 12
CMI Data Processing Results: Prediction versus True Value
- 1.5
- 1
- 0.5
0.5 1 1.5 10 20 30 40 50 60 70 80 90 Target and Predicted Values Sorted Sequence Number q2 = 0.620 Q2 = 0.627 RMSE = 0.786 target predicted
CMI/RPI: Spline Kernel Based Machine Learning 13
CMI Data Processing Results: ROC Curve
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 True Positives False Positives AZ_area = 0.8672
ROC curve predicts a system response with different selections of the value of true/false prediction threshold.
CMI/RPI: Spline Kernel Based Machine Learning 14
predicted negative predicted positive negative 32 5 positive 9 38
CMI Data Processing Results: Confusion Matrix and Distribution of True Values
Balance error: 83.67%
5 10 15 20 25 30 1 2 3 4 5 Healthy Sick
CMI/RPI: Spline Kernel Based Machine Learning 15