SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based - - PowerPoint PPT Presentation

skt a computationaly efficient supanova spline kernel
SMART_READER_LITE
LIVE PREVIEW

SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based - - PowerPoint PPT Presentation

SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based Machine Learning Tool Boleslaw Szymanski, Lijuan Zhu, Long Han and Mark Embrechts Rensselaer Polytechnic Institute, Troy, NY 12180, USA and Alexander Ross and Karsten Sternickel


slide-1
SLIDE 1

Presented at the 11th Online World Conference on Soft Computing in Industrial Applications September 18 – October 6, 2006

SKT: A Computationaly Efficient SUPANOVA: Spline Kernel Based Machine Learning Tool

Boleslaw Szymanski, Lijuan Zhu, Long Han and Mark Embrechts Rensselaer Polytechnic Institute, Troy, NY 12180, USA and Alexander Ross and Karsten Sternickel Cardiomag Imaging, Inc. Schenectady, NY 12304, USA

slide-2
SLIDE 2

1. Introduction: SUPANOVA Kernels 2. Heuristic for Efficient SUPANOVA Kernel Computation 3. Results of SKT for Two Benchmarks:

  • Iris Data
  • Boston Housing Market

4. An Industrial Application: Automatic Analysis of Magnetocardiograms 5. Preprocessing Measurements into CMI Data 6. Results of Processing CMI Data under SKT Presentation Outline

CMI/RPI: Spline Kernel Based Machine Learning 2

slide-3
SLIDE 3

Represent the prediction function as a sum of kernels

  • kernels weighted by nonnegative coefficients

ANOVA kernel

  • decomposes functions of the order m into a sum of terms:1-ary,

2-ary … m-ary order functions of the original arguments

Each kernel function in ANOVA kernel is spline kernel

  • vector with dimensions

6 ) , min( 2 ) , min( ) ( ) , (

3 i i i i i i i i i i spline

v u v u v u v u v u k − + + =

=

  • =
  • =

M j j j j

c a x x K c a x x K x f , ) , ( ) , ( ) (

SUPANOVA Kernels

∏ ∑ ∏ ∑

= < = =

+ + + + = + =

m i i i j i j j i i i m i m i i i i ANOVA

v u k v u k v u k v u k v u k v u K

1 1 1

) , ( ... ) , ( ) , ( ) , ( 1 )] , ( 1 [ ) , (

c

m

M 2 1 = +

{

{

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + + ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + + ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ m m M m j j j m i i i m

K K K K K K K K

1 2 2 1 1 2 1

... 4 4 3 4 4 2 1 4 4 3 4 4 2 1

CMI/RPI: Spline Kernel Based Machine Learning 3

slide-4
SLIDE 4
  • Objective function (S. Gunn and J. Kandola 2002)
  • quadratic error:
  • smoothness error:
  • approximate sparseness error with one-norm:
  • Proposed objective function
  • ideal zero-norm sparseness error
  • weight for smoothness error
  • weight for sparseness error
  • sparseness vector
  • same parameter as used in traditional kernels

= M j j c

c λ

2 2

=

M j j j

a K c y a K a c

j M j T j a

  • ×

=0

λ

) , (

2 2

≥ +

  • ×

+

= Φ

∑ ∑ ∑

= = = j M j j c j M j T j a M j j j

c c a K a c a K c y c a λ λ

SUPANOVA: Objective Function

a

λ

c

λ

c

a

CMI/RPI: Spline Kernel Based Machine Learning 4

slide-5
SLIDE 5

The Gunn’s Iterative Solution

*

0: ' 1 1: ' argmin ( , ') and compute starting with large value and decreasing it until the minimum error is achieved. S2: c argmin ( ', ) where 0 and is set so

a a c a c

S c S a a c a c λ λ λ = = Φ = Φ = u r u u r r u r u u r u u r r

* *

the loss is the same as at initialization 3: argmin ( , ) where 0 and is computed starting with a large value and then decreasing it to mini

M T j j j c a a c a

c a K a M S a a c λ λ λ λ

=

× × = = Φ =

r r uu r u u r r mize the error.

2 2

( , )

M M M T j j a j j c j j j j j

a c y c K a c a K a c c λ λ

= = =

Φ = −

  • +

×

  • +

∑ ∑ ∑

CMI/RPI: Spline Kernel Based Machine Learning 5

slide-6
SLIDE 6

) , (

2 2 ' '

≥ +

= Φ

∑ ∑

= = j M j j c M j j j

c c a K c y c a λ

Heuristic Method for Computing Sparse Vector c

  • Initialization
  • create empty set S of all selected elements of vector c
  • create a set E contains all the remaining elements of vector c
  • Selection
  • select, one by one ci in set E and compute the minimal value of the

loss function with this selection

  • choose the element ci that achieves the smallest value of loss

function among all elements of set E

  • Adjustment
  • solve the set of linear equations to refit the c values in set S
  • Control loop
  • stop heuristic when the process reaches the limited

iteration number or set E becomes empty

In a single step

CMI/RPI: Spline Kernel Based Machine Learning 6

slide-7
SLIDE 7

0.5 1 1.5 2 2.5 3 3.5 4 5 10 15 20 25 30 Target and Predicted Values Sorted Sequence Number q2 = 0.023 Q2 = 0.173 RMSE = 0.355 target predicted

0.5 1 1.5 2 2.5 3 3.5 0.5 1 1.5 2 2.5 3 3.5 Predicted Value Target Value q2 = 0.023 Q2 = 0.173 RMSE = 0.355

10 20 30 40 50 60 20 40 60 80 100 120 140 160 Target and Predicted Values Sorted Sequence Number q2 = 0.214 Q2 = 0.217 RMSE = 4.768 target predicted

SUPANOVA Experimental Results

Iris data Iris data Boston Housing

10 20 30 40 50 5 10 15 20 25 30 35 40 45 50 55 Predicted Value Target Value q2 = 0.214 Q2 = 0.217 RMSE = 4.768

Boston Housing

  • Iris data ( 4 variables, 50 samples, classification)
  • all combination of features
  • Boston Housing Market ( 13 variables, 506 samples, regression)
  • up-to binary combination of features

CMI/RPI: Spline Kernel Based Machine Learning 7

slide-8
SLIDE 8

SUPANOVA Results Discussion

  • Heuristic performance with adjustment versus without adjustment

30 40 50 60 70 80 90 100 110 120 130 140 5 10 15 20 25 30 35 40 Step S2 Error Number of non-zero elements with-adjustment without-adjustment

10 20 30 40 50 60 20 40 60 80 100 120 140 160 Target and Predicted Values Sorted Sequence Number q2 = 0.214 Q2 = 0.217 RMSE = 4.768 target predicted

  • 10

10 20 30 40 50 60 20 40 60 80 100 120 140 160 Target and Predicted Values Sorted Sequence Number q2 = 0.205 Q2 = 0.213 RMSE = 4.717 target predicted

  • Up-to binary versus up-to ternary combinations of features

CMI/RPI: Spline Kernel Based Machine Learning 8

slide-9
SLIDE 9

An Industrial Application: Automatic Classification of Magnetocardiograms

Technical Objectives Goals

Normal LCX block 3 vessel disease

  • Identify cardiac ischemia from

magnetocardiograms

  • Accurate diagnosis of ischemia in patients

with left and/or right bundle branch blocks

  • Incorporate domain knowledge to enhance

automatic classification of diseases

  • Provide cardiologists with a hypothesis

testing module

  • Use computational intelligence to solve a

“missing information” problem

  • Reverse the process and identify the signature
  • f cardiac diseases in magnetocardiograms
  • Develop an online database for data exchange

(pooling) between hospitals Visualization of magnetocardiograms (examples)

CMI 2409

This is a 9-channel MCG system, capable of scanning A 20 cm x 20 cm region

  • ver the heart.

CMI/RPI: Spline Kernel Based Machine Learning 9

slide-10
SLIDE 10

Typical Time Series Data for T3-T4 MagnetoCardiogram

201

CMI/RPI: Spline Kernel Based Machine Learning 10

slide-11
SLIDE 11

From Magnetocardiogram to Data Series for Analysis

  • 1. Preprocessing of the time
  • 1. Preprocessing of the time

series & averaging series & averaging

  • 2. Heart cycle interval selection:
  • 2. Heart cycle interval selection:

Chose the T wave for the Chose the T wave for the detection of ischemia detection of ischemia

  • 3. Automatically determine the
  • 3. Automatically determine the

window of interest window of interest

  • 4. Data export to machine
  • 4. Data export to machine

learning processing learning processing

  • 1. Recording
  • 2. Filtering
  • 3. Averaging
  • 4. Selecting
  • 5. Exporting

CMI/RPI: Spline Kernel Based Machine Learning 11

slide-12
SLIDE 12

CMI Delay Data and Resulting Sparse Vector c

  • Training data: 241 patients
  • Test data: 84 patients
  • Features: 74 data representing delays
  • f pick of polarization signal
  • Heuristic parameters: binary

combinations of features

  • Kernel processing results:
  • run time: 15mins(desktop)
  • size of sparse vector c: 10 elements

2517 15.28137 1431 16.20629 509 2.13248 1416 12.94138 579 5.35348 1319 13.27779 1256 8.21285 1492 10.90624 2506 16.66780 2207 6.72628 772 3.01011

All 10 selected features are combination features Code Value

CMI/RPI: Spline Kernel Based Machine Learning 12

slide-13
SLIDE 13

CMI Data Processing Results: Prediction versus True Value

  • 1.5
  • 1
  • 0.5

0.5 1 1.5 10 20 30 40 50 60 70 80 90 Target and Predicted Values Sorted Sequence Number q2 = 0.620 Q2 = 0.627 RMSE = 0.786 target predicted

CMI/RPI: Spline Kernel Based Machine Learning 13

slide-14
SLIDE 14

CMI Data Processing Results: ROC Curve

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 True Positives False Positives AZ_area = 0.8672

ROC curve predicts a system response with different selections of the value of true/false prediction threshold.

CMI/RPI: Spline Kernel Based Machine Learning 14

slide-15
SLIDE 15

predicted negative predicted positive negative 32 5 positive 9 38

CMI Data Processing Results: Confusion Matrix and Distribution of True Values

Balance error: 83.67%

5 10 15 20 25 30 1 2 3 4 5 Healthy Sick

CMI/RPI: Spline Kernel Based Machine Learning 15