Machine Learning Methods for Mortality Prediction in Patients with - - PowerPoint PPT Presentation

machine learning methods for mortality prediction in
SMART_READER_LITE
LIVE PREVIEW

Machine Learning Methods for Mortality Prediction in Patients with - - PowerPoint PPT Presentation

Machine Learning Methods for Mortality Prediction in Patients with ST Elevation Myocardial Infarction J. Vomlel 1 , H. Kru k 2 , P. T uma 2 , J. P cek 3 , and M. Hutyra 3 z re 1 Institute of Information Theory and Automation (


slide-1
SLIDE 1

Machine Learning Methods for Mortality Prediction in Patients with ST Elevation Myocardial Infarction

  • J. Vomlel1, H. Kruˇ

z´ ık2, P. T˚ uma2, J. Pˇ reˇ cek3, and M. Hutyra3

1Institute of Information Theory and Automation (´

UTIA) Academy of Sciences of the Czech Republic

2Gnomon, Ltd.

Prague, Czech Republic

3First Department of Internal Medicine

University Hospital Olomouc, Czech Republic

slide-2
SLIDE 2

Contents

  • ST Elevation Myocardial Infarction
slide-3
SLIDE 3

Contents

  • ST Elevation Myocardial Infarction
  • Motivation for mortality prediction
slide-4
SLIDE 4

Contents

  • ST Elevation Myocardial Infarction
  • Motivation for mortality prediction
  • Hospital data
slide-5
SLIDE 5

Contents

  • ST Elevation Myocardial Infarction
  • Motivation for mortality prediction
  • Hospital data
  • Data preprocessing
slide-6
SLIDE 6

Contents

  • ST Elevation Myocardial Infarction
  • Motivation for mortality prediction
  • Hospital data
  • Data preprocessing
  • Tested methods
slide-7
SLIDE 7

Contents

  • ST Elevation Myocardial Infarction
  • Motivation for mortality prediction
  • Hospital data
  • Data preprocessing
  • Tested methods
  • Results of experiments
slide-8
SLIDE 8

Acute Myocardial Infarction

Wikimedia Commons

  • An atherosclerotic plaque slowly

builds up in the inner lining of a coronary artery.

  • Suddenly, it ruptures, causing

catastrophic thrombus formation.

  • The thrombus totally occludes

the artery and prevents blood flow downstream.

slide-9
SLIDE 9

Acute Myocardial Infarction

Wikimedia Commons, image by Patrick J. Lynch, medical illustrator and C. Carl Jaffe, MD, cardiologist

The heart cells in the territory of the occluded coronary artery die and do not grow back.

slide-10
SLIDE 10

ST Elevation Myocardial Infarction (STEMI)

  • STEMI is a myocardial infarction with ST elevation on

electrocardiogram (ECG)

slide-11
SLIDE 11

ST Elevation Myocardial Infarction (STEMI)

  • STEMI is a myocardial infarction with ST elevation on

electrocardiogram (ECG)

  • STEMI is the leading cause of death in developed countries
slide-12
SLIDE 12

ST Elevation Myocardial Infarction (STEMI)

  • STEMI is a myocardial infarction with ST elevation on

electrocardiogram (ECG)

  • STEMI is the leading cause of death in developed countries
  • Its treatment has a significant socio-economic impact
slide-13
SLIDE 13

Benchmarking of Hospitals using Mortality

  • 30-days mortality: What fraction of patients treated with

STEMI at a given hospital die within 30 days?

slide-14
SLIDE 14

Benchmarking of Hospitals using Mortality

  • 30-days mortality: What fraction of patients treated with

STEMI at a given hospital die within 30 days?

  • This criteria is not fair for comparing hospitals since some

hospitals treat more complicated cases.

slide-15
SLIDE 15

Benchmarking of Hospitals using Mortality

  • 30-days mortality: What fraction of patients treated with

STEMI at a given hospital die within 30 days?

  • This criteria is not fair for comparing hospitals since some

hospitals treat more complicated cases.

  • Rather, for each patient with a given health state at hospital

admission compute the probability he/she will die within 30 days.

slide-16
SLIDE 16

Benchmarking of Hospitals using Mortality

  • 30-days mortality: What fraction of patients treated with

STEMI at a given hospital die within 30 days?

  • This criteria is not fair for comparing hospitals since some

hospitals treat more complicated cases.

  • Rather, for each patient with a given health state at hospital

admission compute the probability he/she will die within 30 days.

  • For each hospital compute the average of this probabilities

and compare it with true mortality at that hospital.

slide-17
SLIDE 17

Benchmarking of Hospitals using Mortality

  • 30-days mortality: What fraction of patients treated with

STEMI at a given hospital die within 30 days?

  • This criteria is not fair for comparing hospitals since some

hospitals treat more complicated cases.

  • Rather, for each patient with a given health state at hospital

admission compute the probability he/she will die within 30 days.

  • For each hospital compute the average of this probabilities

and compare it with true mortality at that hospital.

  • We need a prediction model that relates the mortality with

attributes describing the health state at hospital admission.

slide-18
SLIDE 18

Dataset of patients with STEMI

  • 603 patients admitted to University Hospital in Olomouc.
slide-19
SLIDE 19

Dataset of patients with STEMI

  • 603 patients admitted to University Hospital in Olomouc.
  • The average age was 65 years.
slide-20
SLIDE 20

Dataset of patients with STEMI

  • 603 patients admitted to University Hospital in Olomouc.
  • The average age was 65 years.
  • There were 431 men (71%) and 172 women (29%) in the

dataset.

slide-21
SLIDE 21

Dataset of patients with STEMI

  • 603 patients admitted to University Hospital in Olomouc.
  • The average age was 65 years.
  • There were 431 men (71%) and 172 women (29%) in the

dataset.

  • About each patient we knew whether he/she died within

30-days.

slide-22
SLIDE 22

Dataset of patients with STEMI

  • 603 patients admitted to University Hospital in Olomouc.
  • The average age was 65 years.
  • There were 431 men (71%) and 172 women (29%) in the

dataset.

  • About each patient we knew whether he/she died within

30-days.

  • Cardiologists selected 23 attributes that may influence STEMI

mortality.

slide-23
SLIDE 23

Attributes

Attribute Code type value range in data Gender SEX nominal {male, female} Age AGE real [23, 94] Height HT real [145, 205] Weight WT real [35, 150] Body Mass Index BMI real [16.65, 48.98] STEMI Location STEMI nominal {inferior, anterior, lateral} Killip classification at admission KILLIP integer {1, 2, 3, 4} Kalium K real [2.25, 7.07] Urea UR real [1.6, 46.5] Kreatinin KREA real [17, 525] Uric acid KM real [109, 935] Albumin ALB real [23, 53.5] HDL Cholesterol HDLC real [0.38, 2.21] Cholesterol CH real [1.8, 9.59] Triacylglycerol TAG real [0.31, 8.13] LDL Cholesterol LDLC real [0.63, 7.79] Glucose GLU real [4.2, 25.7] C-reactive protein CRP real [0.3, 359] Cystatin C CYSC real [0.38, 5.22] NT prohormone of brain natriuretic peptide NTBNP real [22.2, 35000] Troponin TRPT real [0, 25] Glomerular filtration rate (MDRD) GFMD real [0.13, 7.31] Glomerular filtration rate (Cystatin C) GFCD real [0.09, 7.17]

slide-24
SLIDE 24

Ordinal Data

  • Ordinal attributes: attributes whose values have an ordering
  • f values that is natural for the quantification of their impact
  • n the class.
slide-25
SLIDE 25

Ordinal Data

  • Ordinal attributes: attributes whose values have an ordering
  • f values that is natural for the quantification of their impact
  • n the class.
  • This is satisfied by all attributes that can take only two values.
slide-26
SLIDE 26

Ordinal Data

  • Ordinal attributes: attributes whose values have an ordering
  • f values that is natural for the quantification of their impact
  • n the class.
  • This is satisfied by all attributes that can take only two values.
  • Most real-valued attributes are ordinal, but for some

laboratory tests values deviating from a normal range in both directions may increase the probability of death.

slide-27
SLIDE 27

Ordinal Data

  • Ordinal attributes: attributes whose values have an ordering
  • f values that is natural for the quantification of their impact
  • n the class.
  • This is satisfied by all attributes that can take only two values.
  • Most real-valued attributes are ordinal, but for some

laboratory tests values deviating from a normal range in both directions may increase the probability of death.

  • STEMI is nominal. We create one binary attribute for each

state of STEMI indicating whether STEMI takes this state or not: STEMI inferior, STEMI anterior, and STEMI lateral.

slide-28
SLIDE 28

Ordinal Data

  • Ordinal attributes: attributes whose values have an ordering
  • f values that is natural for the quantification of their impact
  • n the class.
  • This is satisfied by all attributes that can take only two values.
  • Most real-valued attributes are ordinal, but for some

laboratory tests values deviating from a normal range in both directions may increase the probability of death.

  • STEMI is nominal. We create one binary attribute for each

state of STEMI indicating whether STEMI takes this state or not: STEMI inferior, STEMI anterior, and STEMI lateral.

  • We will refer to data in this form as D.ORD.
slide-29
SLIDE 29

Discrete Data

  • Discrete attributes: attributes with finite number of values.
slide-30
SLIDE 30

Discrete Data

  • Discrete attributes: attributes with finite number of values.
  • Czech National Code Book classifies numeric laboratory

results into nine groups 1, 2, . . . , 9. Group 5 corresponds to standard values in the standard population. The groups < 5 to decreased values and groups > 5 to increased values.

slide-31
SLIDE 31

Discrete Data

  • Discrete attributes: attributes with finite number of values.
  • Czech National Code Book classifies numeric laboratory

results into nine groups 1, 2, . . . , 9. Group 5 corresponds to standard values in the standard population. The groups < 5 to decreased values and groups > 5 to increased values.

  • We discretized all laboratory tests X so that for each test we

created two new attributes: one for decreased values and another attribute for increased values.

slide-32
SLIDE 32

Discrete Data

  • Discrete attributes: attributes with finite number of values.
  • Czech National Code Book classifies numeric laboratory

results into nine groups 1, 2, . . . , 9. Group 5 corresponds to standard values in the standard population. The groups < 5 to decreased values and groups > 5 to increased values.

  • We discretized all laboratory tests X so that for each test we

created two new attributes: one for decreased values and another attribute for increased values.

  • The attributes Age, Height, and Weight were discretized into

more than two groups.

slide-33
SLIDE 33

Discrete Data

  • Discrete attributes: attributes with finite number of values.
  • Czech National Code Book classifies numeric laboratory

results into nine groups 1, 2, . . . , 9. Group 5 corresponds to standard values in the standard population. The groups < 5 to decreased values and groups > 5 to increased values.

  • We discretized all laboratory tests X so that for each test we

created two new attributes: one for decreased values and another attribute for increased values.

  • The attributes Age, Height, and Weight were discretized into

more than two groups.

  • We will refer to data in this form as D.DISCR.
slide-34
SLIDE 34

Binary Data

  • Binary attributes: attributes take only two values.
slide-35
SLIDE 35

Binary Data

  • Binary attributes: attributes take only two values.
  • All laboratory tests are encoded using two binary attributes:
  • ne for decreased values and another attribute for increased

values.

slide-36
SLIDE 36

Binary Data

  • Binary attributes: attributes take only two values.
  • All laboratory tests are encoded using two binary attributes:
  • ne for decreased values and another attribute for increased

values.

  • Killip classification was transformed by replacing value 1 by 0

and by joining the values 2, 3, 4 into one value 1.

slide-37
SLIDE 37

Binary Data

  • Binary attributes: attributes take only two values.
  • All laboratory tests are encoded using two binary attributes:
  • ne for decreased values and another attribute for increased

values.

  • Killip classification was transformed by replacing value 1 by 0

and by joining the values 2, 3, 4 into one value 1.

  • The attributes Age, Height, and Weight were removed since

they appeared not to be relevant for mortality.

slide-38
SLIDE 38

Binary Data

  • Binary attributes: attributes take only two values.
  • All laboratory tests are encoded using two binary attributes:
  • ne for decreased values and another attribute for increased

values.

  • Killip classification was transformed by replacing value 1 by 0

and by joining the values 2, 3, 4 into one value 1.

  • The attributes Age, Height, and Weight were removed since

they appeared not to be relevant for mortality.

  • Body Mass Index (BMI) was encoded using two binary

attributes BMI high and BMI low.

slide-39
SLIDE 39

Binary Data

  • Binary attributes: attributes take only two values.
  • All laboratory tests are encoded using two binary attributes:
  • ne for decreased values and another attribute for increased

values.

  • Killip classification was transformed by replacing value 1 by 0

and by joining the values 2, 3, 4 into one value 1.

  • The attributes Age, Height, and Weight were removed since

they appeared not to be relevant for mortality.

  • Body Mass Index (BMI) was encoded using two binary

attributes BMI high and BMI low.

  • We will refer to data in this form as D.BIN.
slide-40
SLIDE 40

Tested methods

  • Logistic regression (two versions): LOG.REG and

LOG.BOOST. We applied algorithms to both:

slide-41
SLIDE 41

Tested methods

  • Logistic regression (two versions): LOG.REG and

LOG.BOOST.

  • Decision tree C4.5 – pruned C4.5 decision tree.

We applied algorithms to both:

slide-42
SLIDE 42

Tested methods

  • Logistic regression (two versions): LOG.REG and

LOG.BOOST.

  • Decision tree C4.5 – pruned C4.5 decision tree.
  • Naive Bayes classifier (two versions): NB.SIMPL and NB.

We applied algorithms to both:

slide-43
SLIDE 43

Tested methods

  • Logistic regression (two versions): LOG.REG and

LOG.BOOST.

  • Decision tree C4.5 – pruned C4.5 decision tree.
  • Naive Bayes classifier (two versions): NB.SIMPL and NB.
  • NN – Neural Network Multilayer Perceptron with sigmoid

function. We applied algorithms to both:

slide-44
SLIDE 44

Tested methods

  • Logistic regression (two versions): LOG.REG and

LOG.BOOST.

  • Decision tree C4.5 – pruned C4.5 decision tree.
  • Naive Bayes classifier (two versions): NB.SIMPL and NB.
  • NN – Neural Network Multilayer Perceptron with sigmoid

function.

  • Bayesian network classifier (two versions): BN.K2 and

BN.TAN. We applied algorithms to both:

slide-45
SLIDE 45

Tested methods

  • Logistic regression (two versions): LOG.REG and

LOG.BOOST.

  • Decision tree C4.5 – pruned C4.5 decision tree.
  • Naive Bayes classifier (two versions): NB.SIMPL and NB.
  • NN – Neural Network Multilayer Perceptron with sigmoid

function.

  • Bayesian network classifier (two versions): BN.K2 and

BN.TAN. We applied algorithms to both:

  • the full data and
slide-46
SLIDE 46

Tested methods

  • Logistic regression (two versions): LOG.REG and

LOG.BOOST.

  • Decision tree C4.5 – pruned C4.5 decision tree.
  • Naive Bayes classifier (two versions): NB.SIMPL and NB.
  • NN – Neural Network Multilayer Perceptron with sigmoid

function.

  • Bayesian network classifier (two versions): BN.K2 and

BN.TAN. We applied algorithms to both:

  • the full data and
  • the data with the attribute set reduced by a method which

selects a subsets of attributes highly correlated with the class while having low intercorrelation. We denote the method with extension .AS.

slide-47
SLIDE 47

Evaluation Criteria

  • Accuracy (ACC): the number of true positive and true

negative classifications divided by total number of classifications reported using the percentage scale.

slide-48
SLIDE 48

Evaluation Criteria

  • Accuracy (ACC): the number of true positive and true

negative classifications divided by total number of classifications reported using the percentage scale.

  • Area under the ROC curve (AOC). The ROC curve depicts

the dependence of True Positive Rate (sensitivity) on False Positive Rate (1-specificity) both as functions of the threshold.

slide-49
SLIDE 49

Evaluation Criteria

  • Accuracy (ACC): the number of true positive and true

negative classifications divided by total number of classifications reported using the percentage scale.

  • Area under the ROC curve (AOC). The ROC curve depicts

the dependence of True Positive Rate (sensitivity) on False Positive Rate (1-specificity) both as functions of the threshold.

The ROC curve for LOG.BOOST on D.BIN.AS

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.4 0.6 0.8 1 True Positive Rate False Positive Rate

slide-50
SLIDE 50

Results

Classifier Criteria D.ORD D.ORD.AS D.DISCR D.DISCR.AS D.BIN D.BIN.AS LOG.BOOST ACC 94.03 94.20 93.86 88.23 94.03 93.86 AUC 0.618 0.646 0.722 0.640 0.802 0.832 LOG.REG ACC 92.54 93.86 90.05 87.56 92.87 93.70 AUC 0.792 0.821 0.646 0.607 0.743 0.798 C4.5 ACC 93.86 94.69 94.20 88.72 93.53 94.53 AUC 0.618 0.569 0.600 0.544 0.547 0.610 NB ACC 89.22 91.04 86.90 87.73 86.90 94.20 AUC 0.820 0.813 0.806 0.649 0.811 0.809 NB.SIMPL ACC 89.72 90.88 86.90 87.73 86.90 94.20 AUC 0.828 0.769 0.806 0.649 0.811 0.809 NN ACC 91.38 93.86 93.20 87.40 92.04 93.53 AUC 0.763 0.746 0.737 0.550 0.767 0.759 BN.K2 ACC NA NA 92.04 94.53 94.03 94.36 AUC NA NA 0.769 0.783 0.769 0.821 BN.TAN ACC NA NA 92.04 88.89 94.20 94.86 AUC NA NA 0.787 0.590 0.811 0.818

slide-51
SLIDE 51

LOG.BOOST for D.ORD.AS and D.BIN.AS

0.87 + STEMI_lateral * -0.41 + ALB * -0.08 + HDLC * 0.21 + CYSC * 0.24 + KILLIP * 0.31

slide-52
SLIDE 52

LOG.BOOST for D.ORD.AS and D.BIN.AS

0.87 + STEMI_lateral * -0.41 + ALB * -0.08 + HDLC * 0.21 + CYSC * 0.24 + KILLIP * 0.31

  • 1.64

+ ALB_low * 0.76 + CYSC_high * 0.62 + KILLIP * 0.68

slide-53
SLIDE 53

C4.5 for D.ORD.AS and D.BIN.AS

CYSC <= 1.64: 0 (553.0) CYSC > 1.64 | HDLC <= 0.56: 1 (5.0) | HDLC > 0.56 | | KILLIP <= 1 | | | ALB <= 25.2: 1 (2.21) | | | ALB > 25.2: 0 (29.79) | | KILLIP > 1 | | | UR <= 15.8: 1 (6.0) | | | UR > 15.8: 0 (7.0)

slide-54
SLIDE 54

C4.5 for D.ORD.AS and D.BIN.AS

CYSC <= 1.64: 0 (553.0) CYSC > 1.64 | HDLC <= 0.56: 1 (5.0) | HDLC > 0.56 | | KILLIP <= 1 | | | ALB <= 25.2: 1 (2.21) | | | ALB > 25.2: 0 (29.79) | | KILLIP > 1 | | | UR <= 15.8: 1 (6.0) | | | UR > 15.8: 0 (7.0) CYSC_high = 0: 0 (526.0) CYSC_high = 1 | ALB_low = 0: 0 (63.29) | ALB_low = 1: 1 (13.71)

slide-55
SLIDE 55

Bayesian networks

BN learned by K2 algorithm

STEMI_lateral ALB_high LDLC_low ALB_low KILLIP CYSC_high BMI_low K_low MORTALITY

slide-56
SLIDE 56

Bayesian networks

BN learned by TAN

MORTALITY LDLC_low BMI_low K_low KILLIP CYSC_high STEMI_lateral ALB_high ALB_low

slide-57
SLIDE 57

Conclusions

  • We compared different machine learning methods using a real

medical data from a hospital.

slide-58
SLIDE 58

Conclusions

  • We compared different machine learning methods using a real

medical data from a hospital.

  • The best performance was achieved on discretized data where

the discretization was based on the expert knowledge and the attributes had only two values.

slide-59
SLIDE 59

Conclusions

  • We compared different machine learning methods using a real

medical data from a hospital.

  • The best performance was achieved on discretized data where

the discretization was based on the expert knowledge and the attributes had only two values.

  • The best performing classifiers were based on logistic

regression and on simple Bayesian networks.

slide-60
SLIDE 60

Conclusions

  • We compared different machine learning methods using a real

medical data from a hospital.

  • The best performance was achieved on discretized data where

the discretization was based on the expert knowledge and the attributes had only two values.

  • The best performing classifiers were based on logistic

regression and on simple Bayesian networks.

  • In future we would like to extend the set of attributes and get

datasets with a larger number of patients.