analysis of survival times using bayesian networks
play

Analysis of Survival Times Using Bayesian Networks Helge Langseth - PowerPoint PPT Presentation

Analysis of Survival Times Using Bayesian Networks Helge Langseth Presented at ESREL 98 Trondheim, Norway, 16-19 June 1998 NTNU Two types of statistical models Neyman categorises statistical models into two groups: Interpolating


  1. Analysis of Survival Times Using Bayesian Networks Helge Langseth Presented at ESREL ‘98 Trondheim, Norway, 16-19 June 1998 NTNU

  2. Two types of statistical models Neyman categorises statistical models into two groups: • Interpolating models Used merely to capture rough effects in the data • Explorative models Used to explore the underlying process which generates the data we have observed NTNU Slide no.: 2

  3. Scope With a database as a starting point, we want to build an explorative model to pinpoint how to reduce the rate of critical failures in a system components. Our main goal is to build a model to gain understanding about how the covariates contribute to the system’s survival times. NTNU Slide no.: 3

  4. The History of Graphical Models • Graphical models in statistics can be dated back to Wright’s notation in 1921. • The calculation complexity did however, render the Bayesian Networks neglected for 60 years • In the 1980’ties, effective algorithms for exact calculations on graphs, and later on computer intensive methods like Markov-Chain Monte- Carlo brought the Bayesian Networks back into the light, and up on the Top 5 Statistical Buzz- Word of the Week . NTNU Slide no.: 4

  5. Bayesian Networks Age Gender Exposure Smoking To Toxic Cancer Serum Lung Calcium Tumour NTNU Slide no.: 5

  6. Conditional Independence Age Gender Cancer is Exposure independent of Smoking To Toxic Age and Gender given Exposure To Toxic and Cancer Smoking Serum Lung Calcium Tumour NTNU Slide no.: 6

  7. “Fundamental Theorem” Every multidimensional statistical distribution function can be represented by a Bayesian Network. n ∏ = ( , ,..., ) ( | , ,..., ) f x x x f x x x x − 1 2 1 2 1 n i i = 1 i n ∏ = ( |" All predecesso rs" ) f x i = 1 i 1 2 3 n 4 NTNU Slide no.: 7

  8. Nodes are Probability Tables Gender Age Exposed Age To Toxic In Not In Exposure Smoking Material (25,65) (25, 65) To Toxic 5 % 1% True 95 % 99% False Cancer Serum Lung Calcium Tumour NTNU Slide no.: 8

  9. Where do the Networks come from? Situation: We want to build a model to analyse a multidimensional vector X. Aid: To do so, we have N i.i.d. realisations of X , x 1 , …, x N AND / OR a domain expert. Unknowns: • The network structure • The parameters in the local node tables NTNU Slide no.: 9

  10. Generating Networks: • Initialize Network repeat • Propose some Change to the structure • Fit Parameters to the new structure • Evaluate the new network according to some measure (like BIC, AIC, MDL) • If the New network is Better than the previous, then Keep the Change until Finished NTNU Slide no.: 10

  11. Bayesian Networks are used in: • In “expert systems”, mostly in medical domains (e.g. the MUNIN system) • In decision support systems (e.g. for NASA ) • In analysis of dynamic systems (e.g. speech recognition, the BAT-Mobile ) • … NTNU Slide no.: 11

  12. Bayesian Networks, Summary: • An estimate of the multidimensional density • Easy to understand for non-statisticians (e.g. a domain expert) • The representation is optimized for tasks like – Prediction – Classification – Decision support • Can incorporate prior domain knowledge: – “Top down analysis”: Expert knowledge – “Bottom up” analysis: Data driven system verification NTNU Slide no.: 12

  13. Reliability Analysis • Data-set: 219 Gas-Turbines with 2921 failures and 300 censored survival times from the OREDA-IV database • Each failure is described by ten covariates, e.g., System Type , Manufacturer , Actual/Planned PM ,... • We have special interest in Time To Fail and Failure Severity ( Critical or Degraded ) • Problem to solve: “How can we reduce the frequency of critical failures?” NTNU Slide no.: 13

  14. Generated Network System Installation Location Code Code Design Environ- Operating Class ment Mode Severity Manufact. Class Planned Sub unit Actual PM PM Time to Fail NTNU Slide no.: 14

  15. “Clique” Graph Environ. Location System Severity Subunit TTF PM Location: PM: Installation Planned PM Code Location Actual PM System: Environment: System Code Installation Operating Mode Code Manufacturer Environment System Code Design Class NTNU Slide no.: 15

  16. Model Verification 5000 4000 Cox regression 3000 2000 1000 0 0 1000 2000 3000 4000 5000 Bayesian network NTNU Slide no.: 16

  17. Conclusions • We have generated a Bayesian Network to analyse a data-set from the OREDA IV database. • The Bayesian Network enabled both Qualitative and Quantitative analysis of the data-set. • To verify the calculations, the numerical results where compared to those found by Cox regression. The results of the two methods were at the same level. NTNU Slide no.: 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend