PREDICTING PREDIC ING T THE DISE HE DISEASE O SE OF A F ALZ - PowerPoint PPT Presentation

PREDICTING PREDIC ING ¡T ¡THE ¡DISE HE ¡DISEASE ¡O SE ¡OF ¡A F ¡ALZ LZHEIMER ¡ ¡ HEIMER ¡ ¡ WIT WITH ¡SNP ¡BIOMA H ¡SNP ¡BIOMARK RKER ERS ¡A S ¡AND ¡CLINICA ND ¡CLINICAL ¡D L ¡DATA ¡ ¡ ¡ ¡ USING ¡D USING ¡DATA ¡MINING ¡MINING ¡CLA ¡CLASSIFICA SSIFICATIO ION ¡A N ¡APPR PPROACH: ¡ CH: ¡ DECISION ¡T DE CISION ¡TREE REE ¡Onur ¡ERDOĞAN ¡On r ¡ERDOĞAN ¡ ¡ ¡ ¡ ¡ ¡YEŞİM ¡A ¡ ¡ ¡ ¡ ¡ ¡YEŞİM ¡AYDIN ¡SON YDIN ¡SON MEDICAL ¡INFORMATICS MIDDLE ¡EAST ¡TECHNICAL ¡UNIVERSITY

Outline q Mo#va#on ¡ q Introduc#on ¡ q Materials ¡& ¡Methods ¡ q Results ¡ q Discussion ¡ q Future ¡Work ¡

MoFvaFon § Recent ¡developments ¡in ¡biotechnology ¡have ¡allowed ¡the ¡high-‑throughput ¡data ¡genera#on ¡ from ¡biological ¡samples. ¡ § So ¡genomic ¡medicine ¡is ¡about ¡taking ¡a ¡whole-‑genome ¡view ¡to ¡gene#c ¡disorders ¡so ¡we ¡can ¡ discover: ¡ § If ¡a ¡person ¡is ¡associated ¡with ¡suscep#bility ¡to ¡complex ¡diseases ¡ ¡ § The ¡iden#fica#on ¡of ¡the ¡underlying ¡gene#c ¡reasons ¡ § Insight ¡into ¡the ¡pathoe#ology ¡of ¡the ¡disease ¡ § How ¡to ¡select ¡the ¡appropriate ¡treatment ¡ § How ¡to ¡prevent ¡disease ¡

MoFvaFon ¡(Contd) § In ¡this ¡study ¡we ¡have ¡applied ¡one ¡of ¡the ¡widely ¡used ¡data ¡mining ¡classifica#on ¡methodology: ¡ “decision ¡tree” ¡for ¡associa#ng ¡the ¡SNP ¡Biomarkers ¡and ¡clinical ¡data ¡with ¡the ¡Alzheimer’s ¡disease ¡ (AD), ¡which ¡is ¡the ¡most ¡common ¡form ¡of ¡“demen#a”. ¡ § To ¡determine ¡whether ¡clinical ¡informa#on ¡contributes ¡to ¡the ¡performance ¡of ¡the ¡classifier ¡(predic#on ¡ model). ¡ § To ¡determine ¡which ¡aSributes’ ¡combina#on ¡(SNPs ¡or ¡clinical ¡or ¡demographic ¡features) ¡make ¡individuals ¡ risky ¡in ¡terms ¡of ¡Alzheimer’s ¡Disease. ¡

IntroducFon § Gene#c ¡disorders ¡are ¡illnesses ¡caused ¡by ¡one ¡or ¡more ¡abnormali#es ¡in ¡the ¡human ¡genome. ¡ § Modern ¡gene#cs ¡so ¡far ¡has ¡had ¡its ¡major ¡impact ¡on ¡medicine ¡by ¡defining ¡diseases ¡caused ¡by ¡ visible ¡chromosomal ¡defects ¡and ¡by ¡finding ¡varia#ons ¡in ¡a ¡gene ¡( ¡mutant ¡gene). ¡ § Single ¡gene ¡diseases ¡ § Over ¡1500 ¡muta#ons ¡detected ¡ § Generally ¡rare ¡ § Follows ¡Mendelian ¡inheritance ¡(autosomal ¡dominant) ¡ § Most ¡mul#factorial, ¡chronic ¡diseases ¡ ¡are ¡generally ¡caused ¡by ¡the ¡combined ¡effects ¡of ¡ gene#c ¡varia#on ¡on ¡different ¡genomic ¡loca#ons. ¡(i.e. ¡Alzheimer’s ¡Disease, ¡Joint ¡Ilness) ¡ § No ¡clear ¡paSern ¡of ¡inheritence ¡(enviromental ¡factors ¡may ¡affect) ¡ § Common ¡in ¡popula#on ¡ § Results ¡from ¡the ¡interac#ons ¡of ¡mul#ple ¡genes ¡

IntroducFon ¡(Contd) § The ¡most ¡common ¡gene#c ¡varia#ons ¡are ¡single ¡DNA ¡building ¡block ¡altera#ons ¡which ¡are ¡the ¡ challenging ¡aspect ¡of ¡the ¡post-‑genome ¡biology. ¡ § Finding ¡DNA ¡muta#ons ¡in ¡genes ¡that ¡cause ¡or ¡contribute ¡to ¡a ¡disease ¡is ¡one ¡of ¡the ¡most ¡ challenging ¡tasks. ¡However, ¡it ¡is ¡like ¡looking ¡for ¡a ¡needle ¡in ¡a ¡haystack ¡since ¡millons ¡of ¡SNPs ¡exist ¡ in ¡human ¡genome. ¡ § Gene#c ¡reason ¡behind ¡individual ¡phenotypic ¡differences ¡ § Basis ¡of ¡many ¡complex ¡diseases ¡ § Intelligent ¡computa#onal ¡techniques ¡are ¡needed ¡in ¡order ¡to ¡draw ¡conclusions ¡from ¡the ¡high ¡ amount ¡of ¡data. ¡ § Data ¡mining ¡methods ¡have ¡become ¡promising ¡in ¡determining ¡of ¡the ¡significant ¡gene#c ¡ ¡ ¡ varia#on ¡among ¡individuals. ¡ § Supervised ¡learning ¡(Decision ¡Tree, ¡Support ¡Vector ¡Machines, ¡Ar#ficial ¡Neural ¡Networks ¡etc.) ¡ § Unsupervised ¡learning ¡(K-‑means ¡clustering, ¡hierarchical ¡clustering ¡etc.) ¡

Background ¡InformaFon ¡ Biological ¡Background ¡ SNPs ¡ § Single ¡Nucleo#de ¡Polymorphisms(SNPs) ¡are ¡the ¡most ¡common ¡DNA ¡sequence ¡varia#on ¡where ¡ only ¡a ¡single ¡nucleo#de ¡(A,T,C,G) ¡in ¡the ¡human ¡genome ¡differs ¡between ¡individuals. ¡ Humans ¡share ¡about ¡99.9% ¡sequence ¡iden#ty ¡ § The ¡other ¡0.1% ¡(about ¡10 ¡million ¡bases) ¡are ¡mostly ¡SNPs ¡ § SNPs ¡occur ¡approximately ¡every ¡3000 ¡bases ¡ § Most ¡SNPs ¡have ¡only ¡2 ¡alleles ¡ § Most ¡SNPs ¡not ¡in ¡coding ¡regions ¡(99% ¡not ¡in ¡genes) ¡ § SNPs ¡can ¡cause ¡silent, ¡harmless, ¡harmful, ¡or ¡latent ¡changes ¡ § When ¡frequent ¡enough ¡in ¡a ¡popula#on ¡they ¡can ¡be ¡linked ¡to ¡specific ¡traits, ¡e.g. ¡a ¡disease ¡ § In ¡reality ¡few ¡SNPs ¡act ¡on ¡their ¡own ¡ §

Background ¡InformaFon ¡ Biological ¡Background ¡(Contd) G ¡A ¡C ¡ ¡-‑-‑-‑-‑-‑-‑ ¡ ¡G ¡A ¡G ¡ SNP ¡ ¡ Leucine ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡Leucine ¡ ¡

Background ¡InformaFon ¡ Biological ¡Background ¡(Contd) C ¡T ¡A ¡ ¡-‑-‑-‑-‑-‑-‑ ¡ ¡C ¡T ¡T ¡ SNP ¡ ¡ Aspar1c ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡Valine ¡ ¡Acid ¡

Material ¡& ¡Methods ¡ Dataset ¡ The ¡AD ¡genotyping ¡and ¡phenotyping ¡data ¡is ¡obtained ¡from ¡GENADA. ¡Throughout ¡authorized ¡access, ¡we ¡ § reached ¡ ¡1718 ¡par#cipant’s ¡individual ¡level ¡data ¡. ¡ GENADA ¡dataset ¡contains ¡both ¡genotypic ¡informa#on ¡and ¡phenotyping ¡informa#on. ¡ § In ¡this ¡study ¡410907 ¡SNPs ¡are ¡captured ¡for ¡each ¡individuals ¡in ¡9 ¡medical ¡centers ¡in ¡Canada ¡from ¡eligible ¡ § individuals ¡who ¡have ¡Alzheimer’s ¡disease ¡at ¡the ¡level ¡of ¡mild ¡to ¡moderate, ¡a ¡group ¡of ¡ethnically ¡matched ¡ controls ¡who ¡are ¡not ¡yet ¡considered ¡AD. ¡

Material ¡& ¡Methods ¡(Contd) ¡ Preprocessing ¡(takes ¡60% ¡of ¡efforts ¡of ¡whole ¡process) ¡ ¡ Integra#ng ¡Data ¡ § GENADA ¡contains ¡different ¡repositories ¡in ¡terms ¡of ¡genotype ¡data ¡or ¡laboratory ¡results. ¡Each ¡ repository ¡created ¡by ¡individual ¡IDs. ¡ § Genotyping ¡ ¡data ¡(SNPs), ¡phenotype ¡informa#on ¡and ¡clinical ¡informa#on ¡are ¡joined ¡using ¡ structured ¡query ¡language. ¡

Material ¡& ¡Methods ¡(Contd) ¡ In ¡addi#on ¡to ¡genotype ¡data, ¡clinical ¡aSributes ¡are ¡included ¡in ¡the ¡study ¡for ¡the ¡predic#on ¡of ¡AD. ¡ Variable ¡Name Descrip1on Subject ¡ID Individual ¡iden#fica#on ¡number. age Diagnosis ¡age ¡(for ¡controls). age_on Onset ¡age ¡of ¡AD ¡(for ¡pa#ents). gender Sex ¡of ¡the ¡individuals. case/control Informa#on ¡about ¡affec#on ¡status. Body ¡Mass ¡Index(BMI) Body ¡fatness ¡for ¡individuals. CHOL(mmol/l) Amount ¡of ¡fat ¡lipid ¡ ¡carried ¡in ¡the ¡blood ¡by ¡molecules ¡called ¡lipoproteins. HB(g/l) Amount ¡of ¡proteins ¡that ¡are ¡found ¡in ¡red ¡blood ¡cells. HBA1C_PCT Percentage ¡of ¡hemoglobin ¡in ¡red ¡blood ¡cells ¡(erythrocytes) ¡that ¡is ¡#ed ¡up ¡to ¡glucose. HDLCH(mmol/l) Amount ¡of ¡high ¡density ¡lipoprotein. LDLCH(mmol/l) Amount ¡of ¡low ¡density ¡lipoprotein. TRIG(mmol/l) Amount ¡of ¡triglycerides ¡in ¡blood ¡plasma. WBC(giga/l) The ¡number ¡of ¡leukocytes ¡in ¡the ¡blood ¡.

Material ¡& ¡Methods ¡(Contd) ¡ Preprocessing ¡(takes ¡60% ¡of ¡efforts ¡of ¡whole ¡process) ¡ ¡ Dimension ¡Reduc#on ¡ In ¡this ¡study, ¡from ¡the ¡high ¡dimension ¡of ¡data ¡aSributes, ¡the ¡subset ¡which ¡contains ¡the ¡highest ¡ significant ¡data ¡aSributes ¡are ¡chosen ¡for ¡the ¡decision ¡tree ¡predic#on ¡model ¡construc#on. ¡ § GWAS ¡ § AHP ¡Scoring ¡

PREDICTING PREDIC ING T THE DISE HE DISEASE O SE OF A F ALZ - PowerPoint PPT Presentation

PREDICTING PREDIC ING T THE DISE HE DISEASE O SE OF A F ALZ LZHEIMER HEIMER WIT WITH SNP BIOMA H SNP BIOMARK RKER ERS A S AND CLINICA ND CLINICAL D L DATA USING D USING

Branch Predic,on J. Nelson Amaral Why Branch Predic,on?

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

Wake Up to Lyme What is Lyme Disease? Risk of Lyme Disease Preventing Lyme Disease

Welcome Predicting Change Outcomes Leveraging SQL Server Profiler Lee Everest SQL Rx Predicting

Dise Disease Man anag agement to t to Prom omote te Blood Blood Pressu sure C Con ontr

Linear regression How to measure the accuracy of linear regression models Linear Regression

Mortali lity Predic ictio ion in in Cancer Patie ient Popula latio ions Jun June 25, 2017

Predicting Regulatory Elements Predicting Regulatory Elements in P. falciparum in P. falciparum

Predicting Return to Work Predicting Return to Work with Data Mining with Data Mining Claim A

Predicting and Comprehending Predicting and Comprehending Asteroid Impacts Asteroid Impacts

Predicting and modeling water chemistry Predicting and modeling water chemistry associated with

O tt itti Outtwitting the Twitterers th T itt Predicting Information Predicting

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

Predicting Min Predicting Min-Bias and the Bias and the Underlying Event at

Predicting implicit and explicit questions Matthijs Westera COLT kick-off workshop Predicting

Computational Algorithm Predicting Surface Computational Algorithm Predicting Surface Morphology

Complex flows of cellular suspensions in microtubes at different temperatures Natalya Kizilova

Flow Cytometry What is Flow Cytometry? What Is It Used For? Measure many different

Self-Organisation & MAS An Introduction Multiagent Systems LS Sistemi Multiagente LS Andrea

The Game Development Process Visual Design and Production New Artistic Courses AR 1100.

, If you sing The Star Spangled Banner will sound like a cat fight. , When Irwin was ready to

STVL Overview 27 April, 2008 Antwerp Matt Robshaw Orange Labs STVL Overview STVL

Last Week Logic gates are built out of transistors There are many different logic gates

6-1 The Science of Social Change Research shows that there are certain approaches that work better

Sambuz

Useful Links

Newsletter

Mail Us

PREDICTING PREDIC ING T THE DISE HE DISEASE O SE OF A F ALZ - PowerPoint PPT Presentation

PREDICTING PREDIC ING T THE DISE HE DISEASE O SE OF A F ALZ LZHEIMER HEIMER WIT WITH SNP BIOMA H SNP BIOMARK RKER ERS A S AND CLINICA ND CLINICAL D L DATA USING D USING

Branch Predic,on J. Nelson Amaral Why Branch Predic,on?

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

Wake Up to Lyme What is Lyme Disease? Risk of Lyme Disease Preventing Lyme Disease

Welcome Predicting Change Outcomes Leveraging SQL Server Profiler Lee Everest SQL Rx Predicting

Dise Disease Man anag agement to t to Prom omote te Blood Blood Pressu sure C Con ontr

Linear regression How to measure the accuracy of linear regression models Linear Regression

Mortali lity Predic ictio ion in in Cancer Patie ient Popula latio ions Jun June 25, 2017

Predicting Regulatory Elements Predicting Regulatory Elements in P. falciparum in P. falciparum

Predicting Return to Work Predicting Return to Work with Data Mining with Data Mining Claim A

Predicting and Comprehending Predicting and Comprehending Asteroid Impacts Asteroid Impacts

Predicting and modeling water chemistry Predicting and modeling water chemistry associated with

O tt itti Outtwitting the Twitterers th T itt Predicting Information Predicting

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

Predicting Min Predicting Min-Bias and the Bias and the Underlying Event at

Predicting implicit and explicit questions Matthijs Westera COLT kick-off workshop Predicting

Computational Algorithm Predicting Surface Computational Algorithm Predicting Surface Morphology

Complex flows of cellular suspensions in microtubes at different temperatures Natalya Kizilova

Flow Cytometry What is Flow Cytometry? What Is It Used For? Measure many different

Self-Organisation &amp; MAS An Introduction Multiagent Systems LS Sistemi Multiagente LS Andrea

The Game Development Process Visual Design and Production New Artistic Courses AR 1100.

, If you sing The Star Spangled Banner will sound like a cat fight. , When Irwin was ready to

STVL Overview 27 April, 2008 Antwerp Matt Robshaw Orange Labs STVL Overview STVL

Last Week Logic gates are built out of transistors There are many different logic gates

6-1 The Science of Social Change Research shows that there are certain approaches that work better

Sambuz

Useful Links

Newsletter

Mail Us

Self-Organisation & MAS An Introduction Multiagent Systems LS Sistemi Multiagente LS Andrea