Big data approaches in cardiovascular genomic research Randal - - PowerPoint PPT Presentation

big data approaches in cardiovascular genomic research
SMART_READER_LITE
LIVE PREVIEW

Big data approaches in cardiovascular genomic research Randal - - PowerPoint PPT Presentation

CENTER FOR DATA SCIENCE SCIENCE Strength in Numbers AND BIG D BIG DATA ANALYTICS Big data approaches in cardiovascular genomic research Randal Westrick CDaS and Department of Biological Sciences Dec. 1, 2016 CENTER FOR DATA SC SCIENC


slide-1
SLIDE 1

CENTER FOR

DATA SCIENCE SCIENCE

AND

BIG D BIG DATA

ANALYTICS

Strength in Numbers

Big data approaches in cardiovascular genomic research

Randal Westrick CDaS and Department of Biological Sciences

  • Dec. 1, 2016
slide-2
SLIDE 2

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Thrombosis is the #1 Cause of Morbidity and Mortality

  • Thrombosis
  • Blood clotting INSIDE the blood vessels
  • Arterial Thrombosis
  • Atherothrombosis in arteries following plaque

rupture causes heart attacks and strokes

  • No way to predict which plaques will rupture
  • Venous thrombosis
  • Occurs in veins at sites of stasis, commonly veins
  • f lower extremity
  • Venous thromboembolism (VTE)
  • Life threatening
  • No way to predict when VTE will occur
slide-3
SLIDE 3

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

  • 60% heritable
  • Risk factors?
  • Plasminogen activator

inhibitor 1 (PAI-1)

  • Causal?
  • 60% heritable
  • Factor V Leiden (F5L)

– Accounts for 25% of genetic risk – Not everybody inheriting F5L will get venous thrombosis – Tissue Factor Pathway Inhibitor? GENES

40% 60%

Unknown genes?

ENVIRONMENT Arterial thrombosis

Improved human and animal studies of arterial and venous thrombosis are essential for identifying thrombosis genes and environmental triggers

Venous thrombosis

What Are the Genetic and Environmental Risk Factors for Blood Clotting (Thrombosis)?

slide-4
SLIDE 4

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

The Present Role of Big Data in Cardiovascular Thrombotic Disease

  • Researchers are harnessing Big Data Genomic

Analyses to:

  • Identify disease susceptibility even before illness

appears

  • Facilitate preventive treatment
  • Improve diagnosis
  • Tailor treatment to diagnosis
  • Facilitate treatment prescription with minimum

toxicity and maximum efficacy (pharmacogenomics)

  • Efforts are centered on genotyping and

sequencing hundreds of thousands of patient and control genomes

  • Analyses have not yet identified big effect genes

(disregarding environment???)

slide-5
SLIDE 5

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Genome Wide ENU Mutagenesis Suppressor Screen for Thrombosis Genes

X

F5L/L TFPI+/- S+/-

10,415 →

F5L/+ TFPI+/-

X

F5L/L

X

ENU

♀ ♂

134

~2700 conceptions (~2-3X genome coverage)

Nonlethal genotypes F5L/L TFPI+/+ F5L/+ TFPI+/+ F5L/+ TFPI+/-

ENU induces ~30 random mutations throughout the genome per offspring Normally lethal

slide-6
SLIDE 6

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Identification of Thrombosis Genes by Whole Exome Sequencing

Pull out all coding sequence (exome) from genomic DNA samples 100x genome coverage per 50 megabase exome Assembly and analysis of sequencing reads

slide-7
SLIDE 7

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Mutation Burden Testing and Distributions of ENU Induced Variants in 114 F5L/L TFPI+/- Mice

NHLBI resequencing program

suppressor 2 suppressor 3 suppressor 4 suppressor n suppressor 1 ENU mutation suppressor mutation

Mutation Burden

slide-8
SLIDE 8

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

A Subset of Novel Thrombosis Genes Identified by Whole Exome Sequencing

  • Several thrombosis genes have been

found through this method

  • The majority have not been found
  • Whole genome sequencing for the

remainder

p-value calculated by 10,000,000 permutations (normalized to gene size)

slide-9
SLIDE 9

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Cardiovascular Thrombotic Disease Research at CDaS

  • Leverage Center expertise to analyze and identify

environmental thrombosis triggers from lifestyle data (sleep, activity, stress…) using Biosensors

  • In our genetically susceptible rodents (proof of principal)
  • Humans (in partnership with Translational medicine

researchers and companies)

  • Integrate Genome and Envirome information to inform

us about how particular genomes interact with their environments

  • Graduate and Undergraduate Students will acquire

research skills including Big Data Genome Analysis and Big Data Sensor/Scanning analyses

  • Develop novel preventive or pre-disease therapeutic

strategies for thrombosis/cardiovascular disease