Accelerating Gene Set Enrichment Analysis on CUDA-Enabled GPUs - - PowerPoint PPT Presentation

accelerating gene set enrichment analysis on cuda enabled
SMART_READER_LITE
LIVE PREVIEW

Accelerating Gene Set Enrichment Analysis on CUDA-Enabled GPUs - - PowerPoint PPT Presentation

Accelerating Gene Set Enrichment Analysis on CUDA-Enabled GPUs Bertil Schmidt Christian Hundt Contents Gene Set Enrichment Analysis (GSEA) Background Algorithmic details cudaGSEA Performance evaluation GSEA and


slide-1
SLIDE 1

Accelerating Gene Set Enrichment Analysis on CUDA-Enabled GPUs

Bertil Schmidt Christian Hundt

slide-2
SLIDE 2

Contents

  • Gene Set Enrichment Analysis (GSEA)

– Background – Algorithmic details

  • cudaGSEA
  • Performance evaluation
slide-3
SLIDE 3

GSEA and Bioinformatics

  • High throughput technologies generate large-scale

gene expression data sets

– RNA-Seq – Microarrays

  • GSEA uses annotated gene sets to mine a given gene

expression matrix

– MSigDB contains over 10K signatures each containing around 100 gene identifiers on average

  • Typical GSEA study:

– identify metabolic pathways that are differentially changed in human type-2 diabetes

slide-4
SLIDE 4

Gene Set Enrichment Analysis

  • Reveals correlation between gene

sets and diseases using gene expression data

  • State-of-the-art tool with over

10,000 citations

  • Written in (multi-threaded) Java
  • Highly time consuming

– analyzing 20,639 genes measured in 200 patients with 4,725 pathways and 1M permutations takes around 1 week with GSEA 2.2.2 software on a CPU

  • We present

– GSEA parallelization on a GPU using CUDA (cudaGSEA) – cudaGSEA around two orders-of- magnitude faster than BroadGSEA

slide-5
SLIDE 5

GSEA Algorithm – Gene Ranking

  • Gene expression matrix D obtained from RNA-Seq or Microarray experiments
  • For each gene i and patient j with associated (binary) phenotype C expression value

D[i,j] is stored

  • Diseases driven by complex gene interactions  simply reporting top-ranked genes

produce many false positives

  • Domain experts provides set of genes that might possibly explain observed

phenotypes

slide-6
SLIDE 6

GSEA Algorithm – Enrichment score

  • Enrichment score (ES) measure correlation between given gene set S and

calculated gene ranking g(i)

– Report maximum deviation of a running sum (k) – Sum increases if we hit a member of S and decreases otherwise

  • How significant is ES = 0.857?  p-value calculation using permutation testing
slide-7
SLIDE 7

GSEA Algorithm – Permuation testing

slide-8
SLIDE 8

GSEA Algorithm – Permuation testing

slide-9
SLIDE 9

GSEA Algorithm

|ES|

  • |ES|
  • Histogram of 1,000,000 enrichment scores gained by permuting

patient phenotypes

  • Estimate p-value by counting events in both tails
  • Why so many permutations?

– When testing 1,000 gene sets at significance level p<0.001 we need more than 1,000,000 samples to reject null hypothesis at 1,000p < 0.001 (Bonferroni correction)

slide-10
SLIDE 10

CUDA Parallelization

Transpose D to ensure coalesced memory accesses

slide-11
SLIDE 11

CUDA Parallelization

slide-12
SLIDE 12

CUDA Parallelization

slide-13
SLIDE 13

CUDA Implementation Details

  • Support for single-precision and double-precision
  • Resulting matrix of enrichment scores (#gene sets x

#permutations) can be large

– e.g. 5K x 1M x 8B = 40GB

  • p-value estimation, Family-wise error rate (FWER),

normalized enrichment score (NES) computation can be accomplished on the GPU with (sum/max) reduction kernels without the need for storing this matrix

  • False discovery rate (FDR) computation this matrix is

transferred to the CPU for post-processing

slide-14
SLIDE 14

cudaGSEA Features

  • Reading data sets directly in Broad Institute-compatible file

formats

  • Supporting several local deviation measures

– Mean-based measures (difference/quotient/log-quotient of means) – Mean and standard deviation-based measures (signal to noise- ratio, t-tests, one/two-pass estimation) – Numerically stable summation schemes for local measures and ES (Kahan etc.)

  • Package for the R framework and standalone application
  • Multi-threaded CPU version in C++ using OpenMP
slide-15
SLIDE 15
  • GSE19429 dataset

– collapsed to 20,639 gene symbols; 200 patients (183 cases + 17 controls)

  • Hallmark: 50 gene sets

– MSigDB 5.1 smallest gene set collection

  • GeForce Titan X (single precison) / Tesla K40c (double precision, ECC off), CUDA 7.5
  • 10 core Xeon E5-2660v3@2.60GHz, 20 Threads, Ubuntu 14.04, gcc 4.8.4, 64-bit OpenJDK
  • BroadGSEA v.2.2.2

Performance Evaluation

slide-16
SLIDE 16

Performance Evaluation

  • GSE19429 dataset

– collapsed to 20,639 gene symbols; 200 patients (183 cases + 17 controls)

  • C2: 4726 gene sets

– MSigDB 5.1 largest gene set collection

  • GeForce Titan X (single precison) / Tesla K40c (double precision, ECC off), CUDA 7.5
  • 10 core Xeon E5-2660v3@2.60GHz, 20 Threads, Ubuntu 14.04, gcc 4.8.4, 64-bit OpenJDK
  • BroadGSEA v.2.2.2
slide-17
SLIDE 17

Conclusion

  • High-throughput technologies establish the need for

scalable bioinformatics tools that can process large- scale gene expression data sets

  • CUDA is a suitable technology to address this need
  • cudaGSEA on one GPU achieves around two orders-of-

magnitude speedup versus BroadGSEA on a CPU

– analyzing 20,639 genes measured in 200 patients with 4,726 pathways and 1M permutations takes around 1 week with GSEA 2.2.2 on a Xeon E5-2660v3 CPU while less than 1 hour on a GeForce Titan X

  • Source code available at:

– https://github.com/gravitino/cudaGSEA

  • Group Website:

– https://www.hpc.informatik.uni-mainz.de/

slide-18
SLIDE 18

Accelerating Gene Set Enrichment Analysis on CUDA-Enabled GPUs

Bertil Schmidt, Christian Hundt Institute of Computer Science Johannes Gutenberg University Mainz {bertil.schmidt, hundt}@uni-mainz.de

Thank you!