GPU computing and the tree of life Michael P . Cummings Center for - - PowerPoint PPT Presentation

gpu computing and the tree of life
SMART_READER_LITE
LIVE PREVIEW

GPU computing and the tree of life Michael P . Cummings Center for - - PowerPoint PPT Presentation

GPU computing and the tree of life Michael P . Cummings Center for Bioinformatics and Computational Biology University of Maryland Institute of Advanced Computer Studies GPU summit 27 October 2014 some domain science context the great apes


slide-1
SLIDE 1

GPU computing and the tree of life

Michael P . Cummings Center for Bioinformatics and Computational Biology University of Maryland Institute of Advanced Computer Studies

GPU summit

27 October 2014

slide-2
SLIDE 2

some domain science context

slide-3
SLIDE 3

the great apes

slide-4
SLIDE 4

great apes: phylogenetic relationships?

slide-5
SLIDE 5

great apes: phylogenetic relationships?

slide-6
SLIDE 6

great apes: phylogenetic relationships?

slide-7
SLIDE 7

phylogenetic relationships of great apes

when subjected to phylogenetic analysis overwhelming evidence supports chimps and humans being each others most closest relatives

slide-8
SLIDE 8

number of possible topologies

tips unrooted trees 3 1 4 3 5 15 6 105 7 945 8 10,395 9 135,135 10 2,027,025 11 34,459,425 12 654,729,075 13 13,749,310,575 14 316,234,143,225 15 7,905,853,580,625 20 213,643,476,699,771,875

slide-9
SLIDE 9

phylogenetic analysis

the most accurate methods are model-based and involve likelihood calculations

  • maximum likelihood estimation
  • Bayesian analysis

Prob(H|D) = Prob(D|H) Prob(H) _____________ .__Prob(D) we can only directly calculate Prob(D|H)

A A C T . . . A A T G . . . A A T A . . .

  • 1574.63624

(log likelihood)

A C T G . . .

slide-10
SLIDE 10

likelihood calculation

x0 x2 x1 t2 t1 x

L(i)

0 (x0) =

X

x1

Prob(x1|x0, t1)L(i)

1 (x1)

! X

x2

Prob(x2|x0, t2)L(i)

2 (x2)

!

nonetheless, likelihood calculations are very computationally intensive - O(taxa x sites x rates x states²)

peeling algorithm (Felsenstein 1981) does post-order traversal with calculation of partial likelihoods at each node that depend

  • nly on its immediate children
slide-11
SLIDE 11

likelihood calculations: majority of computation

likelihood related calculations

nucleotide 94.69% amino acid 95.72% codon 81.24%

GARLI profiling; 11 taxa; 2178 characters

slide-12
SLIDE 12

BEAGLE: broad-platform evolutionary analysis

general likelihood evaluator

an application programming interface (API) and high- performance computing library for statistical phylogenetics emphasis is evaluating phylogenetic likelihoods of biomolecular sequence evolution aim is to provide high performance evaluation 'services' to a wide range of phylogenetic software, both Bayesian samplers and maximum likelihood optimizers allows phylogenetic software using the library to make use of

  • ptimized hardware such as GPUs
slide-13
SLIDE 13

BEAGLE library design goals

  • pen-source (LGPL)

multi-platform support (i.e., Linux, OS X, Windows) low level

C API

does not explicitly have concept of tree minimize transfer of data support multiple implementations (e.g., CPU, SSE, CUDA, OpenCL) uses dynamic plug-in system support both single and double precision

slide-14
SLIDE 14

GPU implementation

CPU-side code only used to manage GPU

memory allocations and transfers, kernel launches allows client to use CPU in parallel to GPU

GPU interface abstraction layer CUDA and OpenCL implementations share same CPU-side code CUDA implementation uses the driver API

Parallel Thread Execution (PTX) kernels Java Native Interface (JIT) compilation templated kernels support arbitrary number of states multiple GPUs supported via client-side partitioning (scales linearly)

slide-15
SLIDE 15

gross structure of BEAGLE

BEAGLE

BEAST

MrBayes

GARLI JNI wrapper

implementation manager

GPU implementation CPU CUDA interface

OpenCL interface

CUDA kernels

OpenCL kernels

C API

slide-16
SLIDE 16

100 1,000 1e+04 1e+05 5e+05 3e+06 1 4 16 64 256 1 4 16 64

  • GFLOPS

speedup factor unique site patterns

  • GPU: AMD Radeon HD 7970 GHz Edition

GPU: NVIDIA GeForce GTX 580 (CUDA) GPU: NVIDIA Tesla K20m MIC : Intel Xeon Phi SE10P CPU: Intel Xeon E5−2680 x2 (16 cores) CPU: Intel Xeon E5−2680 (single core)

throughput for nucleotide data (4 states)

slide-17
SLIDE 17

throughput for codon data (64 states)

GFLOPS speedup factor unique site patterns

  • GPU: AMD Radeon HD 7970 GHz Edition

GPU: NVIDIA GeForce GTX 580 (CUDA) GPU: NVIDIA Tesla K20m MIC : Intel Xeon Phi SE10P CPU: Intel Xeon E5−2680 x2 (16 cores) CPU: Intel Xeon E5−2680 (single core)

100 1,000 1e+04 6e+04 4 16 64 256 1024 1 4 16 64 256

slide-18
SLIDE 18

MrBayes speedup

double single double single

1 4 16 64

nucleotide model codon model

MrBayes

23 525

MrBayes SSE

  • 2.6

15 16 18 1.3 40 35 89 1.9 3.1

  • 2.3

3.4 10 13 15 1.3 1.6 17 20 50

precision

  • GPU: AMD 7970

MIC : Xeon Phi CPU: 16 cores CPU: SSE CPU: standard

slide-19
SLIDE 19

BEAST speedup

double single double single

1 4 16 64

nucleotide model codon model

precision

889 806

  • 1.0

115 1.4 16 18 44 27 47

  • 1.2

1.5 8.6 12 28 1.2 1.5 15 25 55

  • GPU: AMD 7970

MIC : Xeon Phi CPU: 16 cores CPU: SSE CPU: standard

slide-20
SLIDE 20

more than academic

academic: having no practical or useful significance

Webster’s New Collegiate Dictionary

slide-21
SLIDE 21

two recent studies using BEAGLE library

slide-22
SLIDE 22

phylogenetics in use: early spread of HIV-1

Faria et al. 2014 The early spread and epidemic ignition

  • f HIV-1 in human populations. Science 346:56-61
slide-23
SLIDE 23

phylogenetics in use: 2014 Ebola outbreak

Gire et al. 2014 Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science 345:1369-1372

slide-24
SLIDE 24

acknowledgements

Daniel Ayres, University of Maryland Peter Beerli, Florida State University Aaron Darling, University of Technology Sydney Mark Holder, University of Kansas John Huelsenbeck, University of California, Berkeley Paul Lewis, University of Connecticut Andrew Rambaut, University of Edinburgh Fredrik Ronquist, Swedish National Museum of Natural History Marc Suchard, University of California, Los Angeles David Swofford, Duke University Derrick Zwickl, University of Arizona Dan Stanzione, Texas Advanced Computing Center Yariv Aridor and Arik Narkis, Intel Israel Altera University Donation Program National Science Foundation