Massively parallel read mapping on graphics cards Johannes K oster - - PowerPoint PPT Presentation

massively parallel read mapping on graphics cards
SMART_READER_LITE
LIVE PREVIEW

Massively parallel read mapping on graphics cards Johannes K oster - - PowerPoint PPT Presentation

Massively parallel read mapping on graphics cards Johannes K oster May 15, 2014 1 / 23 Genome Informatics Outline 1 Next-Generation-Sequencing of DNA 2 Read Mapping 3 Algorithm 4 Results 2 / 23 Genome Informatics Outline 1


slide-1
SLIDE 1

1 / 23 Genome Informatics

Massively parallel read mapping on graphics cards

Johannes K¨

  • ster

May 15, 2014

slide-2
SLIDE 2

2 / 23 Genome Informatics

Outline

1 Next-Generation-Sequencing of DNA 2 Read Mapping 3 Algorithm 4 Results

slide-3
SLIDE 3

3 / 23 Genome Informatics

Outline

1 Next-Generation-Sequencing of DNA 2 Read Mapping 3 Algorithm 4 Results

slide-4
SLIDE 4

4 / 23 Genome Informatics

Next-Generation-Sequencing

1 Chop DNA/RNA into small

fragments.

2 Ligate adapters to both ends. 3 Spread fragment solution across

a flowcell with beads.

4 Amplify fragments into clusters

(PCR).

5 Sequence fragments by adding

fluorescent complementary bases ◮ reads.

Illumina, 2013

slide-5
SLIDE 5

5 / 23 Genome Informatics

Outline

1 Next-Generation-Sequencing of DNA 2 Read Mapping 3 Algorithm 4 Results

slide-6
SLIDE 6

6 / 23 Genome Informatics

Read Mapping

For each read... find position in the known reference genome.

? ? ?

  • A DNA sequence is a word over Σ = {A, C, G, T}.
  • string matching, but with error tolerance
slide-7
SLIDE 7

7 / 23 Genome Informatics

Read Mapping

For each read... find position(s) with optimal alignment(s) to either strand of the reference: ACTGTGGACTATCAATGGAC GGTACTGT CTATCTATGGACCGTTAG

◮ Smith Waterman Algorithm

Too slow, therefore heuristics to find anchor positions:

  • suffixarray/Burrows-Wheeler-Transformation (BWA, bowtie2)
  • q-gram indices (RazerS3)
slide-8
SLIDE 8

8 / 23 Genome Informatics

Read mapping on GPUs

Challenges:

  • limited and slow memory

q-gram index

  • branching interrupts parallelism

BWT Idea:

  • Use a special q-gram index with small memory footprint.
  • Use parallelism to hide memory latency.
  • Export branching into bitvector operations.

◮ PEANUT – the ParallEl AligNment UTility

slide-9
SLIDE 9

9 / 23 Genome Informatics

Outline

1 Next-Generation-Sequencing of DNA 2 Read Mapping 3 Algorithm 4 Results

slide-10
SLIDE 10

10 / 23 Genome Informatics

Algorithm

Main steps:

  • Filtration

find potential hits between reads and reference using a special q-gram index

  • Validation

validate hits using a bit-parallel alignment algorithm

slide-11
SLIDE 11

10 / 23 Genome Informatics

Algorithm

Main steps:

  • Filtration

find potential hits between reads and reference using a special q-gram index

  • Validation

validate hits using a bit-parallel alignment algorithm

slide-12
SLIDE 12

11 / 23 Genome Informatics

Q-Gram Index

For a given DNA sequence T:

  • consider q-grams (substrings of length q)

GGTACTGACGTTCTATGGACCGTTAG

  • encode them as integers

ACGT = 11 10 01 00 = 228

  • array P with concatenation of q-gram positions
  • array Q with address in P for each q-gram

◮ size 4q + |T|

P[Q[228]] . . . P[Q[229]]

slide-13
SLIDE 13

12 / 23 Genome Informatics

Q-Group Index

  • assign each q-gram to a q-group

⌊g/w⌋

  • store occurence of q-gram in a

bit-vector

  • two address arrays guide from

q-group to positions of the q-gram in the text

◮ size 2/w ·4q +min{4q, |T|}+|T|

228 0 1 2 3 4 5 6 7... 1 1 228 / 32 228 % 32 found

slide-14
SLIDE 14

13 / 23 Genome Informatics

Q-Group Index

0000 0010 2 2 3 5 8 15 3 52 31 GAAA 1 11 17 308 22 0101 I S S' O

less memory, because we consider only. . .

  • q-groups at the top level
  • occuring q-grams at the bottom

calculate adress ranges in parallel by

  • population counts
  • prefix-sums
slide-15
SLIDE 15

14 / 23 Genome Informatics

Algorithm

Main steps:

  • Filtration

find potential hits between reads and reference using a special q-gram index

  • Validation

validate hits using a bit-parallel alignment algorithm

slide-16
SLIDE 16

15 / 23 Genome Informatics

Validation

T G T C T A T G T A +1 +1

Observations:

  • calculating column j needs only column j − 1
  • each transition changes edit distance by at most 1

Myers bit-parallel algorithm1:

  • process graph column-wise
  • maintain distance deltas in bitvectors

1 1 1 1 1 1 1 + << & ^

1Myers, 1999. J. ACM 46.

slide-17
SLIDE 17

16 / 23 Genome Informatics

Workflow

  • load reads into buffer
  • build q-group index of reads
  • filtration of hits
  • validation of hits
  • postprocessing
  • writing

stop p

  • s

t p r

  • c

e s s i n g f i l t r a t i

  • n

v a l i d a t i

  • n

f i l t r a t i

  • n

v a l i d a t i

  • n

f i l t r a t i

  • n

v a l i d a t i

  • n

f i l t r a t i

  • n

v a l i d a t i

  • n

f i l t r a t

  • n

v a l i d a t i

  • n

f i l t r a t i

  • n

v a l i d a t i

  • n

l

  • a

d r e a d s e q u e n c e s w r i t e h i t s start

  • IO
  • GPU
  • CPU
slide-18
SLIDE 18

17 / 23 Genome Informatics

Outline

1 Next-Generation-Sequencing of DNA 2 Read Mapping 3 Algorithm 4 Results

slide-19
SLIDE 19

18 / 23 Genome Informatics

Results

100 200 300 400 500 600

block size

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

  • ccupancy

filter_reference create_queries_index validate_hits

slide-20
SLIDE 20

19 / 23 Genome Informatics

Sensitivity

  • assessed with Rabema2 benchmark on S. cerevisiae genome
  • 100% for reads with error rate less than 7%
  • 99.77% for error rates up to 10%
  • 98.97% for error rates up to 20%

2Holtgrewe et al. 2011. BMC Bioinformatics

slide-21
SLIDE 21

20 / 23 Genome Informatics

Performance

Output types: all all alignments of a read best one of the best alignments best-stratum all best alignments 5 million simulated human reads:

mapper type time [min:sec]

  • sens. [%]

PEANUT best-stratum 1:55 98.62 BWA-MEM best 3:16 96.99 Bowtie 2 best 5:21 96.85 PEANUT all 18:29 98.74 RazerS 3 all 199:55 98.83 Intel Core i7, 16GB RAM NVIDIA Geforce 780, 3GB RAM

slide-22
SLIDE 22

21 / 23 Genome Informatics

Performance

5 million real human exome reads:

mapper type time [min:sec] PEANUT best-stratum 1:33 BWA-MEM best 1:58 Bowtie 2 best 3:12 PEANUT all 10:52 RazerS 3 all 89:38 Intel Core i7, 16GB RAM NVIDIA Geforce 780, 3GB RAM

slide-23
SLIDE 23

22 / 23 Genome Informatics

Performance

10 million human exome paired end reads:

mapper type time [min:sec] PEANUT best-stratum 3:08 BWA-MEM best 4:44 Bowtie 2 best 8:18 PEANUT all 21:54 RazerS 3 all 150:59 Intel Core i7, 16GB RAM NVIDIA Geforce 780, 3GB RAM

slide-24
SLIDE 24

23 / 23 Genome Informatics

Summary

PEANUT is a GPU based read mapper that outperforms other state-of-the-art mappers in terms of

  • sensitivity
  • speed

by introducing the q-group index with small memory footprint and exploiting

  • bit-vector operations
  • prefix sums
  • population counts

http://peanut.readthedocs.org