1
EECS 573 GenAx Paper Presentation
GenAx: A Genome Sequence Accelerator Daichi Fujiki et al Presented - - PowerPoint PPT Presentation
GenAx: A Genome Sequence Accelerator Daichi Fujiki et al Presented by: Amani Alkayyali Ben Cyr EECS 573 GenAx Paper Presentation 1 Genome Sequencing Thymine DNA: Thymine, Cytosine, Adenine, Guanine Genome Sequencing: Determining
1
EECS 573 GenAx Paper Presentation
2
EECS 573 GenAx Paper Presentation
Determining T,C,A,G Order
○ Understanding entire DNA sequence as system
Thymine Adenine Cytosine Guanine
lgrdlmnqvtthequickababcmfxlqbrownfoxj urvsmpedoverthelazyyyzplfdogjjiurttlythe doglayhhbeldquietlydreaminghwwiqldns
lgrdlmnqvtthequickababcmfxlqbrownfo xjurvsmpedoverthelazyyyzplfdogjjiurttl ythedoglayhhbeldquietlydreaminghwwi qldnsofdinnerplwosiucnd
3
EECS 573 GenAx Paper Presentation
and personalized medicine
○ Understanding an individual’s cancer cell mutations
diseases
https://rnsights.com/the-push-for-personalized-medicine/
4
EECS 573 GenAx Paper Presentation
○ Break into small pieces (reads) at random positions ○ Determine the sequence ○ Figure out which pieces fit together (read alignment)
○ Clone-by-Clone ○ Whole Genome Sequencing
5
EECS 573 GenAx Paper Presentation
○ 2001: $3 billion - first human genome sequencing
○ Data from 1 mill genomes produces over 300 Petabytes of data
○ Broad Institute’s standard software for read alignment
6
EECS 573 GenAx Paper Presentation
Automata (LA) ○ Improve scaling
7
EECS 573 GenAx Paper Presentation
substitutions
Read Alignment
Reference Genome
Seeding Seed Extension
1 2
8
EECS 573 GenAx Paper Presentation
○ “k-mers”: string matches of k length ○ Super Maximal Exact Matches (SMEMs): Maximum length match extending from k-mer
match is found.
Seeding Seed Extension
9
EECS 573 GenAx Paper Presentation
K = 4
10
EECS 573 GenAx Paper Presentation
K = 4
11
EECS 573 GenAx Paper Presentation
K = 4
12
EECS 573 GenAx Paper Presentation
○ CAMs tell you very quickly if certain data is in the CAM block ○ Small 512 index CAM table ○ When k = 12 (avg case), matches usually < 500
13
EECS 573 GenAx Paper Presentation
Seeding Seed Extension
14
EECS 573 GenAx Paper Presentation
15
EECS 573 GenAx Paper Presentation
16
EECS 573 GenAx Paper Presentation
with neighbor
17
EECS 573 GenAx Paper Presentation
18
EECS 573 GenAx Paper Presentation
19
EECS 573 GenAx Paper Presentation
○ Intel Xeon Processor running BWA-MEM (128 GB DDR4) ○ Nvidia TITAN Xp running CUSHAW2
○ 800 Million reads at 101 base pairs / read
20
EECS 573 GenAx Paper Presentation
○ 31.7x Speedup ○ 12x less power ○ ~10 Hrs vs. ~300 Hrs
○ 72.4x Speedup
21
EECS 573 GenAx Paper Presentation
○ Computes edit distance between two strings ○ String independent and local communication
○ Accelerator for Silla supporting traceback
○ SillaX + Seeding Accelerator ○ Drop-In replacement for BWA-MEM software
22
EECS 573 GenAx Paper Presentation
large K-edit distances, many “k-mer” seeds). Is it worth using GenAx even if it is not flexible enough to handle these edge cases?
good solution for scaling?