hg color enhanced de bruijn graph for the error
play

HG-CoLoR: enHanced de bruijn Graph for the error COrrection of LOng - PowerPoint PPT Presentation

HG-CoLoR: enHanced de bruijn Graph for the error COrrection of LOng Reads Pierre Morisse , Thierry Lecroq and Arnaud Lefebvre pierre.morisse2@univ-rouen.fr Laboratoire dInformatique, de Traitement de lInformation et des Syst` emes


  1. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion NaS overview NaS corrects a long read as follows: Third step Assemble the obtained subset of short reads P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 9/33

  2. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion NaS overview NaS corrects a long read as follows: Fourth step Use the obtain contig as the correction of the initial long read contig P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 9/33

  3. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Main idea Use long reads as templates Get rid of the time consuming step of aligning the short reads against each other Focus on a seed and extend approach Rely on an enhanced de Bruijn graph, built from the short reads P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 10/33

  4. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Main idea Use long reads as templates Get rid of the time consuming step of aligning the short reads against each other Focus on a seed and extend approach Rely on an enhanced de Bruijn graph, built from the short reads P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 10/33

  5. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Main idea Use long reads as templates Get rid of the time consuming step of aligning the short reads against each other Focus on a seed and extend approach Rely on an enhanced de Bruijn graph, built from the short reads P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 10/33

  6. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Main idea Use long reads as templates Get rid of the time consuming step of aligning the short reads against each other Focus on a seed and extend approach Rely on an enhanced de Bruijn graph, built from the short reads P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 10/33

  7. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Introduction 1 2 Main idea Enhanced de Bruijn graph 3 Workflow 4 Experimental results 5 6 Conclusion P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 11/33

  8. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Enhanced de Bruijn graph Problem de Bruijn graphs are widely used for correction and assembly... ...But face difficulties with locally insufficient coverage Usual solutions Usually, multiple de Bruijn graphs of different orders are built Requires a different graph for each order Consumes large amounts of time and memory P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 12/33

  9. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Enhanced de Bruijn graph Problem de Bruijn graphs are widely used for correction and assembly... ...But face difficulties with locally insufficient coverage Usual solutions Usually, multiple de Bruijn graphs of different orders are built Requires a different graph for each order Consumes large amounts of time and memory P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 12/33

  10. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Enhanced de Bruijn graph Idea Enhance the de Bruijn graph with the capability of computing overlaps of variable lengths between the k -mers, in an overlap graph fashion, in order to avoid building multiple de Bruijn graphs of different orders. P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 13/33

  11. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Enhanced de Bruijn graph Example With the set of reads S = { AGCTTACA, CTTACGTA, GTATACTG } , we obtain the following enhanced de Bruijn graph of order 6: AGCTTA 4 3 5 5 4 GCTTAC CTTACA 3 4 4 5 3 5 5 5 5 CTTACG TTACGT TACGTA GTATAC TATACT ATACTG 3 4 P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 14/33

  12. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Enhanced de Bruijn graph Example With the set of reads S = { AGCTTACA, CTTACGTA, GTATACTG } , we obtain the following enhanced de Bruijn graph of order 6: AGCTTA 4 3 5 5 4 GCTTAC CTTACA 3 4 4 5 3 5 5 5 5 CTTACG TTACGT TACGTA GTATAC TATACT ATACTG 3 4 P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 14/33

  13. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Enhanced de Bruijn graph Example With the set of reads S = { AGCTTACA, CTTACGTA, GTATACTG } , we obtain the following enhanced de Bruijn graph of order 6: AGCTTA 4 3 5 5 4 GCTTAC CTTACA 3 4 4 5 3 5 5 5 5 CTTACG TTACGT TACGTA GTATAC TATACT ATACTG 3 4 P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 14/33

  14. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal The enhanced de Bruijn graph does not need to be explicitly built It can be traversed with the help of PgSA [Kowalski et al., 2015]: The k -mers from the reads are stored in the index The index is queried in order to retrieve the edges Makes backwards traversal easy P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 15/33

  15. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal The enhanced de Bruijn graph does not need to be explicitly built It can be traversed with the help of PgSA [Kowalski et al., 2015]: The k -mers from the reads are stored in the index The index is queried in order to retrieve the edges Makes backwards traversal easy P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 15/33

  16. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal The enhanced de Bruijn graph does not need to be explicitly built It can be traversed with the help of PgSA [Kowalski et al., 2015]: The k -mers from the reads are stored in the index The index is queried in order to retrieve the edges Makes backwards traversal easy P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 15/33

  17. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal The enhanced de Bruijn graph does not need to be explicitly built It can be traversed with the help of PgSA [Kowalski et al., 2015]: The k -mers from the reads are stored in the index The index is queried in order to retrieve the edges Makes backwards traversal easy P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 15/33

  18. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal The enhanced de Bruijn graph does not need to be explicitly built It can be traversed with the help of PgSA [Kowalski et al., 2015]: The k -mers from the reads are stored in the index The index is queried in order to retrieve the edges Makes backwards traversal easy P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 15/33

  19. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example Traversing the previous enhanced de Bruijn graph: 4 AGCTTA 3 5 5 4 GCTTAC CTTACA 3 4 4 5 3 5 5 5 5 CTTACG TTACGT TACGTA GTATAC TATACT ATACTG 3 4 P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 16/33

  20. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  21. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  22. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC Occurrences positions? 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  23. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC Occurrences positions? 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,1) (5,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  24. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,1) (5,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  25. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,1) (5,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  26. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC Occurrences positions? 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  27. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC Occurrences positions? 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,2) ; (3,0) ; (4,0) ; (5,1) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  28. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,2) ; (3,0) ; (4,0) ; (5,1) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  29. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,2) ; (3,0) ; (4,0) ; (5,1) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  30. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,2) ; (3,0) ; (4,0) ; (5,1) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  31. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT { (1,2) ; (3,0) ; (4,0) ; (5,1) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  32. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC Occurrences positions? 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA CTTACG 8: TATACT 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  33. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC Occurrences positions? 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA { (1,3) ; (3,1) ; (4,1) ; CTTACG 8: TATACT (5,2) ; (9,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  34. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA { (1,3) ; (3,1) ; (4,1) ; CTTACG 8: TATACT (5,2) ; (9,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  35. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA { (1,3) ; (3,1) ; (4,1) ; CTTACG 8: TATACT (5,2) ; (9,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  36. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA { (1,3) ; (3,1) ; (4,1) ; CTTACG 8: TATACT (5,2) ; (9,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  37. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA { (1,3) ; (3,1) ; (4,1) ; CTTACG 8: TATACT (5,2) ; (9,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  38. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Traversal Example k -mers set GCTTAC 1: AGCTTA 5 2: ATACTG 3: CTTACA PgSA 3 4: CTTACG AGCTTA TTACGT Index 5: GCTTAC 4 6: GTATAC 4 7: TACGTA { (1,3) ; (3,1) ; (4,1) ; CTTACG 8: TATACT (5,2) ; (9,0) } 9: TTACGT CTTACA P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 17/33

  39. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Introduction 1 2 Main idea Enhanced de Bruijn graph 3 Workflow 4 Experimental results 5 6 Conclusion P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 18/33

  40. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Workflow 5 steps: Correct the short reads (with QuorUM [Marc ¸ais et al., 2015]) 1 Filter out corrected short reads containing weak k -mers, and 2 index solid k -mers with PgSA Align the remaining short reads to the long reads, to find seeds 3 (with BLASR [Chaisson and Tesler, 2012]) Merge the overlapping seeds, and link them together, by 4 traversing the graph Extend the obtained corrected long read, on the left (resp. right) 5 of the leftmost (resp. rightmost) seed P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 19/33

  41. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Workflow 5 steps: Correct the short reads (with QuorUM [Marc ¸ais et al., 2015]) 1 Filter out corrected short reads containing weak k -mers, and 2 index solid k -mers with PgSA Align the remaining short reads to the long reads, to find seeds 3 (with BLASR [Chaisson and Tesler, 2012]) Merge the overlapping seeds, and link them together, by 4 traversing the graph Extend the obtained corrected long read, on the left (resp. right) 5 of the leftmost (resp. rightmost) seed P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 19/33

  42. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Workflow 5 steps: Correct the short reads (with QuorUM [Marc ¸ais et al., 2015]) 1 Filter out corrected short reads containing weak k -mers, and 2 index solid k -mers with PgSA Align the remaining short reads to the long reads, to find seeds 3 (with BLASR [Chaisson and Tesler, 2012]) Merge the overlapping seeds, and link them together, by 4 traversing the graph Extend the obtained corrected long read, on the left (resp. right) 5 of the leftmost (resp. rightmost) seed P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 19/33

  43. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Workflow 5 steps: Correct the short reads (with QuorUM [Marc ¸ais et al., 2015]) 1 Filter out corrected short reads containing weak k -mers, and 2 index solid k -mers with PgSA Align the remaining short reads to the long reads, to find seeds 3 (with BLASR [Chaisson and Tesler, 2012]) Merge the overlapping seeds, and link them together, by 4 traversing the graph Extend the obtained corrected long read, on the left (resp. right) 5 of the leftmost (resp. rightmost) seed P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 19/33

  44. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Workflow 5 steps: Correct the short reads (with QuorUM [Marc ¸ais et al., 2015]) 1 Filter out corrected short reads containing weak k -mers, and 2 index solid k -mers with PgSA Align the remaining short reads to the long reads, to find seeds 3 (with BLASR [Chaisson and Tesler, 2012]) Merge the overlapping seeds, and link them together, by 4 traversing the graph Extend the obtained corrected long read, on the left (resp. right) 5 of the leftmost (resp. rightmost) seed P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 19/33

  45. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds merging and linking Seeds with overlapping mapping positions are merged Perfect overlap: merge Otherwise: keep the best seed Seeds are used as anchor points on the graph The graph is traversed to link the seeds together and assemble the k -mers P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 20/33

  46. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds merging and linking Seeds with overlapping mapping positions are merged Perfect overlap: merge Otherwise: keep the best seed Seeds are used as anchor points on the graph The graph is traversed to link the seeds together and assemble the k -mers P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 20/33

  47. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds merging and linking Seeds with overlapping mapping positions are merged Perfect overlap: merge Otherwise: keep the best seed Seeds are used as anchor points on the graph The graph is traversed to link the seeds together and assemble the k -mers P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 20/33

  48. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds merging and linking Seeds with overlapping mapping positions are merged Perfect overlap: merge Otherwise: keep the best seed Seeds are used as anchor points on the graph The graph is traversed to link the seeds together and assemble the k -mers P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 20/33

  49. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds merging and linking Seeds with overlapping mapping positions are merged Perfect overlap: merge Otherwise: keep the best seed Seeds are used as anchor points on the graph The graph is traversed to link the seeds together and assemble the k -mers P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 20/33

  50. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read seed 1 seed 2 seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  51. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  52. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  53. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  54. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  55. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  56. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  57. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 k − 2 . . . k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  58. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  59. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  60. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  61. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  62. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  63. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  64. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  65. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  66. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . . . . k − 1 . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  67. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . k − 1 . . . src dst dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  68. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst seed 3 k − 1 k − 1 k − 1 . . . k − 2 k − 1 . . . . . . k − 1 . . . src src dst dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  69. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read linked seeds seed 3 k − 1 k − 1 k − 1 . . . . . . . . . src dst k − 3 . . . k − 2 k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  70. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read src dst k − 1 k − 1 k − 1 . . . . . . . . . src dst k − 3 k − 2 . . . k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  71. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 4: Seeds linking long read corrected long read k − 1 k − 1 k − 1 . . . . . . . . . src dst k − 3 k − 2 . . . k − 3 . . . P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 21/33

  72. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 5: Tips extension Seeds don’t always map right at the beginning or until the end of the long read Once all the seeds have been linked, HG-CoLoR keeps on traversing the graph The traversal stops when the borders of the long read or a branching path are reached P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 22/33

  73. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 5: Tips extension Seeds don’t always map right at the beginning or until the end of the long read Once all the seeds have been linked, HG-CoLoR keeps on traversing the graph The traversal stops when the borders of the long read or a branching path are reached P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 22/33

  74. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Step 5: Tips extension Seeds don’t always map right at the beginning or until the end of the long read Once all the seeds have been linked, HG-CoLoR keeps on traversing the graph The traversal stops when the borders of the long read or a branching path are reached P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 22/33

  75. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Remark Some seeds might be impossible to link together ⇒ Production of a corrected long read fragmented in multiple parts P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 23/33

  76. Introduction Main idea Enhanced de Bruijn graph Workflow Experimental results Conclusion Remark Some seeds might be impossible to link together ⇒ Production of a corrected long read fragmented in multiple parts P. Morisse, T. Lecroq, A. Lefebvre HG-CoLoR 23/33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend