principles and applica ons of modern principles and
play

Principles and Applicaons of Modern Principles and Applicaons of - PowerPoint PPT Presentation

Principles and Applicaons of Modern Principles and Applicaons of Modern DNA Sequencing DNA Sequencing EEEB GU4055 EEEB GU4055 Session 11: Genome Assembly Session 11: Genome Assembly 1 Today's topics Today's topics 1. de Bruijn


  1. Principles and Applica�ons of Modern Principles and Applica�ons of Modern DNA Sequencing DNA Sequencing EEEB GU4055 EEEB GU4055 Session 11: Genome Assembly Session 11: Genome Assembly 1

  2. Today's topics Today's topics 1. de Bruijn Graphs and Euler. 2. Kmers. 3. Challenges in Genome Assembly. 4. Empirical Example. 2

  3. Kmers and de Bruijn graphs Kmers and de Bruijn graphs Reads start and end at different posi�ons covering all or nearly all of the genome. Decomposing reads into smaller kmers makes it more likely that we have uniformly sized bits covering the en�re genome. This is useful for building a graph. 3

  4. Kmers and de Bruijn graphs Kmers and de Bruijn graphs Shortest possible superstring that contains all substrings of length k. 4

  5. Kmers and de Bruijn graphs Kmers and de Bruijn graphs Hamiltonian graph requires comparing/aligning kmers, which is hard when the number and size of kmers is large. de Bruijn graphs join iden�cal matching (k-1)mers, such that kmers form the edges of the graph -- a much simpler computa�on. 5

  6. When poll is active, respond at PollEv.com/dereneaton004 ⢓ [3] Action: Write a function to get all 5 mers from the 6

  7. [6,7,8] Use functions to accomplish the designated tasks... Compare your functions and results with at least two of your 7

  8. Genome Assembly Genome Assembly 8

  9. denovo Genome Assembly denovo Genome Assembly denovo genome assembly is computa�onally demanding. Requires reads that cover the full genome many �mes (e.g., 50X). The end goal is to assemble scaffolds that match to chromosomes -- the real *bits* of the genome. 9

  10. Combining short and long-read technologies Combining short and long-read technologies Short read assemblies are highly fragmented. Long read technologies are highly error prone. Combining the two technologies -- while obtaining high-coverage of both -- is currently the gold standard. 10

  11. Caveats: Long reads require HMW DNA, some�mes a lot. Caveats: Long reads require HMW DNA, some�mes a lot. Specialized DNA extrac�on kits and protocols are used to isolate long (unbroken) DNA fragment lengths. More expensive and �me-consuming, but worth it. 11

  12. Eucalypus: (500Mb size, 170X ONT; 200X Illumina) Eucalypus: (500Mb size, 170X ONT; 200X Illumina) 12

  13. Scaffolding: Hi-C Proximity Liga�on Scaffolding: Hi-C Proximity Liga�on Chromosome conforma�on capture (3C) describes the structure of the genome within a cell; it's organiza�on and structure. Be�er than microscopy, can tell us how close together (poten�ally interac�ng) some regions of the genome are (such as promoters and enhancers). Hi-C: A highthroughput version of 3C is based a library prepara�on to build chimeric reads followed by short-read sequencing of paired-end reads. Creates a contact map of interac�ons correlated to spa�al distance. 13

  14. Scaffolding: Hi-C Proximity Liga�on Scaffolding: Hi-C Proximity Liga�on Restric�on diges�on; streptavidin bead extrac�on; paired-seq. 14

  15. Scaffolding: Amaranthus Hi-C Assembly Scaffolding: Amaranthus Hi-C Assembly 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend