Genome Sequencing Introduc1on and History Sample Prepara1on - PowerPoint PPT Presentation

Genome ¡Sequencing ¡

Introduc1on ¡and ¡History ¡

Sample ¡Prepara1on ¡

Sample ¡Prepara1on ¡ Fragments ¡

Sample ¡Prepara1on ¡ Fragments ¡ Sequencing ¡ Next ¡Genera1on ¡Sequencing ¡(NGS) ¡ ACGTAGAATCGACCATG ACGTAGAATACGTAGAA GGGACGTAGAATACGAC Reads ¡

Sample ¡Prepara1on ¡ Fragments ¡ Sequencing ¡ Reads ¡ ACGTAGAATACGTAGAA Assembly ¡ ACGTAGAATCGACCATG GGGACGTAGAATACGAC ACGTAGAATACGTAGAAACAGATTAGAGAG… Con1gs ¡

Sample ¡Prepara1on ¡ Fragments ¡ Sequencing ¡ Reads ¡ Assembly ¡ Con1gs ¡ Analysis ¡

Reference ¡Genome ¡ 9 ¡

De ¡novo ¡vs. ¡Re-‑sequencing ¡ • De ¡novo ¡ assembly ¡(“from ¡the ¡beginning”) ¡ implies ¡that ¡you ¡have ¡ no ¡prior ¡knowledge ¡of ¡ the ¡genome. ¡ ¡ ¡ • Re-‑sequencing ¡assembly ¡assumes ¡you ¡have ¡a ¡ copy ¡of ¡the ¡reference ¡genome ¡(that ¡has ¡been ¡ verified ¡to ¡a ¡certain ¡degree). ¡ • The ¡programs ¡that ¡work ¡for ¡re-‑sequencing ¡will ¡ not ¡work ¡for ¡ de ¡novo . ¡ ¡

De ¡novo ¡vs. ¡Re-‑sequencing ¡

Sample ¡Prepara1on ¡ Re-sequencing (LOCAS, Shrimp) requires 15x to 30x coverage. Anything less and re-sequencing programs will not produce results or produce questionable results. Fragments

Sample ¡Prepara1on ¡ De-novo assembly requires higher coverage. At least 30x but upwards to 100x’s coverage. Most de novo assemblers require paired-end data. Fragments

Sample ¡Prepara1on ¡ Our ¡focus ¡for ¡today’s ¡lecture: ¡ 1. Comparison ¡of ¡sequencing ¡ Fragments ¡ plaSorms ¡ 2. Details ¡of ¡sample ¡prepara1on ¡ Sequencing ¡ 3. Defini1ons ¡and ¡terminologies ¡ concerning ¡data ¡and ¡ sequencing ¡plaSorms ¡ Reads ¡ Assembly ¡ Con1gs ¡ Analysis ¡

History ¡and ¡Background ¡

Landmarks ¡in ¡Sequencing ¡ Efficiency ¡ ¡ Year ¡ Event ¡ (bp/person/ year) ¡ 1870 ¡ Miescher: ¡ ¡Discovers ¡DNA ¡ 1940 ¡ Avery: ¡ ¡Proposes ¡DNA ¡as ¡“Gene1c ¡Material” ¡ 1953 ¡ Watson ¡& ¡Crick: ¡ ¡Double ¡Helix ¡Structure ¡of ¡DNA ¡ 1 ¡ 1965 ¡ Holley: ¡ ¡transfer ¡RNA ¡from ¡Yeast ¡ 1,500 ¡ 1977 ¡ Maxam ¡& ¡Gilbert: ¡"DNA ¡sequencing ¡by ¡chemical ¡ degrada1on” ¡ Sanger: ¡“DNA ¡sequencing ¡with ¡chain-‑termina1ng ¡ inhibitors” ¡ 15,000 ¡ 1981 ¡ Messing ¡and ¡his ¡colleagues ¡developed ¡“shotgun ¡ sequencing” ¡method ¡ 25,000 ¡ 1987 ¡ ABI ¡markets ¡the ¡first ¡sequencing ¡plaSorm, ¡ABI ¡ 370 ¡

Landmarks ¡in ¡Sequencing ¡ Efficiency ¡ ¡ Year ¡ Event ¡ (bp/person/year) ¡ 50,000 ¡ 1990 ¡ NIH ¡begins ¡large-‑scale ¡sequencing ¡bacteria ¡genomes. ¡ 200,000 ¡ 1995 ¡ Craig ¡Venture ¡and ¡Hamilton ¡Smith ¡at ¡the ¡Ins1tute ¡for ¡ Genomic ¡Research ¡(TIGR) ¡published ¡the ¡first ¡complete ¡ genome ¡of ¡a ¡free-‑living ¡organism ¡in ¡Science. ¡ ¡This ¡marks ¡the ¡ first ¡use ¡of ¡whole-‑genome ¡shotgun ¡sequencing, ¡elimina1ng ¡ the ¡need ¡for ¡ini1al ¡mapping ¡efforts. ¡ ¡ 2001 ¡ A ¡drai ¡of ¡the ¡human ¡genome ¡was ¡published ¡in ¡Science. ¡ 2001 ¡ A ¡drai ¡of ¡the ¡human ¡genome ¡was ¡published ¡in ¡Nature. ¡ 50,000,000 ¡ 2002 ¡ 454 ¡Life ¡Sciences ¡comes ¡out ¡with ¡a ¡pyrosequencing ¡machine. ¡ 100,000,000 ¡ 2008 ¡ Next ¡genera1on ¡sequencing ¡machines ¡arrive. ¡ Huge ¡ 2011 ¡ Oxford ¡Nanopore: ¡600 ¡Million ¡base ¡pairs ¡per ¡hour. ¡ ¡

Robert ¡Holley ¡and ¡team ¡in ¡1965 ¡ Watson ¡and ¡Crick ¡ Messing: ¡World’s ¡most-‑cited ¡ ¡ scien1st ¡ Francis ¡and ¡Collins: ¡Private ¡Human ¡Genome ¡project. ¡ ¡

Next-‑Gen ¡Sequencing ¡PlaSorms ¡ 454/Roche ¡GS-‑20/FLX ¡ (2005) ¡ PacBio ¡RS ¡(2009-‑2010) ¡ 3 rd ¡genera1on? ¡ Illumina ¡HISeq ¡ ¡ (2007) ¡

Comparison ¡of ¡PlaSorms ¡ Technology ¡ Reads ¡per ¡run ¡ Average ¡Read ¡ bp ¡per ¡run ¡ Types ¡of ¡ Length ¡ errors ¡ 454 ¡(Roche) ¡ 400,000 ¡ 250-‑1000bp ¡ 70 ¡Million ¡ Subs1tu1on ¡ SoLID ¡(ABI) ¡ 88-‑132 ¡Million ¡ 35bp ¡ 1 ¡Billion ¡ Illumina ¡HISeq ¡ 150 ¡Million ¡ 100 ¡– ¡200bp ¡ 15 ¡Billion ¡ Subs1tu1on ¡ with ¡ exponen1al ¡ increase ¡ PacBio ¡ 45,000 ¡ 1000-‑2000bp ¡ 45 ¡Million ¡ Inser1ons ¡and ¡ dele1ons ¡ \ ¡

Sequencing ¡Methods ¡and ¡ Terminology ¡

Sanger ¡Sequencing ¡ • The ¡key ¡principle ¡of ¡the ¡Sanger ¡method ¡was ¡the ¡ dideoxynucleo1de ¡triphosphates ¡(ddNTPs) ¡as ¡ DNA ¡chain ¡terminators. ¡ ¡ • These ¡ddNTPs ¡will ¡also ¡be ¡radioac1vely ¡for ¡ detec1on ¡in ¡automated ¡sequencing ¡machines. ¡ • Posi1ves: ¡longer ¡reads ¡(600 ¡to ¡1000 ¡bp). ¡ • Nega1ves: ¡poor ¡coverage ¡(6x), ¡expensive, ¡ inaccurate. ¡ ¡ ¡ • S1ll ¡commonly ¡used ¡for ¡small ¡scale ¡sequencing. ¡

Sanger ¡Sequencing ¡Video ¡

Sanger ¡Sequencing ¡ DNA target sample SHEAR

Sanger ¡Sequencing ¡ DNA target sample SHEAR Close each fragment many times. T ¡ T ¡ A ¡ A ¡ T ¡ A ¡ A ¡ T ¡ C ¡ G ¡ C ¡ G ¡ C ¡ G ¡ C ¡ G ¡

Sanger ¡Sequencing ¡ DNA target sample SHEAR T T ¡ T ¡ A ¡ A ¡ A T ¡ A ¡ A ¡ T ¡ C C ¡ G ¡ C ¡ G ¡ C ¡ G ¡ C ¡ G ¡ G 30 ¡

Sanger ¡Sequencing ¡ Primer ¡ DNA ¡polymerase ¡ T ¡ A A ¡ C ¡ G ¡

Sanger ¡Sequencing ¡ Primer ¡ DNA ¡polymerase ¡ T ¡ A A ¡ C ¡ G ¡ T ¡ A A ¡ Primer ¡ C ¡ G ¡ DNA ¡polymerase ¡

Sanger ¡Sequencing ¡ A ¡ Primer ¡ G ¡ DNA ¡polymerase ¡ C ¡ C ¡ G ¡ A ¡ T ¡ A A ¡ C ¡ C ¡ G ¡ T ¡ A ¡ C ¡ T ¡ A ¡ C ¡ T ¡

Sanger ¡Sequencing ¡ A ¡ Primer ¡ G ¡ DNA ¡polymerase ¡ C ¡ G ¡ C ¡ G ¡ A ¡ T ¡ A A ¡ C ¡ C ¡ G ¡ T ¡ A ¡ C ¡ T ¡ A ¡ C ¡ T ¡

Sanger ¡Sequencing ¡ A ¡ Primer ¡ G ¡ C ¡ G ¡ C ¡ G ¡ G ¡ A ¡ T ¡ A A ¡ C ¡ C ¡ G ¡ T ¡ A ¡ C ¡ T ¡ A ¡ C ¡ T ¡

Sanger ¡Sequencing ¡ A ¡ Primer ¡ G ¡ C ¡ G ¡ C ¡ G ¡ G ¡ C ¡ A ¡ T ¡ A A ¡ C ¡ C ¡ G ¡ T ¡ A ¡ C ¡ T ¡ A ¡ C ¡ T ¡

Sanger ¡Sequencing ¡ A ¡ Primer ¡ G ¡ C ¡ G ¡ C ¡ G ¡ G ¡ C ¡ A ¡ T ¡ T ¡ A A ¡ C ¡ G ¡ C ¡ G ¡ T ¡ A ¡ A ¡ C ¡ T ¡ A ¡ C ¡ T ¡

Sanger ¡Sequencing ¡ A ¡ Primer ¡ G ¡ C ¡ C ¡ G ¡ A ¡ C ¡ T ¡ A ¡ C ¡ T ¡ A ¡ C ¡ T ¡

Sanger ¡Sequencing ¡ A ¡ Primer ¡ G ¡ C ¡ C ¡ G ¡ A ¡ C ¡ T ¡ A ¡ C ¡ Con1nue ¡un1l ¡all ¡strands ¡of ¡DNA ¡ ¡ T ¡ have ¡undergone ¡this ¡reac1on. ¡ ¡If ¡you ¡ choose ¡the ¡reagents ¡correctly ¡then ¡you ¡ ¡ A ¡ should ¡have ¡all ¡possible ¡A-‑terminated ¡ ¡ C ¡ strands; ¡resul1ng ¡in ¡sequences ¡of ¡varying ¡ T ¡ lengths. ¡

Sanger ¡Sequencing ¡

Sanger ¡Sequencing ¡ In ¡the ¡radioac1ve ¡gel, ¡the ¡longer ¡DNA ¡fragments ¡ move ¡to ¡the ¡bopom ¡and ¡the ¡shorter ¡ones ¡move ¡to ¡ ¡ the ¡top. ¡ ¡ ¡ ¡ Aierward ¡the ¡sequence ¡can ¡be ¡read ¡off ¡by ¡going ¡ ¡ from ¡top ¡to ¡bopom. ¡

Genome Sequencing Introduc1on and History Sample Prepara1on - PowerPoint PPT Presentation

Genome Sequencing Introduc1on and History Sample Prepara1on Sample Prepara1on Fragments Sample Prepara1on Fragments Sequencing Next Genera1on Sequencing (NGS)

Introduction to Bioinformatics Genome sequencing & assembly Genome sequencing & assembly

Genome Sequencing & Analysis Core Resource Olivier Fedrigo Friday, October 19, 12 Reference

Apicomplexan Genome Sequencing in Sanger Arnab Pain, The Pathogen Sequencing Unit (PSU) 2 nd

Sequencing technology and assembly Sanger sequencing Sanger sequencing with radioactivity

Genomics Sequencing tech Sequencing tech: next generation What do we get from sequencing? How

Genomes and Metagenomes Whole Genome Sequencing and Metagenomics Whole Genome Sequencing

CSE 473: Introduc1on to Ar1ficial Intelligence Introduc1on Luke Ze<lemoyer University of

Genome Annotation The steps in genome sequencing Generate genome sequence Assembly ORF

Genetic Testing: Genome Sequencing A-Z for Mitochondrial Disease Christine Stanley PhD, FACMG

Next Next Generation Sequencing: an overview of Generation Sequencing: an overview of

Whole Genome Analysis and Annotation Adam Siepel Biological Statistics & Computational

Genome Reassembly From Fragments 7 January 2019 OSU CSE 1 Genome A genome is the encoding

Brief overview of genome sequencing BIOL 8803 Bioinformatics Georgia Tech Nov 13, 2003 Russell

Detecting SNVs with Next-generation-Sequencing Johannes K oster Genome Informatics, University

11/28/2017 Whole Genome Sequencing for Cluster Detection Minnesota, 2017 Carlota Medus, PhD, MPH

Analysis of structural genome varia3on in whole genome and exome sequencing data Victor Guryev

Montana State University Showing that people from all walks of life, people of all ages, are

Analysis of Algorithms Chapter 11 Instructor: Scott Kristjanson CMPT 125/125 SFU Burnaby, Fall

Tentacular analysis of microarray data Dhammika Amaratunga Senior Research Fellow, Nonclinical

Feedback Message Passing for Inference in Gaussian Graphical Models Ying Liu Venkat

A Method for Aligning RNA Secondary Structures Jason T. L. Wang New Jersey Institute of

PTT 207 Biomolecular and Genetic Engineering Semester 1 2012/2013 BY: PUAN NURUL AIN HARMIZA

Molecular Biology and History of DNA Sequencing 02-223 Sept. 9 2014 History of DNA Thomas

Version Control Marek Kochaczyk Jagiellonian University, Krakw Purpose Purpose Control

Sambuz

Useful Links

Newsletter

Mail Us

Genome Sequencing Introduc1on and History Sample Prepara1on - PowerPoint PPT Presentation

Genome Sequencing Introduc1on and History Sample Prepara1on Sample Prepara1on Fragments Sample Prepara1on Fragments Sequencing Next Genera1on Sequencing (NGS)

Introduction to Bioinformatics Genome sequencing &amp; assembly Genome sequencing &amp; assembly

Genome Sequencing &amp; Analysis Core Resource Olivier Fedrigo Friday, October 19, 12 Reference

Apicomplexan Genome Sequencing in Sanger Arnab Pain, The Pathogen Sequencing Unit (PSU) 2 nd

Sequencing technology and assembly Sanger sequencing Sanger sequencing with radioactivity

Genomics Sequencing tech Sequencing tech: next generation What do we get from sequencing? How

Genomes and Metagenomes Whole Genome Sequencing and Metagenomics Whole Genome Sequencing

CSE 473: Introduc1on to Ar1ficial Intelligence Introduc1on Luke Ze&lt;lemoyer University of

Genome Annotation The steps in genome sequencing Generate genome sequence Assembly ORF

Genetic Testing: Genome Sequencing A-Z for Mitochondrial Disease Christine Stanley PhD, FACMG

Next Next Generation Sequencing: an overview of Generation Sequencing: an overview of

Whole Genome Analysis and Annotation Adam Siepel Biological Statistics &amp; Computational

Genome Reassembly From Fragments 7 January 2019 OSU CSE 1 Genome A genome is the encoding

Brief overview of genome sequencing BIOL 8803 Bioinformatics Georgia Tech Nov 13, 2003 Russell

Detecting SNVs with Next-generation-Sequencing Johannes K oster Genome Informatics, University

11/28/2017 Whole Genome Sequencing for Cluster Detection Minnesota, 2017 Carlota Medus, PhD, MPH

Analysis of structural genome varia3on in whole genome and exome sequencing data Victor Guryev

Montana State University Showing that people from all walks of life, people of all ages, are

Analysis of Algorithms Chapter 11 Instructor: Scott Kristjanson CMPT 125/125 SFU Burnaby, Fall

Tentacular analysis of microarray data Dhammika Amaratunga Senior Research Fellow, Nonclinical

Feedback Message Passing for Inference in Gaussian Graphical Models Ying Liu Venkat

A Method for Aligning RNA Secondary Structures Jason T. L. Wang New Jersey Institute of

PTT 207 Biomolecular and Genetic Engineering Semester 1 2012/2013 BY: PUAN NURUL AIN HARMIZA

Molecular Biology and History of DNA Sequencing 02-223 Sept. 9 2014 History of DNA Thomas

Version Control Marek Kochaczyk Jagiellonian University, Krakw Purpose Purpose Control

Sambuz

Useful Links

Newsletter

Mail Us

Introduction to Bioinformatics Genome sequencing & assembly Genome sequencing & assembly

Genome Sequencing & Analysis Core Resource Olivier Fedrigo Friday, October 19, 12 Reference

CSE 473: Introduc1on to Ar1ficial Intelligence Introduc1on Luke Ze<lemoyer University of

Whole Genome Analysis and Annotation Adam Siepel Biological Statistics & Computational