Applications Anna De Grassi - - European Institute of Oncology - - PowerPoint PPT Presentation

Next Generation Sequencing: Applications Anna De Grassi - - European Institute of Oncology - Milan -- - F. Ciccarelli group - BITS - March 20, 2009 - Genoa

Several Flavours of Throughput… • Genome sequencing Genome sequencing • Transcriptome Transcriptome Analysis Analysis • • • Metagenomics Metagenomics • • Amplicon Amplicon sequencing sequencing • • • UltraDeep UltraDeep sequencing sequencing • • Chip-seq Chip-seq • Structural Variations Structural Variations • • Nucleosome Nucleosome positioning positioning • SNPs SNPs and Point Mutations and Point Mutations • •

Metagenomics ”metagenomics is the application of modern genomics techniques to the study of communities of microbial organisms directly in their natural environments, bypassing the need for isolation and lab cultivation of individual species. ” Kevin Chen and Lior Pachter (University of California, Berkeley) >99% of all microbes cannot be cultured Soil - Sea - Air - ancient DNA - body parts

Metagenomics 454 ob1 ob2 Obese: ob/ob • Selection of microbial cells • DNA extraction Lean: +/+ ob/+ lean3 lean1 lean2 454 sequencing: 3runs • nebulization, ligation, fixed to 2runs beads and emulsion PCR • GS20 pyrosequencer Shotgun sequencing: • cloning in plasmid library • 3730xl capillary sequencer Turnbaugh, PJ Nature - 444, 1027 - 1031 2006

Metagenomics 454 Draft genome of the most common bacterium ( E. rectale) : • overlap generation • contig layout • consensus generation Metagenomics Analyses: • BLASTX (e<10-5) EGS = enviromental gene tags Turnbaugh, PJ Nature - 444, 1027 - 1031 2006

Metagenomics 454 Capillary Pros: • more confident gene calling 454 Pros: • less time consuming only 454 for metagenomics applications • higher sequence coverage • not affected by cloning bias Turnbaugh, PJ Nature - 444, 1027 - 1031 2006

Ultra-deep sequencing Re-sequencing a region several times to detect non-common variants ATCGT ATCGT ATCGT Sanger Sanger ATCGT Only consensus Only consensus ATA AGT GT AT ATCGT ATCGT ATCGT ATCGT sequence: ATCGT sequence: ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT NGS ATCGT NGS ATCGT ATCGT ATA AT AGT GT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATCGT ATA AGT GT AT

Ultra-deep sequencing 454 Detection of rare sub-clonal sub-clonal mutations in cancer cells mutations in cancer cells Detection of rare Samples: ~300bp • Blood of 24 patients affected by CCL (chronic lymphocytic leukemia) • Renal cell of 1 patient F R e.g. ACT 385,000 reads, ~250bp per read ( >95% aligned to the reference) • PCR amplification • equimolar pool of amplicons • One 454 run Campbell, PJ PNAS - 105, 13081 - 13086 2008

Ultra-deep sequencing 454 ERROR PROCESSING : Analysis of the control locus all the variations from the reference sequence are artifacts Sequencing errors: DNA polymerase errors: • polyN > 4 • not associated to polyN • many indels (sequence ends) • typical substitution pattern • few substitutions (throughout the sequence) e.g (G:C->A:T) most common Campbell, PJ PNAS - 105, 13081 - 13086 2008

Ultra-deep sequencing 454 Filter to detect “real” rare variants in 24 samples by excluding: • poor quality reads • indels and substitutions in polyN tracts > 4bp • expected from the distribution of polymerase errors • only in forward and reverse Sub-clonal mutations can be detected down to a frequency of 1/5000 reads Phyolgenetic analysis: • clustalW • maximum parsimony • 1000 bootstrap Campbell, PJ PNAS - 105, 13081 - 13086 2008

Protein-DNA binding sites ChIP-chip Fields S Science (2007) 316. pp. 1441 - 1442 ChIP-seq Chip-chip limits : - low resolution - incorrect hybridizations - a priori knowledge of potential binding sites - no information on the sequence

Protein-DNA binding sites Illumina Protein: NRSF (neuron-restrictive silencer factor) • known “gold standard” target genes • known DNA motif • high-quality antibody DNA samples: • NRSF enriched Chip sample • control of chromatin not immuno-enriched Sequencing and Mapping: • 2-5M reads, 25nt • 50% maps on unique locations • <3 mismatches allowed 1946 peaks Detection of binding sites: • >= 13 reads per sequence • 5 fold enrichment vs control Johnson, DS Science - 316, 1497 - 1502 2007

Protein-DNA binding sites Illumina Benchmark: • compare with known positive and negative binding sites • sensitivity = 87% • specificity = 98% Variation of DNA motifs at the binding site: • 100bp from the “best” 10% segments screened by a motif-finding algorithm • 75% have the known canonical motif • detection of novel non canonical motifs Canonical Non canonical Johnson, DS Science - 316, 1497 - 1502 2007

microRNA profiling Illumina Single-stranded RNA molecules of 21-23nt long that regulated gene expression RNA preparation and sequencing: Samples : • extraction of small RNAs • Pluripotent human embriotic stem cells (hESCs) • libraries of single stranded cDNA • Differentiated cells: embriotic bodies (EBs) • illumina sequencing Filter and Mapping on the genome: • unfiltered reads: 6M, 25nt • perfect alignments to the genome (no indels): ~4M (70%) reads and ~0.75M unique sequences • only sequences observed > 3 reads Overlap with DBs of known sequences: 5% sequences Morin, D Genome Research - 18, 610 - 621 2008

microRNA profiling Illumina Qualitative analysis (known microRNAs) : • detect the variability between reads of the same microRNA sequence: cleavage positions and post-translational modifications Morin, D Genome Research - 18, 610 - 621 2008

microRNA profiling Illumina Quantitative analysis : • reads count per sequence is an index of the expression level (digital expression) • detect the differential expression of microRNAs between samples 100 microRNAs Morin, D Genome Research - 18, 610 - 621 2008

Trascriptome profiling SOliD Samples : • Pluripotent mouse embriotic stem cells (ES) • Differentiated cells: embriotic bodies (EB) • mRNA extraction • library generation (in triplicate per sample) • sequencing Cloonan, N Nature Methods - 5(7), 613 - 619 2008

Trascriptome profiling SOliD Filter and Mapping strategy : 7 steps!! 1. Quality check or removal of 5nt Good quality reads: ~155M reads per sample 2. Clustering to unique tags 3. Mapping on the genome (<=2 mismatches) Reads mapping on the genome: • ~ 95M reads (60%) Multiple mapping is accepted (if less than 100 positions) Cloonan, N Nature Methods - 5(7), 613 - 619 2008

Trascriptome profiling SOliD Custom track on UCSC: • variation in tag coverage • bias : multiple mapping Gene expression (tag count): • high reproducibility between replicates (r>0.95) • good reproducibility between tag counts per gene and microarray signal Differential expression between samples: • tag counts per gene in ES and EB (35/50 ES markers were confirmed): 70% sensitivity Cloonan, N Nature Methods - 5(7), 613 - 619 2008

Trascriptome profiling SOliD Transcriptome discovery : • ~33% of tags are in non-exonic sequences • 20% of tags are in repeat elements (normally excluded from expression arrays) Alternative splicing isoforms: • high quality 35mers were clustered in a longer consensus (>50nt) • BLAT on the genome Cloonan, N Nature Methods - 5(7), 613 - 619 2008

Trascriptome profiling SOliD Discovery of expressed SNPs: Extensive filtering! Mapping to the genome: Filter by Only full length tags (35nt) (multi-mapping are excluded) colour-space errors and high quality Filter by error profile of tags: first 6nt, last 5nt and 26 • 2,000 putative SNPs in both samples • 643 in Refseq ( 84% known SNPs) Filter by proportion: 75% of tag are mutated: • 8/10 non synonymous SNPs validated by (heterozigous mutations are sistematically discarded) PCR: specificity = 80% Cloonan, N Nature Methods - 5(7), 613 - 619 2008

Summary Number of reads Read length Application 454 Illumina SOLiD Genome sequencing Small Genomes Small Genomes No Genome re-sequencing Yes Yes Small genome Metagenomics Yes Only virus No Amplicon sequencing Yes No No Ultra-deep sequencing Yes Tested only for 100s Tested only for reads 100s reads Transcriptome Analysis Yes Yes Yes Structural variations Yes Yes Yes SNPs and Point Mutations Yes Yes Yes Chip-Seq Yes Yes Yes Nucleosome positioning Yes Yes Yes

Applications Anna De Grassi - - European Institute of Oncology - - PowerPoint PPT Presentation

Next Generation Sequencing: Applications Anna De Grassi - - European Institute of Oncology - Milan -- - F. Ciccarelli group - BITS - March 20, 2009 - Genoa Several Flavours of Throughput Genome sequencing Genome sequencing

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Tcp/Ip Applications Programming for Os/2: With Applications for Presentation Manager Tcp/Ip

Multimedia Applications Multimedia Applications Srinidhi Varadarajan Multimedia Applications

Network Applications Network Applications There are many network applications Network

Customer Data Privacy in Customer Data Privacy in AMI Applications AMI Applications AMI

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Sponsored by: Sponsored by: OR 680: Applications Seminar OR 680: Applications Seminar OR 680:

Vadim Lozin DIMAP Center for Discrete Mathematics and its Applications Mathematics Institute

CO550 Web Applications UNIT 11 Wider Context of Web Applications, Progressive Web Apps,

BLOCKCHAIN Technology & Applications #apiconf2018 BLOCKCHAIN Technology & Applications

New Directions for Web Applications Dave Raggett, Canon, TV Raman, IBM 1/11 Web Applications

AI Planner Applications Practical Applications of AI Planners Overview Deep Space 1

Reacting Flow Applications in STAR-CCM+ Outline Various Applications Overview of available

Presentation Technical results Applications Product Info Main Flex-Auger applications PIGS

FY 2018/2019 Application Cycle Application Rating Program Applications Applications # of

CS-5630 / CS-6630 Visualization for Data Science Alexander Lex alex@sci.utah.edu [xkcd]

Always be Cross-compiling Matthew Bauer, John Ericson October 9, 2019 Always be cross compiling

Folding, Assembly, Flexible Systems Maxim Petoukhov EMBL, Hamburg Outstation Outline Outline

Structural Biology Michael Sattler Institute of Structural Biology (STB)

The Brave New World of Non-Coding RNAs Peter F. Stadler Bioinformatics Group, Dept. of Computer

Remote Observing at UCO/Lick Observatory (Part 2) Geoff Marcy, UC Berkeley Robert Kibrick,

Airspace, Altitudes, and Weather more than 30 minutes. Pilots Must Always Use it Above 14,000

MMS UPDATES for MitoAction April 1, 2016 New Projects 1) Solid Organ Transplantation:

Sambuz

Useful Links

Newsletter

Mail Us

Applications Anna De Grassi - - European Institute of Oncology - - PowerPoint PPT Presentation

Next Generation Sequencing: Applications Anna De Grassi - - European Institute of Oncology - Milan -- - F. Ciccarelli group - BITS - March 20, 2009 - Genoa Several Flavours of Throughput Genome sequencing Genome sequencing

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Tcp/Ip Applications Programming for Os/2: With Applications for Presentation Manager Tcp/Ip

Multimedia Applications Multimedia Applications Srinidhi Varadarajan Multimedia Applications

Network Applications Network Applications There are many network applications Network

Customer Data Privacy in Customer Data Privacy in AMI Applications AMI Applications AMI

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Sponsored by: Sponsored by: OR 680: Applications Seminar OR 680: Applications Seminar OR 680:

Vadim Lozin DIMAP Center for Discrete Mathematics and its Applications Mathematics Institute

CO550 Web Applications UNIT 11 Wider Context of Web Applications, Progressive Web Apps,

BLOCKCHAIN Technology &amp; Applications #apiconf2018 BLOCKCHAIN Technology &amp; Applications

New Directions for Web Applications Dave Raggett, Canon, TV Raman, IBM 1/11 Web Applications

AI Planner Applications Practical Applications of AI Planners Overview Deep Space 1

Reacting Flow Applications in STAR-CCM+ Outline Various Applications Overview of available

Presentation Technical results Applications Product Info Main Flex-Auger applications PIGS

FY 2018/2019 Application Cycle Application Rating Program Applications Applications # of

CS-5630 / CS-6630 Visualization for Data Science Alexander Lex alex@sci.utah.edu [xkcd]

Always be Cross-compiling Matthew Bauer, John Ericson October 9, 2019 Always be cross compiling

Folding, Assembly, Flexible Systems Maxim Petoukhov EMBL, Hamburg Outstation Outline Outline

Structural Biology Michael Sattler Institute of Structural Biology (STB)

The Brave New World of Non-Coding RNAs Peter F. Stadler Bioinformatics Group, Dept. of Computer

Remote Observing at UCO/Lick Observatory (Part 2) Geoff Marcy, UC Berkeley Robert Kibrick,

Airspace, Altitudes, and Weather more than 30 minutes. Pilots Must Always Use it Above 14,000

MMS UPDATES for MitoAction April 1, 2016 New Projects 1) Solid Organ Transplantation:

Sambuz

Useful Links

Newsletter

Mail Us

BLOCKCHAIN Technology & Applications #apiconf2018 BLOCKCHAIN Technology & Applications