improving genome assemblies assessing structural
play

Improving genome assemblies, assessing structural variation and - PowerPoint PPT Presentation

Improving genome assemblies, assessing structural variation and trait association using chromosome genomics and Illumina skim genotyping by sequencing David Edwards University of Queensland, Australia Dave.Edwards@uq.edu.au 1 Outline


  1. Improving genome assemblies, assessing structural variation and trait association using chromosome genomics and Illumina skim genotyping by sequencing David Edwards University of Queensland, Australia Dave.Edwards@uq.edu.au 1

  2. Outline • Chromosome sequencing • SNP discovery • Genotyping by sequencing (skim method) • Validating genome structure

  3. The challenge of genome Technology - Next Generation sequence sequencing

  4. The challenge of genome Technology - Next Generation sequence sequencing Thanks to Roger Hellens, Plant and Food New Zealand

  5. Hexaploid wheat genome http://www.jic.ac.uk/staff/graham-moore/wheat_meiosis.htm 17 billion bases 5

  6. Chromosome sequencing • Isolate individual or groups of chromosomes using flow cytometry • Generate NGS libraries and PE Illumina data • Assemble or map reads to reference genome

  7. Mapping reads to reference genomes 1 2 3 4 5 6 7 8 9 10 11 12 7

  8. Sequencing wheat chromosome arms Ta 7DS Bd 1 Bd 3 www.wheatgenome.info 8 Berkman, et al. , Plant Biotechnology Journal (2011)

  9. 7BS/4AL translocation 7DS and 7BL sequence similarity with Brachypodium 9

  10. 7BS/4AL translocation • Translocation between Bradi1g49500 and Bradi1g49550 • Intervening 4 genes missing from all assemblies • ~13% genes moved from 7BS to 4AL • 13 genes moved from 4AL to 7BS Berkman et al. (2012) Theoretical and Applied Genetics 3 , 423-432 10

  11. Wheat genome evolution 10,000 50,000 years ago years ago 7A AABB AA AABB AW BB AABBDD 7B DD DD 7D 11

  12. GBrowse http://wheatgenome.info/ Lai et al.(2012) Plant and Cell Physiology 53 , 1-7

  13. Genome sequencing in chickpea Two draft genomes published in 2013 13

  14. Chickpea reference (Kabuli)

  15. Chickpea reference (Kabuli)

  16. Chickpea reference (Kabuli) K8 D8 K3 D3 K5 D5 K = Kabuli D = Desi

  17. Chickpea reference (Kabuli) K8 D8 K3 D3 K5 D5

  18. Chickpea reference (Desi) A 8 3 5

  19. Chromosome sequencing • Sequencing isolated chromosomes identifies misassembles and rearrangements at base pair resolution

  20. SGSautoSNP • Generate a reference • Map variety specific reads to the reference • Call differences between the varieties • At least two reads defining the difference • No conflict within a variety (homozygous genomes) >95% accuracy for canola >93% accuracy for wheat 20

  21. Brassica SNP matrix A 0 Bn 55,716 0 E 57,492 67,676 0 I 27,487 33,874 26,406 0 J 100,933 108,457 86,807 52,377 0 M1 52,541 61,657 43,746 20,655 93,148 0 M51 53,627 69,495 54,071 30,968 93,966 56,190 0 M52 64,088 68,533 63,092 34,656 51,013 63,219 60,793 0 M91 70,214 80,230 57,023 38,612 89,294 67,496 60,932 58,091 0 M2 34,535 38,248 27,954 18,731 41,866 34,073 29,306 27,318 11,944 0 Mu 106,182 121,584 87,536 46,824 192,343 72,205 114,260 130,317 131,155 66,838 0 N 159,608 208,373 146,700 73,345 270,623 139,082 178,653 205,985 215,689 113,928 258,980 0 No 81,073 97,160 86,610 39,263 164,813 81,265 93,250 98,393 97,109 46,546 174,630 252,923 0 S 40,857 42,661 53,786 28,431 92,840 51,584 55,260 60,118 64,493 31,424 101,900 160,234 81,474 0 Sr 65,657 85,317 63,305 38,484 113,199 68,078 3,798 73,578 73,825 35,584 137,597 215,422 115,212 68,231 0 T 124,971 149,974 100,000 51,304 212,272 61,611 132,415 153,887 153,504 82,307 175,304 296,891 213,237 119,697 157,308 0 Tf 57,190 76,556 78,239 39,240 140,978 68,383 59,394 78,257 90,655 41,702 157,441 262,784 125,298 65,430 74,385 194,683 0 Tr 11,193 14,028 12,553 6,760 21,972 12,045 6,624 13,849 16,149 7,794 25,791 39,920 20,127 12,249 8,314 30,468 12,331 0 A Bn E I J M1 M51 M52 M91 M2 Mu N No S Sr T Tf Tr 21

  22. Skim GBS • Determine SNPs by sequencing parents and running SGSautoSNP • Low coverage skim sequence segregating population • Map reads to the reference genome • Call genotype where reads cover previously defined SNP • Impute and clean to define haplotype blocks 22

  23. Genotype calling A A T/C C/A Call genotype of previously predicted SNPs 23

  24. Pre-imputation

  25. After imputation and cleaning

  26. Misplaced contigs in assembly?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend