around the resistome in 80 ways
play

Around the resistome in 80 ways: an empirical evaluation of - PowerPoint PPT Presentation

Around the resistome in 80 ways: an empirical evaluation of antimicrobial resistance gene detection methods Finlay Maguire finlaymaguire@gmail.com December 2, 2019 Faculty of Computer Science, Dalhousie University Table of contents 1.


  1. Highly similar families to blame 37

  2. Is there any way to improve this?

  3. Statistical/Machine-Learning Correction DIAMOND-BLASTX Output Classifier AMR Gene Predictions 38

  4. Statistical/Machine-Learning Correction DIAMOND-BLASTX Output Classifier AMR Gene Predictions 38

  5. Statistical/Machine-Learning Correction DIAMOND-BLASTX Output Classifier AMR Gene Predictions 38

  6. Statistical/Machine-Learning Correction DIAMOND-BLASTX Output Classifier AMR Gene Predictions Average Precision: 0.63 38

  7. Statistical/Machine-Learning Correction DIAMOND-BLASTX Output Classifier AMR Gene Predictions Average Precision: 0.63 % 38

  8. Revised classifier structure: exploiting the ARO DIAMOND-BLASTX Output AMR Family Classifier AMR Families Family 1 Reads Family ... Reads Family N Reads Family 1 Classifier Family ... Classifier Family N Classifier AMR Gene Predictions 39

  9. Revised classifier structure: exploiting the ARO DIAMOND-BLASTX Output AMR Family Classifier AMR Families Family 1 Reads Family ... Reads Family N Reads Family 1 Classifier Family ... Classifier Family N Classifier AMR Gene Predictions 39

  10. Revised classifier structure: exploiting the ARO DIAMOND-BLASTX Output AMR Family Classifier AMR Families Family 1 Reads Family ... Reads Family N Reads Family 1 Classifier Family ... Classifier Family N Classifier AMR Gene Predictions 39

  11. Revised classifier structure: exploiting the ARO DIAMOND-BLASTX Output AMR Family Classifier AMR Families Family 1 Reads Family ... Reads Family N Reads Family 1 Classifier Family ... Classifier Family N Classifier AMR Gene Predictions 39

  12. Revised classifier structure: exploiting the ARO DIAMOND-BLASTX Output AMR Family Classifier AMR Families Family 1 Reads Family ... Reads Family N Reads Family 1 Classifier Family ... Classifier Family N Classifier AMR Gene Predictions 39

  13. Revised classifier structure: exploiting the ARO DIAMOND-BLASTX Output AMR Family Classifier AMR Families Family 1 Reads Family ... Reads Family N Reads Family 1 Classifier Family ... Classifier Family N Classifier AMR Gene Predictions 39

  14. Slightly improved family performance Normalised Bitscore Random Forest 1.00 0.75 Proportion 0.50 0.25 0.00 Precision Recall Family Test Peformance Mean Precision: 0.995, Mean Recall: 0.985 40

  15. Greatly improved gene performance 41

  16. Gains not evenly distributed Median Precision-Recall Within Families 1.00 Precision Recall 0.75 Proportion 0.50 0.25 0.00 0 25 50 75 100 125 150 175 200 225 Ordered AMR Family Index • Not enough signal in read so output compatible set • Some fixed bugs 42

  17. Metagenomic resistome profile Normalised Read Proportion 10 10 10 10 6 5 4 3 ARO:0000042 ! glycylcycline ARO:0000072 ! linezolid ARO:0000004 ! monobactam ARO:0000025 ! fosfomycin ARO:3000157 ! rifamycin antibiotic ARO:3000034 ! nucleoside antibiotic ARO:3000111 ! novobiocin ARO:3000282 ! sulfonamide antibiotic 47 human gut metagenome profiles ARO:3000053 ! peptide antibiotic ARO:0000041 ! bacitracin ARO:3003253 ! aminocoumarin sensitive parY ARO:3000657 ! paromomycin ARO:0000021 ! ribostamycin ARO:3000701 ! lividomycin B AMR hits related to Drug Class ARO:3000700 ! lividomycin A ARO:3000655 ! gentamicin B Drug Class ARO:0000024 ! butirosin ARO:0000049 ! kanamycin A ARO:0000032 ! cephalosporin ARO:3000387 ! phenicol antibiotic ARO:3000554 ! mupirocin ARO:0000001 ! fluoroquinolone antibiotic ARO:0000044 ! cephamycin ARO:3000103 ! aminocoumarin antibiotic ARO:3000171 ! diaminopyrimidine antibiotic ARO:0000000 ! macrolide antibiotic ARO:0000016 ! aminoglycoside antibiotic ARO:0000026 ! streptogramin antibiotic ARO:3000081 ! glycopeptide antibiotic ARO:0000022 ! polymyxin antibiotic ARO:0000017 ! lincosamide antibiotic Indeterminate Class ARO:3001219 ! elfamycin antibiotic ARO:3000007 ! beta-lactam antibiotic ARO:3000050 ! tetracycline derivative 43

  18. Great, but... • Known AMR genes • Is one organism resistant to everything? • Are many organisms each resistant to one thing? • Have AMR genes been laterally transferred? 44

  19. Can we get the best of metagenomics and genomics?

  20. Metagenomic-Assembled Genomes

  21. MAG binning Genomes Sequencing Reads Assembly Contigs Binning Metagenome- Assembled Genomes 45

  22. MAGs are popular Figure from (Parks et al., 2017) 46

  23. What about plasmids? Figure from (Antipov et al., 2016) • Circular or linear extrachromosomal self-replicating DNA. • Dissemination of AMR genes. • Repetitive, variable copy number, different sequence composition. 47

  24. Or genomic islands www.pathogenomics.sfu.ca/islandviewer • Clusters of genes acquired through LGT • Integrons, transposons, integrative and conjugative elements (ICEs) and prophages • Variable copy number and composition (used by SIGI-HMM, IslandPath-DIMOB) 48

  25. How well do MAGs recover these sequences?

  26. Time to start simulating again • Simulate some metagenomes (lognormal abundance distribution) from difficult genomes • 10 genomes: lots of plasmids • 10 genomes: high % of genomic islands (compositional) • 10 genomes: low % of genomic islands • Assembly using 3 alternative methods: IDBA UD, MetaSPAdes, Megahit • Bin contigs using 4 different tools: metabat2, maxbin2, concoct, dastool 49

  27. Chromosomes fairly well binned 26-94.3% median chromosomal coverage (Pre-print draft github.com/fmaguire/mag_sim_paper ) 50

  28. Chromosomes fairly well binned 26-94.3% median chromosomal coverage (Pre-print draft github.com/fmaguire/mag_sim_paper ) 50

  29. Plasmids are not 1.5-29.2% plasmids binned 51

  30. Genomic islands are better but bad 28-42% GIs binned 52

  31. What about AMR genes? 24-43% AMR genes binned 53

  32. Which AMR genes are lost? • 30-53% chromosomal AMR genes (n=120) • 0-45% genomic island AMR genes (n=11) • 0% of plasmid AMR genes (n=20) 54

  33. Be cautious with MAGs • Regain some context but with biased data loss • Disproportionate loss of AMR genes • Mobile Genetic Elements poorly recovered • Cautionary tale: more processing = more data loss 55

  34. Conclusions

  35. Conclusions Method Strengths Weaknesses

  36. Conclusions Method Strengths Weaknesses Targeted Cheap, easy analysis a priori , stagnation

  37. Conclusions Method Strengths Weaknesses Targeted Cheap, easy analysis a priori , stagnation Genomics Context, moderate analysis Isolation, throughput

  38. Conclusions Method Strengths Weaknesses Targeted Cheap, easy analysis a priori , stagnation Genomics Context, moderate analysis Isolation, throughput Metagenomics Many genomes at once Fragmented, no context, difficult analysis

  39. Conclusions Method Strengths Weaknesses Targeted Cheap, easy analysis a priori , stagnation Genomics Context, moderate analysis Isolation, throughput Metagenomics Many genomes at once Fragmented, no context, difficult analysis Metagenomic-Assembed Genomes Context for many genomes Lose key data, complex analysis • Simulation fundamental to evaluating approaches 56

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend