rapid identification of amr determinants from metagenomic
play

Rapid Identification of AMR Determinants from Metagenomic Samples - PowerPoint PPT Presentation

Rapid Identification of AMR Determinants from Metagenomic Samples AMRtime Progress Report Finlay Maguire June 22, 2018 Faculty of Computer Science, Dalhousie University Table of contents 1. Overview 2. Training Data 3. Read filtering 4.


  1. Rapid Identification of AMR Determinants from Metagenomic Samples AMRtime Progress Report Finlay Maguire June 22, 2018 Faculty of Computer Science, Dalhousie University

  2. Table of contents 1. Overview 2. Training Data 3. Read filtering 4. Sensitive Homology Search 5. Variant Models 6. Summary 7. Acknowledgements 1

  3. Overview

  4. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: 2

  5. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms 2

  6. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: 2

  7. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: • Homology e.g. NDM beta-lactamases, aminoglycoside acetyltransferase 2

  8. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: • Homology e.g. NDM beta-lactamases, aminoglycoside acetyltransferase • Protein Variant e.g. GyrA fluoroquinolone mutation, FolP sulfonamide mutation 2

  9. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: • Homology e.g. NDM beta-lactamases, aminoglycoside acetyltransferase • Protein Variant e.g. GyrA fluoroquinolone mutation, FolP sulfonamide mutation • rRNA gene variants e.g. Mycobacterium aminoglycoside resistance 2

  10. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: • Homology e.g. NDM beta-lactamases, aminoglycoside acetyltransferase • Protein Variant e.g. GyrA fluoroquinolone mutation, FolP sulfonamide mutation • rRNA gene variants e.g. Mycobacterium aminoglycoside resistance • Efflux pump e.g. AcrAB-TolC, MexAB-OprM mutations 2

  11. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: • Homology e.g. NDM beta-lactamases, aminoglycoside acetyltransferase • Protein Variant e.g. GyrA fluoroquinolone mutation, FolP sulfonamide mutation • rRNA gene variants e.g. Mycobacterium aminoglycoside resistance • Efflux pump e.g. AcrAB-TolC, MexAB-OprM mutations • Gene cluster e.g. Van glycopeptide resistance clusters 2

  12. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: • Homology e.g. NDM beta-lactamases, aminoglycoside acetyltransferase • Protein Variant e.g. GyrA fluoroquinolone mutation, FolP sulfonamide mutation • rRNA gene variants e.g. Mycobacterium aminoglycoside resistance • Efflux pump e.g. AcrAB-TolC, MexAB-OprM mutations • Gene cluster e.g. Van glycopeptide resistance clusters • Resistance Gene Identifier (RGI): contigs, predicted genes and merged metagenomic reads 2

  13. Comprehensive Antibiotic Resistance Database • https://card.mcmaster.ca/ (Jia et al., 2016) as of June 2018: • Built around Antibiotic Resistance Ontology (ARO): 3996 terms • 2536 AMR Detection Models with manually curated criteria: • Homology e.g. NDM beta-lactamases, aminoglycoside acetyltransferase • Protein Variant e.g. GyrA fluoroquinolone mutation, FolP sulfonamide mutation • rRNA gene variants e.g. Mycobacterium aminoglycoside resistance • Efflux pump e.g. AcrAB-TolC, MexAB-OprM mutations • Gene cluster e.g. Van glycopeptide resistance clusters • Resistance Gene Identifier (RGI): contigs, predicted genes and merged metagenomic reads • CARDPredicted prevalence dataset 2

  14. Metagenomic Analysis modified from https://www.gatc-biotech.com/en/expertise/genomics/metagenome-analysis.html Key difficulties: • Variation in abundance and diversity 3

  15. Metagenomic Analysis modified from https://www.gatc-biotech.com/en/expertise/genomics/metagenome-analysis.html Key difficulties: • Variation in abundance and diversity • Short fragmentary data 3

  16. Metagenomic Analysis modified from https://www.gatc-biotech.com/en/expertise/genomics/metagenome-analysis.html Key difficulties: • Variation in abundance and diversity • Short fragmentary data • Large amounts of data 3

  17. Metagenomic Analysis modified from https://www.gatc-biotech.com/en/expertise/genomics/metagenome-analysis.html Key difficulties: • Variation in abundance and diversity • Short fragmentary data • Large amounts of data • Compositionality 3

  18. Metagenomic Analysis modified from https://www.gatc-biotech.com/en/expertise/genomics/metagenome-analysis.html Key difficulties: • Variation in abundance and diversity • Short fragmentary data • Large amounts of data • Compositionality • Spare and imbalanced labels 3

  19. AMRtime Structure Input files Metagenomic Reads Processes AMR Filtering Intermediate files Output files Filtered reads CARD Sensitive Homology Search Homology predictions Variant Identification Metamodels Variant predictions Metamodel predictions 4

  20. Training Data

  21. Dataset Generator Assembled Genomes (*.fna) Resistance Gene Identifier (RGI) Abundance/Diversity Resampling CARD AMR Annotations (*.gff) ’Assembled’ metagenome (.fna) Illumina Simulator (ART) Labelling Synthetic metagenome (.fq) Read labels (.txt) 5

  22. Determinants are scarce 6

  23. Determinants are imbalanced 7

  24. AMR sequence space is biased 8

  25. Read filtering

  26. Homology Filter Approaches • BLASTX (Gish et al., 1993) • DIAMOND (Buchfink et al., 2015) • PALADIN (Westbrook et al., 2017) • MMSeqs2 (Steinegger and S¨ oding, 2017) 9

  27. Performance at defaults? 10

  28. How computationally efficient are they? 11

  29. What about in terms of memory? 12

  30. Is there a cap on overall performance? 13

  31. What about to hit any ARO? 14

  32. Performance for best setting per tool 15

  33. But what about individual ARO performance? 16

  34. Systematically missing AROs 17

  35. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): 18

  36. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 18

  37. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 • DNA 1-732 18

  38. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 • DNA 1-732 • OXA-2 (M95287.4): 18

  39. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 • DNA 1-732 • OXA-2 (M95287.4): • Protein 2456-3280 18

  40. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 • DNA 1-732 • OXA-2 (M95287.4): • Protein 2456-3280 • DNA 1-828 18

  41. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 • DNA 1-732 • OXA-2 (M95287.4): • Protein 2456-3280 • DNA 1-828 • Acinetobacter OprD conferring resistance to imipenem (CP006768.1): 18

  42. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 • DNA 1-732 • OXA-2 (M95287.4): • Protein 2456-3280 • DNA 1-828 • Acinetobacter OprD conferring resistance to imipenem (CP006768.1): • Protein 3513470-3514777 18

  43. Why are these 10 always missed? • Enterococcus faecalis liaS mutant conferring daptomycin resistance (AE016830.1): • Protein 2790824-2789724 • DNA 1-732 • OXA-2 (M95287.4): • Protein 2456-3280 • DNA 1-828 • Acinetobacter OprD conferring resistance to imipenem (CP006768.1): • Protein 3513470-3514777 • DNA 3514887-3515414 18

  44. CARD Full Length Alignment QC • 11 AROs protein not detected from DNA 19

  45. CARD Full Length Alignment QC • 11 AROs protein not detected from DNA • 2 AROs different top protein hit from DNA 19

  46. CARD Full Length Alignment QC • 11 AROs protein not detected from DNA • 2 AROs different top protein hit from DNA • Warnings: 119 AROs with different top protein but ID % > 99 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend