Applications of the US EPAs CompTox Chemicals Dashboard to support - - PowerPoint PPT Presentation

applications of the us epa s comptox chemicals dashboard
SMART_READER_LITE
LIVE PREVIEW

Applications of the US EPAs CompTox Chemicals Dashboard to support - - PowerPoint PPT Presentation

http://www. orcid.org/0000-0002-2668-4821 Applications of the US EPAs CompTox Chemicals Dashboard to support structure identification and chemical forensics using mass spectrometry Antony Williams 1 and Andrew D. McEachran 2,3 1) National


slide-1
SLIDE 1

Applications of the US EPA’s CompTox Chemicals Dashboard to support structure identification and chemical forensics using mass spectrometry

Antony Williams1 and Andrew D. McEachran2,3

1) National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC 2) Oak Ridge Institute of Science and Education (ORISE) Research Participant, RTP, NC 3) Present Address: Agilent Inc., Santa Clara, CA

March 2019 Pittcon, Philadelphia http://www.orcid.org/0000-0002-2668-4821

The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA

slide-2
SLIDE 2
  • National Center for Computational Toxicology

established in 2005 to integrate:

– High-throughput and high-content technologies – Modern molecular biology – Data mining and statistical modeling – Computational biology and chemistry

  • Researching computational approaches to

quickly evaluate the safety of chemicals for potential risk.

  • Outputs: a lot of data, models, algorithms and

software applications

National Center for Computational Toxicology

slide-3
SLIDE 3

CompTox Chemicals Dashboard

  • A publicly accessible website delivering access:

– ~875,000 chemicals with related property data – Searchable by chemical, product use, gene and assay (ToxCast) – Experimental and predicted physicochemical property data – “Bioactivity data” for the ToxCast/Tox21 project – Generalized Read-Across (GenRA) module – Links to other agency websites and public data resources – “Literature” searches for chemicals using public resources – “Batch searching” for thousands of chemicals – DOWNLOADABLE Open Data for reuse and repurposing

2

slide-4
SLIDE 4

CompTox Chemicals Dashboard

https://comptox.epa.gov/dashboard

3

slide-5
SLIDE 5

Search Chemicals

4

slide-6
SLIDE 6

Detailed Chemical Pages

5

slide-7
SLIDE 7

Access to Chemical Hazard Data

6

slide-8
SLIDE 8

In Vitro Bioassay Screening

ToxCast and Tox21

7

slide-9
SLIDE 9

Sources of Exposure to Chemicals

8

slide-10
SLIDE 10

MS-Ready Mappings

9

slide-11
SLIDE 11

Specific Data-Mappings “MS-Ready Structures”

10

slide-12
SLIDE 12

MS-Ready Publication

https://doi.org/10.1186/s13321-018-0299-2

11

slide-13
SLIDE 13

MS-Ready Mappings Set

12

slide-14
SLIDE 14

Mass and Formula Searches Supporting Mass Spectrometry

13

slide-15
SLIDE 15

Advanced Searches Mass Based Search

14

slide-16
SLIDE 16

Advanced Searches Mass Based Search

15

slide-17
SLIDE 17

Advanced Searches Mass Based Search

16

slide-18
SLIDE 18

Batch Searching

  • Singleton searches are useful but we work

with thousands of chemicals!

  • Typical questions

– What is the list of chemicals for the formula CxHyOz – What is the list of chemicals for a mass +/- error – Can I get chemical lists in Excel files? In SDF files?

17

slide-19
SLIDE 19

Batch Searches

18

slide-20
SLIDE 20

Batch Searches

19

slide-21
SLIDE 21

Batch Searching Formula/Mass

20

slide-22
SLIDE 22

Excel Output

21

slide-23
SLIDE 23

Suspect Screening and Non-Targeted Analysis Workflow

22

DSSTox Chemical Database “Molecular Features” Extracted Samples Raw Samples Raw Features Matched Formulas Mapped Structures Prioritized Structures (using ToxPi) Confirmed Structures (using ToxCast standards) Processed Features Prioritized Features Predicted Formulas Candidate Structures Sorted Structures Predicted Retention Times Predicted/Observed Functional Use Top Candidate Structure(s)

Suspect Screening Non-Targeted Analysis

Predicted Concentrations Predicted/Observed Media Occurrence Predicted Mass Spectra Methodological Concordance Red = Analytical Chemistry Blue = Data Processing & Analysis Green = Informatics & Web Services Purple = Mathematical & QSPR Modeling

Color Key

slide-24
SLIDE 24

MS-Ready Structures Underpin Analysis

23

slide-25
SLIDE 25

MS-Ready Structures Underpin Analysis

24

slide-26
SLIDE 26

The Dashboard to Support MS-Analysis

25

MS-Ready Structures Underpin Analysis

slide-27
SLIDE 27

MS-Ready Mappings

  • Input Formula: C10H16N2O8: 3 Hits

26

slide-28
SLIDE 28

MS-Ready Mappings

  • Same Input Formula: C10H16N2O8
  • MS Ready Formula Search: 125 Chemicals

27

slide-29
SLIDE 29

MS-Ready Mappings

  • 125 chemicals returned in total

– 8 of the 125 are single component chemicals – 3 of the 8 are isotope-labeled – 3 are neutral compounds and 2 are charged

28

slide-30
SLIDE 30

Complexity to Simplicity 93 Chemicals – 7 in EPAHFR

29

slide-31
SLIDE 31

Complexity to Simplicity 93 Chemicals – 7 in the list

30

slide-32
SLIDE 32

Searching batches Formula (or mass) searching

31

slide-33
SLIDE 33

Downloadable Data

32

slide-34
SLIDE 34

Work in Progress

  • CFM-ID

– Viewing and Downloading pre-predicted spectra – Search spectra against the database

  • Retention Time Index Prediction
  • Structure/substructure/similarity search
  • Generation of MS-ready structures:

– Upload file, download results – Service based generation

33

slide-35
SLIDE 35

Predicted Mass Spectra

http://cfmid.wishartlab.com/

  • MS/MS spectra prediction for ESI+, ESI-, and EI
  • Predictions generated and stored for >700,000

structures, to be accessible via Dashboard

34

slide-36
SLIDE 36

Library Fragmentation Spectra (20eV) Observed Fragmentation Spectra (20eV) Match Score

Predicted Mass Spectra

slide-37
SLIDE 37

Search Expt. vs. Predicted Spectra

slide-38
SLIDE 38

Prototype Development

37

slide-39
SLIDE 39

Prototype Development

38

slide-40
SLIDE 40

Conclusion

  • The CompTox Chemicals Dashboard provides

access to data for ~875,000 chemicals

  • Multiple prediction models available for data gap

filling

– OPERA models and TEST models – PhysChem and Tox endpoints – Models based on in vitro data – classification models – Generalized Read-Across development in progress

  • 2 years development as a CompTox Integration Hub

39

slide-41
SLIDE 41

Acknowledgements

  • IT Development team – especially Jeff

Edwards and Jeremy Dunne

  • Chris Grulke for the ChemReg system
  • NERL colleagues – Jon Sobus, Elin Ulrich,

Mark Strynar, Seth Newton

40

slide-42
SLIDE 42

Contact

Antony Williams

US EPA Office of Research and Development National Center for Computational Toxicology (NCCT) Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821

41