SASBDB Small Angle Scattering Biological Data Bank
Erica Valentini
Dmitri Svergun group Solution Scattering from biological macromolecules EMBO course 2014
SASBDB Small Angle Scattering Biological Data Bank Erica Valentini - - PowerPoint PPT Presentation
SASBDB Small Angle Scattering Biological Data Bank Erica Valentini Dmitri Svergun group Solution Scattering from biological macromolecules EMBO course 2014 Index 1. Introduction: What is SAS? Do we need a SAS database? 2. SASBDB:
Dmitri Svergun group Solution Scattering from biological macromolecules EMBO course 2014
– What is SAS? – Do we need a SAS database?
– Features – Usage – Quality check – Missing
2 SAS EMBO Course 2014 11/2/2014
– What is SAS? – Do we need a SAS database?
– Features – Usage – Quality check – Missing
3 SAS EMBO Course 2014 11/2/2014
SAS Experiment
2θ s
|s| = 4π sinθ/λ s scattering vector 2θ scattering angle λ wavelength I(s) intensity
X-ray/Neutron beam Low resolution Model ATSAS
Scattering Intensity, Log I(s)
4 SAS EMBO Course 2014 11/2/2014
ATSAS Package
Rg MM Dmax Volume Shape Rigid body modelling Missing fragments Oligomeric mixtures Flexible System
5 SAS EMBO Course 2014 11/2/2014
SA(X)S advantages
Increasing popularity of SAXS
Solution Broad size range New developments in software and hardware
From few kDa to GDa Fast experiments: μ
Small amount of sample: 5-30 μl. Monitor alteration in environmental conditions.
6 SAS EMBO Course 2014 11/2/2014
SAS database motivations
7 SAS EMBO Course 2014
publications about SAS and the ATSAS package.
data collected with a single experiment.
the data underlying scientific publications available for the community.
Graewert, M. a and Svergun, D.I. (2013) Impact and progress in small and wide angle X-ray scattering (SAXS and WAXS). Curr. Opin. Struct. Biol., 23, 748–54. Franke, D., Kikhney, A.G. and Svergun, D.I. (2012) Automated acquisition and analysis of small angle X-ray scattering data. Nucl. Instruments Methods Phys. Res. Sect. A Accel. Spectrometers, Detect. Assoc. Equip., 689, 52–59. Collins, F.S. and Tabak, L. a (2014) Policy: NIH plans to enhance reproducibility. Nature, 505, 612–3.
50 100 150 200 250 300 350 400 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Number of publications referring to biological SAS ATSAS bioSAS
. 11/2/2014
wwPDB SAS task force
SAS EMBO Course 2014 8 Trewhella, J., Hendrickson, W.A., Kleywegt, G.J., Sali, A., Sato, M., Schwede, T., Svergun, D.I., Tainer, J.A., Westbrook, J. and Berman, H.M. (2013) Report of the wwPDB Small-Angle Scattering Task Force: Data Requirements for Biomolecular Modeling and the
“…a global repository is needed that holds standard format X-ray and neutron SAS data that is searchable and freely accessible for download” Database and small angle scattering experts
11/2/2014
Database SAS data included Missing
47 models where SAS was used for refinement Primary data used to calculate the models Scattering curves from 20.000 pdb structures Models and possibility to deposit SAS data. SAXS data and models Complete search, cross-references to other databases, quality check on data Scattering curves and ensembles models from disordered proteins SAS data and models from “not disordered proteins”
9 SAS EMBO Course 2014 11/2/2014
Database SAS data included Missing
47 models where SAS was used for refinement Primary data used to calculate the models Scattering curves from 20.000 pdb structures Models and possibility to deposit SAS data. SAXS data and models Complete search, cross-references to other databases, quality check on data Scattering curves and ensembles models from disordered proteins SAS data and models from “not disordered proteins”
10 SAS EMBO Course 2014 Berman, H., Henrick, K., Nakamura, H. and Markley, J.L. (2007) The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res., 35, D301–3. 11/2/2014
Database SAS data included Missing
47 models where SAS was used for refinement Primary data used to calculate the models Scattering curves from 20.000 pdb structures Models and possibility to deposit SAS data. SAXS data and models Complete search, cross-references to other databases, quality check on data Scattering curves and ensembles models from disordered proteins SAS data and models from “not disordered proteins”
11 SAS EMBO Course 2014
dara.embl-hamburg.de
Sokolova, A. V, Volkov, V. and Svergun, D. I. (2003) Prototype of a database for rapid protein classification based on solution scattering data. Conference papers classification based on solution scattering data. 1, 865–868. 11/2/2014
Database SAS data included Missing
47 models where SAS was used for refinement Primary data used to calculate the models Scattering curves from 20.000 pdb structures Models and possibility to deposit SAS data. SAXS data and models Complete search, cross-references to other databases, quality check on data Scattering curves and ensembles models from disordered proteins SAS data and models from “not disordered proteins”
12 SAS EMBO Course 2014 Hura, G.L., Menon, A.L., Hammel, M., Rambo, R.P., Poole, F.L., Tsutakawa, S.E., Jenney, F.E., Classen, S., Frankel, K. a, Hopkins, R.C., et al. (2009) Robust, high-throughput solution structural analyses by small angle X-ray scattering (SAXS). Nat. Methods, 6, 606– 12. 11/2/2014
Database SAS data included Missing
47 models where SAS was used for refinement Primary data used to calculate the models Scattering curves from 20.000 pdb structures Models and possibility to deposit SAS data. SAXS data and models Complete search, cross-references to other databases, quality check on data Scattering curves and ensembles models from disordered proteins SAS data and models from “not disordered proteins”
13 SAS EMBO Course 2014 Varadi, M., Kosol, S., Lebrun, P., Valentini, E., Blackledge, M., Dunker, a K., Felli, I.C., Forman-Kay, J.D., Kriwacki, R.W., Pierattelli, R., et al. (2014) pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins. Nucleic Acids Res., 42, D326–35. 11/2/2014
– What is SAS? – Do we need a SAS database?
– Features – Usage – Quality check – Missing
14 SAS EMBO Course 2014 11/2/2014
SAS EMBO Course 2014 15 11/2/2014
SAS EMBO Course 2014 16
11/2/2014
SAS EMBO Course 2014 17
11/2/2014
SAS EMBO Course 2014 18
11/2/2014
SAS EMBO Course 2014 19
11/2/2014
SAS EMBO Course 2014 20
11/2/2014
SAS EMBO Course 2014 21
Browsing unit
11/2/2014
SAS EMBO Course 2014 22
Scattering curve Model Kratky plot Experiment information Publication Structural parameters Unique code format: SASXXXN
11/2/2014
SAS EMBO Course 2014 23
Chronological
Browse according to the selected field
11/2/2014
SAS EMBO Course 2014 24
Benchmark
11/2/2014
SAS EMBO Course 2014 25
set of 14 “standard proteins”
data
steps
algorithm testing proposes
Dissemination
11/2/2014
SAS EMBO Course 2014 26
Scattering plot Guinier region Kratky plot P(r) distribution
11/2/2014
vRadius of Gyration Maximum Distance MWs & Porod Volume vRadius of Gyration
27 SAS EMBO Course 2014
11/2/2014
Fitting 1 Model 1 Fitting 2 Model 2
28 SAS EMBO Course 2014
11/2/2014
Fitting 3 Model 1 Model 2 Model 3
29 SAS EMBO Course 2014
Model 4
11/2/2014
Experimental details Molecule details
30 SAS EMBO Course 2014
11/2/2014
SAS EMBO Course 2014 31
11/2/2014
SAS EMBO Course 2014 32
11/2/2014
in using ATSAS account
between:
– “on hold” – “public”
33 SAS EMBO Course 2014
11/2/2014
– What is SAS? – Do we need a SAS database?
– Features – Usage – Quality check – Missing
34 SAS EMBO Course 2014 11/2/2014
SAS EMBO Course 2014 35
More than 500 users from August 2014 We are currently monitoring also search items and number of downloads
11/2/2014
11/2/2014 SAS EMBO Course 2014 36
SAS user SAS novice Article referee
11/2/2014 SAS EMBO Course 2014 37
11/2/2014 SAS EMBO Course 2014 38
11/2/2014 SAS EMBO Course 2014 39
11/2/2014 40
11/2/2014 41
11/2/2014 42
SAS EMBO Course 2014
11/2/2014 43
11/2/2014 44
SAS EMBO Course 2014
11/2/2014 45
SAS EMBO Course 2014
11/2/2014 46
SAS EMBO Course 2014
– What is SAS? – Do we need a SAS database?
– Features – Usage – Quality check – Missing
47 SAS EMBO Course 2014 11/2/2014
Difference Rg (Guinier) and Rg (p(r))
11/2/2014 SAS EMBO Course 2014 48
A B
Difference Rg (Guinier) and Rg (p(r))
11/2/2014 SAS EMBO Course 2014 49
A B
Difference MW (expected) and MW (experimental)
11/2/2014 SAS EMBO Course 2014 50
A B
Quality p(r) distribution
11/2/2014 SAS EMBO Course 2014 51
A B
Quality Guinier region
11/2/2014 SAS EMBO Course 2014 52
A B
Quality of the fit
11/2/2014 SAS EMBO Course 2014 53
A B
Quality of the data
11/2/2014 SAS EMBO Course 2014 54
A B
Quality of the data
11/2/2014 SAS EMBO Course 2014 55
A B
11/2/2014 SAS EMBO Course 2014 56
A B A B
Quality score based on the comparison between the selected entry and all the other entries.
– What is SAS? – Do we need a SAS database?
– Features – Usage – Quality check – Missing
57 SAS EMBO Course 2014 11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 58 11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 59 11/2/2014
Berman, H., Henrick, K., Nakamura, H. and Markley, J.L. (2007) The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res., 35, D301–3.
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 60
Read, R.J., Adams, P.D., Arendall, W.B., III, Brunger, A.T., Emsley, P., Joosten, R.P., Kleywegt, G.J., Krissinel, E.B., Lutteke, T., Otwinowski, Z., Perrakis, A., Richardson, J.S., Sheffler, W.H., Smith, J.L., Tickle, I.J., Vriend, G., Zwart, P.H.. (2011) A new generation of crystallographic validation tools for the Protein Data Bank. Structure 19: 1395-1412.
11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 61
Franke, D., Kikhney, A.G. and Svergun, D.I. (2012) Automated acquisition and analysis of small angle X-ray scattering data. Nucl. Instruments Methods Phys.
11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 62
Konarev, P. and Svergun, D.I. (2014) Submitted.
11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 63
Franke, D., Jeffries, C.M. and Svergun, D.I. (2014) Submitted.
11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 64
Tuukkanen, A. and Svergun, D.I. (2015) In preparation.
11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 65
Malfois, M. and Svergun, D.I. (2000) sasCIF: an extension of core Crystallographic Information File for SAS. J. Appl. Crystallogr., 33, 812–816.
11/2/2014
Validation/Quality check
Pipeline to compare values Assessment
angular range Difference between curves Validation of models
Standard format
sasCIF
Submission interface
Automatic
SAS EMBO Course 2014 66
Yang, H., Guranovic, V., Dutta, S., Feng, Z., Berman, H. M. & Westbrook, J. D. (2004). Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. Acta Cryst. D60, 1833-1839.
11/2/2014
– What is SAS? – Do we need a SAS database?
– Features – Usage – Quality check – Missing
67 SAS EMBO Course 2014 11/2/2014
largest repository of SAS data available.
68 SAS EMBO Course 2014 11/2/2014
69 SAS EMBO Course 2014 11/2/2014