Data processing and ab initio analysis Al Kikhney EMBL Hamburg - - PowerPoint PPT Presentation
Data processing and ab initio analysis Al Kikhney EMBL Hamburg - - PowerPoint PPT Presentation
Small angle X-ray scattering Data processing and ab initio analysis Al Kikhney EMBL Hamburg Outline 3D 2D 1D Experiment design and data reduction Exposure time, radiation damage Background subtraction Dilution
Outline
- 3D → 2D → 1D
- Experiment design and data reduction
- Exposure time, radiation damage
- Background subtraction
- Dilution series, structure factor
- Overall parameters:
- Guinier analysis: Rg and I(0)
- Molecular weight
- PDDF p(r), Dmax
- 1D → 3D: ab initio shape reconstruction
- DAMMIF/DAMMIN, MONSA
- GASBOR
- Ambiguity
X-ray detector solvent
- Monodisperse and homogeneous*
- Few kDa to GDa
- Concentration: 0.5–10 mg/ml
- Amount: 10–100 μl
* Terms and conditions apply
solution
X-ray →
SAXS experiment
Log I(s) a.u. 105 104 103 102 101
2D → 1D
0 1.0 2.0 3.0 s, nm-1
Log I(s) a.u. 105 104 103 102 101
- Normalization
- Transmitted beam
- Exposure time
- (Absolute scale)
- Experimental errors
(uncertainties) estimation
0 1.0 2.0 3.0 s, nm-1
Notations and units
105 104 103 102 101
|s| = 4π sinθ/λ
2θ λ s I(s) – scattering angle – wavelength – scattering vector – intensity Log I(s), a.u.
solution
X-ray →
2θ
s
0 1.0 2.0 3.0
s, nm-1
Primary data reduction
Exposure time
0.05 second I(s) s, nm-1
0.1 second
Exposure time
I(s) s, nm-1
0.2 second
Exposure time
I(s) s, nm-1
0.4 second
Exposure time
I(s) s, nm-1
0.8 second
Exposure time
I(s) s, nm-1
1.6 second
Exposure time
I(s) s, nm-1
Exposure time
0.05 second 0.2 second 0.8 second 1.6 second I(s) s, nm-1 RADIATION DAMAGE!
I(s) s, nm-1 frame 1 frame 10
Multiple exposures
I(s) s, nm-1 frame 1 frame 2
Multiple exposures
I(s) s, nm-1 average
Multiple exposures
buffer + cell
Sample and buffer
I(s) s, nm-1
3.2 mg/ml lysozyme + buffer + cell
Sample and buffer
I(s) s, nm-1
3.2 mg/ml lysozyme
Sample and buffer
I(s) s, nm-1
Background subtraction
Solution minus Solvent
I(s) s, nm-1
Background subtraction
Solution minus Solvent
I(s) s, nm-1
Normalization against:
- Concentration
Log I(s) s, nm-1
Logarithmic plot
Dilution series
2 mg/ml
Log I(s) s, nm-1
Dilution series
4 mg/ml
Log I(s) s, nm-1
Dilution series
8 mg/ml
Log I(s) s, nm-1
Dilution series
16 mg/ml
Log I(s) s, nm-1
Dilution series
32 mg/ml
Log I(s) s, nm-1
Dilution series
2 mg/ml 32 mg/ml
Log I(s) s, nm-1
Inter-particle interactions
No interactions
Inter-particle interactions
Attractive interactions Repulsive interactions
Merging data
Log I(s) s, nm-1
Merging data
Log I(s) s, nm-1
Merging data
Log I(s) s, nm-1
Data analysis
Log I(s)
100 nm3
s
Shape
Log I(s)
100 nm3 50 nm3 25 nm3 200 nm3
Size
Radius of gyration (Rg)
Rg
2 definition:
Average of square center-of-mass distances in the molecule
weighted by the scattering length density
Rg: 2.5 nm
28 kDa protein
Radius of gyration (Rg)
Rg
2 definition:
Average of square center-of-mass distances in the molecule
weighted by the scattering length density
Rg: 2.9 nm
28 kDa protein
Radius of gyration (Rg)
Rg
2 definition:
Average of square center-of-mass distances in the molecule
weighted by the scattering length density
Rg: 3.2 nm
28 kDa protein
6 nm 100 nm3 3.6 nm 6.4 nm 3.4 nm 4.8 nm 2.2 nm Radius of gyration (Rg)
Radius of gyration (Rg)
André Guinier 1911-2000 Guinier approximation:
I(s) ≈ I(0) exp(s2Rg
2/-3)
s ≲ 1/Rg
Ln I(s) s2
Guinier plot
Radius of gyration (Rg)
Guinier plot
Radius of gyration (Rg)
Ln I(s)
s2
Ln I(s)
s2
Guinier plot
Radius of gyration (Rg)
y = ax + b Rg = √-3a Ln I(0)
sRg < 1.3
Ln I(s)
s2
Guinier plot
Radius of gyration (Rg)
Rg ± stdev Forward scattering I(0) Data quality Data range
Log I(s) s, 1/nm
Log I(s) s, 1/nm
Aggregation
Monodisperse sample
Aggregated sample
Aggregation
Log I(s) s, 1/nm
Logarithmic plot
Guinier plot
Ln I(s)
s2
Guinier plot
Ln I(s)
s2 smin= 0.26 nm-1 smax= 0.63 nm-1 Rg = 2.0 nm sminRg = 0.52 smaxRg = 1.26 < 1.3
Guinier plot
Ln I(s)
s2 0.44 nm-1 0.63 nm-1 Rg = 2.3 nm sminRg = 1.01 smaxRg = 1.45 > 1.3
Molecular weight (MW)
- From I(0)
– I(s) on an absolute scale – I(s) on a relative scale
- From the Porod volume
- SAXSMoW (a.k.a. Fischer method)
- Volume-of-correlation method
- Consensus Bayesian assessment of MW
Molecular weight from I(0)
MWsample MWBSA I(0)sample I(0)BSA = MWsample = I(0) sample ∙ MWBSA / I(0)BSA
- I(s) on an absolute scale (cm-1)
Assuming I(s) is normalized against concentration (mg/ml) MWsample = 103 I(0)sampleNA / (Δρνsample)2 Δρ – contrast in cm-2 vsample – partial specific volume in cm3/g NA – Avogadro’s number
- I(s) on a relative scale (a.u.)
Porod volume
Excluded volume of the hydrated particle
− =
2 4 2
] ) ( [ ) ( 2 ds s K s I I VP
2
) ( ds s s I
- G. Porod (1982) Academic Press, London
Porod volume
Excluded volume of the hydrated particle
− =
2 4 2
] ) ( [ ) ( 2 ds s K s I I VP
2
) ( ds s s I
For proteins: MW [kDa] ~ Vp [nm3] / 1.6
69.5 nm3 69.2 nm3
Excluded volume of the hydrated particle
~43 kDa ~43 kDa
Porod volume
*Simulated data
770 nm3 23.7 nm3
Excluded volume of the hydrated particle
14.8 kDa 480 kDa
Porod volume
SASDA82 SASBDB: SASDA96
14.8 kDa 480 kDa
Other methods
Porod volume (VP) SAXS MoW Volume-of-correlation (Vc) Bayesian inference
H Fischer et al. (2010)
- J. Appl. Cryst. 43, 101–109
11.2 kDa 510 kDa 12.3 kDa 540 kDa 11.3 kDa 479 kDa
RP Rambo and J.A. Tainer (2013) Nature 496(7446) 477–481 Hajizadeh NR, Franke D, Jeffries CM, Svergun DI (2018) Sci. Rep. 8:7204
SASBDB: SASDA96 SASDA82
Distance distribution function
r, nm γ(r)
Distance distribution function
r, nm γ(r)
Distance distribution function
r, nm
γ(r)
Distance distribution function
r, nm γ(r)
p(r) = r2 γ(r)
6 nm 100 nm3 r, nm p(r) Dmax= 6 nm
Distance distribution function
r, nm p(r)
Distance distribution function
r, nm p(r)
Distance distribution function
r, nm p(r)
Distance distribution function
r, nm p(r) Log I(s) s, nm-1
Distance distribution function
r, nm p(r) Log I(s) s, nm-1
dr sr sr r p s I
D
=
max
) sin( ) ( 4 ) (
ds sr sr s I s r r p
=
2 2 2
) sin( ) ( 2 ) (
p(r) plot
Distance distribution function
r, nm r, nm p(r) p(r)
Dmax Dmax
SASBDB: SASDA96 SASDA82
r, nm p(r)
Dmax
Data quality
smin < π/Dmax
I(s) s, 1/nm
smin Dmax
r, nm p(r)
Rg and I(0) from p(r)
𝐽 0 = 4𝛲 න
𝐸𝑛𝑏𝑦
𝑞 𝑠 𝑒𝑠 𝑆
2 =
𝐸𝑛𝑏𝑦 𝑠2𝑞 𝑠 𝑒𝑠
2
𝐸𝑛𝑏𝑦 𝑞 𝑠 𝑒𝑠
1D → 3D: modelling!
nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern
SAXS data from macromolecules in solution
nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern calculated from model
SAXS data from macromolecules in solution
SAXS data from macromolecules in solution
nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern calculated from model
nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern
SAXS data from macromolecules in solution
nm-1 log I(s)
Ab initio shape reconstruction: dummy atom modelling
experimental SAXS pattern experimental SAXS pattern calculated from model
nm-1 log I(s)
Ab initio shape reconstruction: dummy atom modelling
experimental SAXS pattern experimental SAXS pattern calculated from model
nm-1 log I(s) experimental SAXS pattern
Rg = 3.4 nm smax = 8/Rg 2.35
Ab initio shape reconstruction: dummy atom modelling
nm-1 log I(s) experimental SAXS pattern fit by p(r)
r, nm p(r) smax = 8/Rg
Ab initio shape reconstruction: dummy atom modelling
nm-1 log I(s) fit by p(r)
r, nm p(r)
Ab initio shape reconstruction: dummy atom modelling
log I(s) nm-1 target curve
≈2000–10000 “dummy atoms” 2–10 Å
Franke, D. and Svergun, D.I. (2009) J Appl Cryst 42, 342–346.
DAMMIF
log I(s) nm-1 target curve calculated from the model
DAMMIN IN
- Variable number of “dummy atoms” on a fixed grid
- Scattering is computed using spherical harmonics
- Monte-Carlo type search
- Fixed search space (defined by Dmax)
- Provides volume/molecular weight estimate
- Idea first published by P. Chacón et al. (1998) Biophys J 74
Svergun, D.I. (1999) Biophys J 76
DAMMIF IF
- Variable number of “dummy atoms” on a fixed grid
- Scattering is computed using spherical harmonics
- Monte-Carlo type search
- Expandable search space
- Provides volume/molecular weight estimate
- 40 time faster than DAMMIN (D. I. Svergun (1999) Biophys J 76)
Franke, D. and Svergun, D.I. (2009) J Appl Cryst 42, 342–346
https://www.embl-hamburg.de/biosaxs/atsas-online/dammif.php
Single phase shape determination Fit one data set
Ab initio shape reconstruction
Fit data from several subunits
Ab initio shape reconstruction: multi-phase dummy atom modelling
https://www.embl-hamburg.de/biosaxs/atsas-online/monsa.php
Svergun, D.I., Petoukhov, M.V, Koch, M.H.J. (2001) Biophys J 80, 2946–2953.
GASBOR 3.8 Å Dmax
Ab initio reconstruction: dummy residue modelling
Number of dummy residues = number of residues on the protein
log10I(q) q, nm-1
Ab initio reconstruction: dummy residue modelling
target curve calculated from the model
log10I(q) q, nm-1
Ab initio reconstruction: dummy residue modelling
target curve calculated from the model
log10I(q) q, nm-1
Ab initio reconstruction: dummy residue modelling
target curve calculated from the model
GASBOR
- Fixed number of “dummy residues”
- Distances to neighbor “residues” like in proteins
Svergun, D.I., Petoukhov, M.V, Koch, M.H.J. (2001) Biophys J 80, 2946–2953
GASBOR
- Fixed number of “dummy residues”
- Distances to neighbor “residues” like in proteins
- Fixed search space
- Scattering is computed using Debye formula
- Higher angles used (up to 12 nm-1)
- Only for proteins smaller than 660 kDa
Svergun, D.I., Petoukhov, M.V, Koch, M.H.J. (2001) Biophys J 80, 2946–2953
Ambiguity
I(s) s I(s) s
I(s) s I(s) s
Ambiguity
First formulated by R. Kirste in 1964
I(s) s I(s) s
Ambiguity
From a study by M. Petoukhov, 2015
AMBIM IMETER
Petoukhov, M.V. and Svergun, D.I. (2015) Acta Cryst D71, 1051–1058 Curves from all 14 112 possible shapes represented by one to seven interconnected beads
Ab initio model validity
First validate your sample and input data! Check for:
– monodispersity; – radiation damage; – aggregation; – concentration effects; – overall parameters; – signal-to-noise level.
Make sure your model fits the data. Repeat multiple times.
Data reduction and analysis steps
Radial averaging Radiation damage check Normalization Background subtraction Merge multiple concentrations Rg, molecular weight Dmax, p(r) … Ab initio shape determination
1s 2s 0.5 1.0 2.0 3s
X
p(r) p(r)
Thank you!
www.saxier.org/forum www.sasbdb.org