Data processing and ab initio analysis Al Kikhney EMBL Hamburg - - PowerPoint PPT Presentation

data processing and ab initio analysis
SMART_READER_LITE
LIVE PREVIEW

Data processing and ab initio analysis Al Kikhney EMBL Hamburg - - PowerPoint PPT Presentation

Small angle X-ray scattering Data processing and ab initio analysis Al Kikhney EMBL Hamburg Outline 3D 2D 1D Experiment design and data reduction Exposure time, radiation damage Background subtraction Dilution


slide-1
SLIDE 1

Small angle X-ray scattering

Data processing and ab initio analysis

Al Kikhney EMBL Hamburg

slide-2
SLIDE 2

Outline

  • 3D → 2D → 1D
  • Experiment design and data reduction
  • Exposure time, radiation damage
  • Background subtraction
  • Dilution series, structure factor
  • Overall parameters:
  • Guinier analysis: Rg and I(0)
  • Molecular weight
  • PDDF p(r), Dmax
  • 1D → 3D: ab initio shape reconstruction
  • DAMMIF/DAMMIN, MONSA
  • GASBOR
  • Ambiguity
slide-3
SLIDE 3

X-ray detector solvent

  • Monodisperse and homogeneous*
  • Few kDa to GDa
  • Concentration: 0.5–10 mg/ml
  • Amount: 10–100 μl

* Terms and conditions apply

solution

X-ray →

SAXS experiment

slide-4
SLIDE 4

Log I(s) a.u. 105 104 103 102 101

2D → 1D

0 1.0 2.0 3.0 s, nm-1

slide-5
SLIDE 5

Log I(s) a.u. 105 104 103 102 101

  • Normalization
  • Transmitted beam
  • Exposure time
  • (Absolute scale)
  • Experimental errors

(uncertainties) estimation

0 1.0 2.0 3.0 s, nm-1

slide-6
SLIDE 6

Notations and units

105 104 103 102 101

|s| = 4π sinθ/λ

2θ λ s I(s) – scattering angle – wavelength – scattering vector – intensity Log I(s), a.u.

solution

X-ray →

s

0 1.0 2.0 3.0

s, nm-1

slide-7
SLIDE 7

Primary data reduction

slide-8
SLIDE 8

Exposure time

0.05 second I(s) s, nm-1

slide-9
SLIDE 9

0.1 second

Exposure time

I(s) s, nm-1

slide-10
SLIDE 10

0.2 second

Exposure time

I(s) s, nm-1

slide-11
SLIDE 11

0.4 second

Exposure time

I(s) s, nm-1

slide-12
SLIDE 12

0.8 second

Exposure time

I(s) s, nm-1

slide-13
SLIDE 13

1.6 second

Exposure time

I(s) s, nm-1

slide-14
SLIDE 14

Exposure time

0.05 second 0.2 second 0.8 second 1.6 second I(s) s, nm-1 RADIATION DAMAGE!

slide-15
SLIDE 15

I(s) s, nm-1 frame 1 frame 10

Multiple exposures

slide-16
SLIDE 16

I(s) s, nm-1 frame 1 frame 2

Multiple exposures

slide-17
SLIDE 17

I(s) s, nm-1 average

Multiple exposures

slide-18
SLIDE 18

buffer + cell

Sample and buffer

I(s) s, nm-1

slide-19
SLIDE 19

3.2 mg/ml lysozyme + buffer + cell

Sample and buffer

I(s) s, nm-1

slide-20
SLIDE 20

3.2 mg/ml lysozyme

Sample and buffer

I(s) s, nm-1

slide-21
SLIDE 21

Background subtraction

Solution minus Solvent

I(s) s, nm-1

slide-22
SLIDE 22

Background subtraction

Solution minus Solvent

I(s) s, nm-1

Normalization against:

  • Concentration
slide-23
SLIDE 23

Log I(s) s, nm-1

Logarithmic plot

slide-24
SLIDE 24

Dilution series

2 mg/ml

Log I(s) s, nm-1

slide-25
SLIDE 25

Dilution series

4 mg/ml

Log I(s) s, nm-1

slide-26
SLIDE 26

Dilution series

8 mg/ml

Log I(s) s, nm-1

slide-27
SLIDE 27

Dilution series

16 mg/ml

Log I(s) s, nm-1

slide-28
SLIDE 28

Dilution series

32 mg/ml

Log I(s) s, nm-1

slide-29
SLIDE 29

Dilution series

2 mg/ml 32 mg/ml

Log I(s) s, nm-1

slide-30
SLIDE 30

Inter-particle interactions

No interactions

slide-31
SLIDE 31

Inter-particle interactions

Attractive interactions Repulsive interactions

slide-32
SLIDE 32

Merging data

Log I(s) s, nm-1

slide-33
SLIDE 33

Merging data

Log I(s) s, nm-1

slide-34
SLIDE 34

Merging data

Log I(s) s, nm-1

slide-35
SLIDE 35

Data analysis

slide-36
SLIDE 36

Log I(s)

100 nm3

s

Shape

slide-37
SLIDE 37

Log I(s)

100 nm3 50 nm3 25 nm3 200 nm3

Size

slide-38
SLIDE 38

Radius of gyration (Rg)

Rg

2 definition:

Average of square center-of-mass distances in the molecule

weighted by the scattering length density

Rg: 2.5 nm

28 kDa protein

slide-39
SLIDE 39

Radius of gyration (Rg)

Rg

2 definition:

Average of square center-of-mass distances in the molecule

weighted by the scattering length density

Rg: 2.9 nm

28 kDa protein

slide-40
SLIDE 40

Radius of gyration (Rg)

Rg

2 definition:

Average of square center-of-mass distances in the molecule

weighted by the scattering length density

Rg: 3.2 nm

28 kDa protein

slide-41
SLIDE 41

6 nm 100 nm3 3.6 nm 6.4 nm 3.4 nm 4.8 nm 2.2 nm Radius of gyration (Rg)

slide-42
SLIDE 42

Radius of gyration (Rg)

André Guinier 1911-2000 Guinier approximation:

I(s) ≈ I(0) exp(s2Rg

2/-3)

s ≲ 1/Rg

slide-43
SLIDE 43

Ln I(s) s2

Guinier plot

Radius of gyration (Rg)

slide-44
SLIDE 44

Guinier plot

Radius of gyration (Rg)

Ln I(s)

s2

slide-45
SLIDE 45

Ln I(s)

s2

Guinier plot

Radius of gyration (Rg)

y = ax + b Rg = √-3a Ln I(0)

sRg < 1.3

slide-46
SLIDE 46

Ln I(s)

s2

Guinier plot

Radius of gyration (Rg)

Rg ± stdev Forward scattering I(0) Data quality Data range

slide-47
SLIDE 47

Log I(s) s, 1/nm

slide-48
SLIDE 48

Log I(s) s, 1/nm

slide-49
SLIDE 49

Aggregation

Monodisperse sample

slide-50
SLIDE 50

Aggregated sample

Aggregation

slide-51
SLIDE 51

Log I(s) s, 1/nm

Logarithmic plot

slide-52
SLIDE 52

Guinier plot

Ln I(s)

s2

slide-53
SLIDE 53

Guinier plot

Ln I(s)

s2 smin= 0.26 nm-1 smax= 0.63 nm-1 Rg = 2.0 nm sminRg = 0.52 smaxRg = 1.26 < 1.3

slide-54
SLIDE 54

Guinier plot

Ln I(s)

s2 0.44 nm-1 0.63 nm-1 Rg = 2.3 nm sminRg = 1.01 smaxRg = 1.45 > 1.3

slide-55
SLIDE 55

Molecular weight (MW)

  • From I(0)

– I(s) on an absolute scale – I(s) on a relative scale

  • From the Porod volume
  • SAXSMoW (a.k.a. Fischer method)
  • Volume-of-correlation method
  • Consensus Bayesian assessment of MW
slide-56
SLIDE 56

Molecular weight from I(0)

MWsample MWBSA I(0)sample I(0)BSA = MWsample = I(0) sample ∙ MWBSA / I(0)BSA

  • I(s) on an absolute scale (cm-1)

Assuming I(s) is normalized against concentration (mg/ml) MWsample = 103 I(0)sampleNA / (Δρνsample)2 Δρ – contrast in cm-2 vsample – partial specific volume in cm3/g NA – Avogadro’s number

  • I(s) on a relative scale (a.u.)
slide-57
SLIDE 57

Porod volume

Excluded volume of the hydrated particle

− =

2 4 2

] ) ( [ ) ( 2 ds s K s I I VP 

 2

) ( ds s s I

  • G. Porod (1982) Academic Press, London
slide-58
SLIDE 58

Porod volume

Excluded volume of the hydrated particle

− =

2 4 2

] ) ( [ ) ( 2 ds s K s I I VP 

 2

) ( ds s s I

For proteins: MW [kDa] ~ Vp [nm3] / 1.6

slide-59
SLIDE 59

69.5 nm3 69.2 nm3

Excluded volume of the hydrated particle

~43 kDa ~43 kDa

Porod volume

*Simulated data

slide-60
SLIDE 60

770 nm3 23.7 nm3

Excluded volume of the hydrated particle

14.8 kDa 480 kDa

Porod volume

SASDA82 SASBDB: SASDA96

slide-61
SLIDE 61

14.8 kDa 480 kDa

Other methods

Porod volume (VP) SAXS MoW Volume-of-correlation (Vc) Bayesian inference

H Fischer et al. (2010)

  • J. Appl. Cryst. 43, 101–109

11.2 kDa 510 kDa 12.3 kDa 540 kDa 11.3 kDa 479 kDa

RP Rambo and J.A. Tainer (2013) Nature 496(7446) 477–481 Hajizadeh NR, Franke D, Jeffries CM, Svergun DI (2018) Sci. Rep. 8:7204

SASBDB: SASDA96 SASDA82

slide-62
SLIDE 62

Distance distribution function

r, nm γ(r)

slide-63
SLIDE 63

Distance distribution function

r, nm γ(r)

slide-64
SLIDE 64

Distance distribution function

r, nm

γ(r)

slide-65
SLIDE 65

Distance distribution function

r, nm γ(r)

p(r) = r2 γ(r)

slide-66
SLIDE 66

6 nm 100 nm3 r, nm p(r) Dmax= 6 nm

Distance distribution function

slide-67
SLIDE 67

r, nm p(r)

Distance distribution function

slide-68
SLIDE 68

r, nm p(r)

Distance distribution function

slide-69
SLIDE 69

r, nm p(r)

Distance distribution function

slide-70
SLIDE 70

r, nm p(r) Log I(s) s, nm-1

Distance distribution function

slide-71
SLIDE 71

r, nm p(r) Log I(s) s, nm-1

dr sr sr r p s I

D

=

max

) sin( ) ( 4 ) ( 

ds sr sr s I s r r p

=

2 2 2

) sin( ) ( 2 ) ( 

slide-72
SLIDE 72

p(r) plot

Distance distribution function

r, nm r, nm p(r) p(r)

Dmax Dmax

SASBDB: SASDA96 SASDA82

slide-73
SLIDE 73

r, nm p(r)

Dmax

Data quality

smin < π/Dmax

I(s) s, 1/nm

smin Dmax

slide-74
SLIDE 74

r, nm p(r)

Rg and I(0) from p(r)

𝐽 0 = 4𝛲 න

𝐸𝑛𝑏𝑦

𝑞 𝑠 𝑒𝑠 𝑆𝑕

2 =

׬

𝐸𝑛𝑏𝑦 𝑠2𝑞 𝑠 𝑒𝑠

2 ׬

𝐸𝑛𝑏𝑦 𝑞 𝑠 𝑒𝑠

slide-75
SLIDE 75

1D → 3D: modelling!

slide-76
SLIDE 76

nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern

SAXS data from macromolecules in solution

slide-77
SLIDE 77

nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern calculated from model

SAXS data from macromolecules in solution

slide-78
SLIDE 78

SAXS data from macromolecules in solution

nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern calculated from model

slide-79
SLIDE 79

nm-1 log I(s) experimental SAXS pattern experimental SAXS pattern

SAXS data from macromolecules in solution

slide-80
SLIDE 80

nm-1 log I(s)

Ab initio shape reconstruction: dummy atom modelling

experimental SAXS pattern experimental SAXS pattern calculated from model

slide-81
SLIDE 81

nm-1 log I(s)

Ab initio shape reconstruction: dummy atom modelling

experimental SAXS pattern experimental SAXS pattern calculated from model

slide-82
SLIDE 82

nm-1 log I(s) experimental SAXS pattern

Rg = 3.4 nm smax = 8/Rg 2.35

Ab initio shape reconstruction: dummy atom modelling

slide-83
SLIDE 83

nm-1 log I(s) experimental SAXS pattern fit by p(r)

r, nm p(r) smax = 8/Rg

Ab initio shape reconstruction: dummy atom modelling

slide-84
SLIDE 84

nm-1 log I(s) fit by p(r)

r, nm p(r)

Ab initio shape reconstruction: dummy atom modelling

slide-85
SLIDE 85

log I(s) nm-1 target curve

≈2000–10000 “dummy atoms” 2–10 Å

Franke, D. and Svergun, D.I. (2009) J Appl Cryst 42, 342–346.

DAMMIF

slide-86
SLIDE 86

log I(s) nm-1 target curve calculated from the model

slide-87
SLIDE 87
slide-88
SLIDE 88
slide-89
SLIDE 89

DAMMIN IN

  • Variable number of “dummy atoms” on a fixed grid
  • Scattering is computed using spherical harmonics
  • Monte-Carlo type search
  • Fixed search space (defined by Dmax)
  • Provides volume/molecular weight estimate
  • Idea first published by P. Chacón et al. (1998) Biophys J 74

Svergun, D.I. (1999) Biophys J 76

slide-90
SLIDE 90

DAMMIF IF

  • Variable number of “dummy atoms” on a fixed grid
  • Scattering is computed using spherical harmonics
  • Monte-Carlo type search
  • Expandable search space
  • Provides volume/molecular weight estimate
  • 40 time faster than DAMMIN (D. I. Svergun (1999) Biophys J 76)

Franke, D. and Svergun, D.I. (2009) J Appl Cryst 42, 342–346

slide-91
SLIDE 91

https://www.embl-hamburg.de/biosaxs/atsas-online/dammif.php

slide-92
SLIDE 92

Single phase shape determination Fit one data set

Ab initio shape reconstruction

slide-93
SLIDE 93

Fit data from several subunits

Ab initio shape reconstruction: multi-phase dummy atom modelling

slide-94
SLIDE 94

https://www.embl-hamburg.de/biosaxs/atsas-online/monsa.php

slide-95
SLIDE 95

Svergun, D.I., Petoukhov, M.V, Koch, M.H.J. (2001) Biophys J 80, 2946–2953.

GASBOR 3.8 Å Dmax

Ab initio reconstruction: dummy residue modelling

Number of dummy residues = number of residues on the protein

slide-96
SLIDE 96

log10I(q) q, nm-1

Ab initio reconstruction: dummy residue modelling

target curve calculated from the model

slide-97
SLIDE 97

log10I(q) q, nm-1

Ab initio reconstruction: dummy residue modelling

target curve calculated from the model

slide-98
SLIDE 98

log10I(q) q, nm-1

Ab initio reconstruction: dummy residue modelling

target curve calculated from the model

slide-99
SLIDE 99

GASBOR

  • Fixed number of “dummy residues”
  • Distances to neighbor “residues” like in proteins

Svergun, D.I., Petoukhov, M.V, Koch, M.H.J. (2001) Biophys J 80, 2946–2953

slide-100
SLIDE 100

GASBOR

  • Fixed number of “dummy residues”
  • Distances to neighbor “residues” like in proteins
  • Fixed search space
  • Scattering is computed using Debye formula
  • Higher angles used (up to 12 nm-1)
  • Only for proteins smaller than 660 kDa

Svergun, D.I., Petoukhov, M.V, Koch, M.H.J. (2001) Biophys J 80, 2946–2953

slide-101
SLIDE 101
slide-102
SLIDE 102

Ambiguity

I(s) s I(s) s

slide-103
SLIDE 103

I(s) s I(s) s

Ambiguity

First formulated by R. Kirste in 1964

slide-104
SLIDE 104

I(s) s I(s) s

Ambiguity

From a study by M. Petoukhov, 2015

slide-105
SLIDE 105

AMBIM IMETER

Petoukhov, M.V. and Svergun, D.I. (2015) Acta Cryst D71, 1051–1058 Curves from all 14 112 possible shapes represented by one to seven interconnected beads

slide-106
SLIDE 106

Ab initio model validity

First validate your sample and input data! Check for:

– monodispersity; – radiation damage; – aggregation; – concentration effects; – overall parameters; – signal-to-noise level.

Make sure your model fits the data. Repeat multiple times.

slide-107
SLIDE 107

Data reduction and analysis steps

Radial averaging Radiation damage check Normalization Background subtraction Merge multiple concentrations Rg, molecular weight Dmax, p(r) … Ab initio shape determination

1s 2s 0.5 1.0 2.0 3s

X

p(r) p(r)

slide-108
SLIDE 108

Thank you!

www.saxier.org/forum www.sasbdb.org