SAXS/ SANS data processing and overall parameters Petr V. Konarev - - PowerPoint PPT Presentation

saxs sans data processing and overall parameters
SMART_READER_LITE
LIVE PREVIEW

SAXS/ SANS data processing and overall parameters Petr V. Konarev - - PowerPoint PPT Presentation

EMBO Global Exchange Lecture Course 30 November 2012 Hyderabad India SAXS/ SANS data processing and overall parameters Petr V. Konarev European Molecular Biology Laboratory, Hamburg Outstation BioSAXS group Small-angle scattering:


slide-1
SLIDE 1

SAXS/ SANS data processing and overall parameters

Petr V. Konarev European Molecular Biology Laboratory, Hamburg Outstation BioSAXS group

EMBO Global Exchange Lecture Course 30 November 2012 Hyderabad India

slide-2
SLIDE 2

Detector k1 Scattering vector s=k1-k, s=4π sinθ/λ Radiation sources: X-ray generator (λ = 0.1 - 0.2 nm) Synchrotron (λ = 0.03 - 0.35 nm) Thermal neutrons (λ = 0.2 - 1 nm) Monochromatic beam Sample 2θ Wave vector k, k=2π/λ

Small-angle scattering: experiment

s=4π sinθ/λ, nm-1

1 2 3

Log (Intensity)

  • 1

1 2

slide-3
SLIDE 3

BIOSAXS beamline P12 (Petra-3)

Pilatus 2M

2D Raw Data Iron nanoparticles (June 2011)

slide-4
SLIDE 4

PILATUS Pixel X-ray Detector at P12

PILATUS 2M (24*100K modules) Active area 250*290 mm2 , pixel size: 172μm Readout time: 3.6ms, framing rate: 50Hz

Silver behenate Axis calibration standard

slide-5
SLIDE 5

Raw data reduction steps

  • Radial integration of 2D image into 1D curve
  • Exact coordinates of the beam center are required for

integration (determined from AgBeh data)

  • Mask file is used to eliminate beamstop and inactive

detector area

  • Associated errors in the data points are computed from the

numbers of counts using Poisson statistics

  • Data are normalized to the pindiode value (intensity of the

transmitted beam) and exposure time

  • Data are transferred into ASCII format containing 3

columns: s I(s) Er(s)

slide-6
SLIDE 6

Normalization against:

  • data collection time,
  • transmitted sample intensity.

Log I(s), a.u. s, nm-1

|s| = 4π sinθ/λ

Small Angle Scattering

Radial averaging

slide-7
SLIDE 7

Scattering by matter

  • X-rays are scattered mostly by electrons
  • Thermal neutrons are scattered mostly by

nuclei

  • Scattering amplitude from an ensemble of

atoms A(s) is the Fourier transform of the scattering length density distribution in the sample ρ(r)

  • Experimentally, scattering intensity

I(s) = [A(s)]2 is measured.

slide-8
SLIDE 8

Small-angle scattering: contrast

Isample(s) Imatrix (s) Iparticle(s)

♦ To obtain scattering from the particles, matrix

scattering must be subtracted, which also permits to significantly reduce contribution from parasitic background (slits, sample holder etc)

♦ Contrast Δρ = <ρ(r) - ρs>, where ρs is the scattering

density of the matrix, may be very small for biological samples

slide-9
SLIDE 9

X-rays neutrons

  • X-rays: scattering factor increases with atomic

number, no difference between H and D

  • Neutrons: scattering factor is irregular, may

be negative, huge difference between H and D

Element H D C N O P S Au

  • At. Weight

1 2 12 14 16 30 32 197 N electrons 1 1 6 7 8 15 16 79 bX,10-12 cm 0.282 0.282 1.69 1.97 2.16 3.23 4.51 22.3 bN,10-12 cm -0.374 0.667 0.665 0.940 0.580 0.510 0.280 0.760

slide-10
SLIDE 10

Sample and buffer scattering

Looking for protein signals less than 5% above background level…

slide-11
SLIDE 11

Sample and buffer scattering

slide-12
SLIDE 12

[ ]

) ( ) ( s Det T cT (s) I T T (s) I T (s) I T I(s)

m s e s m m s s m

− − − =

Here, subscripts s, m and e denote the scattering from sample, matrix (e.g. solvent) and empty cell (camera background), T stands for transmission, c for sample concentration and Det(n) is the detector response function. For solution scattering studies Ts usually equals to Tm and the third term vanishes.

Sample and buffer scattering

slide-13
SLIDE 13

s, nm -1 2 4 6 8

lg I, relative

1 2 3

Scattering curve I ( s)

Overall Parameters Rg Dmax MMexp Excluded Volume

Analysis of biological SAS data

slide-14
SLIDE 14

Overall parameters

) s R ) I( I(s)

g 2 2

3 1 exp( − ≅

Radius of gyration Rg (Guinier, 1939) Maximum size Dmax: p(r)=0 for r> Dmax Excluded particle volume (Porod, 1952)

= =

2 2

) ( I(0)/Q; 2 V ds s I s Q π

Maximum size Dmax: p(r)=0 for r> Dmax

slide-15
SLIDE 15

Program PRI MUS- graphical package for data manipulations and analysis

♦data manipulations (averaging, background subtraction, merging of data in

different angular ranges, extrapolation to infinite dilution ) ♦evaluation of radius of gyration and forward intensity (Guinier plot, module AUTORG), estimation of Porod volume ♦calculation of distance/size distribution function p(r)/V(r) (module GNOM) ♦data fitting using the parameters of simple geometrical bodies (ellipsoid, elliptic/hollow cylinder, rectangular prism) (module BODIES) ♦data analysis for polydisperse and interacting systems, mixtures and partially ordered systems (modules OLIGOMER, SVDPLOT, MIXTURE and PEAK)

P.V. Konarev, V.V. Volkov, A.V. Sokolova, M.H.J. Koch, D.I. Svergun J.Appl. Cryst. (2003) 36, 1277-1282

slide-16
SLIDE 16

PRIMUS: graphical user interface

slide-17
SLIDE 17

Data quality

Radiation damage

Log I(s), a.u. s, nm-1

sample

slide-18
SLIDE 18

Data quality

Radiation damage

s, nm-1

sample same sample again RADIATION DAMAGE!

Log I(s), a.u.

slide-19
SLIDE 19

Low and High Concentration

Log I(s) s, nm-1

1 mg/ml 10 mg/ml

Merging data

slide-20
SLIDE 20

Low and High Concentration

Log I(s) s, nm-1

Merging data

slide-21
SLIDE 21

Merging data

Low and High Concentration

Log I(s) s, nm-1

slide-22
SLIDE 22

Low and High Concentration

Log I(s) s, nm-1

Merging data

slide-23
SLIDE 23

Low and High Concentration

Log I(s) s, nm-1

Merging data

slide-24
SLIDE 24

E xtrapolation to zero concentration

Infinite dilution

Log I(s) s, nm-1

10 mg/ml 1 mg/ml 0 mg/ml?

slide-25
SLIDE 25

Shape and size

lysozyme apoferritin

Log I(s) a.u. s, nm-1

slide-26
SLIDE 26

The scattering is related to the shape

s, nm-1

0.0 0.1 0.2 0.3 0.4 0.5

lg I(s), relative

  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

Solid sphere Long rod Flat disc Hollow sphere Dumbbell

s, nm-1

0.0 0.1 0.2 0.3 0.4 0.5

lg I(s), relative

  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s, nm-1

0.0 0.1 0.2 0.3 0.4 0.5

lg I(s), relative

  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s, nm-1

0.0 0.1 0.2 0.3 0.4 0.5

lg I(s), relative

  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s, nm-1

0.0 0.1 0.2 0.3 0.4 0.5

lg I(s), relative

  • 6
  • 5
  • 4
  • 3
  • 2
  • 1
slide-27
SLIDE 27

Guinier law

For small values of x, sinx/x can be expressed as :

Hence, close to the origin: I(s) = I(0)[1-ks2+…] ≈ I(0)exp(-ks2) The scattering curve of a particle can be approximated by a Gaussian curve in the vicinity of the origin

= ) sin( ) ( 4 ) ( ds sr sr r p s I π

.. ! 5 ) ( ! 3 ) ( 1 ) sin(

4 2

− + − = sr sr sr sr

This is a classical formula derived by Andre Guinier in 1938, in his first SAXS application (to defects in metals)

) 3 / exp( ) ( ) (

2 2 g

R s I s I − ≅

slide-28
SLIDE 28

Radius of gyration

Radius of gyration :

2 2

( ) ( )

V g V

r dV R dV ρ ρ Δ = Δ

∫ ∫

r r

r r

r r

Rg is the quadratic mean of distances to the center of mass weighted by the contrast of electron density. Rg is an index of non sphericity. For a given volume the smallest Rg is that of a sphere : Ellipsoïd of revolution (a, b) Cylinder (D, H)

3 5

g

R R =

2 2

2 5

g

a b R + =

2 2

8 12

g

D H R = +

ideal monodisperse

slide-29
SLIDE 29

Guinier plot example

Validity range : 0 < sRg <1.3 The law is generally used under its log form : A linear regression yields two parameters : I(0) (y-intercept) Rg from the slope

3 / )] ( ln[ )] ( ln[

2 2 g

R s I s I − ≅

slide-30
SLIDE 30

Guinier

PRIMUS: Guinier plot

Rg = 2.68 +- 1.11e-2 I0 = 271.07 +- 0.605

) 3 exp( ) ( ) (

2 2 Rg

s I s I ⋅ − ⋅ =

Rg – radius of gyration

M≈MBSA*(I(0)/IBSA(0))

slide-31
SLIDE 31

PRIMUS: AutoRG module

AutoRg

Petoukhov, M.V., Konarev, P.V., Kikhney, A.G. & Svergun, D.I. (2007) J. Appl. Cryst., 40, s223-s228.

slide-32
SLIDE 32

In the case of very elongated particles, the radius of gyration of the cross-section can be derived using a similar representation, plotting this time sI(s) vs s2 Finally, in the case of a platelet, a thickness parameter is derived from a plot of s2I(s) vs s2 : with T : thickness

Rods and platelets

) 2 / exp( ) (

2 2 c

R s s sI − ∝ ) exp( ) (

2 2 2 t

R s s I s − ∝ 12 / T Rt =

slide-33
SLIDE 33

Porod law and excluded particle Volume

I(s) ~ s-4

Intensity decay is proportional to s-4 at higher angles (for globular particles of uniform density)

− =

2 2

] ) ( [ ) ( 2 ds s K s I I VP π

K is a constant determined to ensure the asymptotical intensity decay proportional to s-4 at higher angles following the Porod's law for homogeneous particles Vp is excluded volume of the hydrated partcile, for globular macromolecultes its value in nm3 is approximately twice (1.7 times) of the molecular mass in kDa Vp=120 nm3 MMexp =(70±5) kDa

slide-34
SLIDE 34

PRIMUS: Porod plot

Porod

) ) ( ( ) ( 2 ) ( 2

2 2 2

− = = ds K s I s I Q I V π π

V – excluded volume of particle

V = 92.37

slide-35
SLIDE 35

Real/ reciprocal Real/ reciprocal space transformation space transformation

p(r)=r2 γ(r) distance distribution function γ0(r)=γ(r)/γ(0)

Probability to find a point at distance r from a given point inside the particle

i

j

rij

slide-36
SLIDE 36

Distance distribution function from simple shapes

slide-37
SLIDE 37

Distance distribution Distance distribution function of function of helix helix

slide-38
SLIDE 38

Gnom Run

PRIMUS: GNOM menu

Indirect Fourier Transform

slide-39
SLIDE 39

Gnom

PRIMUS: P(R) function

Indirect Fourier Transform

slide-40
SLIDE 40

=

max min

) ( ) , ( ) (

D D

dr r p r s K s J

The operator K(s,r) includes the Fourier transform and smearing effects

This is a typical ill-posed problem, i.e. small errors in J(s) may lead to large errors in p(r). Tikhonov’s regularization method is used in GNOM to solve this problem

) ( ] [

2

p Kp J p T

J

Ω + − = α

α

Ω(p) – a stabilizer that take into account the smoothness, non- negativity of p(r) and the systematic deviations between experimental J(s) and the restored function J(α,s)=Kp (α) D.I. Svergun (1992) JAC, 25, 495-503

Estimation of overall parameters in GNOM

slide-41
SLIDE 41

SANS data from bacteriophage T7 in D2O buffer (importance of smearing effects)

Bacteriophage T7 is a large bacterial virus with MM of 56 MDa consisting of an icosahedral protein capsid (diameter

  • f about 600A ) that contains a double-

stranded DNA molecule. The skewed shape of p(r) function is typical for hollow particles which is in agreement with a core-shell like structure of the virus. DNA molecule (having lower contrast in D2O than the protein) is located inside the protein capsid of the phage.

( )

[ ]

( ) ( ) ( ) ( )

[ ]

{ } ∫ ∫∫

∞ ∞ − ∞ ∞ − ∞ −

+ − = =

1 2 2

2 1

) ( dtdu d t u Q I W t W u W Q I W Q J

l w

λ λ λ

λ

slide-42
SLIDE 42

♦ In the original version of GNOM the maximum particle

size Dmax is a user-defined parameter and successive calculations with different Dmax are required to select its

  • ptimum value.

AUTOGNOM – automated version of GNOM for monodisperse systems

♦ This optimum Dmax should provide a smooth real

space distance distribution function p(r) such that p(Dmax) and its first derivative p'(Dmax) are approaching zero, and the back-transformed intensity from the p(r) fits the experimental data.

Petoukhov, M.V., Konarev, P.V., Kikhney, A.G. & Svergun, D.I. (2007) J. Appl. Cryst., 40, s223-s228.

slide-43
SLIDE 43

Estimation of Dmax with GNOM (under-estimation)

6.0

Poor fit to experimental data Distance distribution function p(r) goes to zero too abruptly

slide-44
SLIDE 44

Estimation of Dmax with GNOM (over-estimation) Good fit to experimental data BUT: Distance distribution function p(r) becomes negative

12.0

slide-45
SLIDE 45

8.0

Estimation of Dmax with GNOM (correct case) Good fit to experimental data Distance distribution function p(r) goes smoothly to zero

slide-46
SLIDE 46

♦ The maximum size is determined from automated

comparison of the p(r) functions calculated at different Dmax values ranging from 2Rg to 4Rg, where Rg is the radius of gyration provided by AUTORG.

AUTOGNOM – automated version of GNOM for monodisperse systems

♦The calculated p(r) functions and corresponding fits to

the experimental curves are compared using the perceptual criteria of GNOM (Svergun, 1992) together with the analysis of the behavior of p(r) function near Dmax and the best p(r) function is chosen for the final

  • utput.

Petoukhov, M.V., Konarev, P.V., Kikhney, A.G. & Svergun, D.I. (2007) J. Appl. Cryst., 40, s223-s228.

slide-47
SLIDE 47

An automated SAXS pipeline at P12

Data normalization 2D-1D reduction Data processing Check for radiation damage Computation of overall parameters Database search Ab initio modelling XML-summary file generation

Hardware- independent analysis block

slide-48
SLIDE 48

Kratky plot

This provides a sensitive means of monitoring the degree of compactness of a protein as a function of a given parameter. This is most conveniently represented using the so-called

Kratky plot of s2I(s) vs s.

Globular particle : bell-shaped curve Gaussian chain : plateau at large s-values but beware: a plateau does not imply a Gaussian chain

slide-49
SLIDE 49

SAXS patterns of globular and flexible proteins

Natively unfolded Globular Multidomain with flexible linkers

slide-50
SLIDE 50

Summary of model-independent information

I(0)/c, i.e. molecular mass (from Guinier plot or p(r) function) Radius of gyration Rg (from Guinier plot or p(r) function) Radii of gyration of thickness or cross-section (anisometrc particles) Maximum particle size Dmax (from p(r) function) Particle volume V (from I(0) and Porod invariant) Globular or unfoded (From Kratky plot)

slide-51
SLIDE 51

Thank you!