Automated data analysis on ESRF BM29 Martha Brennich (EMBL Grenoble) - - PowerPoint PPT Presentation

automated data analysis on esrf bm29 martha brennich embl
SMART_READER_LITE
LIVE PREVIEW

Automated data analysis on ESRF BM29 Martha Brennich (EMBL Grenoble) - - PowerPoint PPT Presentation

Automated data analysis on ESRF BM29 Martha Brennich (EMBL Grenoble) Idealized bio-SAS experiment Solution Scattering Data from Protein of Interest Black Box Neutron source/beamline homesource What can we learn from BioSAXS?


slide-1
SLIDE 1

Martha Brennich (EMBL Grenoble) Automated data analysis on ESRF BM29

slide-2
SLIDE 2

Idealized bio-SAS experiment

Solution Scattering Data from Protein of Interest Black Box Neutron source/beamline homesource

slide-3
SLIDE 3

What can we learn from BioSAXS?

  • Low-resolution structural information – shape, overall fold
  • Mean molecular weight, oligomeric state
  • Mixing ratios
  • Model validation
  • Domain placement
  • Complex structures
  • Ab-initio models

?

slide-4
SLIDE 4
  • Dedicated solution

scattering beamline

  • Optimized for

macromolecules (4kDa -1MDa)

  • Many “non-expert”

users, short visits

slide-5
SLIDE 5

Automated sample Handling

slide-6
SLIDE 6

x-rays

sample changer

capillary 3 m detector

Inline HPLC

slide-7
SLIDE 7

sample changer

slide-8
SLIDE 8

Automated data acquisition

About 3 minutes per buffer/sample/buffer set Actual acquisition rate: 10 frames/minute

slide-9
SLIDE 9

ISPyB: Prepare your acquisition from anywhere!

ISPyB: Information System for Protein CrystallographY Beamlines

slide-10
SLIDE 10

Data Processing - EDNA

2 x 10 10 pyFai Select Average 1 Subtract autorg datgnom dammif damaver dammin

slide-11
SLIDE 11

Data Processing - EDNA

Image Processing

Radial Integration PyFAI Frame merging and Radition damage detection

1D data reduction

Compare buffers to determine the "Best" Subtract "Best" Buffer from protein curve

Curve reduction

Group all protein curves from same construct Compare curves

Curve Analysis

AutoRg DATGNOM DAMMIF

1D curve Protein Curve Idealized curve Indication of quality (similarity of all curves) Ab-initio Models Model independent Parameters

slide-12
SLIDE 12

ISPYB: Data Analysis Overview

slide-13
SLIDE 13

ISPYB: 1d Visualisation

slide-14
SLIDE 14

ISPYB: Model Visualistation

slide-15
SLIDE 15

x-rays

sample changer Inline HPLC

slide-16
SLIDE 16

In-Situ HPLC – increase sample monodispersity

UV cell capillary from GPC MAX pump automatic valve column not controlled by beamline mode valve

slide-17
SLIDE 17

In-situ HPLC – data acquisition

1000 or more single measurements in a dataset

slide-18
SLIDE 18

PROCESSING FOR HPLC

1000 pyFai Select Average 1 buffer e.g. frame 1-456 544 samples Subtract 544 autorg 544 peak finder 4 autorg datgnom dammif damaver dammin 4

slide-19
SLIDE 19

AutoMATED PROCESSING FOR HPLC

Image Processing

Radial Integration PyFAI Merge first frames to create buffer

1D data reduction

Subtract buffer Determine invariants

Curve reduction

Find peaks Merge curves in peak

Curve Analysis

AutoRg DATGNOM DAMMIF

1D curves Protein Curves Idealized curves Ab-initio Models Model independent Parameters

slide-20
SLIDE 20

HPLC: Real Time feedback

Background quality Signal strength Spoiling?

slide-21
SLIDE 21

ISPYB: HPLC overview

slide-22
SLIDE 22
  • Data processing framework
  • Collaboration between ESRF and Diamond
  • Mostly used in macromolecular crystallography
  • Python 2.7 based
  • At BM29 as a TANGO device
  • No direct user interaction: At BM29, the users only need

to explicitly provide sample concentrations

EDNA

slide-23
SLIDE 23

3 local machines for online processing, in principle each can do everything

BM29 Data Analysis Hardware

Primary Processing Bead modelling HPLC processing XEON 2 core, 3 GHz 2 x XEON 4 core, 2.26 GHz XEON 6 core, 3.40 GHz nVidia Quadro 4000, 2 GB memory nVidia GeForce GTX 750 Ti, 2 GB memory nVidia Quadro M2000, 4 GB memory Before 2009 2011 2016

slide-24
SLIDE 24
  • Reject radiation damaged data
  • Identify peaks in HPLC mode

Why do we select frames?

q [​nm↑−1 ] log[I(q) ] time

slide-25
SLIDE 25
  • Oversampled data, error bars of each data points non-

ideal (correlated, …)

  • Correlation Map (CORMAP) test, originally proposed by

Daniel Franke at EMBL Hamburg

  • Core idea: If two frames come from “the same” sample,

the difference between should be random!

  • Hence the distribution of + and – differences corresponds

to a series of coin tosses

How do we select frames?

slide-26
SLIDE 26

CORMAP II

Mark F. Schilling The College Mathematics Journal

  • Vol. 21, No. 3 (May, 1990), pp. 196-207
  • Distribution is recursive for

the number of coin tosses

  • The longest run is actually

pretty short!

  • e.g. at BM29 with 1043

q-bins in the range between 7 and 14 points

  • Available in freesas
slide-27
SLIDE 27
  • Forward scattering and radius of

gyration are useful for identifying concentration effects on the scattering signal

  • But the appropriate data range for

the Guinier approximation is sample dependent and a priori unknown

  • Score fits in different regions
  • Originally used ATSAS version
  • Moved to freeSAS implementation

for HPLC

AutoRg

slide-28
SLIDE 28

Beam center - the BM29 way

X,Y Transmission 3•10-7

slide-29
SLIDE 29

Adam Round Andrew McCarthy

ACKNOWLEDGMENTS

GRENOBLE Petra Pernot Mark Tully Benoit Maillot Jérôme Kieffer Staffan Ohlsson Ma7as Guijarro Antonia Betava Alejandro De Maria Antolinos

freesas pipeline