SLIDE 1 Kristina Djinović-Carugo
Advanced use of databases in the hybrid structural research: PDB
Department of Structural and Computational Biology
Max F. Perutz Laboratories University of Vienna
Austria
EMBO Global WS: Structural and biophysical methods for biological macromolecules in solution Singapore, 12th December 2017
SLIDE 2 Structural databases
SLIDE 3
Statistics by method
SLIDE 4 Pathway for generation
SLIDE 5 Pathway
- Experiments: Crystallisation, X-ray diffraction
- Computational: Structure determination, refinement, analysis
SLIDE 6 Why a crystal
- X-ray scattering from a single molecule would be
very weak and could not be detected above the noise level
- A crystal arranges large numbers of molecules in
the same orientation
- Scattered waves can add up in phase and raise the signal
to a measurable level
- Crystal acts as an amplifier
SLIDE 7 Why crystal
- The waves add up in phase in some directions and have to
cancel out in other directions
- X-rays diffracted from a crystal and detected on a flat 2D-detector
SLIDE 8 Image of molecule: electron density distribution
- Electromagnetic radiation interacts with matter
through its fluctuating electric field
- The result of an X-ray crystallographic experiment is
the distribution of electrons in the molecule
SLIDE 9
Image of molecule – electron density
SLIDE 10
Final result of structure determination (is none of what you see)
SLIDE 11
Final result: atomic 3D coordinates
SLIDE 12
Crystallographic terms
SLIDE 13 Reflection - Intensity
- Intensities of diffracted beams are measured
- Reflections, I – intensities
- 𝐽
"
= |Fo| - structure factor amplitudes
SLIDE 14 Resolution
- RES = ½ [λ/sin(θ)]
- Detail that can be resolved in electron density
maps
SLIDE 15 Resolution of electron density map and consequently of the 3D model
1.0 Å 2.5 Å 3.0 Å 4.0 Å
SLIDE 16 Resolution of electron density map
- 1 Å resolution individual atoms can be fitted
SLIDE 17 Resolution of electron density map
- Alpha-helices are clear at 6 Å resolution, but beta-strands are not.
- At lower resolutions than about 8 Å, only whole molecules can be
placed.
SLIDE 18
RES in PDB
SLIDE 19 Rfactor
- |Fo| - from measured defecation intensity
- Experimental data
- |Fc| - calculated from coordinates
- Global measure of agreement between experiment and
the model
- Surface residues can be less recognizable
å å
h h h h h
Fobs Fcalc Fobs R
SLIDE 20 Rfree
- Calculated for 5-10% of reflections not used in
refinement
- How well the model agrees with the data that it
has not been fit to
å å
h h h h h
Fobs Fcalc Fobs R
SLIDE 21 R/Rfree
- For most structures refined at RES=2.5 Å, R is
less than 0.2 and Rfree less than 0.25
SLIDE 22
Rfree in PDB
SLIDE 23 RES and Rfactor for ranking
- Ranking of quality of structures
- Higher RES and lower R are associated with
higher Q
Q =
1 RES − R
SLIDE 24 Thermal motion
- B iso = isotropic thermal factor
- B iso = 8π2<u>2
- u = mean amplitude of displacement from the
mean position
SLIDE 25
Thermal motion
B iso 3 coordinates + 1B B aniso 3 coordinates + 6B
SLIDE 26
Thermal motion
SLIDE 27 Atomic displacement parameters (B)
- Bmain < Bside
- B absorb lattice defects, large scale
movements, disorder
SLIDE 28 Atomic displacement parameters (B)
- B absorb lattice defects, large scale movements,
disorder
- Inform on function
- Access to internal cavities, substrate channels
- Correlation between thermal stability and thermal
motions/flexibility
SLIDE 29 T of experiment
- X-tal structures @ T = 100 K
- NMR @ RT
SLIDE 30 T of experiment
- 160K – 200K proteins undergo a phase
transition
- Conformational disorder goes from dynamic to
static
- Lower T reduces conformational distribution of
sidechains:
- à smaller and more packed and unique modes
SLIDE 31 Flowchart/Crystallographic terms
Protein Solution Publication, … Structure (xi,yi,zi,Bi) Electron density r(x,y,z) aP
a, b, c, a, b, g, symmetry, solvent content, # mol./a.u.
h, k, l, |FP|
Heavy Atom Derivative
Crystals h, k, l, |FPH| h, k, l, |Fcalc|, acalc |Fo|
R-factor
SLIDE 32
PDB statistics – depositions per year and cumulative growth
SLIDE 33
Folds
SLIDE 34 Redundancy of PDB
- 135787 Biological Macromolecular Structures
- 12.12.2017
SLIDE 35 Why redundancy…
- Cover a limited space of biological
macromolecular universe
- Same or similar proteins, e.g.
- Lysozyme > 500 entries
- Membrane proteins : 2-3% PDB entries
- 15% - 35% of human proteome
- Intrinsically disordered proteins
SLIDE 36 Non redundant PDB subsets
- PDBselect: reject proteins with aa sequence
identity > threshold
- http://bioinfo.tg.fh-giessen.de/pdbselect/
- PISCES: download precompiled datasets
- dunbrack.fccc.edu/PISCES.php
- “Advanced search utility” in PDB
- Skip-Redundant of EMBOSS
- Cd-hit
- http://weizhongli-lab.org/cd-hit/
SLIDE 37 … but sequence is not all
- Same sequence can adopt 2 different structures
- e.g.: Calmodulin
- Procedure taking in account also topology
- Bioinformatics (2008), 24, 2632
SLIDE 38
CHECK FIGURES OF MERIT, STEREOCHEMISTRY
SLIDE 39 Missing residues
- Interpretation of electron density (ρ) allows
positioning of atoms
- Sometimes ρ is elusive and thus positioning of atoms
uncertain or impossible
SLIDE 40 Treatments of invisible residues/atoms
- Omit the atoms/aa residues
- Amino acid residues ‘torsos’
- … molecular graphics not always warns
SLIDE 41
Example
“Torso”
SLIDE 42 Treatments of invisible residues/atoms
- Leave the atom in the model à large B
factors
- Easily visualized by molecular graphics
SLIDE 43 Treatments of invisible residues/atoms
- Leave the atom in the model with occupancy 0
- No alert by molecular graphics
SLIDE 44 Occurrence of invisible residues/atoms
- 20% of structures at atomic RES contain
invisible residues
- 80% of structures 1.5 Å RES contain 80% of
invisible residues
- At atomic RES 2-3% residues are invisible
- At 2.0 Å RES 7% residues are invisible
- At 3.0 Å RES 10% residues are invisible
SLIDE 45 Reasons for invisible residues/atoms
- Proteolysis Ltd
- Quality/quantity of diffraction data
- Conformational disorder
SLIDE 46 …caveat
- Invisible residues are often on surfaces
- Caution if surface properties are investigated:
- Electrostatic potential
K43
SLIDE 47 …caveat
- Invisible residues are often on surfaces
- Caution if surface properties are investigated:
- Electrostatic potential calculation is affected!
“Torso”
SLIDE 48 Conformational disorder / Occupancy
- Residues/atoms do not reside on the same
position in all residue/atoms in all molecules in the crystal/ensemble
- à weak electron density à invisible
- At medium/high RES observe multiple
conformations
- Static disorder
- Dynamic disorder
SLIDE 49 Static/dynamic disorder
- Static: two or more conformations
exist
- Dynamic (at higher T): shuffling
from one conformation to another
- Crystal structure determination
gives time and space averaged structural information
- à cannot distinguish between
static and dynamic disorder
SLIDE 50 Alternative conformations
The sum of occupancies of both positions conformations is 1
SLIDE 51 Disorder continued…
- Occupancy of atoms/residues is < 1
- Ligand is not bound to a fraction of molecules in
the crystal
- Weak binding, suboptimal binding conditions
- Misplaced ligand
- Partial disorder
- X-ray induced radiation damage
- loss of carboxylates, methyl groups…
SLIDE 52 Partial disorder of the ligand
- Weichenberger et al.
- Volume 73 | Part 3 | March 2017 | Pages 211–222 | 10.1107/S205979831601620X
SLIDE 53 Radiation damage at work
- Garman
- Volume 66 | Part 4 | April 2010 | Pages 339–351 | 10.1107/S0907444910008656
SLIDE 54 …caveat
- Molecular graphics shows all alternative
conformations
- Surface properties calculation with all
conformations!
SLIDE 55 Stereochemistry
- Is the protein stereo-chemically sound?
- Covalent distances, angles, torsion angles,
backbone conformation, group planarity, chirality, H-bonds, electrostatic interactions
SLIDE 56
Ramachandran plot
SLIDE 57 Ideal stereochemical parameters
Peptide bond Average Single Bond Average Cα - C 1.53 (Å) C - C 1.54 (Å) C - N 1.33 (Å) C - N 1.48 (Å) N - Cα 1.46 (Å) C - O 1.43 (Å) Hydrogen Bond Average (±0.3) O-H --- O-H 2.8 (Å) N-H --- O=C 2.9 (Å) O-H --- O=C 2.8 (Å) Rms deviations from ideal geometry: bond length 0.01 - 0.02 Å bond angles: 1.2 - 1.5 deg
SLIDE 58 Tools to check stereochemistry
- Procheck
- What_Check
- MolProbity
- ProSA
- PDB – validation protocol, which examines also
fit of experimental data
SLIDE 59
PDB – validation report
SLIDE 60 10 20 30 40 50 60 70 80 90 100
15% of the PDB files of similar resolution are worse than this
35% of the PDB files than this one (and 65% are better).
WORSE BETTER
READ the REPORT
SLIDE 61 Re-refined structures
- Database of automatically re-refined structures
SLIDE 62
Carugo O, & Djinovic Carugo, K. Methods Mol Biol 2016 Criteria to extract high quality protein databank subsets from PDB And references therein