Geometric arrangement algorithms for protein structure determination - - PowerPoint PPT Presentation

geometric arrangement algorithms for protein structure
SMART_READER_LITE
LIVE PREVIEW

Geometric arrangement algorithms for protein structure determination - - PowerPoint PPT Presentation

Geometric arrangement algorithms for protein structure determination Jeff Martin Bruce Donald Laboratory http://www.cs.duke.edu/donaldlab/ Protein structure determination Protein structure determination Protein Synthesis Protein structure


slide-1
SLIDE 1

Bruce Donald Laboratory

Jeff Martin

Geometric arrangement algorithms for protein structure determination

http://www.cs.duke.edu/donaldlab/

slide-2
SLIDE 2

Protein structure determination Protein structure determination

Protein Synthesis

slide-3
SLIDE 3

Protein structure determination Protein structure determination

Protein Synthesis

Primary sequence

ANNTTGFTRIIKAAGYSWKGLRAAWINEAAF RQEGVAVLLAVVIACWLDVDAITRVLLISSV MLVMIVEILNSAIEAVVDRIGSEYHELSGRAK DMGSAAVLIAIIVAVITWCILLWSHFG

slide-4
SLIDE 4

Protein structure determination Protein structure determination

Protein Synthesis Protein Folding

Primary sequence

ANNTTGFTRIIKAAGYSWKGLRAAWINEAAF RQEGVAVLLAVVIACWLDVDAITRVLLISSV MLVMIVEILNSAIEAVVDRIGSEYHELSGRAK DMGSAAVLIAIIVAVITWCILLWSHFG

slide-5
SLIDE 5

Protein structure determination Protein structure determination

Protein Synthesis Protein Folding

Primary sequence

Experimental measurements

ANNTTGFTRIIKAAGYSWKGLRAAWINEAAF RQEGVAVLLAVVIACWLDVDAITRVLLISSV MLVMIVEILNSAIEAVVDRIGSEYHELSGRAK DMGSAAVLIAIIVAVITWCILLWSHFG

d

slide-6
SLIDE 6

Why solve protein structures? Why solve protein structures?

Structure determines function! Hence, studying structure helps understand function

slide-7
SLIDE 7

How is protein function important? How is protein function important?

Disease mechanisms

MPER: HIV surface protein

P Reardon, et al., (in preparation)

slide-8
SLIDE 8

How is protein function important? How is protein function important?

Disease mechanisms

MPER: HIV surface protein

T Zhou, I Georgiev, et al., Science, 2010

VRC01: HIV antibody

P Reardon, et al., (in preparation)

slide-9
SLIDE 9

How is protein function important? How is protein function important?

New drugs Disease mechanisms

MPER: HIV surface protein

VRC01: HIV antibody

GrsA PheA: helps manufacture antibiotics

CY Chen, I Georgiev, et al., PNAS, 2009 T Zhou, I Georgiev, et al., Science, 2010 P Reardon, et al., (in preparation)

slide-10
SLIDE 10

One protein function: a molecular assembly line One protein function: a molecular assembly line

The domains have specific tasks, but how do they work?

Protein Antibiotic

slide-11
SLIDE 11

One protein function: a molecular assembly line One protein function: a molecular assembly line

Solving the structure for a domain shows us how it works.

Protein Antibiotic

slide-12
SLIDE 12

One protein function: a molecular assembly line One protein function: a molecular assembly line

If we modify a domain, the protein could perform a new function.

Protein Antibiotic

slide-13
SLIDE 13

Protein redesign relies on protein structure Protein redesign relies on protein structure

Cheng-Yu Chen, Ivelin Georgiev, Amy C. Anderson, Bruce R. Donald, Computational structure-based redesign of enzyme activity, PNAS 2009

Switched specificity from Phenylalanine to Leucine!

Prehaps we can engineer molecules to make new drugs?

Predicted binding model for Leucine

Redesign!

slide-14
SLIDE 14

Experimental methods Experimental methods

Nuclear Magnetic Resonance spectroscopy (NMR)

Duke NMR Center Crystals of Insulin

X-ray Crystallography

For atomic-precision protein structure determination

d

NMR is often preferred because measurements are in solution state.

slide-15
SLIDE 15

How are structures traditionally solved by NMR? How are structures traditionally solved by NMR?

  • 13Cα chemical shifts
  • 13Cβ chemical shifts
  • 13CO chemical shifts
  • 1Hα chemical shifts
  • 1HN chemical shifts
  • 3JHNHα couplings
  • 1H-15N RDCs
  • 15N R2
  • Nitroxide spin-label PREs
  • ATCUN PREs
  • NOEs
  • O2-induced 13C paramagnetic shifts
  • Tryptophan indole solvent accessibility

Experimental Restraints

slide-16
SLIDE 16

Geometric restraints on protein structure Geometric restraints on protein structure

  • Nuclear Overhauser

Effect (NOE)

  • Paramagnetic Relaxation

Enhancement (PRE)

Restraints on distance:

  • 13Cα chemical shifts
  • 13Cβ chemical shifts
  • 13CO chemical shifts
  • 1Hα chemical shifts
  • 1HN chemical shifts
  • 3JHNHα couplings
  • 1H-15N RDCs
  • 15N R2
  • Nitroxide spin-label PREs
  • ATCUN PREs
  • NOEs
  • O2-induced 13C paramagnetic shifts
  • Tryptophan indole solvent accessibility

Experimental Restraints

slide-17
SLIDE 17

Geometric restraints on protein structure Geometric restraints on protein structure

  • Residual Dipolar

Couplings (RDCs) Restraints on orientation:

  • 13Cα chemical shifts
  • 13Cβ chemical shifts
  • 13CO chemical shifts
  • 1Hα chemical shifts
  • 1HN chemical shifts
  • 3JHNHα couplings
  • 1H-15N RDCs
  • 15N R2
  • Nitroxide spin-label PREs
  • ATCUN PREs
  • NOEs
  • O2-induced 13C paramagnetic shifts
  • Tryptophan indole solvent accessibility

Experimental Restraints

slide-18
SLIDE 18

Geometric restraints on protein structure Geometric restraints on protein structure

  • Residual Dipolar

Couplings (RDCs) Restraints on orientation:

  • 13Cα chemical shifts
  • 13Cβ chemical shifts
  • 13CO chemical shifts
  • 1Hα chemical shifts
  • 1HN chemical shifts
  • 3JHNHα couplings
  • 1H-15N RDCs
  • 15N R2
  • Nitroxide spin-label PREs
  • ATCUN PREs
  • NOEs
  • O2-induced 13C paramagnetic shifts
  • Tryptophan indole solvent accessibility

Experimental Restraints

slide-19
SLIDE 19

How are structures traditionally solved by NMR? How are structures traditionally solved by NMR?

  • 13Cα chemical shifts
  • 13Cβ chemical shifts
  • 13CO chemical shifts
  • 1Hα chemical shifts
  • 1HN chemical shifts
  • 3JHNHα couplings
  • 1H-15N RDCs
  • 15N R2
  • Nitroxide spin-label PREs
  • ATCUN PREs
  • NOEs
  • O2-induced 13C paramagnetic shifts
  • Tryptophan indole solvent accessibility

Simulated Annealing: Experimental Restraints

Molecular dynamics simulation and energy minimization

slide-20
SLIDE 20

Structure determination of protein complexes Structure determination of protein complexes

traditionally uses simulated annealing as well

slide-21
SLIDE 21

Simulated annealing is based on heuristics Simulated annealing is based on heuristics

Stochastic search

slide-22
SLIDE 22

Simulated annealing is based on heuristics Simulated annealing is based on heuristics

Stochastic search Simulation & Minimization

slide-23
SLIDE 23

Simulation & Minimization

Simulated annealing is based on heuristics Simulated annealing is based on heuristics

Stochastic search Convergence not guaranteed

slide-24
SLIDE 24

Protein complexes are composed of subunits Protein complexes are composed of subunits

Homodimers have 2 identical subunits Homo-oligomers have n identical subunits

slide-25
SLIDE 25

Structure determination of protein complexes Structure determination of protein complexes

using divide and conquer instead

slide-26
SLIDE 26

Divide and conquer Divide and conquer

de novo structure determination

slide-27
SLIDE 27

Divide and conquer Divide and conquer

de novo structure determination Oligomeric assembly

slide-28
SLIDE 28

Related work in structure determination by solution NMR Related work in structure determination by solution NMR

Chris Bailey-Kellogg, et al., 2000 Wang and Donald, 2004 Wang, Mettu, and Donald, 2006 Zeng, Tripathy, Zhou, and Donald, 2008 Zeng, et al., 2009 Zeng, Zhou, and Donald, 2011 Zeng, Roberts, Zhou, Donald, 2011 Tripathy, Zeng, Zhou and Donald, 2011 Nilges, 1993 Nilges, 1995 Nilges, et al., 1997 Meiler, et al., 2000 Fowler, et al., 2000 Tian, Valafar, and Prestegard, 2001 Herrmann, Güntert, and Wüthrich, 2002 Wedemeyer, Rohl, and Scheraga, 2002 Rieping, et al., 2007 Bardiaux, et al., 2009 Heuristic

de novo determination

Provable Potluri, et al., 2006 Potluri, et al., 2007 Martin, Yan, Bailey-Kellogg, Zhou, and Donald Protein Science, 2011 Martin, Yan, Bailey-Kellogg, Zhou, and Donald J Comp Bio, 2011

Oligomeric assembly

Wang, Lozano-Pérez, and Tidor, 1998 Wang, Bansal, Jiang, and Prestegard, 2008

* * *

polynomial time algorithms

*

slide-29
SLIDE 29

Full citations for Donald lab work Full citations for Donald lab work

Bailey-Kellogg C, Widge A, Kelley JJ, Berardi MJ, Bushweller JH, Donald BR. The NOESY jigsaw: automated protein secondary structure and main- chain assignment from sparse, unassigned NMR data. J Comput Biol. 2000;7(3-4):537-58. PMID: 11108478 Martin JW, Yan AK, Bailey-Kellogg C, Zhou P, Donald BR. A geometric arrangement algorithm for structure determination of symmetric protein homo-oligomers from NOEs and RDCs. J Comput Biol. 2011 Nov;18(11):1507-23. PMID: 22035328 Martin JW, Yan AK, Bailey-Kellogg C, Zhou P, Donald BR. A graphical method for analyzing distance restraints using residual dipolar couplings for structure determination of symmetric protein homo-oligomers. Protein Sci. 2011 Jun;20(6):970-85. doi: 10.1002/pro.620. PMID: 21413097 Donald BR, Martin J. Automated NMR Assignment and Protein Structure Determination using Sparse Dipolar Coupling Constraints. Prog Nucl Magn Reson Spectrosc. 2009 Aug 1;55(2):101-127. PMID: 20160991 Wang L, Donald BR. Exact solutions for internuclear vectors and backbone dihedral angles from NH residual dipolar couplings in two media, and their application in a systematic search algorithm for determining protein backbone structure. J Biomol NMR. 2004 Jul;29(3):223-42. PMID: 15213422. Wang L, Mettu RR, Donald BR. A polynomial-time algorithm for de novo protein backbone structure determination from nuclear magnetic resonance

  • data. J Comput Biol. 2006 Sep;13(7):1267-88. PMID: 17037958.

Zeng J, Tripathy C, Zhou P, Donald BR. A Hausdorff-based NOE assignment algorithm using protein backbone determined from residual dipolar couplings and rotamer patterns. Comput Syst Bioinformatics Conf. 2008;7:169-81. PMID: 19642278. Zeng J, Boyles J, Tripathy C, Wang L, Yan A, Zhou P, Donald BR. High-resolution protein structure determination starting with a global fold calculated from exact solutions to the RDC equations. J Biomol NMR. 2009 Nov;45(3):265-81. Epub 2009 Aug 27. PMID: 19711185 Zeng J, Zhou P, Donald BR. Protein side-chain resonance assignment and NOE assignment using RDC-defined backbones without TOCSY data. J Biomol NMR. 2011 Aug;50(4):371-95. Epub 2011 Jun 25. PMID: 21706248 Zeng J, Roberts KE, Zhou P, Donald BR. A Bayesian approach for determining protein side-chain rotamer conformations using unassigned NOE

  • data. J Comput Biol. 2011 Nov;18(11):1661-79. Epub 2011 Oct 4. PMID: 21970619

Tripathy C, Zeng J, Zhou P, Donald BR. Protein loop closure using orientational restraints from NMR data. Proteins. 2011 Sep 26. doi: 10.1002/prot.23207. PMID: 22161780 Potluri S, Yan AK, Chou JJ, Donald BR, Bailey-Kellogg C. Structure determination of symmetric homo-oligomers by a complete search of symmetry configuration space, using NMR restraints and van der Waals packing. Proteins. 2006 Oct 1;65(1):203-19. PMID: 16897780. Potluri S, Yan AK, Donald BR, Bailey-Kellogg C. A complete algorithm to resolve ambiguity for intersubunit NOE assignment in structure determination of symmetric homo-oligomers. Protein Sci. 2007 Jan;16(1):69-81. PMID: 17192589

slide-30
SLIDE 30

Protein complexes tend to be symmetric Protein complexes tend to be symmetric

50–70% of known complexes are symmetric homo-oligomers

Levy, et.al. Assembly reflects evolution of protein complexes. Nature, 453(7199):1262–1265, June 2008.

slide-31
SLIDE 31

Protein complexes tend to be symmetric Protein complexes tend to be symmetric

50–70% of known complexes are symmetric homo-oligomers

Levy, et.al. Assembly reflects evolution of protein complexes. Nature, 453(7199):1262–1265, June 2008.

slide-32
SLIDE 32

Protein oligomers with cyclic symmetry Protein oligomers with cyclic symmetry

cyclic symmetry (C5) (C3)

slide-33
SLIDE 33

Benefits of symmetry Benefits of symmetry

cyclic symmetry (C5) (C3)

If the subunit structure is known, the

  • ligomer structure is completely

specified by the symmetry axis

Just need orientation and position relative to subunit

slide-34
SLIDE 34

Oligomeric assembly using DISCO Oligomeric assembly using DISCO

Assemble oligomer using the symmetry Compute symmetry axis orientation Compute symmetry axis position

slide-35
SLIDE 35

Oligomeric assembly using DISCO Oligomeric assembly using DISCO

Assemble oligomer using the symmetry Compute symmetry axis orientation Compute symmetry axis position

slide-36
SLIDE 36

Geometric restraints on protein structure Geometric restraints on protein structure

  • Residual Dipolar

Couplings (RDCs) Restraints on orientation:

Principal order frame

slide-37
SLIDE 37

Geometric restraints on protein structure Geometric restraints on protein structure

  • Residual Dipolar

Couplings (RDCs) Restraints on orientation:

Principal order frame

slide-38
SLIDE 38

Geometry of RDC curves Geometry of RDC curves

RDC measurement (scalar) Bond orientation (vector) Alignment tensor (matrix)

slide-39
SLIDE 39

Geometry of RDC curves Geometry of RDC curves

RDC measurement (scalar) Bond orientation (vector) Alignment tensor (matrix) Rotation Scaling

slide-40
SLIDE 40

Geometry of RDC curves Geometry of RDC curves

RDC measurement (scalar) Bond orientation (vector) Alignment tensor (matrix) Rotation Scaling After rotation:

slide-41
SLIDE 41

Geometry of RDC curves Geometry of RDC curves

RDC measurement (scalar) Bond orientation (vector) Alignment tensor (matrix) Rotation Scaling After rotation:

slide-42
SLIDE 42

Geometry of RDC curves Geometry of RDC curves

RDC measurement (scalar) Bond orientation (vector) Alignment tensor (matrix) Rotation Scaling After rotation: Quadric surface!

slide-43
SLIDE 43

Solving for symmetry axis orientation Solving for symmetry axis orientation

Molecular frame Principal order frame (defined by )

slide-44
SLIDE 44

Solving for symmetry axis orientation Solving for symmetry axis orientation

Molecular frame

is unknown

Principal order frame (defined by )

slide-45
SLIDE 45

Solving for symmetry axis orientation Solving for symmetry axis orientation

Molecular frame

is unknown

Principal order frame (defined by )

Can solve for (and ) with at least 5 RDCs

slide-46
SLIDE 46

Solving for symmetry axis orientation Solving for symmetry axis orientation

Molecular frame Principal order frame z-axis of alignment tensor is parallel to the symmetry axis

is unknown Can solve for (and ) with at least 5 RDCs

slide-47
SLIDE 47

Oligomeric assembly using DISCO Oligomeric assembly using DISCO

Assemble oligomer using the symmetry Compute symmetry axis orientation Compute symmetry axis position

slide-48
SLIDE 48

Geometric restraints on protein structure Geometric restraints on protein structure

  • Nuclear Overhauser

Effect (NOE)

  • Paramagnetic Relaxation

Enhancement (PRE)

Restraints on distance:

slide-49
SLIDE 49

Solving for the symmetry axis position Solving for the symmetry axis position

Consider the geometry of an inter-subunit distance restraint:

slide-50
SLIDE 50

Solving for the symmetry axis position Solving for the symmetry axis position

Consider the geometry of an inter-subunit distance restraint:

slide-51
SLIDE 51

Solving for the symmetry axis position Solving for the symmetry axis position

Consider the geometry of an inter-subunit distance restraint:

slide-52
SLIDE 52

Analytical solution for the green annulus Analytical solution for the green annulus

slide-53
SLIDE 53

Symmetry causes uncertainty Symmetry causes uncertainty

For homo-trimers and higher, subunit assignments for distance restraints are not known Subunit ambiguity

slide-54
SLIDE 54

Symmetry causes uncertainty Symmetry causes uncertainty

For homo-trimers and higher, subunit assignments for distance restraints are not known Subunit ambiguity

slide-55
SLIDE 55

Encoding subunit ambiguity Encoding subunit ambiguity

Each assignment generates a constraint annulus Annuli are combined using set union

slide-56
SLIDE 56

Analysis of multiple distance restraints Analysis of multiple distance restraints

DISCO computes the Maximally Satisfying Regions by analyzing the arrangement of the unions of annuli

slide-57
SLIDE 57

Efficient computation of the MSRs Efficient computation of the MSRs

Compute the arrangement from the circular curves bounding the annuli in CGAL

slide-58
SLIDE 58

Efficient computation of the MSRs Efficient computation of the MSRs

Compute face depths using BFS Search dual graph

slide-59
SLIDE 59

Efficient computation of the MSRs Efficient computation of the MSRs

slide-60
SLIDE 60

Efficient computation of the MSRs Efficient computation of the MSRs

slide-61
SLIDE 61

Efficient computation of the MSRs Efficient computation of the MSRs

slide-62
SLIDE 62

Efficient computation of the MSRs Efficient computation of the MSRs

slide-63
SLIDE 63

Efficient computation of the MSRs Efficient computation of the MSRs

slide-64
SLIDE 64

Polynomial time computation of the MSRs Polynomial time computation of the MSRs

Let n be the number of distance restraints Let a be the max number of assignments per distance restraint There are O(an) curves in the arrangement

Search dual graph

slide-65
SLIDE 65

Polynomial time computation of the MSRs Polynomial time computation of the MSRs

O(a2n2) faces O(a2n2) edges Let n be the number of distance restraints Let a be the max number of assignments per distance restraint There are O(an) curves in the arrangement

Search dual graph

Expected O(a2n2) time

slide-66
SLIDE 66

Polynomial time computation of the MSRs Polynomial time computation of the MSRs

O(a2n2) nodes O(a2n2) edges O(a2n2) faces O(a2n2) edges Let n be the number of distance restraints Let a be the max number of assignments per distance restraint There are O(an) curves in the arrangement

Search dual graph

Expected O(a2n2) time

slide-67
SLIDE 67

Polynomial time computation of the MSRs Polynomial time computation of the MSRs

Compute face depths using BFS in expected O(a2n2) time O(a2n2) nodes O(a2n2) edges O(a2n2) faces O(a2n2) edges Let n be the number of distance restraints Let a be the max number of assignments per distance restraint There are O(an) curves in the arrangement

Search dual graph

Expected O(a2n2) time

slide-68
SLIDE 68

Polynomial time computation of the MSRs Polynomial time computation of the MSRs

Compute face depths using BFS in expected O(n2) time O(n2) nodes O(n2) edges O(n2) faces O(n2) edges Let n be the number of distance restraints Let a be bounded by a small constant There are O(n) curves in the arrangement

Search dual graph

Expected O(n2) time

slide-69
SLIDE 69

Oligomeric assembly using DISCO Oligomeric assembly using DISCO

Assemble oligomer using the symmetry Compute symmetry axis orientation Compute symmetry axis position

slide-70
SLIDE 70

Oligomeric assembly using DISCO Oligomeric assembly using DISCO

Compute symmetry axis position

DISCO guarantees:

  • Compute all satisfying symmetry axis positions
  • Runs in polynomial time
slide-71
SLIDE 71

Validation using experimental data Validation using experimental data

  • Diacylglycerol Kinase

– DAGK – Membrane protein Homotrimer 121x3 residues 67 orientational restraints 23 distance restraints

Van Horn et al., 2009

Ground Truth aka Reference Structure

PDB: 2KDC Model 1

slide-72
SLIDE 72

Symmetry axis orientations computed by DISCO Symmetry axis orientations computed by DISCO

Using 67 restraints on orientation Using 67 restraints on orientation

slide-73
SLIDE 73

Symmetry axis positions computed by DISCO Symmetry axis positions computed by DISCO

Using 23 uncertain distance restraints Using 23 uncertain distance restraints

slide-74
SLIDE 74

Individual structure evaluation Individual structure evaluation

slide-75
SLIDE 75

Ensemble of structures computed by DISCO Ensemble of structures computed by DISCO

Reference DISCO

All structures within 1.5 Å backbone RMSD

slide-76
SLIDE 76

Acknowledgements Acknowledgements

We are very grateful for funding from:

The National Institutes of Health

DISCO is open source software

slide-77
SLIDE 77