global molecular replacement for protein structure
play

Global Molecular Replacement for Protein Structure Determination - PowerPoint PPT Presentation

Global Molecular Replacement for Protein Structure Determination Ian Stokes-Rees SBGrid - Harvard Medical School SBGrid and NEBioGrid Cornell U. Washington U. School of Med. R. Cerione NE-CAT T. Ellenberger B. Crane R. Oswald D. Fremont


  1. Global Molecular Replacement for Protein Structure Determination Ian Stokes-Rees SBGrid - Harvard Medical School

  2. SBGrid and NEBioGrid Cornell U. Washington U. School of Med. R. Cerione NE-CAT T. Ellenberger B. Crane R. Oswald D. Fremont S. Ealick C. Parrish Rosalind Franklin NIH M. Jin H. Sondermann M. Mayer D. Harrison A. Ke UMass Medical U. Washington U. Maryland W. Royer T. Gonen E. Toth Brandeis U. UC Davis N. Grigorieff H. Stahlberg Tufts U. K. Heldwein UCSF Columbia U. JJ Miranda Q. Fan Y. Cheng Rockefeller U. R. MacKinnon Stanford A. Brunger Yale U. K. Garcia T. Boggon K. Reinisch T. Jardetzky D. Braddock J. Schlessinger Y. Ha F. Sigworth CalTech E. Lolis F. Zhou Harvard and Affiliates P. Bjorkman W. Clemons N. Beglova A. Leschziner Rice University G. Jensen S. Blacklow K. Miller D. Rees E. Nikonowicz B. Chen A. Rao Y. Shamoo Vanderbilt J. Chou T. Rapoport Y.J. Tao J. Clardy M. Samso WesternU Center for Structural Biology M. Eck P. Sliz W. Chazin C. Sanders M. Swairjo B. Furie T. Springer B. Eichman B. Spiller R. Gaudet G. Verdine M. Egli M. Stone UCSD M. Waterman M. Grant G. Wagner B. Lacy T. Nakagawa S.C. Harrison L. Walensky H. Viadiu Thomas Jefferson J. Hogle S.Walker D. Jeruzalmi T.Walz J. Williams D. Kahne J. Wang Not Pictured: T. Kirchhausen S. Wong University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan

  3. Primary thesis : Molecular replacement, used to solve over 60% of known structures, can benefit from novel computationally intensive techniques to identify search models, including those with low sequence identity or a lack of previous association with the unknown structure. Expected benefits : identify search models which would otherwise be missed; faster bootstrapping of MR search model selection; broaden range of structures amenable to MR, avoiding more costly phasing techniques; allow greater parameter tuning of MR stage; Transferable infrastructure: framework developed to support 20,000 CPU-hour computation with 10 GB of data,100,000 invocations of a scientific application, and the consequent results filtering, aggregation, and analysis can be re-used for other applications.

  4. Traditional Molecular Replacement one, or maybe more carefully selected search model Internal Validation 10-20 Hit 0.1 CPUh Solutions + Refinement Validation Target Data 0.8 0.900 0.6 0.675 0.4 0.450 0.225 0.2 0 0 1 2 3 4 5 1 2 3 4 5

  5. Global Molecular Replacement 95,000 carefully edited search models External Validation ~50K Hit 9500 CPUh Solutions + Refinement Validation Target Data Score Individual Models

  6. Small Physical Differences, Big Impact On Results TARGET MODEL A MODEL B MODEL C differences in loops, and shifts of the secondary structure elements degrade results

  7. • Would global search work? What are the boundaries of global search method? • What is the best scoring function? • Is MR Score related to RMSD/Sequence Identity of target molecule • Real Life example

  8. Target I: 2VLJ T cell receptor (4 Immunoglobulin Domains) V α V β influenza-virus matrix peptide α 12 presentation of the peptide β 2m by the major Histocompatibility Complex (MHC) molecule α 3 (2 Immunoglobulin Domains + peptide binding domain)

  9. 1 x MHC domain + 6 x Ig domain SCOP b.1.1.2 - antibody constant domain-like, 2535 domains ~12.5% by MW V α V β SCOP b.1.1.1 - antibody variable domain-like, 2001 domains α 12 SCOP d.19.1.1 - MHC antigen recognition domain, 568 domains ~22% by MW β 2m α 3 Molecular Weight of the complex: 94.495 kDa

  10. Selection Criteria: • a multidomain protein • wide range of models Bjorkman et al. Structure of the human class I histocompatibility antigen, HLA- V α A2. Nature (1987) vol. 329 (6139) pp. 506-12 V β Garboczi et al. Structure of the complex between human T -cell receptor, viral peptide and HLA-A2. Nature (1996) vol. 384 (6605) pp. 134-41 Phaser - round I α 12 Search with 95K SCOP models β 2m 5 min timeout α 3 2000 CPU cores on OSG 24h

  11. 2D representation of MR results Top Scoring Solution: 1im3a2 1im3a2 100%, 181aa positive ( TFZ=13,LLG=92) α 12 domains SCOP class: d.19.1.1 LLG (strongest predictor) 2vlj R factor (weak predictor): negative TFZ (good predictor)

  12. 18% Phaser - round II 14% Fix the α 12 domain 13% V α V β Repeat MR search 31% with the 95K SCOP dataset α 12 β 2m 5 min timeout 2000 CPU cores on OSG α 3 24h 22%

  13. Two solutions for Ig domains from TCR 1g3iv_ (5.4,46) HSLUV PROTEASE-CHAPERONE COMPLEX false positive 1ogad1 C A (7.9,43) LLG 1ogae1 B (6.8/37) Quick Refinement: C R factor B RFZ A above 55

  14. Domain A12 placed, searching for next domain 1ogad1 100%, 115aa 1kgce2 (19.2,220) 1ogae1 99.2%, 129 aa 100%, 114aa E D A B

  15. Refinement 3 cycles of Rigid Body three domains added rigid 42.26/43.74 40.78/42.75

  16. 4 domains placed, searching for 3 remaining domains b.1.1.2 #1 Top 280 solutions with B2M SCOP domains #2 #3 Highest Scoring TCR D2 ranks as #345 D2 b.1.1.1 #1: 1agdb - 100% B2M, 99aa #2: 2bnra1 - 100% A3, 95aa B2M #3: 1kgcd2 - 100% D2, 89aa A3

  17. Refinement 3 cycles of Rigid Body three domains added rigid 40.78/42.75 42.26/43.74 32.23/34.95 Solved!

  18. • Would global search work? What are the boundaries of global search method? • What is the best MR scoring function? • Is MR Score related to RMSD/Sequence Identity of target molecule • Real Life example

  19. Common approach to molecular replacement: Least Squares match difference between scalar amplitudes Least Squares : commonly used for molecular replacement model quality measure observations select model with minimum error between observed real-space amplitudes | F O | and calculated amplitudes | F C | parametric model to equivalent fit to observations magnitude of vector difference Iterative Convergence : Rotate search model (3D RF) then translate (3D TF) to find best (lowest) least squares fit Problem : Implicitly biased towards Solution Quality : Typically measured model to select h (structure by heuristic score, or residual factor parameters) based on model phasing (measure of agreement between solution and experimental observations)

  20. Phaser performs better (although more CPU demanding) Phaser Molrep (maximum likelihood) (Crowther rotation + FFT in reciprocal space) LLG positive negative TFZ Fast and slow searches return comparable results Clear separation between two populations!

  21. Extended range of correct solutions! extended TFZ/LLG Region α 12 2ak4f2 traditional TFZ 80% region 2nx5q2 60%, B=44 1mhca2 LLG 60%, B=24 2mhac2 72% traditional: TZF > 7 extended: TZF> 4 TFZ

  22. Rotation Function Score MHC molecules LLG heat

  23. • Would global search work? What are the boundaries of global search method? • What is the best MR scoring function? • Is MR Score related to RMSD/Sequence Identity of target molecule • Real Life example

  24. Search for the first molecule: Seq ID heat MHC MHC Ig Ig With small fraction of target (~22%) For Ig domains (~12%) sequence identity > 60% (rmsd < 1.5) required even 100% is barely sufficient

  25. Differences between A12 solutions 2nx5q2 (3.6,51) 84.8% A 1im3a2 (13,92) 2vlj 100% 1mhca2 B (6,49) 64%, C C 1zagb2 (4.8,31) SCOP ID 37.1%, W D d2fsea2 (TFZ/LLG) (3.1/14) 14.7%

  26. Structure Superimposition TARGET MODEL A MODEL B MODEL C differences in loops, and shifts of the secondary structure elements degrade results

  27. Ig Domains variable and constant Seq RMSD ID LLG LLG

  28. • Would global search work? What are the boundaries of global search method? • What is the best MR scoring function? • Is MR Score related to RMSD/Sequence Identity of target molecule • Real Life example

  29. 72% Solvent

  30. Sequence Identity < 20% 3 cycles of refinement in Phenix shift secondary structure elements and lower Rfac to 43%

  31. • NEBioGrid Django Portal • PyGACL Interactive dynamic web portal for Python representation of GACL model workflow definition, submission, and API to work with GACL files monitoring, and access control • osg_wrap • NEBioGrid Web Portal Swiss army knife OSG wrapper script to GridSite based web portal for file-system handle file staging, parameter sweep, DAG, level access (raw job output), meta-data results aggregation, monitoring tagging, X.509 access control/sharing, CGI • sbanalysis • PyCCP4 data analysis and graphing tools for Python wrappers around CCP4 structural structural biology data sets biology applications • osg.monitoring • PyCondor tools to enhance monitoring of job set and Python wrappers around common remote OSG site status Condor operations • shex enhanced Condor log analysis Write bash scripts in Python: replicate • PyOSG commands, syntax, behavior • xconfig Python wrappers around common OSG operations Universal configuration

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend