SLIDE 1
15 July 2017 | CCPN conference 2017, Stirling University, UK Malgosia Duszczyk, Departement D-BIOL, Institute of Molecular Biology and Biophysics Frédéric Allain lab
De novo structure determination of a 27.5 kDa protein- RNA complex: a Dead End for classical NMR approaches?
SLIDE 2 Dead End (Dnd1) in zebrafish development & mice
Weidinger et al. Curr. Biol. 13 (2003), 1429-1434
- Vertebrate-specific germ cell viability mediator
- Expressed in primordial germ cells (PGCs) and is essential for their correct
migration in zebrafish
- Dnd1 deletions lead to PGC death
- Ter mutant mice display germ-cell
loss and testicular tumors
SLIDE 3 Dnd1 is linked to miRNA regulation & cancer
- Two miRNA targets of miR-430 (nanos; tdrd7) persist in zebrafish germ
cells while miR-430 is present
- The human homologue miR-373 family acts as an oncogene by repressing
tumour supressors LATS2 and p27
- Genetic screens found that Dnd1 is associated with this phenomenon by
protecting mRNAs from miRNA-mediated repression in human cell culture and zebrafish PGCs
- HOW?? Through interaction with conserved U-rich regions (URRs) in
3’UTRs of these targets Dnd1 blocks miRNA accessibility Kedde et al. (Agami group NKI) Cell 131, 1273–1286, December 28, 2007
SLIDE 4 Canonical RRM fold & RNA recognition
- The RNA Recognition Motif is
the most abundant RBD in higher vertebrates
- Binds primarily ssRNA
- babbab fold: 4-strand beta
sheet packed on two alpha- helices
sequences (RNP1 & RNP2) with exposed aromatic residues accommodate two RNA nucleotides by stacking
- RNA is usually bound 5’-3’ in
the direction from b4 to b2
3’
RNP1:[R/K]-G-[F/Y]-[G/A]-[F/Y]-[I/L/V]-X-[F/Y] RNP2:[I/L/V]-[F/Y]-[I/L/V]-X-N-L
N C 5’ RNA 3’
Adapted from 2UP1 (Ding et al. 1999)
SLIDE 5
U1A,U2B’’ Pre-mRNA splicing Sex-lethal, HuD Alternative-splicing PABP mRNA stability
RRM2 RRM1 RRM3 RRM4
PTB/hnRNP L Alternative-splicing
To exert their different functions RBPs often harbor multiple not identical copies of RBDs
RRM1 RRM2
CPEB1/4 mRNA stability/ Alternative-splicing
RRM1 RRM2
hnRNP A1 Alternative-splicing
RRM1 RRM2
TDP43 Alternative-splicing Dnd1 microRNA-mediated repression inhibition
RRM1 RRM2 dsRBD
Canonical RNPs Non-canonical RNPs
SLIDE 6 Both RRMs of Dnd1 necessary for tight binding to p27 3’UTR URR1
Dnd1 RRM2 + p27 URR 1/2 No Binding Dnd1 RRM1 + p27 URR1b RNA Kd = 41uM
1) 2) 3)
+ CUUAUUUG + CUUAUUUG
ITC ITC NMR
15N HSQC ‘fingerprint’ RRM12 RRM12+ CUUAUUUG
Dnd1 RRM12 + p27 URR1b RNA Kd = 1.2uM
5’-AAGCGUUGGAUGUAGCAUUAUGCAAUUAGGUUUUUCCUUAUUUGCUUCAUUGUACUACCUGUGUAUAUAGUUUUUACCUUUUAUGUAGCACAUAAACUUU-3’ miRNA seed 1a 1b 2a 2b miRNA seed URR1 URR2
27.5 kDa complex
SLIDE 7
Specificity? Dnd1 RRM-p27URR interaction data
SLIDE 8 CUUAUUUG binding site mapping by NMR
RNP2 VFIGRL RNP1 RGFAYA Kedde: Y94>C mutant inactive
- Note that some putative RNA binding residues cannot be traced due to large shift
- r exchange in complex (e.g. RNP1 Y94)
- As expected from mutational analysis RNA binding by canonical RRM1 beta-
sheet surface but unexpectedly non-canonical elements on RRM1 (a0b0) and 2 (a2b4)
SLIDE 9
RRM12-URR1-CUUAUUUG complex 15N relaxation
18ns ~30kDa
SLIDE 10
NMR spectroscopy is an important method to solve small to medium sized RNAs and protein-RNA complexes
PDB statistics (09.07.2017) Method: Xray solNMR EM ssNMR All structures: Molecule All 117968 11827 102 1594 RNA only 783 522 29 1 Protein-RNA 1557 116 348 under 40kDa: Molecule All 44736 11564 75 76 RNA only 624 516 7 1 Protein-RNA 185 109 10
SLIDE 11 Solving protein-RNA structures using NMR spectroscopy: pipeline
- Production of Dnd1 RRM12-CUUAUUUG complex:
Recombinant protein expression Purchase of short unlabeled RNA oligos or synthesis using isotope labeled phosphoramidites In vitro transcription, used for longer RNAs using 13C/15N labeled and/or deuterated nucleotides not possible
From F. Nelissen RUNijmegen
SLIDE 12
Backbone assignment RRM12 – CUUAUUUG complex
227 aa’s + 8nt RNA = 27.4 kDa Fractional random deuteration (auto-induction in D2O medium) & TROSY Bruker experiments: trHNCACB2H3D CBCACONH3D Both 0.4mM and comparable measurement time trHNCO trHN(CA)CO trHNCA trHN(CO)CA trHNCACB trHN(CO)CACB 15N-NOEsy on 2H sample 93.4% assigned (22 prolines)
SLIDE 13 Structure determination strategy
- Free protein precipitates too rapidly for 3D experiments and in
complex above 298K If possible solve free protein structure first and work at as high temperature as possible
- Backbone assignment using TROSY-based triple resonance
experiments on randomly fractionally deuterated 27.4 kDa complex (93% assigned)
- Few signals in HCCCONH type sidechain experiments and many
missing and overlapped peaks in HcCH- and hCCH-TOCSY so sidechain assignment must be done mainly through NOESYs: Look for HN-HA/HB correlations in HN plane of n & n+1 at expected CA & CB shifts, compare to HN NOE strips, then move further out the sidechain at expected C-shift >> it helps a lot if you have homology model Use HcCH/hCCH-TOCSY or COSY (less overlap) where possible to confirm intra-spin system correlation
SLIDE 14 Structure determination strategy
- 3D-13C-HMQC-NOESY for higher sensitivity – NOE in acquisition
dimension for higher resolution
- Some RNA-interacting residues exchange broadened: some, but
not full improvement by addition of excess RNA
- Lots of overlap (e.g. 124 methyl groups)
If possible, try segmental labeling (protein ligation)
- Sidechain assignment of free RRM2
- Transfer of assignments to RRM12
- 1H assignment completeness 80% (75% RRM1, 88% RRM2)
- High number of prolines (22) > still many not assigned in RRM1
SLIDE 15
Protein-RNA complex structure calculation protocol
Peak picking & automated NOE assignment using ATNOS/CANDID
T.Herrmann et al. J. Mol. Biol. 319 (2002) 209
List of protein-protein distance constraints Manually assigned intra-RNA and intermolecular distance constraints H-bonds, dihedral angles, RDCs Structure calculation using CYANA: 100 structures
Güntert et al. J. Mol. Biol. 273 (1997), p. 283
Final refinement with Amber 12.0
Structures
Manual adjustment of peak lists, automatic NOE assignment using CYANA noeassign (more flexible approach, manual assignments kept)
SLIDE 16 ATNOS/CANDID protocol
Essential to obtain correct fold for multidomain protein: Calculate single domains first using:
- 3D 15N NOESY (H2O)
- 3D 13C ali-NOESY (D2O & H2O)
- 3D 13C aro-NOESY (H2O)
- 2D NOESy (D2O)
- H-bonds (based on non-exchanged HNs in D2O HSQC)
- TALOS restraints
- 140 upls from manually assigned NOEs
Then combine doing run with all chemical shifts and upls obtained from single domain runs
SLIDE 17 Strategy for structure calculations/improving structural statistics of Dnd1 RRM12
- ATNOS/CANDID fails to pick full peak lists – manual adjustments:
Spectrum # peaks AC # peaks man 3D 15N-NOESY 1947 3569 3D 13C-NOESYali(D2O) 2568 3956 3D 13C-NOESYali(H2O) 3627 5369 3D 13C-NOESYaro 382 420
- Peak lists, partly manually assigned (intensities) fed into CYANA for
noeassign procedure
- Using ‘KEEP’ for keeping manual assignments
Systematic underestimation of distances caused by spin diffusion
Adjusting calibration parameters to improve statistics dref for median peak volume (4.0 > 4.6A) upl_values (2.4..5.5A > 2.7..6.0A) elasticity (changevol) = small increase violated upls
SLIDE 18
Cycle-dependent parameters of automated NOE assignment: may be changed in noeassign.cya
J Biomol NMR (2015) 62:453–471 1.0 > 1.25 1.25 > 1.5 Changevol=true
SLIDE 19
Structure of Dnd1 RRMs within the RRM12-CUUAUUUG complex
RRM1 b4b1b3b2 a1 a2 Non-canonical extensions a0b0 N C Novel RRM - abb_babbab - fold 6-strand beta-sheet packed on triple helix RRM2 b4b1b3b2 a1 a2 N C Canonical RRM babbab fold C-term a helix independent a3 Mean global bb (130..208) RMSD: 0.64 Mean global bb (6..124) RMSD: 0.92
SLIDE 20
Structural ensemble RRM12-CUUAUUUG complex
RRM1 RRM2 3’ 5’ C N Input MD: Intra protein: 3602 restraints Intra RNA: 171 restraints Intermolecular: 93 restraints RDCs: 112 restraints (NH) H-bonds 158 restraints Mean global bb (6..208) RMSD: 1.48 RNA (U3-U7): 1.50
SLIDE 21
T1/T2/hetNOE measurements show one rigid complex domain:domain contacts almost 100% via few RNA residues better orientation should be achieved with long distance restraints Tested alignment media:
- Pf1 at 12.5mg/ml > 15Hz D2O splitting > too strong alignment at
low salt (100mM KPO4 pH 6.5) > severe broadening even after dilution
- C12E5/hexanol at 4.2% OK > careful of temperature dependency of
aligned phase!!!
- N-H in aligned phase quite broadened > improvement with
deuterated sample?
RDCs for improving local structures & domain-domain
SLIDE 22
Initial fit RDCs to complex structure
RRM1 RRM2
SLIDE 23 RDC refinement Strategy
- ‘FindTensor’ procedure in CYANA to find initial tensor using SVD
and preliminary structure: Two separate tensors for the two RRMs
- Use initial tensors to refine free RRM structures (within the
RRM12-CUUAUUUG complex)
- Use .upls & refined tensors from these calculations in a new
calculation, now including the intra-RNA and intermolecular upls and RDCs > still using two tensors If all is well tensors should converge if domain-domain orientation is fixed
SLIDE 24
RDC fit to refined complex structure
RRM1 RRM2
SLIDE 25
Structure RRM12-CUUAUUUG complex
RRM1 RRM2 3’ 5’ C N Non-canonical extensions a0b0 a2 b4
SLIDE 26 Differences between protein and RNA NMR
- Spectral overlap 1H: 4nt vs
20aa, sugar 1H very alike
sequential assignment
Through space NOE based assignment (RNA)
SLIDE 27 RNA ribose assignment
Synthesis of selectively labelled oligos from 13C sugar labelled phosphoramidites CUUAUUUG
- Assignment using long 13CHSQC on
selectively ribose labelled complexes and sequential walk through and between the sugar spin systems 2D NOESY & TOCSY
- Several exchange broadened signals
- nly visible in overnight HSQC
C1’H1’ C4’H4’ C3’H3’ C2’H2’ C5’H5’/’’
SLIDE 28 In vitro transcription short RNA CUUAUUUG using ribozyme technology
Tm 20 degrees Mg2+ 48mM 10mM Problem: Co-transcriptional cleavage; it is not possible to separate 8-mer product from aborts produced during transcription on gel or HPLC RNA of interest
SLIDE 29 Strategy no. 2: using Csy4 nuclease from CRISPR system
RNA of interest Problem: ssRNA often self-cleaves during in vitro transcription, mainly at CA or UA steps
SLIDE 30 Step 1: transcription Step 2: RNAse H cleavage Step 3: VS ribozyme cleavage
+
Strategy no. 3: RNAse H cleavage Principle of the method
Trans vs ribozyme 2’OmethylRNA/DNA RNA of interest VS ribozyme SL Precursor RNA
SLIDE 31 Strategy no. 3: RNAse H cleavage
ssRNA of interest protected as ds during transcription & purification RNA of interest
SLIDE 32
Product RNAse H cleavage vs Thermo CUUAUUUG
aro13CHSQC ali13CHSQC 5’P 3’P-2’/3’ 5’OH 3’OH
SLIDE 33
Selectively ribose 13C labeled CUUAUUUG vs Thermo
13CHSQC
SLIDE 34 Use of editing/filtering in NMR to obtain intermolecular distance restraints
extraction of distances between protein & RNA
Protein Isotope labeled RNA unlabeled
SLIDE 35
Suite of edited/filtered 2D NOESYs for selectively labeled RNA & protein-RNA complexes
Peterson RD et al. J Biomol NMR. 2004 28(1):59-67.
SLIDE 36 F2f 13CRRM12 unlabelled RNA
- Mainly 2D Filtered/edited NOESYs
used for sensitivity reasons
- Only handful of strips in 3D13Cali-
F2f NOESY, mostly to methyls > can be used as relatively certain starting points for interNOE assignment
- Additional interNOEs from 2D
NOESY on unlabeled complex: region to H1’/H5 can be used to screen for interNOEs even without labeling (free protein vs complex)
SLIDE 37
Solving ambiguities in intermolecular NOEs using selectively ribose labelled RNAs
U5H1’ & U6H1’ overlap 12C-H (CUUAU5U6UG RNA) (CUUAU5U6UG RNA) (CUUAU5U6UG RNA) 13C-H (protein) U6H1’ U5H1’ F2f 2D NOESY
SLIDE 38
Protein-RNA interface RRM12-CUUAUUUG complex
RRM1 RRM2 3’ 5’ LRRM12: T127 – A4 b4: L122 – U2/3 b1: F53 (RNP2) – A4 b3: F92 & Y94 (RNP1) – U5 b2: M82 – U5 b0: V26 & V28 – U7 L2-3: T84 & F85 – G8 C N b4: W207 a2: K189 – U6
SLIDE 39 * R182 Frog S-QVLAVVKYDSHRAAAMAKKTLCEGSPILPGLPLTVNWLK Human P-GQIALLKFSSHRAAAMAKKALVEGQSHLCGEQVAVEWLK Chimp P-GQIAVLKFSSHRAAAMAKKALVEGQSHLCGEQVAVEWLK Horse S-AQIALLKFSSHRAAAMAKKALVEGQSRLCGEQVAVEWLK Mouse P-SQIALLKFSTHRAAAMAKKALVEGQSRLCGEQVAVEWLK Rat P-SQIALLKFSTHRAAAMAKKALVEGQSRLCGEQVAVEWLK Opossum H-TQIALLKFSSHRAAAMAKKALVEGRSKLCGDQVTVEWLK Medaka G--VSAVVAFSSHHAASMAKKALGEEFKKQFCLDISIKWLS Zebrafish GKEVVALVNYTSHYAASMAKKVLVEAFRNRYGISITVRWTS * * * M186 K189 W207
- Novel RNA-binding interface alpha helix 2 almost fully conserved
- Equivalent of K189T mutation only shotgun mutant outside of RRM1
to not rescue loss of germ cells in Dnd1 depleted zebrafish embryos
- Equivalent of R182 is Ter-mouse premature stopcodon
- Validation of these and other contacts on interface and specifically
recognized RNA nts using mutational analysis in biophysical assays (ITC) and luciferase based assays in progress
Validation of novel RRM2 helix2-beta4-RNA interface
SLIDE 40
Tracing RNA binding of mutants using NMR
Mutation of V26 in RRM1 extension destabilizes the protein but does not influence RNA binding However, mutation of K189 (RRM2 helix2) moves binding mode into intermediate exchange regime > evidence for impaired binding
SLIDE 41 Different strategies for tandem RRM-RNA recognition
CPEB4 TDP43 PABP HuD
RRM2 RRM2 RRM1 RRM1 RRM1 RRM1 RRM2 RRM2
Dnd1
RRM1 RRM2
SLIDE 42 Conclusion & Outlook
- Dnd1 harbours a novel abb_babbab – RRM fold consisting of a 6-strand
beta-sheet packed on triple alpha-helix
- The Dnd1 RRM12 – p27URR1 complex shows an unprecedented mode of
cooperative RNA recognition > in this arrangement RNA is fully buried over long sequence > fits physical blocking model (+ RNA remodeling? Change of localization?)
- Dnd1’s RRM12 recognize a 5-mer NAUNU motif, much shorter than the
- riginally postulated 12-mer URR
> this already assisted in identifying potential targets in LATS2 3’UTR
- De-novo structure determination of protein-RNA complexes of this size is
possible with classical NMR methods, although this will depend on the system (exchange regime, overlap)
- Toolbox may be limited by the (im)possibilities to produce a specific
isotope labeled or deuterated protein & RNA sequence
SLIDE 43
Acknowledgements
Group Hall ETHZ Ugo Pradere & Mauro Zimmermann (selectively labeled RNA synthesis) EMBL Heidelberg Bernd Simon & Tobias Munich Helmholtz Zentrum PEPF Arie Geerlof Gunter Stier (plasmids & expression protocols) Funding UBS Promedica Stiftung NCCR RNA&Disease SNF 120% grant support Group Allain Fred Allain Fred Damberger & Thea Stahel (NMR Support) Tamara Kazeeva & Christine von Schroetter (Technical Support) Julien Boudet (ITC Support) Fionna Loughlin & Antoine Clery (initial experiments) & TUM Madl (NMR pp)