RNA bioinformatics
folding/interaction, stability/dynamics, thermodynamics/kinetics, …
1
Fabrice Leclerc fabrice.leclerc@u-psud.fr
November 16-17 2017
RNA bioinformatics folding/interaction, stability/dynamics, - - PowerPoint PPT Presentation
RNA bioinformatics folding/interaction, stability/dynamics, thermodynamics/kinetics, GT MASIM Fabrice Leclerc fabrice.leclerc@u-psud.fr November 16-17 2017 1 I2BC - Dept. M2GB Genomics, Molecular Genetics, Bioinformatics Team: RNA
1
Fabrice Leclerc fabrice.leclerc@u-psud.fr
November 16-17 2017
I2BC - Dept. M2GB Genomics, Molecular Genetics, Bioinformatics Team: RNA Sequence, Structure & function - D. Gautheret
Gautheret
RNA-RNA interactions, RNA-protein interactions - F. Leclerc
regulations - J. Lehmann
platform), 2 Ph.D. students, 2 CDI IE, (1 invited professor)
specificity (modeling/design)
3
4
Khvorova et al., Nat. Struct. Biol., 2003 de la Peña et al., EMBO J., 2003 Canny et al., JACS, 2004 Wang et al., Biochem., 1999
4
5 5’
G
A
G G G C C C G G C U C C C G C C C U C U C C G G G G A A U C G U G A A C C G G G G G U U C C G G C C G G G C C U A C A 10 20 30 40 50
5' 3' Kloop Internal Loop ANA Loop 5' guide sequence 3' guide sequence
A A G C U G G G G C G G A
3’
target
rRNA (tRNA)
guide
6
Pab_HACA
C Y C Y G A R U G A R G R G A C A
7-9 bp
0-1 nt 5-8 nt 5-27 nt 0-1 nt 0-1 nt 0-1 nt 0-3 nt
8-11 bp
0-6 nt 3-20 nt 5´
lower stem GA stem
17-18 nt 5´
5´ guide sequence 3´ guide sequence K-turn/K-loop
A
Pab/Pfu/Afu (PDB)
G G G Y C C G G U G A A U G A C C G U G R C C CA C A
9 bp
5-6 nt 6 nt
10 bp
3-5 nt 5´
L7Ae Nop10 Cbf5 Gar1
LSU
target
ANA 5' U C C G C U C U G C G G G U C G G G
guide
5'
ARN guide/ARN cible
Toffano-Nioche et al., 2013; Toffano-Nioche et al., 2015
variable-length region variable-length loop connector (zero length) modular sub-structure variable-length stem 75% connector (zero length) 90% 97% 75% 50% nucleotide present nucleotide identity 75% N N 97% N 90% covarying mutations base pair annotations compatible mutations no mutations observed
6
11 guide motifs (7 genes) 23 potential targets
15 productive guide-target
H/ACA(-like) guide:targets candidates productive fold guide:target energetics non-productive fold stable productive duplex 7-9,11/j/k 10/5-6/9 unstable hybrid or 5' duplex guide:target base-pairing productive H/ACA guide:target mispaired RNA:RNA duplex 13 2 2 6 Ehybrid ≤ -27 & E5'duplex ≤ -6.0 refolding 15 2 6 15 8 15 8
tested rRNA targets
non-productive H/ACA guide:target
7
C C C R C Y G A R U G A R G G Y G G G YA C A
9 bp
0-1 nt
5-6 nt
5-23 nt 0-1 nt 0-1 nt 0-1 nt
10 bp
3-21 nt 5´
A G C G A G G G U G C G A U U G C U A C C U C G C U
guide target
5´
2588
LSU E = -30.2 kcal/mol E5’= -6.2 kcal/mol
ANA 5'
10 5 9
C G U C U C U G C A G A G U C G
guide target
5' ANA 5'
E = -29.5 kcal/mol E5’= -9.5 kcal/mol
Toffano-Nioche et al., 2015
12 subfamilies
9
RNA interference
A B
RNA-RNA interactome
5
A B
U G G
5 G G G ’ 3
U G G ' 3
G G G ' 5
5 G
A B C
CCC G U ’ 3
A B
G U U
5 U G G G ’ 3
U G UGGG-5' ' 3
B A B A B
’ 3
B
C A C G
C CC
U G U G U C A C G U U G U G . . . . . . . . . . . . . . .
RNA assembly
ribozymes
monomer 1 HI HII HIII
79
g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g 1a 10a 20a 30a 40a 50a 60a 70a 79a 1b
A
Emonomer(10ºC) = -26.9 kcal/mol Emonomer(25ºC) = -19.4 kcal/mol Emonomer(45ºC) = -9.6 kcal/mol
g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1 10 20 30 40 50 60 70 79 intra-molecular base-pairs tertiary or inter-molecular contacts nucleotide in tertiary contact cleavage site
10
11
Template Model
12
SANS: Small Angle Neutron Scattering Institut Laue-Langevin (ILL), Grenoble
B
Eint(10ºC) = -9.3 kcal/mol Eint(25ºC) = -8.5 kcal/mol Eint(45ºC) = -5.8 kcal/mol Edimer(10ºC) = -47.1 kcal/mol Edimer(25ºC) = -33.6 kcal/mol Edimer(45ºC) = -15.7 kcal/mol
g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1b g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1bmonomer 1 monomer 2 HI HII HIII
30b13
seed sequences
IntaRNA
14
Edimer(10ºC) = -53.7 kcal/mol Edimer(25ºC) = -38.9 kcal/mol Edimer(45ºC) = -19.2 kcal/mol
g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1bC
g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1bmonomer 1 monomer 2 HI HII HIII
Leclerc et al., 2016
15
RMSD vs time (ns) Rg vs time (ns) Histogram of Rg Histogram of RMSD Rg (monomer 1) vs time (ns) Rg (monomer 2) vs time (ns) Histogram of Rg (monomer 1) Histogram of Rg (monomer 2)
RMSD (Å)
5 10 15
Rg (Å)
25 30 35 40 45
time (ns)
10 20 30 40 50 2500 5000 7500 10000125001500017500
16
17
PTB PABP U1A U2AF65 U2AF35 SF1 NusA Vigilin ADAR2 PKR RNase III Dicer Staufen TFIIIA TTP 500 1,000 1,500 2,000 Deaminase Endonuclease Kinase KH Helicase PAZ NusA N-terminal PABP dsRBD RRM R/S S1 ZnF-CCHC ZnF-CCHH ZnF-CCCH
a RRM b KH c dsRBD d Zinc finger
C C C C N N N N 5′ 5′ 5′ 3′ 3′ 3′ Finger 1 Finger 2
Figure 3 | How RNA-binding modules recognize RNA. a | Structure of the N-terminal
Lunde et al., Nat. Rev. Mol. Cell Biol., 2007
18
Zinc Finger
MCSS
TIS11d (CCCH)
A
N N N N NH2 O OH OH H H H H O P
O- O
U
NH O O N O OH OH H H H H O P
O- O
3D/NMR 2.2Å ≤ RMSD ≤ 3.6Å
PDB ID: 1RGO contraint n ≥ 8
Molpy CHARMM
19
Chauvot de Beauchene et al., 2016
trinucleotide libraries RRM
20 POSEVIEW S1 (3’) S2 (5’) H ∆S d M)
Ligand N ΔH TΔS Kd(μM) 5’-AMP 0.89
124 5’-GMP 1.14
568 3’-CMP 0.99
51.6 3’-TMP 0.99
15 3’-UMP NA NA NA 9.7
S1: U>C S2: A>G
Doucet et al., Proteins, 2010
21
box setting & fragment distribution (MCSS) RNase A
(PDB ID: 1RCN)
22
nucleotides RNase A
(PDB ID: 1RCN)
nucleobases ribose-phosphate A, U, C, G ribose, phosphate
focusing on high density regions RNase A
(PDB ID: 1RCN)
23×23×23Å3
23
17×17×17Å3
1 bits
1
G
2
G
S1 (U>C) cluster S1 S2 (A>G) cluster S2
Ligand ∆H nclust T∆S ∆G ∆Gexp 5’-AMP
15 7.5
21 11
29 15
5’-UMP
52 26
PDB ID: 1RCN Ligand ∆H nclust T∆S ∆G ∆Gexp 5’-AMP
27 14
5’-GMP
19 10
5’-CMP
31 16
25 13
24
Strasbourg
26