RNA bioinformatics folding/interaction, stability/dynamics, - - PowerPoint PPT Presentation

rna bioinformatics
SMART_READER_LITE
LIVE PREVIEW

RNA bioinformatics folding/interaction, stability/dynamics, - - PowerPoint PPT Presentation

RNA bioinformatics folding/interaction, stability/dynamics, thermodynamics/kinetics, GT MASIM Fabrice Leclerc fabrice.leclerc@u-psud.fr November 16-17 2017 1 I2BC - Dept. M2GB Genomics, Molecular Genetics, Bioinformatics Team: RNA


slide-1
SLIDE 1

RNA bioinformatics

folding/interaction, stability/dynamics, thermodynamics/kinetics, …

1

Fabrice Leclerc fabrice.leclerc@u-psud.fr

November 16-17 2017

GT MASIM

slide-2
SLIDE 2

I2BC - Dept. M2GB Genomics, Molecular Genetics, Bioinformatics Team: RNA Sequence, Structure & function - D. Gautheret

  • RNA discovery: genomics, transcriptomics (cancer) - D.

Gautheret

  • RNA structure & interactions: snoRNAs, ribozymes-viroids,

RNA-RNA interactions, RNA-protein interactions - F. Leclerc

  • RNA function & processing: genetic code, RNA-based

regulations - J. Lehmann

  • 4 senior researchers (C. Toffano: vice-director of the eBio

platform), 2 Ph.D. students, 2 CDI IE, (1 invited professor)

slide-3
SLIDE 3

Some recurrent problems & issues

  • 2D&3D modeling of RNA variants/mutants from 3D structures
  • sparse structural data
  • experimental 3D structures & biological-physiological conditions
  • 2D&3D modeling of RNA-RNA interactions
  • dense RNA-RNA interactome
  • predict RNA interactions; 2D to 3D transposition
  • 3D modeling of RNA/protein interactions
  • RNA binding ? - binding interface - binding mode - binding

specificity (modeling/design)

3

slide-4
SLIDE 4

Structural/biochemical data inconsistencies

4

Khvorova et al., Nat. Struct. Biol., 2003 de la Peña et al., EMBO J., 2003 Canny et al., JACS, 2004 Wang et al., Biochem., 1999

41-nt 50-nt 56-nt 50-nt

4

slide-5
SLIDE 5

2D&3D structures of H/ACA box s(no)RNAs (guide)

5 5’

G

A

G G G C C C G G C U C C C G C C C U C U C C G G G G A A U C G U G A A C C G G G G G U U C C G G C C G G G C C U A C A 10 20 30 40 50

5' 3' Kloop Internal Loop ANA Loop 5' guide sequence 3' guide sequence

A A G C U G G G G C G G A

3’

target

rRNA (tRNA)

guide

slide-6
SLIDE 6

Structure/function relationships in H/ACA box s(no)RNAs

6

Pab_HACA

C Y C Y G A R U G A R G R G A C A

7-9 bp

0-1 nt 5-8 nt 5-27 nt 0-1 nt 0-1 nt 0-1 nt 0-3 nt

8-11 bp

0-6 nt 3-20 nt 5´

lower stem GA stem

17-18 nt 5´

5´ guide sequence 3´ guide sequence K-turn/K-loop

A

A

D

Pab/Pfu/Afu (PDB)

G G G Y C C G G U G A A U G A C C G U G R C C CA C A

9 bp

5-6 nt 6 nt

10 bp

3-5 nt 5´

L7Ae Nop10 Cbf5 Gar1

LSU

target

ANA 5' U C C G C U C U G C G G G U C G G G

guide

5'

ARN guide/ARN cible

Toffano-Nioche et al., 2013; Toffano-Nioche et al., 2015

variable-length region variable-length loop connector (zero length) modular sub-structure variable-length stem 75% connector (zero length) 90% 97% 75% 50% nucleotide present nucleotide identity 75% N N 97% N 90% covarying mutations base pair annotations compatible mutations no mutations observed

6

11 guide motifs (7 genes) 23 potential targets

15 productive guide-target

slide-7
SLIDE 7

2D(3D) structural&energetic rules to classify “productive” guide RNAs

H/ACA(-like) guide:targets candidates productive fold guide:target energetics non-productive fold stable productive duplex 7-9,11/j/k 10/5-6/9 unstable hybrid or 5' duplex guide:target base-pairing productive H/ACA guide:target mispaired RNA:RNA duplex 13 2 2 6 Ehybrid ≤ -27 & E5'duplex ≤ -6.0 refolding 15 2 6 15 8 15 8

tested rRNA targets

non-productive H/ACA guide:target

7

C C C R C Y G A R U G A R G G Y G G G YA C A

9 bp

0-1 nt

5-6 nt

5-23 nt 0-1 nt 0-1 nt 0-1 nt

10 bp

3-21 nt 5´

  • 1. motif 10/5-6/9
  • 2. guide-target stability

A G C G A G G G U G C G A U U G C U A C C U C G C U

guide target

2588

LSU E = -30.2 kcal/mol E5’= -6.2 kcal/mol

ANA 5'

10 5 9

C G U C U C U G C A G A G U C G

guide target

5' ANA 5'

E = -29.5 kcal/mol E5’= -9.5 kcal/mol

Toffano-Nioche et al., 2015

  • 3. duplex 5’ stability
slide-8
SLIDE 8

2D aligned structures of archaeal H/ACA box s(no)RNAs

HACAprodfold Af190 U G C C C G C C A G C C A UG C G U U C C C U G A U U G G U G A G G G G A A U UC C A C U U G G C G G G C UA C A 70 9 bp 60 10 20 10 bp 50 30 K-loop 40 ANA box Af4.1 G U C C C C G A U C G G G GA A G A G G C A G A G G U G G C A G U G C C G A C G A U G A A G C U U C U C UG C G A G C U C G G G G A CA C A 70 9 bp 10 60 20 10 bp 50 K-turn 40 30 ANA box Af4.2a A C C A U G U C G A U A G GU C C C C U U A G G A G U G A G G C GA AA G C C U U A C G A U G A G U A U G G G GC A A C C CG G A C A U G G GA C A 70 9 bp 60 10 20 10 bp 50 K-turn 30 40 ANA box Af4.2b A C C A U G U C G A U A G GU C C C C U U A G G A G U G A G G C GA AA G C C U U A C G A U G A G U A U G G GGC A A C C CG G A C A U G G GA C A 9 bp , 70 60 10 20 10 bp 50 K-turn 30 40 ANA box Af4.3 G C C C C C A G A G G U C AG C C A C U C U G A G A G G G C A A A G C C U G C G A U G A G G G G U G G UG U U A C U C U G G G G G CA C A 70 9 bp 60 10 20 10 bp 50 K-turn 40 30 ANA box Af46 A G C U C C G C C C C C U C A C G C C C G G G U G A GA A G C A U G A U C C C G G G UC G G U U G G C G G A G C UA C A 70 9 bp 60 10 20 10 bp 50 30 K-loop 40 ANA box Hvo.1 U G C G U A C C U C A A G UC C C C G G C C G A G U G U U C C C GC U UC G G G A G C G A U G A C A G C A C G GC GA A C C C G G G U G C G C AA U A 9 bp 70 10 60 20 10 bp K-turn , 50 30 40 ANA box , 80 Hvo.2a C G C C C G G C A C G A G G G UU U C C C G G U C G A C G C G GC AC G CC G C C U C GGG A U G A G A C C G G C CGU U A G U G U U C U G G G C GA C A 9 bp 10 70 20 60 10 bp K-turn 50 30 40 80 ANA box Hvo.2b C G C C C G G C A C G A G G GU U U C C C G G U C G A C G C G GC AC G CC G C C U C GGG A U G A G A C C G G C C GUU A G U G U U C U G G G C GA C A 9 bp 70 10 20 60 10 bp K-turn 50 30 40 80 ANA box Pa105.1-L2554 C C G C C C G G A G G C C CG A C C G A G G G A G C G U G C C G AG A A A G G C G C G C C A U G A A C G A G G C GA C G U C G C C G G G C G GA C A 9 bp 70 10 60 20 10 bp 50 K-turn 30 40 ANA box Pa105.2-L1377 G G G C C C G G U C U C C GG G G C C G C C U G A G G U U G C C G AC A A C G G C G G G C A A U G A G G G C G G G UG G A U A A G C C G G G C C UA U A 9 bp 70 10 60 20 10 bp 50 K-turn 30 40 ANA box Pa105.2-L2794 G G G C C C G G U C U C C GGG G C C G C C U G A G G U U G C C G AC A A C G G C G G G C A A U G A G G G C G G G UGG A U A A G C C G G G C C UA U A 9 bp 70 10 60 20 10 bp 50 K-turn 30 40 ANA box Pa105.2-S27 G G G C C C G G U C U C C GG G G C C G C C U G A G G U U G C C G AC A A C G G C G G G C A A U G A G G G C G G G UG G A U A AG C C G G G C C UA U A 9 bp 70 10 60 20 10 bp 50 K-turn 30 40 ANA box Pa105.2-S995 G G G C C C G G U C U C C GGG G C C G C C U G A G G U U G C C G AC A A C G G C G G G C A A U G A G G G C G G G UGG A U A A G C C G G G C C UA U A 9 bp 70 10 60 20 10 bp 50 K-turn 30 40 ANA box Pa160-S922 U A Pa19-S1017 A Pa21-S891 K-loop Pa35.1-L2672 40 Pa35.1-L2930 40 C Y Y G A R U G A G R G A C A 9 bp 0-1 nt 0-1 nt 0-1 nt 5-6 nt 5-23 nt 0-1 nt 0-1 nt 0-1 nt 0-1 nt 0-1 nt 10 bp 0-6 nt 0-1 nt 3-20 nt 5´ G C C C C C G G A A A C C GC G G G G G A G G A G C U U A G G C UU AA G C C G A G C U A U G A C U C C C C U UC G C U C C C C G G G G G CA C A 70 9 bp 60 10 20 10 bp 50 K-turn 30 40 ANA box 5´ G C C C A G G G G U C A A G A C G GCG G C G U C G G G G G G A U U G G G G G C A A A G C C C C C G G C A U G A A C C C C G C CC U C C U C C C C U G G G UA U A 9 bp 70 10 20 60 10 bp 30 K-turn 50 40 80 ANA box 5´ G G G C C C G G C U C C C G C C C U C U C C G G G G A A U C G U G A A C C G G G G GU U C C G G C C G G G C C UA C A 70 9 bp 60 10 20 10 bp 50 30 K-loop 40 ANA box 5´ G G G C U C G G U C U A C CCG C C C C C G C A A G G U GU U C G GG U U C G A U G A G C G G G GU G UGC U C A C G C C G A G C C CA C A 9 bp 70 10 60 20 10 bp K-turn 50 30 40 80 ANA box 5´ G G G C U C G G U C U A C CCG C C C C C G C A A G G U GU U C G GG U U C G A U G A G C G G G GU G UGC U C A C G C C G A G C C CA C A 9 bp 70 10 60 20 10 bp K-turn 50 30 40 80 ANA box Pa35.2-L2250 G G G C C G G U G C A U C CG C C C G C G G G A U C A A U G A C C G C C G G GU CU C U G U U G C C G G C C UA C A 70 9 bp 60 10 20 10 bp , 50 30 K-loop 40 ANA box Pa35.2-L2549 G G G C C G G U G C A U C CG C C C G C G G G A U C A A U G A C C G C C G G G U CU C U G U U G C C G G C C UA C A 70 9 bp 10 60 20 10 bp , 50 30 K-loop 40 ANA box Pa35.2-L2697 G G G C C G G U G C A U C CG C C C G C G G G A U C A A U G A C C G C C G G GU CU C U G U U G C C G G C C UA C A 70 9 bp 60 10 20 10 bp , 50 30 K-loop 40 ANA box Pa40.1-L2588 U G C C C C C G C A A G C G AG G G C C U G G U C G A U U A G U G A G A C C A G G UG C G A C G C G G G G G C UA C A 70 9 bp 60 10 20 10 bp 50 30 K-loop 40 ANA box Pa40.2-L1932 G C C C G G C C U C A G C G A G G U C C C C U CGG U A G GUG C C U U C C G C G U CA CG G A GC G C C G U G A C C G G G G G U A A C C C U G G C C G G G CA C A 9 bp 90 10 20 30 80 10 bp 40 50 K-loop 60 70 ANA box Pa40.3-L2016 G G C C C G U C U G G G U UA G C C C G C C U G A U C A UG C CG U UG G C U U A G A U G A A G G C G G G UG U U A C G G G C G G G C UA C A 9 bp , 70 10 60 20 10 bp 50 K-turn 30 40 ANA box Pa91-L2685 C C C C U C C C C U C U C AC A C C U C C G G G A U C A G U G A C C G G A G G GC G G U C G G G G A G G G GA C A 70 9 bp 60 10 20 10 bp 50 30 K-loop 40 ANA box

12 subfamilies

slide-9
SLIDE 9

RNA-RNA interactions

9

RNA interference

A B

RNA-RNA interactome

5

A B

U G G

5 G G G ’ 3

  • GUU

U G G ' 3

  • U

G G G ' 5

5 G

A B C

CCC G U ’ 3

  • GUU

A B

G U U

5 U G G G ’ 3

  • U

U G UGGG-5' ' 3

  • A

B A B A B

’ 3

  • A

B

C A C G

C CC

U G U G U C A C G U U G U G . . . . . . . . . . . . . . .

RNA assembly

ribozymes

slide-10
SLIDE 10

2D structure of Hammerhead RNA

monomer 1 HI HII HIII

79

g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g 1a 10a 20a 30a 40a 50a 60a 70a 79a 1b

A

Emonomer(10ºC) = -26.9 kcal/mol Emonomer(25ºC) = -19.4 kcal/mol Emonomer(45ºC) = -9.6 kcal/mol

g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1 10 20 30 40 50 60 70 79 intra-molecular base-pairs tertiary or inter-molecular contacts nucleotide in tertiary contact cleavage site

10

slide-11
SLIDE 11

3D Modeling of HHR variant

11

Template Model

slide-12
SLIDE 12

SANS* & Modeling

dexp = 96.0Å dcalc = 96.7Å Rgexp = 31Å Rgcalc = 29 (26)Å

12

SANS: Small Angle Neutron Scattering Institut Laue-Langevin (ILL), Grenoble

slide-13
SLIDE 13

HHR self-association

B

Eint(10ºC) = -9.3 kcal/mol Eint(25ºC) = -8.5 kcal/mol Eint(45ºC) = -5.8 kcal/mol Edimer(10ºC) = -47.1 kcal/mol Edimer(25ºC) = -33.6 kcal/mol Edimer(45ºC) = -15.7 kcal/mol

g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1b g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1b

monomer 1 monomer 2 HI HII HIII

30b

13

seed sequences

IntaRNA

slide-14
SLIDE 14

HHR self-association & loss of catalytic activity

14

Edimer(10ºC) = -53.7 kcal/mol Edimer(25ºC) = -38.9 kcal/mol Edimer(45ºC) = -19.2 kcal/mol

g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1b

C

g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c g g u u c u u c c c a u c u u u c c c u g a a g a g a c g a a g c a a g u c g a a a c u c a g a g u c g g a a a g u c g g a a c a g a c c u g g u u u c g u c 1a 10a 20a 30a 40a 50a 60a 70a 79a 10b 20b 30b 40b 50b 60b 70b 79b 1b

monomer 1 monomer 2 HI HII HIII

slide-15
SLIDE 15

3D Model of HHR self- assembly

Leclerc et al., 2016

15

slide-16
SLIDE 16

Dynamics of HHR(-) dimer

RMSD vs time (ns) Rg vs time (ns) Histogram of Rg Histogram of RMSD Rg (monomer 1) vs time (ns) Rg (monomer 2) vs time (ns) Histogram of Rg (monomer 1) Histogram of Rg (monomer 2)

RMSD (Å)

5 10 15

Rg (Å)

25 30 35 40 45

time (ns)

10 20 30 40 50 2500 5000 7500 10000125001500017500

16

slide-17
SLIDE 17

RNA Binding Proteins (RBPs): Modularity, dsRNA/ssRNA

17

PTB PABP U1A U2AF65 U2AF35 SF1 NusA Vigilin ADAR2 PKR RNase III Dicer Staufen TFIIIA TTP 500 1,000 1,500 2,000 Deaminase Endonuclease Kinase KH Helicase PAZ NusA N-terminal PABP dsRBD RRM R/S S1 ZnF-CCHC ZnF-CCHH ZnF-CCCH

a RRM b KH c dsRBD d Zinc finger

C C C C N N N N 5′ 5′ 5′ 3′ 3′ 3′ Finger 1 Finger 2

Figure 3 | How RNA-binding modules recognize RNA. a | Structure of the N-terminal

Lunde et al., Nat. Rev. Mol. Cell Biol., 2007

slide-18
SLIDE 18

Naive approach for modeling ssRNA ligands (fragment-linking)

18

Zinc Finger

MCSS

TIS11d (CCCH)

A

N N N N NH2 O OH OH H H H H O P

  • O

O- O

U

NH O O N O OH OH H H H H O P

  • O

O- O

3D/NMR 2.2Å ≤ RMSD ≤ 3.6Å

PDB ID: 1RGO contraint n ≥ 8

Molpy CHARMM

slide-19
SLIDE 19

Proof of concept: fragment- based modeling of RNA Ligands

19

Chauvot de Beauchene et al., 2016

trinucleotide libraries RRM

slide-20
SLIDE 20

Binding strength & specificity in RBPs: RNase A

20 POSEVIEW S1 (3’) S2 (5’) H ∆S d M)

Ligand N ΔH TΔS Kd(μM) 5’-AMP 0.89

  • 17.1
  • 11.6

124 5’-GMP 1.14

  • 10.9
  • 6.4

568 3’-CMP 0.99

  • 13.5
  • 7.5

51.6 3’-TMP 0.99

  • 14.3
  • 7.6

15 3’-UMP NA NA NA 9.7

S1: U>C S2: A>G

Doucet et al., Proteins, 2010

slide-21
SLIDE 21

Predict “strong” nucleotide binding sites: (fragment-growing)

21

box setting & fragment distribution (MCSS) RNase A

(PDB ID: 1RCN)

slide-22
SLIDE 22

Predict nucleotide binding sites specificity

22

nucleotides RNase A

(PDB ID: 1RCN)

nucleobases ribose-phosphate A, U, C, G ribose, phosphate

slide-23
SLIDE 23

Predict nucleotide binding preferences in “hotspots”

focusing on high density regions RNase A

(PDB ID: 1RCN)

23×23×23Å3

23

17×17×17Å3

slide-24
SLIDE 24

Clustering & scoring nucleotide binding preferences

1 bits

|

1

G

C

U

2

G

A|

S1 (U>C) cluster S1 S2 (A>G) cluster S2

Ligand ∆H nclust T∆S ∆G ∆Gexp 5’-AMP

  • 16

15 7.5

  • 24
  • 5’-GMP
  • 19

21 11

  • 30
  • 5’-CMP
  • 18

29 15

  • 33
  • 6.0

5’-UMP

  • 21

52 26

  • 47
  • 6.7

PDB ID: 1RCN Ligand ∆H nclust T∆S ∆G ∆Gexp 5’-AMP

  • 23

27 14

  • 37
  • 5.5

5’-GMP

  • 24

19 10

  • 34
  • 4.5

5’-CMP

  • 18

31 16

  • 34
  • 5’-UMP
  • 20

25 13

  • 33
  • 24

24

slide-25
SLIDE 25

Acknowledgments

  • Claire Toffano-Nioche, Daniel Gautheret: I2BC
  • Marie-Christine Maurel, Jacques Vergne: MNHN, Univ. Paris 6
  • Giuseppe Zaccai & Anne Martel, ILL, Grenoble
  • Nicolas Chevrollier, I2BC, Univ. Paris 11
  • Manuel Simoes, Ph. D., Université de Strasbourg
  • Martin Karplus, Prof. emeritus, Harvard University - Univ.

Strasbourg

26