Marc A. Marti-Renom Genome Biology Group (CNAG) Structural - - PowerPoint PPT Presentation

marc a marti renom genome biology group cnag structural
SMART_READER_LITE
LIVE PREVIEW

Marc A. Marti-Renom Genome Biology Group (CNAG) Structural - - PowerPoint PPT Presentation

Hybrid [integrative] methods for structure determination Marc A. Marti-Renom Genome Biology Group (CNAG) Structural Genomics Group (CRG) IMP Integrative Modeling Since 1956 Francis Crick James Watson Maurice Wilkins Rosalind


slide-1
SLIDE 1

Hybrid [integrative] methods for structure determination

  • Marc A. Marti-Renom

Genome Biology Group (CNAG) Structural Genomics Group (CRG)

slide-2
SLIDE 2

IMP

slide-3
SLIDE 3

Francis Crick Rosalind Franklin Maurice Wilkins James Watson

Integrative Modeling

Since 1956

slide-4
SLIDE 4

Build geometric models

Spatial restraints Components Model X-ray crystallography NMR Electron microscopy Small-angle x-ray scattering Cross-linking HDXMS Proteomics, mass spectrometry Copurification Bioinformatics, physics Sampling and analysis

slide-5
SLIDE 5

The Integrative Modeling Platform (IMP)

http://www.integrativemodeling.org

IMP C++/Python library multifit/restrainer Simplicity Generalization Chimera tools/ web apps Domain-specific applications

Russel et al, PLoS Biology, 2012

slide-6
SLIDE 6

Russel et al, PLoS Biology, 2012

Stage 1: Gathering information. Stage 2: Choosing how to represent and evaluate models. Stage 3: Finding models that score well. Stage 4: Analyzing resulting models and information.

IMP

The Integrative Modeling Platform (IMP)

http://www.integrativemodeling.org

slide-7
SLIDE 7

Representation

  • Atomic
  • Rigid bodies
  • Coarse grained
  • Multi-scale
  • Symmetry/periodicity
  • Multi-state systems
slide-8
SLIDE 8

Scoring

  • Proteomics
  • Density maps
  • EM images
  • FRET
  • Chemical cross linking
  • Homology-derived restraints
  • SAXS
  • Native mass spec
  • Statistical potentials
  • Molecular mechanics forcefields
  • Bayesian scoring functions
  • Library of functional forms (ambiguity, ...)
slide-9
SLIDE 9

Sampling

  • Monte-Carlo
  • Conjugate Gradients
  • Quasi-Newton
  • Simplex
  • Divide and conquer sampler
slide-10
SLIDE 10

Analysis

  • Clustering
  • Output
  • Chimera
  • Pymol
  • PDBs
  • Density maps
slide-11
SLIDE 11
  • The NPC

Alber, F., Dokudovskaya, S., Veenhoff, L. M., Zhang, W., Kipper, J., Devos, D., Suprapto, A., et al. (2007). Nature, 450(7170), 695–701

slide-12
SLIDE 12
  • N

1 N 2

{Bj

}

n

r

  • N

1

N

2

{Bj

}

n

r 1,2,5

  • 2

3.0 1,5

  • 9

1.5 Nup192 1 1 3

  • 1
  • 2
  • 2

1.5 1,2,5

  • 2

3.0 3

  • 1
  • Nup188

1 1 3

  • 1
  • Nup1

1 4

  • 7

1.5 1,2,5

  • 2

2.9 1,5

  • 12

1.3 Nup170 1 1 3

  • 1
  • 2
  • 3

1.3 1,2,5

  • 3

2.5 3

  • 1
  • Nup157

1 1 3

  • 1
  • Nsp1

2 2 4

  • 9

1.3 1,2,5

  • 2

2.7 1,2,5

  • 2

2.1 Nup133 1 1 3

  • 1
  • Gle1

1 3

  • 1
  • 1,2,5
  • 2

2.6 1,5

  • 4

1.6 Nup120 1 1 3

  • 1
  • 2,3
  • 1

1.6 1,2,5

  • 3

2.0 Nup60 1 4

  • 3

1.6 Nup85 1 1 3

  • 1
  • 1,5
  • 4

1.6 1,2,5

  • 3

2.0 2

  • 2

1.6 Nup84 1 1 3

  • 1
  • 3
  • 1
  • 1,2,5
  • 2

2.3 Nup59 1 1 4

  • 2

1.6 Nup145C 1 1 3

  • 1
  • 1,5
  • 3

1.8 Seh1 1 1 1,2,3,5

  • 1

2.2 2,3

  • 1

1.8 Sec13 1 1 1,2,3,5

  • 1

2.1 Nup57 1 1 4

  • 2

1.8 Gle2 1 1 1,2,3,5

  • 1

2.3 1,5

  • 3

1.7 1,2,5

  • 2

2.4 2,3

  • 1

1.7 Nic96 2 2 3

  • 1
  • Nup53

1 1 4

  • 2

1.7 1,2,5

  • 2

2.3 1,5

  • 6

1.5 Nup82 1 1 3

  • 1
  • Nup145N

2 2,3

  • 1

1.5

  • Representation

436 proteins!

slide-13
SLIDE 13

Data generation Data interpretation

Method Experiments Restraint RC RO RA Functional form of activated feature restraint 30 nup sequences Protein excluded volume restraint
  • 1,864
*1,863/2 Protein-protein: Violated for f < fo. f is the distance between two beads, fo is the sum of the bead radii, and is 0.01 nm. Applied to all pairs of particles in representation =1: Bms Bj 1 ,s,,i
  • 30 nup
sequences
  • 48
Membrane-surface location: Violated if f fo. f is the distance between a protein particle and the closest point on the NE surface (half-torus), fo = 0 nm, and is 0.2 nm. Applied to particles: Bms Bj 6 ,s,,i
  • | (Ndc1,Pom152,Pom34)
  • 64
Pore-side volume location: Violated if f < fo. f is the distance between a protein particle and the closest point on the NE surface (half-torus), fo = 0 nm, and is 0.2 nm. Applied to particles: Bms Bj 8 ,s,,i
  • | (Ndc1,Pom152,Pom34)
  • Bioinformatics and Membrane fractionation
30 Nup sequences and immuno-EM (see below) Surface localization restraint
  • 80
Perinuclear volume location: Violated if f > fo,, f is the distance between a protein particle and the closest point on the NE surface (half-torus), fo = 0 nm, and is 0.2 nm. Applied to particles: Bms Bj 7 ,s,,i
  • (Pom152)
  • 1 S-value
Complex shape restraint 1 164 1 Complex diameter Violated if f < fo. f is the distance between two protein particles representing the largest diameter of the largest complex, fo is the complex maximal diameter D=19.2-R, where R is the sum of both particle radii, and is 0.01 nm. Applied to particles of proteins in composite C45: Bms Bj 1 ,s,,i
  • | C51
  • Hydrodynamics

experiments

30 S-values Protein chain restraint
  • 1,680
Protein chain Violated if f fo. f is the distance between two consecutive particles in a protein, fo is the sum of the particle radii, and is 0.01 nm. Applied to particles: B Bj ,s,,i
  • | 1
  • 456
Z-axial position Violated for f < fo. f is the absolute Cartesian Z-coordinate of a protein particle, fo is the lower bound defined for protein type , and is 0.1 nm. Applied to particles: B Bj ,s,,i
  • | 1, j 1
  • 456
Violated for f > fo. f is the absolute Cartesian Z-coordinate of a protein particle, fo is the upper bound defined for protein type , and is 0.1 nm. Applied to particles: B Bj ,s,,i
  • | 1, j 1
  • 456
Radial position Violated for f < fo. f is the radial distance between a protein particle and the Z-axis in a plane parallel to the X and Y axes, fo is its lower bound defined for protein type , and is 0.1 nm. Applied to particles: B Bj ,s,,i
  • | 1, j 1
  • Immuno-Electron microscopy
10,940 gold particles Protein localization restraint
  • 456
Violated for f > fo. f is the radial distance between a protein particle and the Z-axis in a plane parallel to the X and Y axes, fo is its upper bound defined for protein type , and is 0.1 nm. Applied to particles: B Bj ,s,,i
  • | 1, j 1
  • Overlay

assays

13 contacts Protein interaction restraint 20 112 20 Protein contact Violated for f > fo. f is the distance between two protein particles, fo is the sum of the particle radii multiplied by a tolerance factor of 1.3, and is 0.01 nm. Applied to particle: B Bj ,s,,i
  • | (2,4,9), (1,2,3)
  • 4 complexes
Competitive binding restraint 1 132 4 Protein contact Violated for f > fo. f is the distance between two protein particles, fo is the sum of the particle radii multiplied by a tolerance factor of 1.3, and is 0.01 nm. Applied to : B Bj ,s,,i
  • | (1,2,3), (2,4,6), (Nup82,Nic96,Nup49,Nup57)
  • Affinity purification
64 complexes Protein proximity restraint 692 25,348 692 Protein proximity Violated for f > fo. f is the distance between two protein particles, fo is the maximal diameter of a composite complex, and is 0.01 nm. Applied to particles: B Bj ,s,,i
  • | (1,2,3), (2,4,9)
  • Scoring
slide-14
SLIDE 14
  • Score

1010 108 106 104 102 0.60 0.52 0.44 0.36 Contact similarity

Optimization

Score 100 200 2,000 4,000 300 Number of configurations

slide-15
SLIDE 15

Integrating data

Immuno-EM Ultracentrifugation Nucleoporin stoichiometry NPC symmetry Nuclear envelope pore volume Overlay assays Affinity purifications

+ + +

a

slide-16
SLIDE 16
  • FG nucleoporins
Spoke Pom152 Ndc1 Pom34 Nup120 Nup85 Nup145C Nup84 Sec13 Seh1 Nup133 Nup188 Nup192 Nup170 Nup157 Nup82 Nup82 Nic96 Nic96 5 nm 5 nm Nup145N Nup53 Nup1 Nup60 Nsp1 Nup59 Nup49 Nsp1 Nup57 Nup145N Nup159 Nup57 Nup49 Nup100 Nup116 Nsp1 Nup59 Nsp1 Nup42 Nup53 Cytoplasm Nucleoplasm Inner rings Outer rings Membrane rings Linker nucleoporins

The STRUCTURE of NPC

slide-17
SLIDE 17

IMP-based efforts

PCS9K-Fab complex Sali, Cheng, Agard, Pons Nup84 complex, Sali, Rout, Chait Nuclear Pore Complex, Sali, Rout, Chait Nuclear Pore Complex transport, Sali, Rout, Chait, Chook, Liphardt 26 Proteasome Sali, Baumeister Spindle Pole Body Sali, Davis, Muller Chromatin globin domain Marti-Renom Lymphoblastoid cell genome Alber, Chen Microtubule nucleation Sali, Agard Ribosomes, Sali, Frank; Sali, Akey Hsp90 landscape Sali, Agard TRiC/CCC Sali, Frydman, Chiu Actin Sali, Chiu RyR channel Sali, Serysheva, Chiu

T

slide-18
SLIDE 18

Who Is developing with IMP?

slide-19
SLIDE 19

Whale sperm myoglobin structure (1960) alpha-globin genomic domain structure (2011)

From proteins to genomes

slide-20
SLIDE 20

Resolution Gap

Marti-Renom, M. A. & Mirny, L. A. PLoS Comput Biol 7, e1002125 (2011)

μ 10 10 10 Resolution s Time 10 10 10 10 10 10 10 10 μm Volume 10 10 10 10 10 DNA length nt 10 10 10 10

Knowledge

IDM INM

slide-21
SLIDE 21

“Bridging” the Resolution Gap

Dekker, J., Marti-Renom, M. A., & Mirny, L. A. (2013). Nature Reviews Genetics, 14(6), 390–403.

A compartments 20 Mb 2 Mb B compartments Interaction preference TADs Compartments

slide-22
SLIDE 22

Experiments Computation

Hybrid Method

Baù, D. & Marti-Renom, M. A. Methods 58, 300–306 (2012).

slide-23
SLIDE 23

Hi-C technology

Lieberman-Aiden, E. et al. Science 326, 289–293 (2009).

http://3dg.umassmed.edu

A B C D

Chr.18 (Hind III)

slide-24
SLIDE 24

Biomolecular structure determination 2D-NOESY data Chromosome structure determination 3C-based data

slide-25
SLIDE 25

3C-like data

Nora, E. P., et al. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature

slide-26
SLIDE 26

http://www.3Dgenomes.org

slide-27
SLIDE 27 P1 P2 P1 P2 P1 P2 i i+2 i+1 i+n
slide-28
SLIDE 28

On TADs and hormones

Davide Baù François Serra François le Dily

slide-29
SLIDE 29

Progesterone-regulated transcription in breast cancer

> ¡2,000 ¡genes ¡Up-­‑regulated ¡ > ¡2,000 ¡genes ¡Down-­‑regulated ¡

Regulation in 3D?

Vicent)et#al#2011,))Wright)et#al#2012,)Ballare)et#al#2012)

slide-30
SLIDE 30

Experimental design

HiC libraries Chr.18 (NcoI) Chr.18 (Hind III) ChIP-Seq RNA-Seq Hi-C

slide-31
SLIDE 31

Are there TADs? how robust?

Chr.18

>30 25 15 10 >30 25 15 10

+Pg

  • Pg

conserved ± 2 0 0 k b o r m o r e 1 0 0 k b

8% 12% 80%

>2,000 detected TADs

Mb) C. Size (M

Chromosome

1 3 5 7 X 9 11 15 13 17 19 21

slide-32
SLIDE 32

Are TADs homogeneous?

H3K36me2 H3K4me3 H3K4me1 H3K14ac H3K9me3 H3K27me3 HP1 H1.2

Chr.2

5 Mb

  • Pg
  • 1.0
  • 0.5

0.0 0.5 1.0

Correlation coefficient Same TAD Same random TAD Inter-TADs Consecutive TADs

*** *** ***

Chr.2

5 Mb

+Pg/-Pg

  • 1.0
  • 0.5

0.0 0.5 1.0

Correlation coefficient Same TAD Same random TAD Inter-TADs Consecutive TADs

*** *** ***

slide-33
SLIDE 33

Do TADs respond differently to Pg treatment?

Observed/expected ratio (Log2) Frequencies Expression levels (Log2 RPKM) 30 20 10

  • 1
  • 2

2.0 1.5 1.0 0.5 0.0

  • 0.5
  • 1.0

4 3 2 1

  • 1
  • 2
  • 3

Log2 fold change ZBTB2 RMND1 C6orf211 CCDC170 ESR1 SYNE1

  • Pg

+Pg

Expression levels (Log2 RPKM) 4 3 2 1 8 7 6 5 4 3 2 1

  • 1
  • 2

Log2 fold change MRFAP1 S100P MRFAP1L1 BLOC1S4 KIAA0232 TBC1D14 CCDC96 TADA2B GRPEL1

  • Pg

+Pg

Observed Expected 100-90 100-90 0-10 0-10% % of genes per TAD with positive or negative fold change

TAD 469 TAD 821

slide-34
SLIDE 34

Do TADs respond differently to Pg treatment?

Observed/expected ratio (Log2) Frequencies Expression levels (Log2 RPKM) 30 20 10
  • 1
  • 2
2.0 1.5 1.0 0.5 0.0
  • 0.5
  • 1.0
4 3 2 1
  • 1
  • 2
  • 3
Log2 fold change ZBTB2 RMND1 C6orf211 CCDC170 ESR1 SYNE1
  • Pg
+Pg Expression levels (Log2 RPKM) 4 3 2 1 8 7 6 5 4 3 2 1
  • 1
  • 2
Log2 fold change MRFAP1 S100P MRFAP1L1 BLOC1S4 KIAA0232 TBC1D14 CCDC96 TADA2B GRPEL1
  • Pg
+Pg Observed Expected 100-90 100-90 0-10 0-10% % of genes per TAD with positive or negative fold change

Repressed TADs Activated TADs Other TADs Mean Replicate 1 Replicate 2 Pg induced fold change per TAD (6h)

Fold change 6h Pg

  • 1.0
  • 0.5

0.0 0.5 1.0 1.5 Fold change per TAD (Log2)

*** *** ***

Fold change 1h Pg

Repressed TADs Activated TADs Other TADs Repressed TADs Activated TADs Other TADs

*** *** ***

  • 2.0
  • 1.0

0.0 1.0 2.0 3.0 Pg induced fold change (log2) per gene

Repressed TADs Activated TADs Other TADs

*** *** ***

Repressed TADs Activated TADs Other TADs
  • 2.0
  • 1.0

0.0 1.0 2.0 Pg induced fold change (log2) per TAD non-coding *** *** **

Repressed TADs Activated TADs Other TADs

  • 2.0
  • 1.0

0.0 1.0 2.0 3.0 Pg induced changes in intra-TAD interactions (z-score)

slide-35
SLIDE 35

Chr1:26,800,000-28,700,000

4 3 2

5 4 3 2 1

2.2 0.6 0.9

pool 1 pool 2

0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5

models (micra) FISH (micra) r= 0.94

1-5 2-4 2-3 3-4

Modeling 3D TADs

61 genomic regions containing 209 TADs covering 267Mb

slide-36
SLIDE 36

How TADs respond structurally to Pg?

Repressed TADs Activated TADs Other TADs *** ** 0.8 0.9 1.0 1.1 1.2

PG induced changes in accessibility

Repressed TADs Activated TADs Other TADs *** ** .95 1.00 1.05 1.10

Pg induced changes in radius or giration

non-TSS TSS 20 40 60 80 100

Particle accessibility (%)

***

50 100 150 200 cl12 [13] Pg cl14 [10] mixt cl17 [10] Pg cl11 [30] Pg cl2 [267] Pg cl13 [11] Pg cl1 [297] Pg cl10 [69] Pg cl9 [84] Pg cl16 [10] Pg + cl4 [172] Pg + cl6 [142] Pg + cl7 [89] Pg + cl3 [176] Pg + cl8 [85] Pg + cl15 [10] Pg + cl5 [144] Pg + dRMSD (nm)

Chr2:9,600,000-13,200,000 Chr2 U170 (activated)

50 100 150 200 cl23 [11] Pg + cl24 [10] Pg + cl5 [34] Pg + cl26 [10] Pg + cl27 [10] Pg + cl28 [10] Pg + cl21 [12] Pg + cl9 [21] Pg + cl19 [14] Pg cl16 [15] Pg cl10 [20] Pg cl6 [32] Pg cl7 [32] Pg cl18 [15] Pg cl25 [10] Pg cl17 [15] Pg cl20 [12] Pg cl3 [73] Pg cl14 [16] Pg cl22 [12] Pg cl13 [16] Pg cl1 [118] Pg cl12 [17] Pg cl4 [46] Pg cl11 [18] Pg cl15 [16] Pg cl2 [112] Pg cl8 [25] Pg dRMSD (nm)

Chr6:71,800,000-76,500,000 Chr6 U767 (repressed)

  • Pg

+Pg

  • Pg

+Pg

slide-37
SLIDE 37

Model for TAD regulation

Structural transition

+ P g

DHS HP1 H1.2 H2A MNAse H3K27me3 H3K9m3 H3K14ac H3K4me1 H3K36me2 H3K4me3

Histone H1 Nucleosome Histones H2A/H2B Progesteone Receptor

Repressed TAD chr1 U41

DHS HP1 H1.2 H2A MNAse H3K27me3 H3K9m3 H3K14ac H3K4me1 H3K36me2 H3K4me3

Activated TAD chr2 U207

slide-38
SLIDE 38

STRUCTURE FUNCTION

slide-39
SLIDE 39

Acknowledgments

http://marciuslab.org http://3DGenomes.org http://cnag.cat · http://crg.cat

Open positions http://marciuslab.org

François Serra Davide Baù François le Dily

  • David Dufour

Mike Goodstadt Gireesh Bogu Francisco Martínez-Jiménez

Miguel Beato, Thomas Graf and Guillaume Filion