3D genome conformation and gene expression in fetal pig muscle at - - PowerPoint PPT Presentation

3d genome conformation and gene expression in fetal pig
SMART_READER_LITE
LIVE PREVIEW

3D genome conformation and gene expression in fetal pig muscle at - - PowerPoint PPT Presentation

3D genome conformation and gene expression in fetal pig muscle at late gestation Maria Marti Marimon 4 December 2019 1 Agronomic interest Factors responsible of piglets mortality: weight, genotype and maturity Maturity of fetal muscle -


slide-1
SLIDE 1

1

3D genome conformation and gene expression in fetal pig muscle at late gestation

Maria Marti Marimon 4 December 2019

slide-2
SLIDE 2

2

Agronomic interest

Voillet et al. 2016

Maturity of fetal muscle

  • Factors responsible of piglets mortality: weight, genotype and maturity
  • Motor functions
  • Thermoregulation
slide-3
SLIDE 3

3

Agronomic interest

  • Factors responsible of piglets mortality: weight, genotype and maturity

Voillet et al. 2016

Maturity of fetal muscle

  • Motor functions
  • Thermoregulation
  • Muscle transcriptome study (Voillet et al. BMC Genomics, 2014)

90 d

↑ genes muscle development ↓ genes energy metabolism

110 d

↑ genes energy metabolism ↓ genes muscle development

Transcriptional change associated to 3D genome

  • rganization?
slide-4
SLIDE 4

4

3D genome architecture

Doğan ES & Liu C, 2018

slide-5
SLIDE 5

5

3D Genome dynamics during early development

Ke Y., et al., Cell. 2017 Dixon JR, et al., Nature 2015

slide-6
SLIDE 6

6

3D Genome dynamics during early development

Ke Y., et al., Cell. 2017

Zygote genome activation Cell differentiation expression programs

Dixon JR, et al., Nature 2015

slide-7
SLIDE 7

7

Experimental design

Ø Gene expression (Voillet et al. 2014)

90 days gestation 110 days gestation

90 d

↑ genes muscle development ↓ genes energy metabolism

110 d

↑ genes energy metabolism ↓ genes muscle development

?

slide-8
SLIDE 8

8

Experimental design

3 fetuses (90 days gestation)

  • Rep1-90
  • Rep2-90
  • Rep3-90

3 fetuses (110 days gestation)

  • Rep1-110
  • Rep2-110
  • Rep3-110

In situ Hi-C on fetal muscle

Ø 3D Genome organization Ø Gene expression (Voillet et al. 2014)

90 days gestation 110 days gestation

90 d

↑ genes muscle development ↓ genes energy metabolism

110 d

↑ genes energy metabolism ↓ genes muscle development

? ?

Rao et al, 2014

slide-9
SLIDE 9

9

Raw Contact Maps Normalized Contact Maps Detection of valid pairs Paired-end (PE) reads Read alignment A/B compartments TADs finding

Hi-C data analysis

HiC-Pro (Servant et al. 2015) 476–685M read pairs/sample 3.45 billion read pairs (total) 122–283M valid pairs/sample 63-73% mapped pairs/sample

slide-10
SLIDE 10

10

122–283M valid pairs/sample

Hi-C data analysis

HiC-Pro (Servant et al. 2015) 476–685M read pairs/sample 3.45 billion read pairs (total)

45,45% 51,79% 40,71% 56,01% 46,29% 47,54% 51,99% 45,92% 55,51% 43,09% 51,07% 50,35% 2,56% 2,30% 3,78% 0,89% 2,64% 2,11% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Rep1-90 Rep2-90 Rep3-90 Rep1-110 Rep2-110 Rep3-110 Trans valid pairs Cis long-range valid pairs Cis short-range valid pairs

63-73% mapped pairs/sample High percentages of trans valid pairs Low percentages of cis short-range valid pairs

Raw Contact Maps Normalized Contact Maps Detection of valid pairs Read alignment A/B compartments TADs finding Paired-end (PE) reads

slide-11
SLIDE 11

11

Raw Contact Maps Detection of valid pairs Paired-end (PE) reads Read alignment A/B compartments TADs finding

500 kb 50 kb 200 kb 500 kb 50 kb 200 kb

Normalized Contact Maps

Hi-C data analysis

HiC-Pro (Servant et al. 2015)

slide-12
SLIDE 12

12

Li et al. 2016

TADs detection

  • 50 Kb resolution matrices
  • 1312 TADs per replicate on average
  • Average mean size: 1480 Kb
  • Global conservation of TAD structure

(74 – 79% of TAD boundaries from each condition are identical to the other condition)

Raw Contact Maps Normalized Contact Maps Detection of valid pairs Read alignment Paired-end (PE) reads A/B compartments TADs finding

Hi-C data analysis (TADs) Juicer: arrowhead (Neva et al., Cell Systems 2016; Rao et al., Cell 2014)

Kim et al., Cell 2016

slide-13
SLIDE 13

13

CTCF CTCF

TADs validation Consistent Hi-C data

Hi-C data analysis (TADs)

slide-14
SLIDE 14

14

Raw Contact Maps Normalized Contact Maps Detection of valid pairs Read alignment Paired-end (PE) reads A/B compartments TADs finding

Hi-C data analysis (A/B compartments)

(Lieberman-Aiden et al., Science 2009)

500 Kb resolution matrices 682 compartments/replicate (average) Median size 2.6 Mb – 3.5 Mb

slide-15
SLIDE 15

15

Raw Contact Maps Normalized Contact Maps Detection of valid pairs Read alignment Paired-end (PE) reads A/B compartments TADs finding

Hi-C data analysis (A/B compartments)

(Lieberman-Aiden et al., Science 2009)

500 Kb resolution matrices 682 compartments/replicate (average) Median size 2.6 Mb – 3.5 Mb

Gene density in A/B compartments Gene expression in A/B compartments

slide-16
SLIDE 16

16

A/B compartments

Bins assigned to the same compartment type:

  • 83.3% in all 6 replicates

Good consistency of results across replicates

slide-17
SLIDE 17

17

Genomic regions switching compartments

slide-18
SLIDE 18

18

Genomic regions switching compartments

Variability between conditions: 3.3% switching bins (52 Mb) 90 d è110 d 43.3% (AAA è BBB) 56.7% (BBB è AAA)

slide-19
SLIDE 19

19

A/B compartments and gene expression

slide-20
SLIDE 20

20

A/B compartments and gene expression

Switching regions are associated to transcriptional changes

slide-21
SLIDE 21

21

Genome-wide fragmentation during the muscle maturation process

Number distribution of compartments

slide-22
SLIDE 22

22

Genome-wide fragmentation during the muscle maturation process

Number distribution of compartments Fragmentation of genome compartmentalization

slide-23
SLIDE 23

23

Differentially distal genomic regions

500 Kb 200 Kb Total bin pairs with any count 9,262,199 3,844,272 Differential bin pairs 10,183 (0.11%) 3,417 (0.09%)

Ø Filtering, normalization and detection of bin pairs with significant number of contacts (method: Generalized Linear Model “GLM” functionality of edgeR)

slide-24
SLIDE 24

24

Differentially distal genomic regions

500 Kb 200 Kb Total bin pairs with any count 9,262,199 3,844,272 Differential bin pairs 10,183 (0.11%) 3,417 (0.09%)

Ø Filtering, normalization and detection of bin pairs with significant number of contacts (method: Generalized Linear Model “GLM” functionality of edgeR) Positive logFC = more counts “contacts” at 110 days than at 90 days = genomic regions closer to each other Negative logFC = more counts “contacts” at 90 days than at 110 days = genomic regions closer to each other logFC (bin pair) = log2 [ (counts at 110 days) / (counts at 90 days) ]

slide-25
SLIDE 25

25

Gene expression in differentially distal genomic regions

Distributions of logFC expression values of probes mapped to different categories of genomic regions

slide-26
SLIDE 26

26

Gene expression in differentially distal genomic regions

Distributions of logFC expression values of probes mapped to different categories of genomic regions The expression values of probes in genomic regions closer at either 90 or 110 days are significantly lower

slide-27
SLIDE 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

27

Differential interacting regions (90-110 days of gestation)

slide-28
SLIDE 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

28

cis bin pairs 81.8%

Differential interacting regions (90-110 days of gestation)

slide-29
SLIDE 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

29

Differential interacting regions (90-110 days of gestation)

slide-30
SLIDE 30

30

Differential interacting regions (cis)

Positive logFC Negative logFC

slide-31
SLIDE 31

31

Positive logFC Negative logFC

Differential interacting regions (cis)

slide-32
SLIDE 32

32

Positive logFC Negative logFC

Differential interacting regions (cis)

Large dynamic differential regions (90-110 days gestation)

slide-33
SLIDE 33

33

Differential genomic regions (trans)

Positive logFC Negative logFC

slide-34
SLIDE 34

34

Positive logFC Negative logFC

Telomeric regions Negative logFC ptel qtel

Differential genomic regions (trans)

slide-35
SLIDE 35

35

Positive logFC Negative logFC

Telomeric regions Negative logFC Preferential clustering

  • f telomeres at 90 days

ptel qtel

Differential genomic regions (trans)

slide-36
SLIDE 36

36

Positive logFC Negative logFC

Telomeric regions Negative logFC ptel qtel

Differential genomic regions (trans)

Preferential clustering

  • f telomeres at 90 days
slide-37
SLIDE 37

37

Preferential associations of telomeres (90 days gestation)

A

SSC2pter-SSC9qter SSC15qter-SSC9qter SSC13qter-SSC9qter

slide-38
SLIDE 38

38

General output

Ø Changes in genome structure at late gestation è switching A/B compartments è genome-wide fragmentation è differentially interacting regions (telomeres) 3,1% regions switching compartment Up to 10,000 differential interacting pairs Ø These changes are associated with variations in gene expression

90 d

↑ genes muscle development ↓ genes energy metabolism

110 d

↑ genes energy metabolism ↓ genes muscle development Gene expression

(Voillet et al. 2014)

3D structure Expression changes associated to the switching regions Expression changes associated to differentially distal regions

slide-39
SLIDE 39

39

Hi-C working team: Experiments: Hervé Acloque & Florence Mompart Sequencing: Diane Esquerré Data analysis: Sylvain Foissac, Sarah Djebali, Matthias Zytnicki & David Robelin Statistic analysis : Nathalie Vialaneix Cytogenetic team: Yvette Lahbib-Mansais Martine Bouissou-Matet Funding: SCALES projet (CNRS): Pierre Neuvial & Nathalie Vialaneix

slide-40
SLIDE 40

40

Hi-C bioinformatics workflow è Read alignment

476 – 685 M read pairs / sample è 3.45 billion read pairs HiC-Pro (Servant et al. 2015)

Raw Contact Maps Normalized Contact Maps Detection of valid pairs Read alignment Paired-end (PE) reads A/B compartments TADs finding

slide-41
SLIDE 41

41

Normalized matrices

chr1 Rep1-90 chr1 Rep1-90 Raw ICE normalized

slide-42
SLIDE 42

42

Identification of A/B compartments

1

1- ICE normalization (matrix balancing) 2- « Distance normalization » (observed/expected) 3- Pearson correlation matrix 4- Principal Component Analysis on the bins

2 3 4

Raw matrix (500 Kb) Iced normalized matrix Distance-based normalized matrix Pearson correlation matrix

Genomic position (bp) chr 1

Principal component of the correlation matrix

slide-43
SLIDE 43

43

Genome-wide fragmentation during development

Number distribution of compartments Number compartments vs. coverage Fragmentation of genome compartmentalization

slide-44
SLIDE 44

44

slide-45
SLIDE 45

45

Differentially distal genomic regions

500 Kb 200 Kb Total bin pairs with any count 9,262,199 3,844,272 Differential bin pairs 10,183 (0.11%) 3,417 (0.09%) % differential bin pairs with logFC(+) 56.9 50.7 % differential bin pairs with logFC(-) 43.1 49.3

Ø Filtering, normalization and detection of bin pairs with significant number of contacts (method: Generalized Linear Model “GLM” functionality of edgeR) Positive logFC = more counts “contacts” at 110 days than at 90 days = genomic regions closer to each other Negative logFC = more counts “contacts” at 90 days than at 110 days = genomic regions closer to each other logFC (bin pair) = log2 [ (counts at 110 days) / (counts at 90 days) ]

slide-46
SLIDE 46

46

Differential analysis (90 – 110 days of gestation)

Ø Raw matrices of the 18 autosomes (500, 200 and 40 kb) Ø Inter-matrix normalization Ø Detecting pairs of bins with a significant difference in the number of counts Generalized Linear Model based on the negative binomial distribution (edgeR)

Pseudo counts per simple (before normalization) Pseudo counts per simple (after normalization)

Rep1-110 Rep2-110 Rep3-110 Rep1-90 Rep2-90 Rep3-90 Rep1-110 Rep2-110 Rep3-110 Rep1-90 Rep2-90 Rep3-90

log2(count +1)