Towards reconstructing personalised causal regulatory networks using large-scale trans-eQTL and single-cell co-expression QTL analysis
Annique Claringbould Department of Genetics University Medical Center Groningen Slides Lude Franke
Towards reconstructing personalised causal regulatory networks using - - PowerPoint PPT Presentation
Towards reconstructing personalised causal regulatory networks using large-scale trans -eQTL and single-cell co-expression QTL analysis Annique Claringbould Department of Genetics University Medical Center Groningen Slides Lude Franke Twelve
Towards reconstructing personalised causal regulatory networks using large-scale trans-eQTL and single-cell co-expression QTL analysis
Annique Claringbould Department of Genetics University Medical Center Groningen Slides Lude Franke
Twelve years of genome-wide association studies
Genes_unknown Pathways_unknown Cell-types_unknown >10,000 known >200 diseases Genetic risk factors Disease
The black box challenge
Far majority of genetic risk factors affect gene expression
genetic risk factor for type 1 diabetes
Dubois et al, Nature Genetics 2010 Westra et al, Nature Genetics 2013 Fehrmann et al, Nature Genetics 2015 Zhernakova et al, Nature Genetics 2017
B-Cell
gene X expression >
T-Cell Monocyte P = 0.9
CC CT TT
P = 0.8
CC CT TT CC CT TT
P = 10-9
Cell-type specific cis-eQTL effect:
CD4+ T CD8+ T CD56(dim) NK CD56(bright) NK cMonocyte ncMonocyte B Plasma mDC pDC Megakaryocyte HPC 1 2 3 4 5 6 7 8 9 10 11 12 t-SNE 1 t-SNE 2
scRNA-seq analysis in 25,000 PBMCs (45 different individuals)
Monique van der Wijst et al, Nature Genetics 2018
rs4821670 affects LGALS2 in cis:
Single-cell cis-eQTL analysis
rs9332431 affects CHTF8 in cis: Monique van der Wijst et al, Nature Genetics 2018
Goal
Disease SNP Disease SNP Disease SNP Disease SNP Disease SNP Y D E Tissue 2
Genome-wide association studies
A B X C Tissue 1
cis-eQTL effects: trans-eQTL effects:cis-eQTL mapping trans-eQTL mapping
Z Disease
Key driver geneKey driver gene identification
Get larger sample-sizes: meta-analysis in 5,311 samples
Westra et al, Nature Genetics 2013 Zhernakova et al, Nature Genetics 2017 Bonder et al, Nature Genetics 2017
Downstream trans-eQTL effects
Systemic lupus erythematosis risk factor: Local expression effect: Type 1 interferon response: IKZF1 (in Monocytes) MX1 IFIT1 IFI44L IFI6
2 1 3 D
n s t r e a m e f f e c t s i d e n t i fi e d f
3 4 6 g e n e t i c r i s k f a c t
s 2 1 8 A i m t
n d m
e , u s i n g m a n y m
e s a m p l e s
genenetwork.nl/eqtlgen Large-scale eQTL analysis: eQTLGen
www.eqtlgen.org Large-scale eQTL analysis: 37 population based cohorts Genotype data and gene expression in blood available 31,684 samples
eQTLGen
Large-scale eQTL analysis: eQTLGen
trans-eQTL analysis: 10,317 trait-associated SNPs studied trans-eQTL analysis results: 6,298 (31%) trans-eQTL genes 3,853 (36%) genetic risk factors Polygenic score analysis results: 2,658 (13%) eQTS genes 689 (54%) traits afgect gene expression cis-eQTL analysis results: 16,989 (88.3%) cis-eQTL genes 238,340 unlinked cis-eQTL SNPs cis-eQTL analysis: 11M SNPs studied (Window size 1Mb, MAF ≥ 1%) Polygenic risk score analysis: 1,263 traits studied
A
eQTLGen Consortium
31,684 blood samples
' 5 ' 3 19,960 genes studied11M SNPs (MAF ≥ 1%) 10,317 trait-associated SNPs Gene A Gene B Gene C Disease SNP Polygenic risk for disease > Gene expression
trans-eQTL efgects Susan Peter Kate JohnX Y Y Z Disease SNP A
cis-eQTL efgectVõsa et al, BioRxiv 2018
Expression levels of nearly every gene are influenced by SNPs
cis-eQTLs
Nearly every gene is showing a significant cis-eQTL effect
−5 0.0 0.2 0.4 0.6 0.8 1.0P = 0.54 P = 0.002 P = 0.22 P = 0.02 P = 0.95 P = 4 x 10−4 P = 0.09 P = 2 x 10-7 P = 0.15 P = 2 x 10-6 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Average blood gene expression
Low High
Proportion of genes showing cis-eQTL effect Proportion of genes pLI score
Loss of function intolerant genes
−5 −40% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Average blood gene expression
Low High
Proportion of genes showing cis-eQTL effect Proportion of genes
34%
66%
Gene showing no eQTL effect in blood, but showing eQTL in GTEx Genes showing no eQTL effect in eQTLGen nor in GTEx
Enriched pathway Carcinoma RNA processing RNA splicing P-Value 5 x 10-11 2 x 10-9 2 x 10-7
96.2% of lead eSNPs map within 100kb of cis-gene
Limited evidence blood cis-eQTLs pinpoint disease genes
Analysis of cis-eQTLs using SMR for 16 well-powered traits Prioritized SMR genes do not overlap more often than expected with genes, prioritised using pathway enrichment method DEPICT
37% of genetic risk factors for disease affect expression in trans
trans-eQTLs
37% of 10,000 risk factors affect gene expression levels in trans
Average blood gene expression
Low High
showing trans-eQTL effect
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.0 0.2 0.4 0.6 0.8 1.0pLI score
P = 0.54 P = 9 x 10-4 P = 0.78 P = 0.95 P = 0.13 P = 0.10 P = 3 x 10-6 P = 10−5 P = 0.005 P = 6 x 10-7
−4Biological mechanism of trans-eQTLs
Biological mechanism known for trans-eQTLs Biological mechanism unknown for trans-eQTLs
Susceptibility locus trans-eQTL gene X Disease SNP Susceptibility locusTranscription factor binding
trans-eQTL gene X Disease SNP BCo-expression
A Susceptibility locus trans-eQTL gene X Disease SNP BProtein-protein interaction
A Disease SNP trans-eQTL gene XClose physical proximity
Fold enrichment = 1.98x, P = 4 x 10-83 (Co-expression based on 31,684 eQTLGen samples) Fold enrichment = 1.19x, P = 0.05 (Protein interactions based on InWeb) Fold enrichment = 2.2x, P = 2 x 10-61 (RegulatoryCircuits, Marbach et al, Nature Biotechnology 2016) Fold enrichment = 0.99x, P = 0.30 (Hi-C interactions, Rao et al, Cell 2014) TF A Susceptibility locus trans-eQTL gene Y X Disease SNPIndirect transcription factor binding
Fold enrichment = 3.2x, P < 10-300 (RegulatoryCircuits, Marbach et al, Nature Biotechnology 2016, co-expression based on 31,684 eQTLGen samples) TF A co-expression Susceptibility locus trans-eQTL gene Disease SNPExpression levels of local gene mediate trans-eQTL efgect
Fold enrichment = 5.3x, P = 10-67 (Tested in 3,831 BIOS samples) B A3% 1 % 83%
X4%
Using blood trans-eQTLs to gain insight into brain genes
rs17087335 REST cis trans
CAD SNP affects REST transcription factor: Trans-genes specific for brain:
Trans-eQTL effects in cancer
trans-eQTL efgects trans-eQTL efgect trans-eQTL efgect cis-eQTL efgectsATF7IP rs116766442 rs4487645 Multiple myeloma risk factor Multiple myeloma risk factor O-glycan biosynthesis GOLM1
trans-eQTL efgect trans-eQTL efgectrs7745098 rs114865495 Hodgkin’s lymphoma risk factor Hodgkin’s lymphoma risk factor Cell cycle RTKN2 rs2900333 Testicular germ cell tumor risk factor Chromatin organization DNA repair Gametocyte specifjc factor 1 Highly expres- sed in testis Male meiosis Male meiosis / Highly expressed in testis GTSF1 FAM50B DDX43
trans-eQTL efgects missense variantATM rs1801516 Melanoma risk factor DNA Damage / Telomere Stress Induced Senescence HIST1H2AC H1F0 HIST2H2BF HIST1H2BC HIST1H2BE HIST2H2BE HIST1H1PS1 HIST1H4E HIST1H2BD HIST1H1C HIST1H4H HIST1H3D
Converging effects in systemic lupus erythematosis
rs7097397 rs2111485 rs1990760 rs4917014 rs877819 rs2663052 rs17849501 rs597808 rs9888739 rs11574637 rs35472514 rs34572943 rs1143679 rs10774625 rs1913517 ISG15 IFI6 IFI44L IFI44 RSAD2 HERC5 IFIT1 OAS3 OAS2 OASL EPSTI1 MX1 16p11.2 10q11.23 12q24.12 7p12.2 1q25.3 2q24.2> Polygenic SLE Risk > > Expression of interferon genes
EIF2AK2 OAS1 DDX58 OAS3 OAS2 IFIT3 IFIT2 IFI6 HELZ2 XAF1 EPSTI1 RSAD2 CMPK2 OASL IFI44L IFI44 PARP9 HERC5 MX1 PARP14 IFIT1Interferon genes
What is a polygenic risk score?
403 variants associated to type 2 diabetes
Alice Bob Carl
11 risk alleles 189 risk alleles 362 risk alleles
Polygenic scores calculated for 1,263 diseases and traits
Genetic risk scores on several metabolites PGS correlations: PHDGH and PSAT1: enzymes in formation of serine, acetylglycine, glycine and creatine
Glycose 3-Phosphoglycerate Pyruvate 3-Phosphoserine L-serine Phospholipids 3-Phosphohydroxypyruvate PSAT PSPH SHMT Glycine Creatine N-acteylglycine Derivative of glycine Glycine is upstream on biosynthetic pathway 3-PGDH ASNS ANKRD36BP2 CHRM3-AS2 ALKBH7 SLC7A1 FBXO9 RP11-439E19.8 ANKHD1 AARS PSAT1 PHGDH L-serine Glycine Creatine N-acteylgycine
HDL Cholesterol: genetic risk score correlation
ABCA1 ABCG1 Apo A-1 LDLR Mature HDL Nascent HDL SR-BI LDL / VLDL CETP Cholesterol SREBP2 (SREBF2)Foam cell Liver
Celiac disease
CD56 (dim) NK Cells CD56 (bright) NK Cells Plasma cells mDCs pDC Megakaryocytes Expression enrichment:Cell-types with signifjcant correlation to ePRS results (BLUEPRINT consortium): Single peripheral blood mononuclear cells with signifjcant correlation to ePRS results:
CD14+, CD16- classical monocyte Macrophage Infmammatory macrophage Monocyte Osteoclast Efgector memory CD8+ αβ T cell Efgector memory CD4+ αβ T cell CD3+, CD4-, CD8+ double positive thymocyte Central memory CD4+ αβ T cell CD4+ αβ thymocyte CD8+ αβ thymocyte CD4+ αβ T cell 9.8 9.8 9.6 9.2 8.3Celiac disease polygenic risk score correlation on expression
eQTLGen summary
Võsa et al, BioRxiv 2018
Blood cis-eQTLs have different genetic architecture as compared to disease-associated SNPs Trans-eQTLs are more informative to gain insight into downstream consequences eQTS effects are less abundant, but can help identify core genes
Cell type specifjcity > biological relevance for disease > Low High Intermediate High Low Inter- mediate cis-eQTLs trans-eQTLs eQTS 16,989 genes 6,298 genes 2,568 genes
genenetwork.nl/eqtlgen
Võsa et al, BioRxiv 2018
Preprint available at BioRxiv (doi.org/10.1101/447367)
All summary statistics (including non-significant results) available for downloading at www.eqtlgen.org
eQTLGen summary
SREBF2 ChIP-seq binding
cis-eQTL effect on FADS2 mediated by SREBF2
Zhernakova et al, Nature Genetics 2017
FADS2 expression SREBF2 expression rs968567: A/A A/G G/G r2=0.51 r2=0.28 r2=0.10
cis-eQTL effect on FADS2 mediated by SREBF2
SREBF2 rs968567 FADS2
Gene Gene SNP
Gene Gene SNP Gene Gene SNP Gene Gene SNP Gene Gene SNP Gene Gene Gene Gene Gene Gene Gene Gene
Regulatory network reconstruction
Disease SNP in gene Disease SNP in gene cis-eQTL Disease SNP cis-eQTL Disease SNP Disease SNP trans-eQTL Disease SNP in gene Disease SNP in gene cis-eQTL Disease SNP cis-eQTL Disease SNP Disease SNP trans-eQTL Disease SNP in gene Disease SNP in gene cis-eQTL Disease SNP cis-eQTL Disease SNP Disease SNP trans-eQTL
Key driver gene
From GWAS to key driver genes
Pers et al, Nature Communications 2015
Inferring relationships using single-cell eQTLs
CC CT TT
RPS26 RPL32
r = 0.07
Expression data imputed with MAGIC§
van der Wijst et al, Nature Genetics 2018
Inferring relationships using single-cell eQTLs
CC CT TT
RPS26
Expression data imputed with MAGIC§
RPL32
rs7297175 T/T genotype: 9 individuals rs7297175 C/C genotype: 12 individuals
0.12
0.02 0.04
0.94 0.69 0.78 0.92 0.92 0.88 0.89 0.91 0.79
Correlation between RPS26 and RPL32 rs7297175 r = 0.84
0.5 1 C/C C/T T/T
Replication of effect in whole-blood bulk RNA-seq data RPS26 RPL32 RPS26 RPL32
CC CT TT CC CT TT
scRNA-seq bulk RNA-seq (4,200 samples)
r = 0.00 r = 0.03 r = 0.26
interaction effect p = 1.3 x 10-8
van der Wijst et al, Nature Genetics 2018
Drug response depends on your regulatory network
Disease SNP Disease SNP Disease SNP cis-eQTL cis-eQTL cis-eQTL
Key driver gene
Disease SNP Disease SNP Disease SNP cis-eQTL cis-eQTL cis-eQTL
Drug No disease symptoms Drug cures symptoms Key driver gene
Disease SNP Disease SNP Disease SNP cis-eQTL cis-eQTL cis-eQTL
Key driver gene
Disease SNP Disease SNP Disease SNP cis-eQTL cis-eQTL cis-eQTL
Key driver gene Disease symptoms Drug does not cure symptoms Drug
John Kate
genenetwork.nl/eqtlgen
Consortium aim: Perform large-scale single-cell eQTL meta- analysis in peripheral blood mononuclear cells Build personalised co-expression networks and identify co-expression QTLs Apply these models to individual genomes, predict cell-type specific gene expression levels
Single-cell eQTLGen Consortium
Large-scale datasets necessary to better understand biology behind GWAS associations Downstream effects of genetic variants more informative than local effects Combined genetic risk score associations can identify key driver genes Single-cell data identifies the cell types at play Personalised co-expression networks can be built using these ingredients, and will ultimately aid pharmacological decision-making
Conclusion
Acknowledgements >
Funding >
UMC Groningen BBMRI-NL BIOS Consortium eQTLGen Consortium
Juha Karjalainen Patrick Deelen Marc Jan Bonder Annique Claringbould Sipko van Dam Jackie Dekens Monique van der Wijst Morris Swertz Peter-Bram ’t Hoen Tonu Esko Freerk van Dijk Niek de Klein Harm-Jan Westra Urmo Võsa Dylan de Vries Harm Brugge Sasha Zhernakova Jingyuan Fu Bas Heijmans Vinod Kumar Sebo Withoff Yang Li Serena Sanna Dasha Zhernakova Raúl Aguirre Cisca Wijmenga Lude Franke