Discovering Mechanistic Heterogeneity using Mendelian Randomization - - PowerPoint PPT Presentation

discovering mechanistic heterogeneity using mendelian
SMART_READER_LITE
LIVE PREVIEW

Discovering Mechanistic Heterogeneity using Mendelian Randomization - - PowerPoint PPT Presentation

Discovering Mechanistic Heterogeneity using Mendelian Randomization Qingyuan Zhao Statistical Laboratory, University of Cambridge Joint work with Daniel Iong (who made most of the slides) and Yang Chen September 26, 2020 @ PCIC Qingyuan Zhao


slide-1
SLIDE 1

Discovering Mechanistic Heterogeneity using Mendelian Randomization

Qingyuan Zhao

Statistical Laboratory, University of Cambridge Joint work with Daniel Iong (who made most of the slides) and Yang Chen

September 26, 2020 @ PCIC

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 1 / 21

slide-2
SLIDE 2

Outline

1

Motivation

2

Mechanistic Heterogeneity in MR

3

MR-PATH Model Assumptions Statistical inference

4

Results HDL-CHD BMI-T2D

5

Conclusion

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 2 / 21

slide-3
SLIDE 3

Motivation

Mendelian randomization (MR)

MR = Using genetic variation as instrumental varibles. Surging interest in epidemiology and genetics.

Number of publications in MR by year (Source: Web of Science).

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 3 / 21

slide-4
SLIDE 4

Motivation

Example: Causal effect of the LDL-cholesterol

Basic idea: People who inherited certain alleles of rs17238484 and rs12916 have naturally higher concentration of LDL cholesterol.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 4 / 21

slide-5
SLIDE 5

Motivation

Example: Causal effect of the LDL-cholesterol

Basic idea: People who inherited certain alleles of rs17238484 and rs12916 have naturally higher concentration of LDL cholesterol.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 4 / 21

slide-6
SLIDE 6

Motivation

Motivation for this work

Exclusion restriction: Instruments (genetic variants) can only affect the outcome through the risk exposure.

In MR, this assumption may be violated due to pleiotropy. Many pleiotropy-robust MR methods (e.g. MR-RAPS) have been developed.

Most robust MR methods rely on the “effect homogeneity” assumption: the risk exposure has the same causal effect for every individual. Our contributions

1 A novel concept—Mechanistic heterogeneity. 2 A transparent mixture model—MR-PATH. Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 5 / 21

slide-7
SLIDE 7

Mechanistic Heterogeneity in MR

Review: Linear structural equation model for MR

Z1 . . . Zp X U Y θX1 θXp β α1 αp

For exposure X, outcome Y , unobserved confounding variables U, and SNPs Z1, . . . , Zp, the commonly assumed linear structural equation model is given by X =

p

  • i=1

θXiZi + ηXU + EX, Y = βX +

p

  • i=1

αiZi + ηY U + EY

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 6 / 21

slide-8
SLIDE 8

Mechanistic Heterogeneity in MR

Review: Linear structural equation model for MR

X =

p

  • i=1

θXiZi + ηXU + EX, Y = βX +

p

  • i=1

αiZi + ηY U + EY If Zi is a valid instrument, θXi = 0, Zi | = {U, EX, EY }, and αi = 0. However, it is often the case that αi = 0 due to pleiotropy and multiple causal pathways. If αi = 0 for some SNPs, then the causal effect β cannot be estimated consistently without further assumptions on αi.

e.g. αi ∼ N(0, τ 2) for most SNPs.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 7 / 21

slide-9
SLIDE 9

Mechanistic Heterogeneity in MR

Two scenarios of mechanistic heterogeneity

Z1,1

. . .

Z1,p1 Z2,1

. . .

Z2,p2 Z3,1

. . .

Z3,p3

M1 M2 M3 X θ1 θ2 θ3 Y β α2 α3 U

(a) Scenario 1: Multiple pathways of horizontal pleiotropy.

Z1,1

. . .

Z1,p1

M1 X1 θ1

Z2,1

. . .

Z2,p2

M2 X2 θ2

Z3,1

. . .

Z3,p3

M3 X3 θ3

X = X1 + X2 + X3

Y β1 β2 β3 U

(b) Scenario 2: Multiple mechanisms for the exposure X.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 8 / 21

slide-10
SLIDE 10

Mechanistic Heterogeneity in MR

Two scenarios of mechanistic heterogeneity

If we interpret the diagrams in the previous slide as linear structural equations as before, we can derive the Wald estimands for each pathway.

Instruments Z Pathway M Effect of M on X Effect of M on Y Wald estimand Scenario 1 Z1,1, . . . , Z1,p1 M1 θ1 θ1β β Z2,1, . . . , Z2,p2 M2 θ2 θ2β + α2 β + α2/θ2 Z3,1, . . . , Z3,p3 M3 θ3 θ3β + α3 β + α3/θ3 Scenario 2 Z1,1, . . . , Z1,p1 M1 θ1 θ1β1 β1 Z2,1, . . . , Z2,p2 M2 θ2 θ2β2 β2 Z3,1, . . . , Z3,p3 M3 θ3 θ3β3 β3

SNPs on the same pathway have the same Wald estimand, while SNPs across different pathways generally have different estimands. Mechanistic heterogeneity can arise even when all SNPs are valid instruments (Scenario 2).

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 9 / 21

slide-11
SLIDE 11

Mechanistic Heterogeneity in MR

Mechanism-specific causal effect

The same clustering phenomenon also occurs in nonlinear models. It is well known that assuming monotonicity, IV nonparametrically estimates the complier average treatment effect (Angrist et al., JASA, 1996). By assuming monotonicity and Pearl’s nonparametric structural equation model with independent errors (NPSEM-IE), our paper showed that (if X, Z, M are all binary variables) E[Y (X = 1) − Y (X = 0) | X(Zkj = 1) > X(Zkj = 0)] =E[Y (X = 1) − Y (X = 0) | X(Mk = 1) > X(Mk = 0)], where k indexes the mechanism and j indexes the gene within.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 10 / 21

slide-12
SLIDE 12

MR-PATH Model Assumptions

MR-PATH: Model Assumptions

Assumption (Error-in-variables regression) The observed SNP-exposure and SNP-outcome associations are distributed as ˆ θXi ˆ θYi

  • indep.

∼ N θXi βiθXi

  • ,

σ2

Xi

σ2

Yi

, i = 1, . . . , p, where σXi, σYi are (fixed) measurement errors. Assumption (Mixture model for mechanistic heterogeneity) Zi ∼ Categorical (π1, . . . , πK), βi|Zi = k ∼ N(µk, σ2

k),

k = 1, . . . , K.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 11 / 21

slide-13
SLIDE 13

MR-PATH Statistical inference

MR-PATH: Statistical Inference

1 Monte-Carlo EM algorithm for obtaining model parameter

estimates

2 Approximate confidence intervals for quantifying uncertainty of the

estimates

3 Modified Bayesian Information criterion (BIC) for selecting

number of clusters We perform simulation studies to verify the efficacy of these inference procedures. See paper for implementation details.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 12 / 21

slide-14
SLIDE 14

Results HDL-CHD

Example: HDL-CHD

Data (Three-sample MR design) Selection dataset: Teslovich et al. 20101 Exposure dataset: Kettunen et al. 20162 Outcome dataset: Nikpay et al. 20153

1Tanya M Teslovich et al. “Biological, clinical and population relevance of

95 loci for blood lipids”. In: Nature 466.7307 (2010), pp. 707–713.

2Johannes Kettunen et al. “Genome-wide study for circulating metabolites

identifies 62 loci and reveals novel systemic effects of LPA”. In: Nature communications 7.1 (2016), pp. 1–9.

3Majid Nikpay et al. “A comprehensive 1000 Genomes–based genome-wide

association meta-analysis of coronary artery disease”. In: Nature Genetics 47.10 (2015), p. 1121.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 13 / 21

slide-15
SLIDE 15

Results HDL-CHD

Example: HDL-CHD

−0.10 −0.05 0.00 0.00 0.05 0.10 0.15

SNP association with HDL−C SNP association with CHD

Results of MR-RAPS.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 14 / 21

slide-16
SLIDE 16

Results HDL-CHD

Example: HDL-CHD

−0.10 −0.05 0.00 0.00 0.05 0.10 0.15

SNP association with HDL−C SNP association with CHD

1 2

Results of MR-PATH (http://danieliong.me/mr-path/.)

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 14 / 21

slide-17
SLIDE 17

Results HDL-CHD

Example: HDL-CHD

−1.0 −0.5 0.0 rs838880 rs4846914 rs2954029 rs2814944 rs12678919 rs2241770 rs9326246 rs863750 rs2943634 rs643531 rs17782313 rs11869286 rs4731702 rs8071366 rs7134375 rs2243976 rs4969178 rs4939883 rs4841132 rs2293889 rs4660293 rs3136441 rs2923084 rs7679 rs9989419 rs3890182 rs16942887 rs11067231 rs174546 rs588136 rs1532085 95% Posterior Credible Interval

1 2

0.00 0.25 0.50 0.75 1.00 Cluster membership prob.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 15 / 21

slide-18
SLIDE 18

Results HDL-CHD

Example: HDL-CHD

VLDL−D XXL−VLDL−TG XXL−VLDL−PL XXL−VLDL−P XXL−VLDL−L XL−VLDL−TG XL−VLDL−PL XL−VLDL−P XL−VLDL−L L−VLDL−TG L−VLDL−PL L−VLDL−P L−VLDL−L L−VLDL−FC L−VLDL−CE L−VLDL−C M−VLDL−TG M−VLDL−PL M−VLDL−P M−VLDL−L M−VLDL−FC M−VLDL−CE M−VLDL−C S−VLDL−TG S−VLDL−PL S−VLDL−P S−VLDL−L S−VLDL−FC XS−VLDL−TG XS−VLDL−PL XS−VLDL−P XS−VLDL−L LDL−D LDL−C APOB L−LDL−PL L−LDL−P L−LDL−L L−LDL−FC L−LDL−CE L−LDL−C M−LDL−PL M−LDL−P M−LDL−L M−LDL−CE M−LDL−C S−LDL−P S−LDL−L S−LDL−C IDL−TG IDL−PL IDL−P IDL−L IDL−FC IDL−C HDL−D HDL−C APOA1 XL−HDL−TG XL−HDL−PL XL−HDL−P XL−HDL−L XL−HDL−FC XL−HDL−CE XL−HDL−C L−HDL−PL L−HDL−P L−HDL−L L−HDL−FC L−HDL−CE L−HDL−C M−HDL−PL M−HDL−P M−HDL−L M−HDL−FC M−HDL−CE M−HDL−C S−HDL−TG S−HDL−P S−HDL−L rs838880 rs4846914 rs2954029 rs2814944 rs12678919 rs2241770 rs9326246 rs863750 rs2943634 rs643531 rs17782313 rs11869286 rs4731702 rs8071366 rs7134375 rs2243976 rs4969178 rs4939883 rs4841132 rs2293889 rs4660293 rs3136441 rs2923084 rs7679 rs9989419 rs3890182 rs16942887 rs11067231 rs174546 rs588136 rs1532085 −7 −5 −3 0 3 5 7

z−score

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 16 / 21

slide-19
SLIDE 19

Results BMI-T2D

Example: BMI-T2D

Data (Three-sample MR design) Selection dataset: Akiyama et al. 20171 Exposure dataset: Locke et al. 20152 Outcome dataset: Mahajan et al. 20183

1Masato Akiyama et al. “Genome-wide association study identifies 112 new

loci for body mass index in the Japanese population”. In: Nature Genetics 49.10 (2017), p. 1458.

2Adam E Locke et al. “Genetic studies of body mass index yield new insights

for obesity biology”. In: Nature 518.7538 (2015), pp. 197–206.

3Anubha Mahajan et al. “Fine-mapping type 2 diabetes loci to single-variant

resolution using high-density imputation and islet-specific epigenome maps”. In: Nature genetics 50.11 (2018), pp. 1505–1513.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 17 / 21

slide-20
SLIDE 20

Results BMI-T2D

Example: BMI-T2D

−0.3 −0.2 −0.1 0.0 0.1 0.000 0.025 0.050 0.075

SNP association with BMI SNP association with T2D

Results of MR-RAPS.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 18 / 21

slide-21
SLIDE 21

Results BMI-T2D

Example: BMI-T2D

−0.3 −0.2 −0.1 0.0 0.1 0.000 0.025 0.050 0.075

SNP association with BMI SNP association with T2D

1 2

Results of MR-PATH.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 18 / 21

slide-22
SLIDE 22

Results BMI-T2D

Example: BMI-T2D

−20 −15 −10 −5 rs7903146 rs9368222 rs7020996 rs7923837 rs6444082 rs10906111 rs2237892 rs4357030 rs1568079 rs5007237 rs10208649 rs3820594 rs10504374 rs6734118 rs1592269 rs16847378 rs5015933 rs1562987 rs4794047 rs10828399 rs1442493 rs2301712 rs1459194 rs12236219 rs7588043 rs4803846 rs2977907 rs16978957 rs2390669 rs6947395 rs1048029 rs4790981 rs729050 rs6507516 rs9397585 rs11235 rs139913 rs3826893 rs3814424 rs11951673 rs713586 rs4409766 rs10501087 rs1899444 rs12597682 rs12617233 rs17640009 rs1996023 rs764975 rs2920930 rs12595158 rs16937956 rs633715 rs2053979 rs6567160 rs2307111 rs939584 rs9568867 rs2206277 rs1558902 95% Posterior Credible Interval

1 2

0.00 0.25 0.50 0.75 1.00 Cluster membership prob.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 19 / 21

slide-23
SLIDE 23

Results BMI-T2D

Example: BMI-T2D

rs10906111 rs2237892 rs6444082 rs7020996 rs7903146 rs7923837 rs9368222 0.0 0.1 0.2 −15 −10 −5

SNP−specific slope SNP association with peak blood insulin

a a 1 2

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 20 / 21

slide-24
SLIDE 24

Conclusion

Concluding remarks

A few other related methods:

MR-Clust: Constructs mixture model based on SNP-specific Wald estimators. GRAPPLE: A visualization tool that does not attempt to model different mechanisms explicitly. BESIDE-MR: A Bayesian model averaging approach extends the profile likelihood used in MR RAPS.

Advantages of MR-PATH:

Does not require individually strong instruments. Accounts for measurement error in the summary data. An interpretable generative model for multiple causal mechanisms. Potential extensions to multivariable MR with correlated SNPs.

Further information: http://danieliong.me/mr-path/.

Qingyuan Zhao (Cambridge) MR-PATH September 26, 2020 @ PCIC 21 / 21