in the presence of latent confounders and linear non-Gaussian SEMs - PowerPoint PPT Presentation

1 Causal Modeling and Machine Learning Beijing, China, June 2014 Estimation of causal direction in the presence of latent confounders and linear non-Gaussian SEMs Shohei Shimizu Osaka University, Japan with Kenneth Bollen University of North Carolina, Chapel Hill, USA

2 Abstract • Estimation of causal direction of two observed variables in the presence of latent confounders • A key challenge in causal discovery • Propose a non-Gaussian method • Not require to specify the number of latent confounders • Experiments on artificial and sociology data

Background

4 Motivation • Causality is a main interest in many empirical sciences • Many recent methods for estimating causal directions (with no temporal information) – Linear non-Gaussian model (Dodge & Rousson 2001; Shimizu et al., 2006) – Nonlinear model (Hoyer et al., 2009; Zhang & Hyvarinen, 2009; Peters et al. 2011) Sleep Depression problems mood or Which is dominant? Sleep Depression ? problems mood Epidemiology (Rosenstrom et al., 2012) • Another important challenge: Latent confounders

5 Structural equation modeling (SEM) (Bollen, 1989; Pearl, 2000, 2009) • A framework for describing causal relations • An example (of linear cases): 𝒚 𝟑 ∶= 𝒈(𝒚 𝟐 , 𝒇 𝟑 ) e2 x2 x1 = 𝒄 𝟑𝟐 𝒚 𝟐 + 𝒇 𝟑 – The value of 𝑦 2 is determined by the values of 𝑦 1 and error/exogenous variable 𝑓 2 through the linear function • Generally speaking, if the value of 𝑦 1 is changed and that of 𝑦 2 also changes, then 𝑦 1 causes 𝑦 2

6 Major challenges 1. Estimation of causal direction when temporal information is not available x2 ? x2 or x1 x1 2. Coping with latent confounders x2 ? or f 1 f 1 x1 x1 x2

Non-Gaussian approach: LiNGAM 7 (Linear Non-Gaussian Acyclic Model) (Shimizu et al., 2006) • Acyclic SEMs with different directions distinguishable (Dodge & Rousson, 2001; Shimizu et al., 2006) Model 2: Model 1: e2 e2 e1 e1    x b x e x e or 1 12 2 1 1 1 x1 x2 x1 x2    x e x b x e 2 2 2 21 1 2 e e where and are error/exogenous variables 1 2 • Fundamental assumptions: – e1 and e2 are non-Gaussian – Independence btw. e1 and e2 (No latent confounders)

Different directions give 8 different data distributions Gaussian Non-Gaussian (uniform) x2 x2 Model 1:  x e x1 x1 x1 e1 1 1   0.8 x 0 . 8 x e 2 1 2 x2 e2 Model 2: x2 x2   x 0 . 8 x e x1 e1 1 2 1 x1 0.8 x1  x e x2 e2 2 2       E e E e 0 , 1 2       var x var x 1 1 2

9 LiNGAM with latent confounders (Hoyer, Shimizu & Kerminen, 2008) • Extension to incorporate non-Gaussian latent f confounders q Q         x f b x e i i iq q ij j i  1  q j i  f q ( q 1 , , Q ) where, WLG, are independent:  f f Q       2 x f e 1 1 1 1 q q 1  q 1 Q        x f b x e 2 2 2 q q 21 1 2 e e x1 x2  q 1 1 2

10 Previous estimation approaches • Explicitly model latent confounders and compare two models with opposite directions of causation – Maximum likelihood principle (Hoyer et al., 2008 ) – Bayesian model selection (Henao & Winther, 2011) e – Laplace / finite mixture of Gaussians for p( ) i • Require to specify the number of latent confounders, which is difficult in general … … f Q f Q f 1 f 1 or e e e e x1 x2 x1 x2 1 2 1 2

Our proposal Reference: Shimizu and Bollen (2014) Journal of Machine Learning Research In press

12 Key idea (1/2) • Another look at the LiNGAM with latent confounders: Q        ( m ) ( m ) ( m ) ( m ) x f b x e m -th obs.: 2 2 2 q q 21 1 2  q 1  ( m ) 2 Observations are generated from the LiNGAM    ( m ) model with possibly different intercepts 2 2    ( 1 ) b 2 2 21 ( 1 ) ( 1 ) ( 1 ) e ( 1 ) e x x … f Q 2 f 1 1 2 1 …    ( m ) e e b 2 2 x1 x2 21 1 2 ( 1 ) ( m ) b x e ( m ) ( m ) e x 21 1 2 1 2 …

13 Key idea (2/2) • Include the sums of latent confounders as the observation-specific intercepts: Q        ( m ) ( m ) ( m ) ( m ) x f b x e m -th obs.: 2 2 2 q q 21 1 2  q 1  ( m ) Obs.-specific 2 intercept • Not explicitly model latent confounders • Neither necessary to specify the number of latent confounders Q nor estimate the  coefficients 2 q

14 Our approach • Compare these two LiNGAM models with opposite directions: Model 3 (x1  x2) Model 4 (x1  x2) ( m )     ( m )  ( m ) ( m )     ( m )  ( m )  ( m ) x e x b x e 1 1 1 1 1 1 1 12 2 1       ( m ) ( m ) ( m ) ( m )      x b x e ( m ) ( m ) ( m ) x e 2 2 2 21 1 i 2 2 2 2    ( m ) • Many additional parameters ( i 1 , 2 ; m 1 , , n )  i  ( m ) • Prior for the observation-specific intercepts i • Other para. low-informative: Gaussian with large sd. • Bayesian model selection (marginal likelihoods)

15 Prior for the observation - specific Q Q   intercepts       ( m ) ( m ) ( m ) ( m ) f , f 1 1 q q 2 2 q q   q 1 q 1 • Motivation: Central limit theorem – Sums of independent variables tend to be more Gaussian • Approximate the density by a bell-shaped curve dist.    ( m )  1 ,  1 t -distribution with sd , ~   2  ( m )  v   correlation , and DOF 2 12 • Select the hyper-parameter values that maximize the marginal likelihood: Empirical Bayes         { 0 , 0 . 2 sd ( x ), , 1 . 0 sd ( x )}, { 0 , 0 . 1 , , 0 . 9 }   – l l l 12 v – DOF fixed to be 6 in the experiments below  • Small means similar intercepts l

Experiments on artificial data

17 Experimental results (100 obs.) • Data generated from LiNGAM with latent confounders • Various non-Gaussian distributions … f Q f 1 – Laplace, Uniform, asymmetric dist. etc. e • Our method uses Laplace for p( ) x1 x2 i Numbers of successful discoveries (100 rep.) N. latent confounders = 6 N. latent confounders = 1 100 100 86 80 80 72 80 58 58 55 55 54 51 60 60 47 39 34 40 40 20 20 0 0 Our Our Hoyer: Henao: Hoyer: Henao: mthd mthd 1, 4 conf. 1, 4, 10 conf. 1, 4 conf. 1, 4, 10 conf.

Experiment on sociology data

19 Sociology data • Source: General Social Survey (n=1380) – Non-farm background, ages 35-44, white, male, in the labor force, no missing data for any of the covariates, 1972-2006 x 2: Son’s Income Status attainment model (Duncan et al., 1972)

20 Evaluation of our method using the sociology data Known (temporal) orderings of 15 pairs Father’s Son’s Education Education … Father’s Son’s Education Income … Son’s Son’s Occupation Income

Conclusions

22 Conclusions • Estimation of causal direction in the presence of latent confounders is a major challenge in causal discovery • Our proposal: Fit linear non-Gaussian SEM with possibly different intercepts to data • Future works – Test other informative priors for observation-specific intercepts – Implement a wider variety of error/prior distributions (e.g., learn DOF of t dist.) – Develop extensions using nonlinear/cyclic models (Hoyer et al., 2009; Zhang & Hyvarinen, 2009; Lacerda et al., 2008) instead of LiNGAM

in the presence of latent confounders and linear non-Gaussian SEMs - PowerPoint PPT Presentation

1 Causal Modeling and Machine Learning Beijing, China, June 2014 Estimation of causal direction in the presence of latent confounders and linear non-Gaussian SEMs Shohei Shimizu Osaka University, Japan with Kenneth Bollen University of

Controlling for confounders through approximate sufficiency Rina Foygel Barber (joint with Lucas

Policy Evaluation with Latent Confounders via Optimal Balance Andrew Bennett 1 Cornell University

Pitfalls of data-driven networking: A case study of latent causal confounders in video streaming

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Presence Presence Presence When we wake up in the morning we may automatically leave our

Causality Actions, Confounders and Interventions Christos Dimitrakakis October 30, 2019 . . .

www.dagitty.net Dealing with confounders just got easier! George TH Ellison PhD DSc TIME

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]->

Confounders and Corfield: Back to the Future 12 July, 2018 0G 2018 ICOTS-10 1 0G 2018

STATISTICS 536B, Lecture #3 March 3, 2015 General options for binary Y , binary X , confounders C

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy

ZEB1 Regulates the Latent- -Lytic Lytic Switch Switch ZEB1 Regulates the Latent in Infection

Latent Wishart Processes for Relational Kernel Learning Wu-Jun Li Department of Computer Science

AdaGeo: Adaptive Geometric Learning for Optimization and Sampling Gabriele Abbati 1 , Alessandra

Dream to Control: Learning Behaviors by Latent Imagination Danijar Hafner, Timothy Lillicrap,

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Efficient Model Evaluation in the Search-Based Approach to Latent Structure Discovery Tao Chen,

Advanced CUDA: Overview of GPU Hardware John E. Stone Theoretical and Computational Biophysics

Roadmap Roadmap Distributed Data Mining: Why Bother? Distributed Data Mining: Why Bother?

On a Road to 6G: Interplay Between NOMA and Reconfigurable Intelligent Surfaces (RIS) Dr. Yuanwei

Sambuz

Useful Links

Newsletter

Mail Us

in the presence of latent confounders and linear non-Gaussian SEMs - PowerPoint PPT Presentation

1 Causal Modeling and Machine Learning Beijing, China, June 2014 Estimation of causal direction in the presence of latent confounders and linear non-Gaussian SEMs Shohei Shimizu Osaka University, Japan with Kenneth Bollen University of

Controlling for confounders through approximate sufficiency Rina Foygel Barber (joint with Lucas

Policy Evaluation with Latent Confounders via Optimal Balance Andrew Bennett 1 Cornell University

Pitfalls of data-driven networking: A case study of latent causal confounders in video streaming

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Presence Presence Presence When we wake up in the morning we may automatically leave our

Causality Actions, Confounders and Interventions Christos Dimitrakakis October 30, 2019 . . .

www.dagitty.net Dealing with confounders just got easier! George TH Ellison PhD DSc TIME

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]-&gt;

Confounders and Corfield: Back to the Future 12 July, 2018 0G 2018 ICOTS-10 1 0G 2018

STATISTICS 536B, Lecture #3 March 3, 2015 General options for binary Y , binary X , confounders C

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor &amp; Client: Dr. Randy

ZEB1 Regulates the Latent- -Lytic Lytic Switch Switch ZEB1 Regulates the Latent in Infection

Latent Wishart Processes for Relational Kernel Learning Wu-Jun Li Department of Computer Science

AdaGeo: Adaptive Geometric Learning for Optimization and Sampling Gabriele Abbati 1 , Alessandra

Dream to Control: Learning Behaviors by Latent Imagination Danijar Hafner, Timothy Lillicrap,

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Efficient Model Evaluation in the Search-Based Approach to Latent Structure Discovery Tao Chen,

Advanced CUDA: Overview of GPU Hardware John E. Stone Theoretical and Computational Biophysics

Roadmap Roadmap Distributed Data Mining: Why Bother? Distributed Data Mining: Why Bother?

On a Road to 6G: Interplay Between NOMA and Reconfigurable Intelligent Surfaces (RIS) Dr. Yuanwei

Sambuz

Useful Links

Newsletter

Mail Us

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]->

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy