Entropic Causal Inference Murat Kocaoglu, Alexandros G. Dimakis, - PowerPoint PPT Presentation

Entropic Causal Inference Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath and Babak Hassibi University of Texas at Austin November 28, 2019 Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 1 / 25

Outline Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Outline Problem Definition Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Outline Problem Definition Approach Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Outline Problem Definition Approach Background and Notation Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Identifiability ( H 1 ) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Identifiability ( H 1 ) Greedy Entropy Minimization Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Identifiability ( H 1 ) Greedy Entropy Minimization Experiments Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25

Problem Definition Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25

Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25

Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25

Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Structural Causal Model: E ∼ p E Y = f ( X , E ) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25

Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Structural Causal Model: E ∼ p E Y = f ( X , E ) Causal sufficiency: X ⊥ ⊥ E Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25

Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Structural Causal Model: E ∼ p E Y = f ( X , E ) Causal sufficiency: X ⊥ ⊥ E Example: Additive noise: f ( X , E ) = f ( X ) + E Linear causal mechanism: f ( X ) = A . X + µ Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25

Approach Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 4 / 25

Approach The use of information theory as a tool for causal discovery Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 4 / 25

Approach The use of information theory as a tool for causal discovery e.g. Granger causality, Directed information and etc. Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 4 / 25

Approach Key Assumption : Exogenous noise E is “simple” in the correct causal direction. Occam’s Razor There should not be too much complexity not included in the causal model Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 7 / 25

Approach Focus on discrete random variables i.e. categorical variables p X ( i ) = P ( X = i ) Notions of simplicity: Renyi entropy 1 � p X ( i ) a ) H a ( X ) = 1 − alog ( i This work emphasizes on: Shannon entropy: H 1 Cardinality: H 0 Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 8 / 25

Approach Objective: Find the minimum H ( E ) such that Y = f ( X , E ) is feasible Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 9 / 25

Identifiability ( H 0 ) Causal model: M = ( { X , Y } , E , f , X → Y , p X , E ) Independent identically distributed samples: { ( x i , y i ) } i ∼ p X , Y Decide X → Y or Y → X , given the joint distribution p X , Y . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 10 / 25

Identifiability ( H 0 ) Have in mind that both X , Y have cardinality n and E has cardinality m . Definition (Conditional distribution matrix). n × n matrix Y | X ( i , j ) := P ( Y = i | X = j ). The vector vec ( Y | X )( i + ( j − 1) n ) = Y | X ( i , j ) is called the conditional distribution vector. Definition (Block Partition Matrices). Consider a matrix M ∈ { 0 , 1 } n 2 × m . Let m i , j represent the i + ( j − 1) n th row of M . Let S i , j = { k ∈ [ m ] : m i , j ( k ) � = 0 } . The matrix M is called a block partition matrix if it belongs to C := { M : M ∈ { 0 , 1 } n 2 × m , i ∈ [ n ] S i , j = [ m ] , S i , j ∩ S l , j = ∅ , ∀ i � = l } . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 11 / 25

Identifiability ( H 0 ) Equivalent condition for existence of causal mechanism: Lemma Lemma 1. Given discrete random variables X , Y with distribution p X , Y , ∃ a causal model M = ( { X , Y } , E , f , X → Y , p X , E ) with H 0 ( E ) = m if and only if ∃ M ∈ C , e ∈ R m + with � i e ( i ) = 1 that satisfy vec ( Y | X ) = Me . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 12 / 25

Identifiability ( H 0 ) Lemma (Upper Bound on Minimum Cardinality of E). Let X , Y be two random variables with joint probability distribution p X , Y ( x , y ), where H 0 ( X ) = H 0 ( Y ) = n . Then exists a causal model Y = f ( X , E ) , X ⊥ ⊥ E that induces p X , Y , where m = H 0 ( E ) ≤ n ( n 1) + 1. If the columns of Y | X are uniformly sampled points in the n 1 dimensional simplex, then n ( n 1) states are necessary for E Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 13 / 25

Identifiability ( H 0 ) True causal direction: Y = f ( X , E ) , X ⊥ ⊥ Y Wrong causal direction: X = g ( Y , ˜ E ) , ˜ E ⊥ ⊥ X Under mild assumptions about the generation process of causal mechanism f , X , E instead of Y | X , we can have the same lower-bound. Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 14 / 25

Identifiability ( H 0 ) Definition (Generic Function). Let Y = f ( X , E ) where variables X , Y , E have supports X , Y , E , respectively. Let S y , x = f − 1 ( y ) ⊂ E be the inverse map, x i.e., S y , x = { e ∈ E : y = f ( x , e ) } . A function f is called “generic”, if for each ( x 1 , x 2 , y ) triple f − 1 x 1 ( y ) � = f − 1 x 2 ( y ) and for every pair ( x , y ), f − 1 ( y ) � = ∅ . x Causal mechanism f will be generic almost surely (!) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 15 / 25

Identifiability ( H 0 ) Theorem (Identifiability). Consider the causal model M = ( { X , Y } , E , f , X → Y , p X , E ) where the random variables X , Y have n states, E ⊥ ⊥ X has θ states and f is a generic function. If the distributions of X and E are uniformly randomly selected from the n 1 and 1 simplices, then with probability 1 , any ˜ ⊥ Y that satisfies X = g ( Y , ˜ E ⊥ E ) for some deterministic function g has cardinality at least n ( n − 1) . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 16 / 25

Identifiability ( H 0 ) Assume that we have the algorithm A that given the joint probability of X , Y , outputs E and f such that Y = f ( X , E ) with minimum cardinality E . Corollary The causal direction can be recovered with probability 1 if the original exogenous random variable E has cardinality less than n ( n − 1), the causal mechanism f is generic and the distributions of X and E are selected uniformly randomly from the proper simplice. Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 17 / 25

Identifiability ( H 0 ) Proposition (Inference algorithm). Suppose X → Y . Let X ∈ X , Y ∈ Y , |X| = n , |Y| = m . Assume that A is the algorithm that finds the exogenous variables E , ˜ E with minimum cardinality. Then, if the underlying exogenous variable has less cardinality than n ( m − 1), with probability 1, we have H 0 ( X ) + H 0 ( E ) < H 0 ( Y ) + H 0 ( ˜ E ) . Unfortunately, it turns out there does not exist an efficient algorithm A , unless P = NP Definition Subset sum problem: For a given set of integers V , and an integer a , decide whether there exists a subset S of V such that � u ∈ S u = a . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 18 / 25

Identifiability ( H 1 ) THE EXACT SAME STORY! Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 19 / 25

Entropic Causal Inference Murat Kocaoglu, Alexandros G. Dimakis, - PowerPoint PPT Presentation

Entropic Causal Inference Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath and Babak Hassibi University of Texas at Austin November 28, 2019 Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 1 / 25 Outline Amirkasra

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Modes of Statistical Inference for Causal Efgects Plus an overview of the testing based approach

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal inference Gary Goertz Kroc Institute for International Peace Studies University of Notre

Causal Inference An introduction based on S. Wagers course on Causal Inference (OIT 661) Imke

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference Theory and Applications Dr. Matthias Uflacker, Johannes Huegle, Christopher

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference and Response Surface Modeling Inference and

On entropic cost optimal transport cost Soumik Pal University of Washington, Seattle

Time energy entropic uncertainty relations: an algebraic approach Christian Bertoni, Yuxiang

New perspectives for air transport performance Dr Andrew Cook Principal Research Fellow

Signal Processing for Medical Applications Frequency Domain Analyses Muthuraman Muthuraman

Why causality? To paraphrase a old joke, there are two types of statisticians: those who do

Traffic Driven Analysis of Cellular Data Networks Samir R. Das Computer Science Department Stony

Please feel free to include these slides in your own material, or modify them as you see fit. If

Dark Energy to Modified Gravity Philippe Brax IPhT Saclay Frontiers of Fundamental physics July

THE STABILITY CONDITIONS OF MODIFIED GRAVITY MODELS AND THEIR APPLICATIONS Gravity and Cosmology

xtseqreg: Sequential (two-stage) estimation of linear panel data models and some pitfalls in the