CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl - PowerPoint PPT Presentation

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea)

OUTLINE • Inference: Statistical vs. Causal distinctions and mental barriers • Formal semantics for counterfactuals: definition, axioms, graphical representations • Inference to three types of claims: 1. Effect of potential interventions 2. Attribution (Causes of Effects) 3. Direct and indirect effects

TRADITIONAL STATISTICAL INFERENCE PARADIGM P Q ( P ) Data Joint (Aspects of P ) Distribution Inference e.g., Infer whether customers who bought product A would also buy product B . Q = P ( B | A )

FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES Probability and statistics deal with static relations P ′ P Q ( P ′ ) Joint Data Joint (Aspects of P ′ ) Distribution Distribution change Inference What happens when P changes? e.g., Infer whether customers who bought product A would still buy A if we were to double the price.

FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES What remains invariant when P changes say, to satisfy P ′ ( price =2)=1 P ′ P Q ( P ′ ) Joint Data Joint (Aspects of P ′ ) Distribution Distribution change Inference Note: P ′ ( v ) ≠ P ( v | price = 2) P does not tell us how it ought to change e.g. Curing symptoms vs. curing diseases e.g. Analogy: mechanical deformation

FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES (CONT) 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect “Controlling for” / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. 3. 4.

FROM STATISTICAL TO CAUSAL ANALYSIS: 2. MENTAL BARRIERS 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect “Controlling for” / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. No causes in – no causes out (Cartwright, 1989) statistical assumptions + data } ⇒ causal conclusions causal assumptions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4.

FROM STATISTICAL TO CAUSAL ANALYSIS: 2. MENTAL BARRIERS 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect “Controlling for” / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. No causes in – no causes out (Cartwright, 1989) statistical assumptions + data } ⇒ causal conclusions causal assumptions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4. Non-standard mathematics: a) Structural equation models (Wright, 1920; Simon, 1960) Counterfactuals (Neyman-Rubin ( Y x ) , Lewis ( x b) Y ))

WHY CAUSALITY NEEDS SPECIAL MATHEMATICS Scientific Equations (e.g., Hooke’s Law) are non-algebraic e.g., Pricing Policy: “Double the competitor’s price” Correct notation: Y : = 2 X X = 1 Y = 2 X Y = 2 X = 1 Process information The solution Had X been 3, Y would be 6. If we raise X to 3, Y would be 6. Must “wipe out” X = 1.

WHY CAUSALITY NEEDS SPECIAL MATHEMATICS Scientific Equations (e.g., Hooke’s Law) are non-algebraic e.g., Pricing Policy: “Double the competitor’s price” Correct notation: (or) Y ← 2 X X = 1 Y = 2 X = 1 Process information The solution Had X been 3, Y would be 6. If we raise X to 3, Y would be 6. Must “wipe out” X = 1.

THE STRUCTURAL MODEL PARADIGM Data Joint Q ( M ) Data Generating Distribution (Aspects of M ) Model M Inference M – Invariant strategy (mechanism, recipe, law, protocol) by which Nature assigns values to variables in the analysis.

FAMILIAR CAUSAL MODEL ORACLE FOR MANIPILATION X Y Z INPUT OUTPUT

STRUCTURAL CAUSAL MODELS Definition: A structural causal model is a 4-tuple 〈 V,U, F, P ( u ) 〉 , where V = { V 1 ,...,V n } are observable variables • • U = { U 1 ,...,U m } are background variables • F = { f 1 ,..., f n } are functions determining V , v i = f i ( v , u ) • P ( u ) is a distribution over U P ( u ) and F induce a distribution P ( v ) over observable variables

STRUCTURAL MODELS AND CAUSAL DIAGRAMS The arguments of the functions v i = f i ( v,u ) define a graph v i = f i ( pa i ,u i ) PA i ⊆ V \ V i U i ⊆ U Example: Price – Quantity equations in economics U 1 U 2 I W = + + Q P q b p d i u 1 1 1 PA Q = + + p b q d w u 2 2 2

STRUCTURAL MODELS AND INTERVENTION Let X be a set of variables in V . The action do ( x ) sets X to constants x regardless of the factors which previously determined X . do ( x ) replaces all functions f i determining X with the constant functions X=x , to create a mutilated model M x U 1 U 2 I W = + + q b p d i u 1 1 1 = + + p b q d w u 2 2 2 Q P

STRUCTURAL MODELS AND INTERVENTION Let X be a set of variables in V . The action do ( x ) sets X to constants x regardless of the factors which previously determined X . do ( x ) replaces all functions f i determining X with the constant functions X=x , to create a mutilated model M x M p = + + U 1 U 2 I W q b p d i u 1 1 1 = + + p b q d w u 2 2 2 = p p Q P 0 P = p 0

CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: “ Y would be y (in situation u ), had X been x , ” denoted Y x ( u ) = y , means: The solution for Y in a mutilated model M x , (i.e., the equations for X replaced by X = x ) with input U=u , is equal to y . The Fundamental Equation of Counterfactuals: = Y ( u ) Y ( u ) x M x

CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: “ Y would be y (in situation u ), had X been x , ” denoted Y x ( u ) = y , means: The solution for Y in a mutilated model M x , (i.e., the equations for X replaced by X = x ) with input U=u , is equal to y . • Joint probabilities of counterfactuals: = = = ∑ P ( Y y , Z z ) P ( u ) x w = = u : Y ( u ) y , Z ( u ) z The Fundamental Equation of Counterfactuals: x w In particular: ∆ = = ∑ P ( y | do ( x ) ) P ( Y y ) P ( u ) = = x Y ( u ) Y ( u ) x M = x u : Y ( u ) y x = = ∑ PN Y y x y P u x y ( ' | , ) ( | , ) x ' = u : Y ( u ) y ' x '

AXIOMS OF CAUSAL COUNTERFACTUALS = Y would be y , had X been x (in state U = u ) Y x ( u ) y : 1. Definiteness ∃ ∈ = x X s . t . X ( u ) x y 2. Uniqueness = = ⇒ = ( X ( u ) x ) & ( X ( u ) x ' ) x x ' y y 3. Effectiveness = X xw ( u ) x 4. Composition = ⇒ = W ( u ) w Y ( u ) Y ( u ) x xw x 5. Reversibility = = ⇒ = ( Y ( u ) y & ( W ( u ) w ) Y ( u ) y xw xy x

INFERRING THE EFFECT OF INTERVENTIONS The problem: To predict the impact of a proposed intervention using data obtained prior to the intervention. The solution (conditional): Causal Assumptions + Data → Policy Claims 1. Mathematical tools for communicating causal assumptions formally and transparently. 2. Deciding (mathematically) whether the assumptions communicated are sufficient for obtaining consistent estimates of the prediction required. 3. Deriving (if (2) is affirmative) 4. Suggesting (if (2) is negative) a closed-form expression for the predicted impact a set of measurements and experiments that, if performed, would render a consistent estimate feasible.

NON-PARAMETRIC STRUCTURAL MODELS Given P ( x,y,z ), should we ban smoking? U U 1 1 U 3 U 3 U 2 U 2 f 3 f 1 f 2 α β X Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs Linear Analysis Nonparametric Analysis x = u 1 , x = f 1 ( u 1 ), z = α x + u 2 , z = f 2 ( x , u 2 ), y = β z + γ u 1 + u 3 . y = f 3 ( z , u 1 , u 3 ). Find: α ⋅ β Find: P ( y | do ( x ))

EFFECT OF INTERVENTION AN EXAMPLE Given P ( x,y,z ), should we ban smoking? U U 1 1 U 3 U 3 U U 2 2 f 3 f 2 α β X = x Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs Linear Analysis Nonparametric Analysis x = u 1 , x = const. z = α x + u 2 , z = f 2 ( x , u 2 ), y = β z + γ u 1 + u 3 . y = f 3 ( z , u 1 , u 3 ). ∆ Find: α ⋅ β Find: P ( y | do ( x )) = P ( Y = y ) in new model

EFFECT OF INTERVENTION AN EXAMPLE (cont) Given P ( x,y,z ) , should we ban smoking? U (unobserved) U (unobserved) X = x Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs • • •

EFFECT OF INTERVENTION AN EXAMPLE (cont) Given P ( x,y,z ) , should we ban smoking? U (unobserved) U (unobserved) X = x Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs Pre-intervention Post-intervention = = ∑ ∑ P ( x , y , z ) P ( u ) P ( x | u ) P ( z | x ) P ( y | z , u ) P ( y , z | do ( x )) P ( u ) P ( z | x ) P ( y | z , u ) u u •

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl - PowerPoint PPT Presentation

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Inference: Statistical vs. Causal distinctions and mental barriers Formal semantics for

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Modes of Statistical Inference for Causal Efgects Plus an overview of the testing based approach

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal inference Gary Goertz Kroc Institute for International Peace Studies University of Notre

Causal Inference An introduction based on S. Wagers course on Causal Inference (OIT 661) Imke

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference Theory and Applications Dr. Matthias Uflacker, Johannes Huegle, Christopher

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference and Response Surface Modeling Inference and

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Paclitaxel: Should we be concerned about the risks? Rajabrata Sarkar M.D. Ph.D. Barbara Baur

Lee Kuan Yew -British Colony [Independence 1965] Singapore Stats: -Area: 275 Square Miles

Exchanging a key - how hard can it be? Cas Cremers Joint work with Michle Feltz Authenticated

Towards a sustainable solution to open source sustainability Tobie Langel, Principal, UnlockOpen

16 Applications 1: Monolingual Sequence-to-sequence Prob- lems Up until now, we have largely

Basic Elements: A Framework for Automated Evaluation of Summary Content Eduard Hovy, Chin-Yew

How to Build a Liveable Megacity from Globopolis to Cosmopolis in Asia Mike Douglass Asia Research

EE663: Optimizing Compilers Prof. R. Eigenmann Purdue University School of Electrical and