Causal Inference: An Introduction Qingyuan Zhao Statistical - PowerPoint PPT Presentation

Causal Inference: An Introduction Qingyuan Zhao Statistical Laboratory, University of Cambridge 4th March, 2020 @ Social Sciences Research Methods Programme (SSRMP), University of Cambridge Slides and more information are available at http://www.statslab.cam.ac.uk/~qz280/ .

About this lecture About me 2019 – University Lecturer in the Statistical Laboratory (in Centre for Mathematical Sciences, West Cambridge). 2016 – 2019 Postdoc: Wharton School, University of Pennsylvania. 2011 – 2016 PhD in Statistics: Stanford University. Disclaimer I am a statistician who work on causal inference, but not a social scientist. Bad news: What’s in this lecture may not reflect the current practice of causal inference in social sciences. Good news (hopefully): What’s in this lecture will provide you an up-to-date view on the design , methodology , and interpretation of causal inference (especially observational studies). I tried to make the materials as accessible as possible, but some amount of maths seemed inevitable. Please bear with me and don’t hesitate to ask questions. Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 1 / 57

Growing interest in causal inference United States United Kingdom ● ● 100 ● ● ● ● ● Interest (Google Trends) ● ● ● ● 75 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● 50 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 25 ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ●●●●● ● ● ● ●●● ●●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●●●●● ● ● ● ● ● ● 0 ●● ● ● ●● ● ● ● ● Jan 2010 Jan 2012 Jan 2014 Jan 2016 Jan 2018 Jan 2020 Time Figure: Data from Google Trends. Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 2 / 57

A diverse field Causal inference is driven by applications and is at the core of statistics ( the science of using information discovered from collecting, organising, and studying numbers —Cambridge Dictionary). Many origins of causal inference Biology and genetics; Agriculture; Epidemiology, public health, and medicine; Economics, education, psychology, and other social sciences; Artificial intelligence and computer science; Management and business. In the last decade, independent developments in these disciplines have been merging into a single field called “Causal Inference”. Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 3 / 57

Examples in social sciences Economics: How does supply and demand (causally) depend on price? 1 Policy: Are job training programmes actually effective? 2 Education: Does learning “mindset” affect academic achievements? 3 Law: Is it justifiable to sue the factory over injuries due to poor working 4 conditions? Psychology: What is the effect of family structure on children’s outcome? 5 Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 4 / 57

Outline for this lecture To study causal relationships, empirical studies can be categorised into Randomised Experiments (Part I) Completely randomised; 1 Stratified (pairs or blocks); 2 With regression adjustment (also called covariance adjustment)? 3 More sophisticated designs (e.g. sequential experiments). 4 ↓↓ Question: How to define causality? (Part II) ↓↓ Observational Studies (Part III) Also called quasi-experiments in social sciences (I think it’s a poor name). Controlling for confounders; 1 Instrumental variables; 2 Regression discontinuity design; 3 Negative control (e.g. difference in differences). 4 Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 5 / 57

Part I: Randomised experiments The breakthrough The idea of randomised experiments dates back to the early development of experimental psychology in the late 1800s by Charles Sanders Peirce (American philosopher). In 1920s, Sir Ronald Fisher established randomisation as a principled way for causal inference in scientific research ( The Design of Experiments , 1935). Fundamental logic* Suppose we let half of the participants to receive the treatment at random , 1 If significantly more treated participants have better outcome, 2 Then the treatment must be beneficial. 3 Randomisation (1) = ⇒ a choice of statistical error (2) vs. causality (3). (because there can be no other logical explanations) *We will revisit this logic when moving to observational studies. Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 7 / 57

Randomisation Some notations A is treatment (e.g. job training), for now let A be binary (0=control, 1=treated); Y is outcome (e.g. employment status 6 months after job training). X is a vector of covariates measured before the treatment (e.g. gender, education, income, . . . ). Subscript i = 1 , . . . , n indexes the study participants. Different designs of randomised experiments Bernoulli trial : A 1 , . . . , A n independent and P ( A i = 1) = 0 . 2. Completely randomised : � − 1 � n P ( A 1 = a 1 , . . . , A n = a n ) = if a 1 + · · · + a n = n / 2 . n / 2 Stratified : A 1 , . . . , A n independent, P ( A i = 1 | X i ) = π ( X i ) where π ( · ) is a given function. For example: P ( A i = 1 | X i 1 = male) = 0 . 5 and P ( A i = 1 | X i 1 = female) = 0 . 75 . Blocked : Completely randomised within each block of participants similar in X . Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 8 / 57

Statistical inference: Approach 1 Randomisation inference (permutation test) Test the hypothesis H 0 : A ⊥ ⊥ Y | X (or H 0 : A ⊥ ⊥ Y if randomisation does not depend on X ). Choose a test statistic T ( X , A , Y ) (e.g. in a blocked experiment with 1 matched pairs, the average pairwise treated-minus-control difference in Y ). Obtain the randomisation distribution of T ( X , A , Y ) by permuting A , 2 according to how it was randomised. Compute the p-value : 3 � � T ( X , A , Y ) ≥ T ( X , A obs , Y ) | X , Y . P A ∼ π Note that the randomisation inference treats X and Y as given and only considers randomness in the treatment A ∼ π (which is exactly the randomness introduced by the experimenter). Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 9 / 57

Statistical inference: Approach 2 Regression analysis Simplest form: E [ Y | A ] = α + β A . Regression adjustment (also called covariance adjustment): E [ Y | A , X ] = α + β A + γ X + δ AX . More complex mixed-effect models, to account for heterogeneity of the participants. Interpretation of regression analysis Slope coefficient β of the treatment A in these regression models is usually interpreted as the average treatment effect , although this becomes difficult to justify in complex designs/regression models. To differentiate from structural equation models , regression models were written in the form of E [ Y | A ] = α + β A instead of the “traditional” form Y = α + β A + ǫ . We will explain their differences later. Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 10 / 57

Comparison of the two approaches Randomisation inference Advantages: Only uses randomness in the design. 1 Distribution-free and exact finite-sample test. 2 Disadvantages: Only gives a hypothesis test for “no treatment effect whatsoever” (can be 1 extended to constant treatment effect). Regression analysis Advantages: Account for treatment effect heterogeneity. 1 Well-developed extensions: mixed-effect models, generalised linear models, 2 Cox proportional-hazards models, etc. Disadvantages: Inference usually relies on normality or large-sample approximations. 1 Causal interpretation is model-dependent! 2 Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 11 / 57

Internal vs. external validity Internal validity Campbell and Stanley (1963): “Whether the experimental treatments make a difference in this specific experimental instance”. Exactly what randomisation inference tries to do. External validity Shadish, Cook and Campbell (2002): “Whether the cause-effect relationship holds over variation in persons, settings, treatment variables, and measurement variables”. Related concepts Another important concept in social sciences is construct validity : “the validity if inferences about the higher order constructs that represent sampling particulars”. See Shadish et al. (2002) for more discussion. Perice’s three kinds of inferences: deduction, induction, abduction. Qingyuan Zhao (Stats Lab) Causal Inference: An Introduction SSRMP 12 / 57

Causal Inference: An Introduction Qingyuan Zhao Statistical - PowerPoint PPT Presentation

Causal Inference: An Introduction Qingyuan Zhao Statistical Laboratory, University of Cambridge 4th March, 2020 @ Social Sciences Research Methods Programme (SSRMP), University of Cambridge Slides and more information are available at

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Causal Inference An introduction based on S. Wagers course on Causal Inference (OIT 661) Imke

Modes of Statistical Inference for Causal Efgects Plus an overview of the testing based approach

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal inference Gary Goertz Kroc Institute for International Peace Studies University of Notre

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference Theory and Applications Dr. Matthias Uflacker, Johannes Huegle, Christopher

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal Inference and Response Surface Modeling Inference and

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

tr rt r r

Machine Learning 2007: Slides 1 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

An algebraic approach to stochastic duality Cristian Giardin` a RAQIS18, Annecy 14

ABC Methods for Bayesian Model Choice Christian P. Robert Universit e Paris-Dauphine, IuF,

Population-Based Search 2-3-16 Reading Quiz Question 1: Which of the following attributes do

B uchi Complementation 2 ( n log n ) BA B BA B Automata Theory Seminar BA: B uchi

Generalized Weyl algebras and their global dimension V. V. Bavula 1 Generalized Weyl algebras

Transition system Definition 1.3.1 A transition system S is a pair of the form S ( C , T = )