predicting perturbation effects in large scale systems
play

Predicting perturbation effects in large-scale systems from - PowerPoint PPT Presentation

Predicting perturbation effects in large-scale systems from observational data Marloes Maathuis Seminar f ur Statistik, ETH Z urich, Switzerland Joint work with Peter B uhlmann Diego Colombo Markus Kalisch Marloes Maathuis, ETH Z


  1. Predicting perturbation effects in large-scale systems from observational data Marloes Maathuis Seminar f¨ ur Statistik, ETH Z¨ urich, Switzerland

  2. Joint work with Peter B¨ uhlmann Diego Colombo Markus Kalisch Marloes Maathuis, ETH Z¨ urich 2 / 29

  3. Research question • In short: Can we learn perturbation effects without doing perturbation experiments? Marloes Maathuis, ETH Z¨ urich 3 / 29

  4. Research question • In short: Can we learn perturbation effects without doing perturbation experiments? • Concretely: Can we learn the gene regulatory network of yeast from observational data? • Predict perturbation effects between all pairs of genes • Identify pairs of genes between which there is a large effect Marloes Maathuis, ETH Z¨ urich 3 / 29

  5. Why use observational data? • Thousands of perturbation experiments needed to estimate all perturbation effects ⇒ time consuming and expensive Marloes Maathuis, ETH Z¨ urich 4 / 29

  6. Why use observational data? • Thousands of perturbation experiments needed to estimate all perturbation effects ⇒ time consuming and expensive • Questions: • Does observational data provide some information on perturbation effects? • Can this information be used to guide and prioritize perturbation experiments? Marloes Maathuis, ETH Z¨ urich 4 / 29

  7. Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? Marloes Maathuis, ETH Z¨ urich 5 / 29

  8. Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? • Hypothetical experiment: Genetically modified Genetically modified such that X i ≈ a such that X i ≈ a + 1 Marloes Maathuis, ETH Z¨ urich 5 / 29

  9. Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? • Hypothetical experiment: do( X i = a ) do( X i = a + 1 ) Marloes Maathuis, ETH Z¨ urich 5 / 29

  10. Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? • Hypothetical experiment: do( X i = a ) do( X i = a + 1 ) • Perturbation effect of gene i on gene j : E ( X j | do ( X i = a + 1)) − E ( X j | do ( X i = a )) (value of a drops out if the system is linear) Marloes Maathuis, ETH Z¨ urich 5 / 29

  11. Estimating perturbation effects from observational data • It is easy to estimating associations from observational data. But association is not causation! • Pearl (2003): • “An associational concept is any relationship that can be defined in terms of the joint distribution of observed variables.” Marloes Maathuis, ETH Z¨ urich 6 / 29

  12. Estimating perturbation effects from observational data • It is easy to estimating associations from observational data. But association is not causation! • Pearl (2003): • “An associational concept is any relationship that can be defined in terms of the joint distribution of observed variables.” • “A causal concept [such as a perturbation effect] is any relationship that cannot be defined from the distribution alone (...) Any claim invoking causal concepts must be traced to some premises that invoke such concepts; it cannot be inferred or derived from statistical associations alone.” Marloes Maathuis, ETH Z¨ urich 6 / 29

  13. Estimating perturbation effects from observational data • It is easy to estimating associations from observational data. But association is not causation! • Pearl (2003): • “An associational concept is any relationship that can be defined in terms of the joint distribution of observed variables.” • “A causal concept [such as a perturbation effect] is any relationship that cannot be defined from the distribution alone (...) Any claim invoking causal concepts must be traced to some premises that invoke such concepts; it cannot be inferred or derived from statistical associations alone.” • An assumption that is often made: data were generated by a known directed acyclic graph (DAG) Marloes Maathuis, ETH Z¨ urich 6 / 29

  14. Directed acyclic graph (DAG) X 2 X 1 X 3 • Nodes represent random variables and edges represent conditional independence relationships • The DAG encodes causal assumptions: • Edge X 2 → X 1 : X 2 may have a direct causal effect on X 1 • No edge X 1 � X 3 : X 1 cannot have a direct causal effect on X 3 (but X 1 and X 3 will be correlated!) Marloes Maathuis, ETH Z¨ urich 7 / 29

  15. Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • The perturbation effect of X 1 on X 3 : E ( X 3 | do ( X 1 = a + 1)) − E ( X 3 | do ( X 1 = a )) Marloes Maathuis, ETH Z¨ urich 8 / 29

  16. Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • The perturbation effect of X 1 on X 3 : E ( X 3 | do ( X 1 = a + 1)) − E ( X 3 | do ( X 1 = a )) • The do-operator stands for a hypothetical experiment. So E ( X 3 | do ( X 1 = a )) is not the usual conditional expectation! In the example: • E ( X 3 | X 1 = a ) � = E ( X 3 ) • E ( X 3 | do ( X 1 = a )) = E ( X 3 ) Marloes Maathuis, ETH Z¨ urich 8 / 29

  17. Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • The perturbation effect of X 1 on X 3 : E ( X 3 | do ( X 1 = a + 1)) − E ( X 3 | do ( X 1 = a )) • The do-operator stands for a hypothetical experiment. So E ( X 3 | do ( X 1 = a )) is not the usual conditional expectation! In the example: • E ( X 3 | X 1 = a ) � = E ( X 3 ) • E ( X 3 | do ( X 1 = a )) = E ( X 3 ) • Pearl’s do-calculus uses the DAG to write expressions involving the do-operator in terms of pre-intervention conditional distributions Marloes Maathuis, ETH Z¨ urich 8 / 29

  18. Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • Summary: If the DAG is given, one can estimate perturbation effects (or causal effects) from observational data Marloes Maathuis, ETH Z¨ urich 9 / 29

  19. Main points in this talk • Present IDA (Intervention calculus when the DAG is Absent) • Requires observational data • generated from an unknown DAG • multivariate Gaussian • no hidden confounders • potentially high-dimensional system • Returns (summary measures of) estimated set of possible causal effects • Consistent in sparse high-dimensional settings • Validation on yeast data Marloes Maathuis, ETH Z¨ urich 10 / 29

  20. What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? Marloes Maathuis, ETH Z¨ urich 11 / 29

  21. What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? • Almost... Marloes Maathuis, ETH Z¨ urich 11 / 29

  22. What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? • Almost... several DAGs can encode the same conditional independence relationships. They form an equivalence class, described by a CPDAG. Marloes Maathuis, ETH Z¨ urich 11 / 29

  23. What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? • Almost... several DAGs can encode the same conditional independence relationships. They form an equivalence class, described by a CPDAG. • One can estimate this CPDAG, for example using the PC-algorithm of Peter Spirtes and Clark Glymour (Spirtes et al, 2000) • Fast implementation in the R-package pcalg • Consistent in sparse high-dimensional settings (Kalisch and B¨ uhlmann, JMLR 2007) Marloes Maathuis, ETH Z¨ urich 11 / 29

  24. IDA (oracle version) PC-algorithm do-calculus DAG 1 effect 1 DAG 2 effect 2 . . . . oracle CPDAG multi-set Θ . . . . . . . . DAG m effect m Marloes Maathuis, ETH Z¨ urich 12 / 29

  25. The multi-set Θ • Why multi-set instead of a unique value? Marloes Maathuis, ETH Z¨ urich 13 / 29

  26. The multi-set Θ • Why multi-set instead of a unique value? • Recall quote of Pearl. We make “weak” causal assumptions: • The data are generated from unknown DAG • There are no hidden confounders Marloes Maathuis, ETH Z¨ urich 13 / 29

  27. The multi-set Θ • Why multi-set instead of a unique value? • Recall quote of Pearl. We make “weak” causal assumptions: • The data are generated from unknown DAG • There are no hidden confounders • What information does Θ provide? Examples: • Θ = { 1 . 5 } ⇒ causal effect is 1 . 5 • Θ = { 1 . 5 , 0 . 5 , 3 . 1 } ⇒ causal effect is positive • Θ = { 1 . 5 , 1 . 5 , − 1 } ⇒ absolute value of causal effect ≥ 1 Marloes Maathuis, ETH Z¨ urich 13 / 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend