cause effect pairs challenge
play

Cause-Effect Pairs Challenge Isabelle Guyon ChaLearn Thanks - PowerPoint PPT Presentation

Cause-Effect Pairs Challenge Isabelle Guyon ChaLearn Thanks Initial impulse : Joris Mooij, Dominik Janzing, and Bernhard Schlkopf, from the Max Planck. Examples of algorithms and data: Povilas Daniuis, Arthur Gretton, Patrik O. Hoyer,


  1. Cause-Effect Pairs Challenge Isabelle Guyon ChaLearn

  2. Thanks Initial impulse : Joris Mooij, Dominik Janzing, and Bernhard Schölkopf, from the Max Planck. Examples of algorithms and data: Povilas Daniušis, Arthur Gretton, Patrik O. Hoyer, Dominik Janzing, Antti Kerminen, Joris Mooij, Jonas Peters, Bernhard Schölkopf, Shohei Shimizu, Oliver Stegle, and Kun Zhang, Jakob Zscheischler. Datasets and result analysis : Isabelle Guyon + Mehreen Saeed + {Mikael Henaff, Sisi Ma, and Alexander Statnikov}, from NYU. Website and sample code: Isabelle Guyon + Ben Hamner (Kaggle). Review, testing: Marc Boullé, Hugo Jair Escalant, Frederick Eberhardt, Seth Flaxman, Patrik Hoyer, Dominik Janzing, Richard Kennaway, Vincent Lemaire, Joris Mooij, Jonas Peters, Florin , Peter Spirtes, 
 Ioannis Tsamardinos, Jianxin Yin, Kun Zhang. Challenges in Machine Learning http://chalearn.org

  3. Causal discovery without overfitting? Neural networks 100 billion neurons Gene networks 100,000 genes Small networks: Influence diagrams Challenges in Machine Learning http://chalearn.org

  4. Causation coefficient C 0 A <- B A – B or A|B A -> B C can be used to - RANK pairs of variables and prioritize experiments - Orient edges in degenerate causal graphs Causality Workbench clopinet.com/causality

  5. ROC curves for A->B Challenges in Machine Learning chalearn.org

  6. Winners 1. ProtoML (Rank 1): Diogo Moitinho de Almeida. 2. Jarfo (Rank 2): José Adrián Rodríguez Fonollosa. 3. FirfID (Rank 4): Spyridon Samothrakis. Challenges in Machine Learning chalearn.org

  7. Data Challenges in Machine Learning chalearn.org

  8. Cause-effect pairs method Test whether A -> B is a better explanation than A <- B comparing two hypotheses: B = f (A, noise) A = f (B, noise) Causality Workbench clopinet.com/causality

  9. Setting of the challenge A Z B Z Z Z B Z A B A A B A -> B A <- B A <- Z -> B ~ A | B A B A - B Causality Workbench clopinet.com/causality

  10. Setting • No feed-back loops. • No explicit time information. • A variable can be though of as an aggregate statistic, like life expectancy of a population, or a measurement like temperature. • We consider pairs of variables {A, B} for which A -> B means B = f (A, noise). • Pairs are independent of each other. Causality Workbench clopinet.com/causality

  11. Data provided Challenges in Machine Learning chalearn.org

  12. Example: Best fit: A -> B A -> B A <- B Causality Workbench clopinet.com/causality

  13. Large dataset • Real data (18%): – Altitude -> Temperature – Age -> Wages – Car color -> Price – Country -> Infant mortality • Artificial data (82%): B = f(A, noise) Challenges in Machine Learning chalearn.org

  14. Real variables Demographics: Medicine: Sex -> Height Cancer volume -> Recurrence Age -> Wages Metastasis -> Prognosis Native country -> Education Age -> Blood pressure Latitude -> Infant mortality Genomics (mRNA level): Ecology: transcription factor -> protein induced City elevation -> Temperature Engineering: Water level -> Algal frequency Elevation -> Vegetation type Car model year -> Horsepower Distance to hydrology -> Fire Number of cylinders -> MPG Cache memory -> Compute power Econometrics: Roof area -> Heating load Mileage -> Car resell price Cement used -> Compressive strength Number of rooms -> House price Trace price last day -> Trade price Challenges in Machine Learning chalearn.org

  15. Real variables 2N manually N artificial curated pairs A <-> B Rank preserving var. substitution Var. random permutations N N N N A <-> B A | B A -> B A <- B Challenges in Machine Learning chalearn.org

  16. Artificial data Z Real variables Mix Categorical + F(A, Z) B A B Continuous Causality Workbench clopinet.com/causality

  17. Data browser and sample code Challenges in Machine Learning chalearn.org

  18. Result analysis Challenges in Machine Learning chalearn.org

  19. Model-based methods • Additive Noise Model (ANM): Best fit, compare independence of input and residual. • Latent variable models (LINGAM): Enforce independence of input and residual, compare model weights. • Complexity-based models: Select simplest explanation of the data (GPI and IGCI). http://webdav.tuebingen.mpg.de/causality/ Causality Workbench clopinet.com/causality

  20. Empirical methods • 267 teams and 4578 entries. • All baseline methods outperformed! • Code of 3 winners available. Causality Workbench clopinet.com/causality

  21. No overfitting Challenges in Machine Learning chalearn.org

  22. Result comparison Challenges in Machine Learning chalearn.org

  23. Statistical significance Challenges in Machine Learning chalearn.org

  24. Causation coefficient distribution Challenges in Machine Learning chalearn.org

  25. Causation coefficient distribution Challenges in Machine Learning chalearn.org

  26. Post-challenge verifications 3648 cause effect pairs from GeneNetWeaver 3.0 (http://gnw.sourceforge.net/) based on E. Coli transcriptional regulatory network. Experiment 1: no retraining Experiment 2: train ½, test ½. Alexander Statnikov and Sisi Ma Challenges in Machine Learning chalearn.org

  27. Survey (27 responses) Challenges in Machine Learning chalearn.org

  28. Preprocessing Challenges in Machine Learning chalearn.org

  29. Feature extraction Challenges in Machine Learning chalearn.org

  30. Dimensionality reduction Challenges in Machine Learning chalearn.org

  31. Recognition Challenges in Machine Learning chalearn.org

  32. Classifier Challenges in Machine Learning chalearn.org

  33. Implementation Challenges in Machine Learning chalearn.org

  34. Time spent Challenges in Machine Learning chalearn.org

  35. Cause-Effect Pairs Challenge http://clopinet.com/causality Causality Workbench clopinet.com/causality

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend