W HY C AUSALITY . Polio drops can cause polio epidemics (The - PowerPoint PPT Presentation

M ACHINE LEARNING FOR CAUSE - EFFECT PAIRS DETECTION Mehreen Saeed CLE Seminar 11 February, 2014.

W HY C AUSALITY …. • Polio drops can cause polio epidemics – (The Nation, January 2014) • A supernova explosion causes a burst of neutrinos – (Scientfic American, November 2013) • Mobile phones can cause brain tumors • Mobile phones can cause brain tumors – (The Telegraph, October 2012) • DDT pesticide my cause Alzhiemer’s disease – (BBC, January 2014) • Price of dollar going up causes price of gold to go down – (Investopedia.com, March 2011)

O UTLINE • Causality • Coefficients for computing causality – Independence measures – Probabilistic – Probabilistic – Determining the direction of arrows • Transfer learning • Causality challenge • Conclusions

O BSERVATIONAL V S . E XPERIMENTAL D ATA • Observational data is collected by recording values of different characteristics • Experimental data is collected by changing values of some characteristics of changing values of some characteristics of the subject and some values are under the control of an experimenter Example: Randomly select 100 individuals and collect data on their everyday diet and their health issues Vs. Select 100 individuals with diabetes and omit a certain food from their diet and observe the result

O BSERVATIONAL V S . E XPERIMENTAL D ATA …( CONTD ) • Observational data: Google receives around 2 million requests/minute, Facebook users post around 680,000 pieces of content/minute, email users send 200,000,000 messages in a minute 200,000,000 messages in a minute VS. • Experimental data: expensive, maybe unethical, maybe not possible 15 years ago it was thought that inferring causal relationships from observational data is not possible…. Research of machine learning scientists like Judea Pearl has changed this view REF: http://mashable.com/2012/06/22/data-created-every-minute/

C AUSALITY : F ROM O BSERVATIONAL D ATA TO C AUSE E FFECT D ETECTION • X->Y smoking causes lung cancer • Y->X lung cancer causes coughing • X ⊥ Y winning cricket match and being born in February X->Z->Y X->Z->Y • • X ⊥ Y | Z (Conditional independence) X ⊥ Y | Z (Conditional independence) • X<-Z->Y X ⊥ Y | Z (Conditional independence)

4 correlation = -0.036627 x 10 3 C ORRELATION 2.5 ρ ={E(XY)-E(X)E(Y)}/STD(X)/STD(Y) 2 1.5 1 Y 0.5 0 4 correlation = 0.91918 -0.5 x 10 4 -1 -1.5 3 -4 -3 -2 -1 0 1 2 3 4 X 4 x 10 X->Y correlation = -0.04 X->Y correlation = -0.04 2 4 1 correlation = 0.7349 Y x 10 3.5 3 0 2.5 2 -1 1.5 Y 1 -2 -0.5 0 0.5 1 1.5 2 2.5 3 0.5 X 4 x 10 0 X->Y correlation = 0.9 -0.5 -1 Correlation does not necessarily imply causality -1.5 -3 -2 -1 0 1 2 3 4 5 X 4 x 10 X ⊥ Y correlation = 0.73

χ 2 T EST F OR I NDEPENDENCE 4 x 10 4 3 25 20 2 15 1 10 Y 5 0 0 -1 1 2 3 4 5 6 7 8 910 10 9 -2 8 p-value = 0.99 7 6 5 4 3 dof = 81 dof = 81 2 -3 1 1 -4 -4 -2 -2 0 0 2 2 4 4 6 6 8 8 X 4 chi2value = 52.6 x 10 truth: X ⊥ Y 4 x 10 8 6 p-value = 0 4 dof = 63 2 chi2value = 3255 0 Y corr = 0.5948 -2 -4 -6 -8 -3 -2 -1 0 1 2 3 4 X 4 x 10 truth: X ⊥ Y Again this test does not tell us anything about causal inference

STATISTICAL INDEPENDENCE FOR TWO INDEPENDENT EVENTS: P(XY)=P(X)P(Y)

STATISTICAL INDEPENDENCE… CONTD … Measuring P(XY)-P(X)P(Y) 4 p(XY) - P(X)P(Y) = 0.085591 x 10 2.5 4 p(XY) - P(X)P(Y) = 0.036651 x 10 4 2 2 0 1.5 -2 1 Y -4 Y 0.5 -6 0 -8 -10 -0.5 -12 -1 0 1 2 3 4 5 6 -1 X 5 -6 -5 -4 -3 -2 -1 0 1 2 3 x 10 X 4 x 10 X ⊥ Y X->Y P(XY)-P(X)P(Y) = 0.04 P(XY)-P(X)P(Y) = 0.09

X->Y VS. Y->X CAUSALITY & DIRECTION OF ARROWS

CONDITIONAL PROBABILITY 1400 250 1200 200 1000 150 frequency 800 frequency 600 100 400 50 200 0 -3 -3 -2.5 -2.5 -2 -2 -1.5 -1.5 -1 -1 -0.5 -0.5 0 0 0.5 0.5 1 1 1.5 1.5 0 0 -2 -1 0 1 2 3 4 5 6 X|Y>0.4 X P(X|Y) P(X) Does the presence of another variable alter the distribution of X? P(cause and effect) more likely explained by P(cause)P(effect|cause) as • compared to P(effect)P(cause|effect) ALSO • if (PX)=P(X|Y) it may indicates that X is independent of Y •

D ETERMINING T HE D IRECTION O F A RROWS ANM Fit Y=f(X)+e x check independence of X and e x to determine strength of X->Y PNL Fit Y=g(f(X)+e x ) and check independence of X and e x IGCI If X->Y then KL-divergence between P(Y) and a reference distribution is greater than KL- divergence between P(X) and a reference divergence between P(X) and a reference distribution GPI-MML Likelihood of observed data given X->Y is ANM-MML inversely related to the complexity of P(X) and ANM-GAUSS P(Y|X) LINGAM Fit Y=aX+e x and X=bY+e Y X->Y if a>b Note: There are assumptions associated with each method, not stated here REF: Statnikov et al. , new methods for separating causes from effects in genomics data, BMC Genomics, 2012

USING REGRESSION Determine the direction of causality idea behind ANM … 4 x 10 3 4 x 10 3 2 2 Fit Y=f(X)+e x 1 1 0 0 Y -1 -1 -2 -2 -2 -2 -3 -3 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 X 4 x 10 4 x 10 4 x 10 Truth: X->Y 4 Fit X=f(Y)+ e y 3 2 1 0 X Check the independence of X and e x -1 and Y and e y -2 -3 -4 -3 -2 -1 0 1 2 3 Y 4 x 10

IDEA BEHIND LINGAM… 4 correlation = 0.58332 x 10 6 y=0.58x-0.02 5 4 3 2 Y 1 0 -1 -2 -2 -1 0 1 2 3 4 5 6 7 X 4 x 10 x=.6y+0.01 truth: Y->X

TRANSFER LEARNING Can we use our knowledge from one problem and transfer it to another??? REF: Pan and Yang, A survey on transfer learning, IEEE TKDE, 22(10), 2010.

TRANSFER LEARNING … ONE POSSIBLE VIEW SOURCE DOMAIN feature construction Lots of labeled data Truth values are known Output TARGET DOMAIN labels Classification machine same features

CAUSALITY & FEATURE CONSTRUCTION FOR TRANSFER LEARNING If we know the truth values for X and Y relationship then construct features such as: 4 correlation = -0.036627 x 10 3 independence based: 2.5 correlation 2 chi square and so on chi square and so on 1.5 causality based 1 IGCI Y 0.5 ANM 0 PNM and so on statistical -0.5 percentiles -1 medians and so on -1.5 -4 -3 -2 -1 0 1 2 3 4 X 4 machine learning x 10 errors of prediction and so on

C AUSALITY AND T RANSFER L EARNING … THE WHOLE PICTURE PAIR 1 PAIR 2 PAIR 3 PAIR 1 LABEL CORR IG CHI-SQ ANM… X ⊥ Y X->Y Y->X 0.1215 0.1855 0.307 -0.064 0.0225 0.6551 PAIR 2 LABEL CORR IG CHI-SQ ANM… 0.3448 0.5005 -0.1891 0.0537 0.4515 0.1557 PAIR 3 LABEL CORR IG CHI-SQ ANM… 0.1692 0.2291 0.3983 -0.06 0.0388 0.7383 0.1114 0.1114 0.3994 0.5108 -0.288 0.0445 0.2788 0.3994 0.5108 -0.288 0.0445 0.2788 0.1947 0.3059 0.5006 -0.1113 0.0596 0.6363 0.2861 0.6278 0.0555 0.0978 1.1939 0.3416 0.2519 0.4929 0.7449 -0.241 0.1242 0.5111 0.1769 0.1232 0.3002 0.0537 0.0218 1.4356 PAIR i PAIR j PAIR k features unknown unknown unknown 0.0783 0.5261 0.6045 -0.4478 0.0412 0.1488 0.2827 0.3728 -0.1925 0.0255 0.319 0.0902 Classification machine 0.125 0.5065 0.6314 -0.3815 0.0633 0.2468 0.1408 0.3727 0.5135 -0.232 0.0525 0.3777 0.4615 0.4928 0.9543 -0.0314 0.2274 0.9364 Output

CAUSE EFFECT PAIRS CHALLENGE Generated from artificial and real data (geography, Identity of demographics, chemistry, biology, etc.: variables in all Training Data: 4050 pairs (truth values : known) cases: unknown Validation Data: 4050 pairs (truth values : unknown) Test Data: 4050 pairs (truth values : unknown) Can be categorical, numerical or binary REF: Guyon, Results and analysis of the 2013 ChaLearn cause-effect pair challenge, NIPS 2013. REF: http://www.causality.inf.ethz.ch/cause-effect.php

CAUSE EFFECT PAIRS CHALLENGE https://www.kaggle.com/c/cause-effect-pairs

WHAT WERE THE BEST METHODS Pre-processing: Smoothing, binning, transforms, noise removal etc. Feature extraction: Independence, entropy, residuals, statistical features etc. Dimensionality reduction: Feature selection, PCA, ICA, clustering Classifier : Random forests, decision trees, neural networks etc. REF: Guyon, Results and analysis of the 2013 ChaLearn cause-effect pair challenge, NIPS 2013.

INTERESTING RESULTS... TRANSFER LEARNING NO RETRAINING RETRAINING Jarfo 0.87 0.997 FirfiD 0.60 0.984 ProtoML 0.81 0.990 3648 gene network cause effect pairs from Ecoli regulatory network REF: Guyon, Results and analysis of the 2013 ChaLearn cause-effect pair challenge, NIPS 2013. REF: http://gnw.sourceforge.net/dreamchallenge.html

W HY C AUSALITY . Polio drops can cause polio epidemics (The - PowerPoint PPT Presentation

M ACHINE LEARNING FOR CAUSE - EFFECT PAIRS DETECTION Mehreen Saeed CLE Seminar 11 February, 2014. W HY C AUSALITY . Polio drops can cause polio epidemics (The Nation, January 2014) A supernova explosion causes a burst of neutrinos

C AUSALITY Prasun Dewan Department of Computer Science University of North Carolina at Chapel

and Hidden Markov Models Dynamic Programming Biostatistics 615/815 Lecture 9: . . Summary HMM

Comparative Effectiveness Evaluation and Monitoring: An American Perspective Jeffrey Smith Paul

CEE02- Generating High Resolution Rainfall Data Using Statistical Techniques By: Foo Xiang Hua

MaskGAN: Better Text Generation via Filling in the ______ June 5, 2018 (

of Taylor Berg-Kirkpatrick Prepared by: Ritesh Sarkhel Biography B.S. : University of California,

Eligibility Assessments November 21, 2013 Introduction Chris Hunter Assessments Project

Afghanistan Independent Land Authority Historic and current institutional developments in

[Slide 1] Thanks for scheduling consideration. Introduce Board members. [Slide 2] USI is the

Parents evening presentation Security marking: PUBLIC UCAS an independent charity UCAS

Accident Sequence Quantification in Multi-unit Seismic PSA using MCSs Yongjin Kim, Seunghyun Jang,

Catastrophe Portfolio Management CARE Seminar 2011 Mindy Spry 2 1 04/06/2011 Contents 1

COMETE Activities related to ProNoBiS The team Catuscia Palamidessi (DR INRIA) Frank Valencia

Discussion on Uncertainty handling in Logic Programing Lluis Godo IIIA - CSIC, Barcelona, Spain

State Estimation in Power Distribution Network Operation Ravindra Singh Dept. of Electrical

Acquiring dramatic gains in image quality: GPU-accelerated beamforming Andre Lehovich, Dustin

A Comparison Study of Transection Rate and Wound Surface Size Caused by a Manual and Motorized

VINTAGE DERBY WFTDA: DERBY 101 https://www.youtube.com/watch?v=OId6gTd2LCM 2012 - Yellowknife

Rocketdyne Follow-On Health Study 6-8 April 2005 Overview Who was in the study? 54,384

Biostatistics 602 - Statistical Inference Apil 23rd, 2013 Biostatistics 602 - Lecture 26 Hyun

2020 303(d) Listing Methodology Workgroup Kick-off meeting September 14, 2018 CDPHE - C1A,

Use of the results of the SPICE project on marine litter Marta Ruiz HELCOM Secretariat HOLAS

ASIA PACIFIC INFORMATION SUPERHIGHWAY (AP-IS) Atsuko Okuda Chief, ICT and Development Section

Community Survey Highlights Survey Conducted December 6-17, 2015 Methodology Survey of 600

Sambuz

Useful Links

Newsletter

Mail Us