the finite set independence criterion fsic
play

The Finite-Set Independence Criterion (FSIC) Zoltn Szab Arthur - PowerPoint PPT Presentation

The Finite-Set Independence Criterion (FSIC) Zoltn Szab Arthur Gretton Wittawat Jitkrittum Gatsby Unit University College London wittawat@gatsby.ucl.ac.uk 3rd UCL Workshop on the Theory of Big Data 28 June 2017 1/10 What Is


  1. The Finite-Set Independence Criterion (FSIC) Zoltán Szabó Arthur Gretton Wittawat Jitkrittum Gatsby Unit University College London wittawat@gatsby.ucl.ac.uk 3rd UCL Workshop on the Theory of Big Data 28 June 2017 1/10

  2. What Is Independence Testing? Let ✭ X ❀ Y ✮ ✷ ❘ d x ✂ ❘ d y be random vectors following P xy . Given a joint sample ❢ ✭ x i ❀ y i ✮ ❣ n i ❂ 1 ✘ P xy (unknown), test H 0 ✿ P xy ❂ P x P y ❀ vs. H 1 ✿ P xy ✻ ❂ P x P y ✿ Compute a test statistic ❫ ✕ n . Reject H 0 if ❫ ✕ n ❃ T ☛ (threshold). T ☛ ❂ ✭ 1 � ☛ ✮ -quantile of the null distribution. 2/10

  3. What Is Independence Testing? Let ✭ X ❀ Y ✮ ✷ ❘ d x ✂ ❘ d y be random vectors following P xy . Given a joint sample ❢ ✭ x i ❀ y i ✮ ❣ n i ❂ 1 ✘ P xy (unknown), test H 0 ✿ P xy ❂ P x P y ❀ vs. H 1 ✿ P xy ✻ ❂ P x P y ✿ Compute a test statistic ❫ ✕ n . Reject H 0 if ❫ ✕ n ❃ T ☛ (threshold). T ☛ ❂ ✭ 1 � ☛ ✮ -quantile of the null distribution. 2/10

  4. What Is Independence Testing? Let ✭ X ❀ Y ✮ ✷ ❘ d x ✂ ❘ d y be random vectors following P xy . Given a joint sample ❢ ✭ x i ❀ y i ✮ ❣ n i ❂ 1 ✘ P xy (unknown), test H 0 ✿ P xy ❂ P x P y ❀ vs. H 1 ✿ P xy ✻ ❂ P x P y ✿ Compute a test statistic ❫ ✕ n . Reject H 0 if ❫ ✕ n ❃ T ☛ (threshold). T ☛ ❂ ✭ 1 � ☛ ✮ -quantile of the null distribution. P H 0 (ˆ λ n ) T α 0 25 50 75 ˆ λ n 2/10

  5. Motivations Modern state-of-the-art test is HSIC [Gretton et al., 2005]. ✓ Nonparametric i.e., no assumption on P xy . Kernel-based. ✗ Slow. Runtime: ❖ ✭ n 2 ✮ where n ❂ sample size. ✗ No systematic way to choose kernels. Propose the Finite-Set Independence Criterion (FSIC). 1 Nonparametric. 2 Linear-time. Runtime complexity: ❖ ✭ n ✮ . Fast. 3 Tunable i.e., well-defined criterion for parameter tuning. 3/10

  6. Motivations Modern state-of-the-art test is HSIC [Gretton et al., 2005]. ✓ Nonparametric i.e., no assumption on P xy . Kernel-based. ✗ Slow. Runtime: ❖ ✭ n 2 ✮ where n ❂ sample size. ✗ No systematic way to choose kernels. Propose the Finite-Set Independence Criterion (FSIC). 1 Nonparametric. 2 Linear-time. Runtime complexity: ❖ ✭ n ✮ . Fast. 3 Tunable i.e., well-defined criterion for parameter tuning. 3/10

  7. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ 4/10

  8. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ 4/10

  9. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ 4/10

  10. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ Data ( v, w ) correlation: 0.97 1 . 0 5 l ( y, w ) y 0 . 5 0 0 . 0 − 2 . 5 0 . 0 2 . 5 0 . 0 0 . 5 1 . 0 k ( x, v ) x 4/10

  11. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ Data ( v, w ) correlation: -0.47 1 . 0 5 l ( y, w ) y 0 . 5 0 0 . 0 − 2 . 5 0 . 0 2 . 5 0 . 0 0 . 5 1 . 0 k ( x, v ) x 4/10

  12. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ Data ( v, w ) correlation: 0.33 1 . 0 5 l ( y, w ) y 0 . 5 0 0 . 0 − 2 . 5 0 . 0 2 . 5 0 . 0 0 . 5 1 . 0 k ( x, v ) x 4/10

  13. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ Data ( v, w ) correlation: 0.023 1 . 0 2 l ( y, w ) 0 . 5 y 0 − 2 0 . 0 0 . 0 0 . 5 1 . 0 − 10 0 10 k ( x, v ) x 4/10

  14. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ Data ( v, w ) correlation: 0.025 1 . 0 2 l ( y, w ) 0 . 5 y 0 − 2 0 . 0 0 . 0 0 . 5 1 . 0 − 10 0 10 k ( x, v ) x 4/10

  15. Proposal: The Finite-Set Independence Criterion (FSIC) 1 Pick 2 positive definite kernels: k for X , and l for Y . ✏ ✑ � ❦ x � v ❦ 2 ✎ Gaussian kernel: k ✭ x ❀ v ✮ ❂ exp . 2 ✛ 2 x 2 Pick some feature ✭ v ❀ w ✮ ✷ ❘ d x ✂ ❘ d y 3 ✿ Transform ✭ x ❀ y ✮ ✼✦ ✭ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮✮ then measure covariance ❘ d x ✂ ❘ d y ✦ ❘ ✂ ❘ FSIC 2 ✭ X ❀ Y ✮ ❂ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v ✮ ❀ l ✭ y ❀ w ✮❪ ✿ Data ( v, w ) correlation: 0.087 2 l ( y, w ) 0 . 5 y 0 − 2 0 . 0 0 . 0 0 . 5 1 . 0 − 10 0 10 k ( x, v ) x 4/10

  16. General Form of FSIC J FSIC 2 ✭ X ❀ Y ✮ ❂ 1 ❳ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v j ✮ ❀ l ✭ y ❀ w j ✮❪ ❀ J j ❂ 1 j ❂ 1 ✷ ❘ d x ✂ ❘ d y . for J features ❢ ✭ v j ❀ w j ✮ ❣ J Proposition 1. Assume 1 Kernels k and l satisfy some conditions (e.g. Gaussian kernels). 2 Features ❢ ✭ v i ❀ w i ✮ ❣ J i ❂ 1 are drawn from a distribution with a density. Then, for any J ✕ 1 , FSIC ✭ X ❀ Y ✮ ❂ 0 if and only if X and Y are independent Under H 0 ✿ P xy ❂ P x P y , FSIC 2 ✘ weighted sum of J dependent ✤ 2 variables. n ❭ Difficult to get ✭ 1 � ☛ ✮ -quantile for the threshold. 5/10

  17. General Form of FSIC J FSIC 2 ✭ X ❀ Y ✮ ❂ 1 ❳ cov 2 ✭ x ❀ y ✮ ✘ P xy ❬ k ✭ x ❀ v j ✮ ❀ l ✭ y ❀ w j ✮❪ ❀ J j ❂ 1 j ❂ 1 ✷ ❘ d x ✂ ❘ d y . for J features ❢ ✭ v j ❀ w j ✮ ❣ J Proposition 1. Assume 1 Kernels k and l satisfy some conditions (e.g. Gaussian kernels). 2 Features ❢ ✭ v i ❀ w i ✮ ❣ J i ❂ 1 are drawn from a distribution with a density. Then, for any J ✕ 1 , FSIC ✭ X ❀ Y ✮ ❂ 0 if and only if X and Y are independent Under H 0 ✿ P xy ❂ P x P y , FSIC 2 ✘ weighted sum of J dependent ✤ 2 variables. n ❭ Difficult to get ✭ 1 � ☛ ✮ -quantile for the threshold. 5/10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend