Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics - PowerPoint PPT Presentation

Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Aapo Hyv¨ arinen Gatsby Unit University College London 27 Feb 2017 Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Outline ◮ Part I: Theory of ICA ◮ Definition and difference to PCA ◮ Importance of non-Gaussianity ◮ Part II: Natural images and ICA ◮ Application of ICA and sparse coding on natural images ◮ Extensions of ICA with dependent components ◮ Part III: Estimation of unnormalized models ◮ Motivation by extensions of ICA ◮ Score matching ◮ Noise-contrastive estimation ◮ Part IV: Recent extensions of ICA and natural image statistics ◮ A three-layer model, towards deep learning Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Part I: Theory of ICA ◮ Definition of ICA as non-Gaussian generative model ◮ Importance of non-Gaussianity ◮ Fundamental difference to PCA ◮ Estimation by maximization of non-Gaussianity ◮ Measures of non-Gaussianity Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Problem of blind source separation There is a number of “source signals”: Due to some external circumstances, only linear mixtures of the source signals are observed. Estimate (separate) original signals! Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences A solution is possible PCA does not recover original signals Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences A solution is possible PCA does not recover original signals Use information on statistical independence to recover: Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Independent Component Analysis (H´ erault and Jutten, 1984-1991) ◮ Observed random variables x i are modelled as linear sums of hidden variables: m � x i = a ij s j , i = 1 ... n (1) j =1 ◮ Mathematical formulation of blind source separation problem ◮ Not unlike factor analysis ◮ Matrix of a ij is parameter matrix, called “mixing matrix”. ◮ The s i are hidden random variables called “independent components”, or “source signals” ◮ Problem: Estimate both a ij and s j , observing only x i . Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences When can the ICA model be estimated? ◮ Must assume: ◮ The s i are mutually statistically independent ◮ The s i are nongaussian (non-normal) ◮ (Optional:) Number of independent components is equal to number of observed variables ◮ Then: mixing matrix and components can be identified (Comon, 1994) A very surprising result! Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Reminder: Principal component analysis ◮ Basic idea: find directions � i w i x i of maximum variance i w 2 ◮ We must constrain the norm of w : � i = 1, otherwise solution is that w i are infinite. ◮ For more than one component, find direction of max var orthogonal to components previously found. ◮ Classic factor analysis has essentially same idea as in PCA: explain maximal variance with limited number of components Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Comparison of ICA, PCA, and factor analysis ◮ In contrast to PCA and factor analysis, components really give the original source signals or underlying hidden variables ◮ In PCA and factor analysis, only a subspace is properly determined (although an arbitrary basis is given as output) ◮ Catch: ICA only works when components are nongaussian ◮ Many psychological or social-science hidden variables (e.g. “intelligence”) may be (practically) gaussian because sum of many independent variables (central limit theorem). ◮ But signals measured by sensors are usually quite nongaussian Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Some examples of nongaussianity 5 6 2 4 1.5 4 3 1 2 2 0.5 1 0 0 0 −0.5 −1 −2 −1 −2 −4 −1.5 −3 −2 −4 −6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0.7 0.7 0.8 0.7 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −4 −3 −2 −1 0 1 2 3 4 5 −6 −4 −2 0 2 4 6 Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Why classic methods cannot find original components or sources ◮ In PCA and FA: find components y i which are uncorrelated cov( y i , y j ) = E { y i y j } − E { y i } E { y j } = 0 (2) and maximize explained variance (or variance of components) ◮ Such methods need only the covariances, cov( x i , x j ) ◮ However, there are many different component sets that are uncorrelated, because ◮ The number of covariances is ≈ n 2 / 2 due to symmetry ◮ So, we cannot solve the n 2 mixing coeffs, not enough information! (“More variables than equations”) Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Nongaussianity, with independence, gives more information ◮ For independent variables we have E { h 1 ( y 1 ) h 2 ( y 2 ) } − E { h 1 ( y 1 ) } E { h 2 ( y 2 ) } = 0 . (3) ◮ For nongaussian variables, nonlinear covariances give more information than just covariances. ◮ This is not true for multivariate gaussian distribution ◮ Distribution is completely determined by covariances ◮ Uncorrelated gaussian variables are independent ⇒ ICA model cannot be estimated for gaussian data. Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Whitening as preprocessing for ICA ◮ Whitening is usually done before ICA ◮ Whitening means decorrelation and standardization, E { xx T } = I . ◮ After whitening, A can be considered orthogonal. E { xx T } = I = A E { ss T } A T = AA T (4) ◮ Half of parameters estimated! (and other technical benefits) Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Illustration Two components with uniform distributions: Original components, observed mixtures, PCA, ICA PCA does not find original coordinates, ICA does! Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics - PowerPoint PPT Presentation

Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Aapo Hyv

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts III-IV

Information Theory Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit

Neural Encoding Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience

Estimation of information-theoretic quantities Liam Paninski Gatsby Computational Neuroscience

Statistical methods for neural decoding Liam Paninski Gatsby Computational Neuroscience Unit

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Introduction to Neural Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

FrankWood Gatsby UCL

Probabilistic & Unsupervised Learning Beyond linear-Gaussian and Mixture models Maneesh

Probabilistic & Unsupervised Learning Beyond linear-Gaussian and Mixture models Maneesh

Probabilistic & Unsupervised Learning Beyond linear-Gaussian models and Mixtures Maneesh

Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Deterministic Independent Component Analysis (ICA) Ruitong Huang Andrs Gyrgy Csaba

Lecture 24: Autoencoders ICA Aykut Erdem December 2017 Hacettepe University Last time

Introduction to Machine Learning CMU-10701 20. Independent Component Analysis Barnabs Pczos

A Bioshock 2 Post-Mortem Michael Kamper 2K Marin Audio Lead Michael Csurics 2K Marin Dialogue

ICA Q&A session Jonathan Bowdler Head of Regulatory Compliance Objectives of the session 1)

PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2018 Soleymani

Empirical Comparison of Approximate Inference Algorithms for Networked Data Prithviraj Sen Lise

WindMine: Fast and Effective Mining of Web-click Sequences Yasushi Sakurai (NTT) Lei Li

Sambuz

Useful Links

Newsletter

Mail Us

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics - PowerPoint PPT Presentation

Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Aapo Hyv

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts III-IV

Information Theory Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit

Neural Encoding Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience

Estimation of information-theoretic quantities Liam Paninski Gatsby Computational Neuroscience

Statistical methods for neural decoding Liam Paninski Gatsby Computational Neuroscience Unit

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Introduction to Neural Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

FrankWood Gatsby UCL

Probabilistic &amp; Unsupervised Learning Beyond linear-Gaussian and Mixture models Maneesh

Probabilistic &amp; Unsupervised Learning Beyond linear-Gaussian and Mixture models Maneesh

Probabilistic &amp; Unsupervised Learning Beyond linear-Gaussian models and Mixtures Maneesh

Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Deterministic Independent Component Analysis (ICA) Ruitong Huang Andrs Gyrgy Csaba

Lecture 24: Autoencoders ICA Aykut Erdem December 2017 Hacettepe University Last time

Introduction to Machine Learning CMU-10701 20. Independent Component Analysis Barnabs Pczos

A Bioshock 2 Post-Mortem Michael Kamper 2K Marin Audio Lead Michael Csurics 2K Marin Dialogue

ICA Q&amp;A session Jonathan Bowdler Head of Regulatory Compliance Objectives of the session 1)

PCA &amp; ICA CE-717: Machine Learning Sharif University of Technology Spring 2018 Soleymani

Empirical Comparison of Approximate Inference Algorithms for Networked Data Prithviraj Sen Lise

WindMine: Fast and Effective Mining of Web-click Sequences Yasushi Sakurai (NTT) Lei Li

Sambuz

Useful Links

Newsletter

Mail Us

Probabilistic & Unsupervised Learning Beyond linear-Gaussian and Mixture models Maneesh

Probabilistic & Unsupervised Learning Beyond linear-Gaussian and Mixture models Maneesh

Probabilistic & Unsupervised Learning Beyond linear-Gaussian models and Mixtures Maneesh

ICA Q&A session Jonathan Bowdler Head of Regulatory Compliance Objectives of the session 1)

PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2018 Soleymani