MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED - PowerPoint PPT Presentation

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING Kernel Canonical Correlation Analysis 1

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Structure of today’s and next week’s class 1) Briefly go through one extension of principal component analysis, namely Canonical Correlation Analysis (CCA). 2) Derive the non-linear version of CCA, kernel CCA (kCCA). 3) Make an exercise to understand the modulation of the space generated by CCA and kCCA. 2

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Canonical Correlation Analysis (CCA) x  N y  N x y   1 1 x y ,   T T max w corr w x w y , x y x y , w   2 2 x , y Video description Audio description Determine features in two (or more) separate descriptions of the dataset that best explain each datapoint. Extract hidden structure that maximize correlation across two different projections. 3

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Canonical Correlation Analysis (CCA)     M M     N N Pair of multidimensional X x , Y y y i x i   i 1 1 i zero mean variables   1 1 x y , We have M instances of the pairs.   T T max w corr w x w y , Search two projections w and w : x y x y x y , w   T T and z w X z w Y x x y y   2 2 x , y solutions of:     max max corr z ,z x y w w , x y 4

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Canonical Correlation Analysis (CCA)   T T max w corr w x w y , Search two projections w and w : x y x y x y , w   T T and z w X z w Y x x y y solutions of:     max max corr z ,z x y w w , x y   T T T w w E XY w w C  x y  x y xy max max T T w X w Y T T w , w , w w w w C w C w x y x y x y x xx x y yy y With and zero mean, i.e. X Y       E X E Y 0 5

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Canonical Correlation Analysis (CCA) Crosscovariance matrix  is C N N xy x y Measure crosscorrelation between and . X Y Covariance matrices    T C =E XX : N N solutions of: xx x x      T C =E YY : N N   yy y y max max corr z ,z x y w w , x y   T T T w w E XY w w C  x y  x y xy max max T T w X w Y T T w , w , w w w w C w C w x y x y x y x xx x y yy y With and zero mean, i.e. X Y       E X E Y 0 6

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Canonical Correlation Analysis (CCA) Correlation not affected by rescaling the norm of the vectors,    T T we can ask that w C w w C w 1 x y xx x yy y   T max max w C w x xy y w , w x y   T T u. c. w C w w C w 1 x xx x y yy y solutions of:     max max corr z ,z x y w w , x y   T T T w w E XY w w C  x y  x y xy max max T T w X w Y T T w , w , w w w w C w C w x y x y x y x xx x y yy y 7

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Canonical Correlation Analysis (CCA) Correlation not affected by rescaling the norm of the vectors,    T T we can ask that w C w w C w 1 x y xx x yy y   T max max w C w x xy y w , w x y   T T u. c. w C w w C w 1 x xx x y yy y  To determine the optimum (maximum) of , solve by Lagrange:               T T T L w w , , , = w C w w C w 1 w C w 1 x y x y x xy y x x xx x y y yy y Taking the partial derivatives over w w , x y       : / 2 x y 8

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Canonical Correlation Analysis (CCA)  Replacing and write the set of equations gives:         0 C 0 w C w xy   x xx x         Generalized Eigenvalue Problem;   w 0 C w 0      C   It can be reduced to a classical eigenvalue y yy y yx problem if C xx is invertible  Which can be rewritten as    1 2 C C C w C w xy yy yx x xx x Solving for w gives: y    1 2 C C C w C w yx xx xy y yy y If is invertible, it becomes an eigenvalue problem as for . C w yy y These two eigenvalue problems yield a pair of    i i vectors , , where min( , ) q w w q N N x y x y  1.. i q   N i N i , w w y x x y 9

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING CCA: Exercise I Consider the example below of a dataset of 4 points with 2-dimensional coordinates in both X and Y. • Determine by hand the directions found by CCA in each space. • Contrast to the directions found by PCA. X2 X3 1 Exercise - I 0.5 0 -0.5 X4 X1 -1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1 Y2 0.5 Y3 0 Y4 -0.5 Y1 -1 -1.5 -1 -0.5 0 0.5 1 10

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Kernel Canonical Correlation Analysis CCA finds basis vectors, s.t. the correlation between the projections (of all datapoints in X and Y ) is mutually maximized. CCA is a generalized version of PCA for two or more multidimensional datasets, but unlike PCA it does have the constraint to find orthogonal vectors. Assumes a linear correlation. If correlation non-linear  Kernel CCA. 12

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Kernel Canonical Correlation Analysis (kCCA) x  N y  N x y   1 1 x y ,         T T max , w corr w x w y x x y y x y w ,   2 2 x , y Assume two transformations   y x Video description Audio description And then perform correlation analysis in feature space across the two feature spaces. 13

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING From CCA to Kernel CCA     M M     N N , X x Y y y i x i   1 i i 1 Send into two separate feature spaces for data in X and in Y.             M M   M M       x and y , with x 0 and y 0 i i i i x y x y   i 1 i 1   i 1 i 1 Construct associated kernel matrices:         T T i i , , columns of , are , K F F K F F F F x y x x x y y y x y x y 14

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING From CCA to Kernel CCA In kernel CCA, we solve for: In Linear CCA, we were solving for:   T T T T max F F F F max w C w x x x y y y x y xy w w , w w , K K x y x y x y         T T T T T T T T u.c. F F F F F F F F 1 u.c. w w 1 C w C w x y x x x x x y y y y y x xx x y yy y K K y x Express the projection vectors as a linear combination of images of datapoints in feature space (as in kPCA): Replace the covariance and crosscovariance     and w F w F x x x y y y matrices by the product of the projection     M M          i i w x and w y vectors in feature space (as in kPCA): x x i , x x y i , y   i 1 i 1  T C F F xx x x  T C F F yy y y  T C F F xy x y 15

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Kernel CCA In summary, in kernel CCA, we search the projection vectors , w w x y (that live in feature space) so as to maximize:           max max corr w x , w y x x y y , , w w w w x y x y     This is again a generalized eigenvalue problem T max max K K x x y y   w , w ,   x y x y with , the dual eigenvectors (as dual vectors     x y       T 2 T 2 u c . . K K 1 x x x y y y in kPCA), see documentation in annexes for derivation. Generalized eigenvalue problem:          2 0 K K K 0 x y   x  x  x            2 K K 0       0 K   y y y x y 16

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Kernel CCA   If the intersection between the spaces spanned by , is non-zero, K K x x y y       then the problem has a trivial solution, as ~ cos K , K 1 x x y y (see solution to the exercises).     T max max K K x x y y   w , w , x y x y           T 2 T 2 u c . . K K 1 x x x y y y Generalized eigenvalue problem:          2 0 K K K 0 x y   x  x  x            2 K K 0       0 K   y y y x y 17

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED - PowerPoint PPT Presentation

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Structure of todays and next weeks class 1) Briefly go through one

Correlation Course Title Correlation Correlation coe ffi cient between -1 and 1 Sign

Canonical Correlation Analysis In principal components analysis, we analyzed one set of variables

Canonical Correlation Analysis James H. Steiger Department of Psychology and Human Development

Kernel Exploitation via Uninitialized Stack http://people.canonical.com/~kees/defcon19/ Kees

Semi-supervised Kernel Canonical Correlation Analysis of Human Functional Magnetic Resonance

Large-Scale Sparse Kernel Canonical Correlation Analysis Viivi Uurtio 1 , Sahely Bhadra 2 , and

kernel CCA, kernel Kmeans Spectral Clustering 1 MACHINE LEARNING 2012 Change in timetable:

Introducing... Benjamin Mako Hill GULEV: Ubuntu Canonical Ltd. Ubuntu A GNU/Linux Operating

Canonical Typology Danny Hieber Hieber, Daniel W. 2011. Canonical Typology. Talk given to the

A canonical martingale coupling Workshop on Optimal Transportation and Appplications Nicolas

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Theory of correlation transfer and correlation structure in recurrent networks Ruben Moreno-Bote

Business Statistics CONTENTS The correlation coefficient The rank correlation coefficient

ADVANCED MACHINE LEARNING Kernel PCA 11 ADVANCED MACHINE LEARNING Overview Todays Lecture

Canonical Correlation a Tutorial Magnus Borga January 12, 2001 Contents 1 About this tutorial

Introduction to Statistical Learning and Kernel Machines Hichem SAHBI CNRS UPMC June 2018

TCP Part 3: Performance, Fairness, & Modern Congestion Controllers 15-441 Guest Lecture

IND-CCA-secure Key Encapsulation Mechanism in the Quantum Random Oracle Model, Revisited Haodong

Public-Key Cryptography Lecture 12 CCA Secure PKE Hybrid Encryption CCA Secure PKE In SKE, to get

Phase Lengths in the Cylic Cellular Automaton Kiran Tomlinson Department of Computer Science,

Spectral Methods for Natural Language Processing Karl Stratos Thesis Defense Committee David

Updatable Encryption & Key Rotation Anja Lehmann IBM Research Zurich (R)CCA Secure

Beyond Provable Security: Verifiable IND-CCA Security of OAEP Gilles Barthe 1 Benjamin Grgoire 2

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED - PowerPoint PPT Presentation

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Structure of todays and next weeks class 1) Briefly go through one

Correlation Course Title Correlation Correlation coe ffi cient between -1 and 1 Sign

Canonical Correlation Analysis In principal components analysis, we analyzed one set of variables

Canonical Correlation Analysis James H. Steiger Department of Psychology and Human Development

Kernel Exploitation via Uninitialized Stack http://people.canonical.com/~kees/defcon19/ Kees

Semi-supervised Kernel Canonical Correlation Analysis of Human Functional Magnetic Resonance

Large-Scale Sparse Kernel Canonical Correlation Analysis Viivi Uurtio 1 , Sahely Bhadra 2 , and

kernel CCA, kernel Kmeans Spectral Clustering 1 MACHINE LEARNING 2012 Change in timetable:

Introducing... Benjamin Mako Hill GULEV: Ubuntu Canonical Ltd. Ubuntu A GNU/Linux Operating

Canonical Typology Danny Hieber Hieber, Daniel W. 2011. Canonical Typology. Talk given to the

A canonical martingale coupling Workshop on Optimal Transportation and Appplications Nicolas

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Theory of correlation transfer and correlation structure in recurrent networks Ruben Moreno-Bote

Business Statistics CONTENTS The correlation coefficient The rank correlation coefficient

ADVANCED MACHINE LEARNING Kernel PCA 11 ADVANCED MACHINE LEARNING Overview Todays Lecture

Canonical Correlation a Tutorial Magnus Borga January 12, 2001 Contents 1 About this tutorial

Introduction to Statistical Learning and Kernel Machines Hichem SAHBI CNRS UPMC June 2018

TCP Part 3: Performance, Fairness, &amp; Modern Congestion Controllers 15-441 Guest Lecture

IND-CCA-secure Key Encapsulation Mechanism in the Quantum Random Oracle Model, Revisited Haodong

Public-Key Cryptography Lecture 12 CCA Secure PKE Hybrid Encryption CCA Secure PKE In SKE, to get

Phase Lengths in the Cylic Cellular Automaton Kiran Tomlinson Department of Computer Science,

Spectral Methods for Natural Language Processing Karl Stratos Thesis Defense Committee David

Updatable Encryption &amp; Key Rotation Anja Lehmann IBM Research Zurich (R)CCA Secure

Beyond Provable Security: Verifiable IND-CCA Security of OAEP Gilles Barthe 1 Benjamin Grgoire 2

TCP Part 3: Performance, Fairness, & Modern Congestion Controllers 15-441 Guest Lecture

Updatable Encryption & Key Rotation Anja Lehmann IBM Research Zurich (R)CCA Secure