Semi-supervised Kernel Canonical Correlation Analysis of Human Functional Magnetic Resonance Imaging Data
Jacquelyn A. Shelton
Max Planck Institute for Biological Cybernetics and Universität Tübingen, Tübingen, Germany
Semi-supervised Kernel Canonical Correlation Analysis of Human - - PowerPoint PPT Presentation
Semi-supervised Kernel Canonical Correlation Analysis of Human Functional Magnetic Resonance Imaging Data Jacquelyn A. Shelton Max Planck Institute for Biological Cybernetics and Universitt Tbingen, Tbingen, Germany Women in Machine
Max Planck Institute for Biological Cybernetics and Universität Tübingen, Tübingen, Germany
◮ Neuroscience: assess natural processing, i.e. fMRI –
◮ Problems: high-dimensional data, expensive labels ◮ Goal: Canonical Correlation Analysis in semi-supervised
◮ Samples in 2 modalities: representations of 1 process,
fMRI data: (labeled) X = {x1, x2, . . . , xn}, (unlabeled) {xn+1, . . . , xp} Corresponding labels: Y = {y1 = 1, y2 = 0, . . . , yn} → Paired data (fMRI with labels): (x1, y1), (x2, y2), . . . , (xn, yn)
◮ Finds projection directions in each modality’s subspace
◮ CCA: maximize correlation between X and Y projections
wx,wy
x Cxywy
x Cxxwx)(wT y Cyywy)
◮ Kernelized CCA (KCCA): general, optimization easier ◮ Regularized KCCA: avoid degenerate solutions
α,β
x + εxKx) αβT
y + εyKy
◮ Manifold assumption: high-dimensional data lie on a
◮ Functions should vary smoothly along M – small gradient ◮ Estimate the gradient ∇M by constructing a graph along
Samples of manifold Graph estimate of manifold
◮ Gradient estimate ∇M of functions along M leads to
◮ Optionally unlabeled data can be included to improve
Poor estimate: Graph with few data points Better estimate: Graph with more data points
α,β
xxKyyβ
xxKxˆ x + Rˆ x) αβT
y
x = εxKˆ xˆ x
x
xˆ xLˆ xKˆ xˆ x
Laplacian
◮ SSKCCA will favor directions α and β whose projections
◮ fMRI data (X): human volunteer during viewing of 2
◮ Labels (Y ): Continuous labels, 1 movie – 5 observers’
◮ Linear kernel in all experiments
◮ Model Selection: criterion from (Hardoon et al., 2004) to
Mean holdout correlations from five-fold cross validation across [each of the five] variables in all experiments. Labels
KCCA, Tikhonov regularization SSKCCA, Tikhonov and Laplacian regularization
◮ SSKCCA learned expected regions of brain activity
◮ KCCA with Laplacian regularization improves correlation
◮ SSKCCA with use of unlabeled data further improves
◮ Check out poster M26 for our extension of this work
◮ SSKCCA learned expected regions of brain activity
◮ KCCA with Laplacian regularization improves correlation
◮ SSKCCA with use of unlabeled data further improves
◮ Check out poster M26 for our extension of this work
regional specialization to local motion and to contrast-invariant, global flow in the human brain. Cereb Cortex 18:705-717.
natural viewing conditions reveal a time-based anatomy of the brain. NeuroImage 22:419-433.
Framework for Learning from Labeled and Unlabeled Examples. JMLR (2006)
Regularization of Kernel Canonical Correlation Analysis. ECML
Correlation Analysis: An Overview with Application to Learning Methods,” Neural Computation, 16, (12), 2639-2664.
Parametric Mapping: The Analysis of Functional Brain Images, Academic Press (2007)
analysis of human functional magnetic resonance imaging data, Max Planck Institute Tech Report, (185) (05 2009)
fMRI Analyses: Semi-supervised learning and resting state activity. NIPS.
max
wx,wy
wT
x Cxywy
x Cxxwx wT y Cyywy
. (4) We denote Hx the reproducing kernel Hilbert space (RKHS) associated with kx, and denote the associated feature map φx : X → H, i.e. kx(xi, xj) = φx(xi), φx(xj). max
fx,fy
f T
x ˆ
Cxyfy
x ˆ
Cxxfx f T
y ˆ
Cyyfy = max
α,β
αTKxKyβ
x α βTK2 y β
, (5) max
α,β
αTKxKyβ
x + εxKx) αβT
K2
y + εyKy
, (6) Denoting the kernel matrix computed using the data in X as Kxx ∈ Rn×n, the matrix computed using ˆ X and X as Kˆ
xx ∈ Rmx×n, the matrix computed using ˆ
X with itself as Kˆ
xˆ x ∈ Rmx×mx , etc. Kernel matrices for Y can be defined analogously.
Semi-supervised Laplacian regularized generalization of above equation: max
α,β
αTKˆ
xxKyˆ yβ
xxKxˆ x + Rˆ x) αβT
Kˆ
yyKyˆ y + Rˆ y
, (7)
where W is the matrix of similarities between data points and D is the diagonal matrix with entries of W’s row sums
σ2
xˆ x)ii = n+px j=1 (Kˆ xˆ x)ij.
◮ fMRI data of one human volunteer during viewing of 2
◮ 350 time slices of 3-dimensional fMRI brain volumes
◮ Pre-processed according to standard procedures using the
Visualization of learned weight vectors (wx) for color and face stimuli, following [2].
(a) CCA, Tikhonov regularization (b) CCA, Tikhonov and Laplacian regularization (c) Semi-supervised CCA, Tikhonov and Laplacian regularization