Classification of fMRI-based cognitive states Stephen LaConte - - PowerPoint PPT Presentation

classification of fmri based cognitive states
SMART_READER_LITE
LIVE PREVIEW

Classification of fMRI-based cognitive states Stephen LaConte - - PowerPoint PPT Presentation

Classification of fMRI-based cognitive states Stephen LaConte Department of Neuroscience Organization Background and motivation Basic principles Evaluation of predictive models Model interpretation fMRI


slide-1
SLIDE 1

Classification of fMRI-based cognitive states

Stephen LaConte Department of Neuroscience

slide-2
SLIDE 2
  • Background and motivation
  • Basic principles
  • Evaluation of predictive models
  • Model interpretation
  • fMRI considerations
  • Supervised learning-based real-time fMRI
  • Tools and resources

“Organization”

Stephen LaConte July 25, 2008

slide-3
SLIDE 3

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

Stephen LaConte July 25, 2008

slide-4
SLIDE 4

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

Stephen LaConte July 25, 2008

slide-5
SLIDE 5

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

0.30 0.40 0.50 0.60 0.0 0.2 0.4 0.6 0.8 1.0 100 PCs 75 PCs 50 PCs 25 PCs 10 PCs Air Alignment High Low DC De-trend None High Low Smooth No Alignment DC De-trend, No Smooth: mean reproducibility mean prediction accuracy

Stephen LaConte July 25, 2008

slide-6
SLIDE 6

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

Stephen LaConte July 25, 2008

slide-7
SLIDE 7

Stephen LaConte July 25, 2008

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

slide-8
SLIDE 8

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

Stephen LaConte July 25, 2008

slide-9
SLIDE 9

Stephen LaConte July 25, 2008

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

slide-10
SLIDE 10

Background and Motivation

  • Complements univariate approaches

(Friston, 1995; McIntosh, 1996; Strother, 2002; Moeller and Habeck 2006)

  • Early demonstration

(Dehaene, 1998)

  • Methodology and validation

(Strother, 2002; LaConte, 2003; Shaw, 2003; Mitchell 2004; Mourão-Miranda, 2005; Martinez-Ramon, 2006)

  • Representation of

different classes of stimuli

(Haxby, 2001; Cox and Savoy, 2003; Haynes & Rees, 2005; Kamitani & Tong 2005)

  • Detecting and tracking

cognitive states

(Polyn, 2005)

  • Natural representation

for real-time fMRI

(LaConte, 2007)

  • Image identification

(Kay, 2008)

  • Data analysis competitions

HBM ’06 and ‘07

Stephen LaConte July 25, 2008

slide-11
SLIDE 11
  • Background and motivation
  • Basic principles
  • Evaluation of predictive models
  • Model interpretation
  • fMRI considerations
  • Supervised learning-based real-time fMRI
  • Tools and resources

“Organization”

Stephen LaConte July 25, 2008

slide-12
SLIDE 12

Supervised learning applied to fMRI

Data acquisition Visual display Supervised learning

Data labels (y) Image data Stimulus

y t I t

Estimated label for time t y t

^

Time-labeled scans Time-labeled scans

Step 1: Train with labeled data

Data acquisition Data acquisition Visual display Visual display Model

Image data Stimulus

I t

Step 2: Use model to predict/decode y represents the stimulus/behavioral categories for each volume. For classification, there is a set of stimulus categories y = {0, 1, 2, …, N}.

y

Stephen LaConte July 25, 2008

slide-13
SLIDE 13

Mathematical Representation of fMRI Data

[ ]

N

X X X

  • 2

1

Stephen LaConte July 25, 2008

slide-14
SLIDE 14
  • MN

M M N N

X X X X X X X X X

  • 1

1 2 22 21 1 12 11

Variables/Features/Space Observations/Time/Intervals

Mathematical Representation of fMRI Data

Stephen LaConte July 25, 2008

slide-15
SLIDE 15
  • N

x x x

  • 2

1

Classification with individual volumes

1

x

2

x

time experiment

Stephen LaConte July 25, 2008

slide-16
SLIDE 16

Temporal Regression

time fMRI experiment

y

1 6 3 9 10 8 6 4 2 10 8 6 4 2

) ( β β + =

T

x x f )) ( , ( x f y L

1

x

2

x y

1

x

2

x

Stephen LaConte July 25, 2008

slide-17
SLIDE 17

x

g ˆ

high dimensional input space scalar (or very low dimensional) decision

Classifier

Classification

Stephen LaConte July 25, 2008

slide-18
SLIDE 18

Training Data and Decision Boundary

Stephen LaConte July 25, 2008

slide-19
SLIDE 19

Training Data and Decision Boundary

Stephen LaConte July 25, 2008

slide-20
SLIDE 20

Training Data and Decision Boundary

Stephen LaConte July 25, 2008

slide-21
SLIDE 21

Multi-class

Training Data Individual 2-class models

1 vs. 4 2 vs. 4 1 vs. 3 1 vs. 2 2 vs. 3 3 vs. 4

1 3 4 2

Stephen LaConte July 25, 2008

slide-22
SLIDE 22

4-Class Model

1 3 4 2

Multi-class

Individual 2-class models

1 vs. 4 2 vs. 4 1 vs. 3 1 vs. 2 2 vs. 3 3 vs. 4

Stephen LaConte July 25, 2008

slide-23
SLIDE 23

x

g ˆ

high dimensional input space scalar (or very low dimensional) decision

Classifier

Classification

Stephen LaConte July 25, 2008

slide-24
SLIDE 24

Canonical Variates Analysis

UTx

x z

g ˆ

PCA rotation based on training data feature space vector linear weights from eigenvectors of class covariances high dimensional input space reduced dimensional feature space scalar (or very low dimensional) decision

Lz

Stephen LaConte July 25, 2008

slide-25
SLIDE 25

PCA/CVA

Data Matrix: Truncate Q (model complexity) PCA via SVD: CVA:

time voxels

Columns of L are determined by the eigenvectors of W-1B. W is the within class variance and B the between class variance, and both are obtained from Q.

Q V X U

T T

= Λ = X LU LQ C

T* * =

= X

Stephen LaConte July 25, 2008

slide-26
SLIDE 26

Support Vector Machine

k(x) x z

.

w z w z

g ˆ

non-linear kernel mapping function feature space vector linear weights high dimensional input space very high dimensional feature space scalar decision

Stephen LaConte July 25, 2008

slide-27
SLIDE 27

2 1

2 1 w T C

T t t

  • =

+ ξ

minimize

This term allows some training errors. This term favors the widest possible margin, C = infinity is hard margin SVM (as apposed to soft margin) because it does not allow any training errors

) ( ) ( w z w z D

i i

+ ⋅ =

  • SVM

Stephen LaConte July 25, 2008

slide-28
SLIDE 28

Model selection

  • Model parameters are fit based on training data
  • But…

– Finite sample sizes – Noisy samples

  • Classical approach: Bias-variance tradeoff

– If model is too flexible: overfitting to noise – If model is not flexible enough: will not adequately capture signal structure

  • Statistical learning theory approach: keep empirical risk constant

and minimize the confidence interval

  • In practice model selection is done by

– penalization with an analytical model (e.g. Akaike’s final prediction error) – Resampling (Cross-validation, Bootstrap)

Stephen LaConte July 25, 2008

slide-29
SLIDE 29

Complexity control

  • Traditional model complexity: model-based –

controlled by number of model parameters

  • SVM, Kernels, Statistical learning Theory:

margin-based: complexity is controlled by size

  • f the margin
  • Both methods have to goal of good

generalization – high prediction of future samples

Stephen LaConte July 25, 2008

slide-30
SLIDE 30

2 1

2 1 w T C

T t t

  • =

+ ξ

For SVM, minimize the following

This term allows some training errors. This term favors the widest possible margin, C = infinity is hard margin SVM (as apposed to soft margin) because it does not allow any training errors

Stephen LaConte July 25, 2008

slide-31
SLIDE 31

Seperable C = “small” Side note: The w map Considering the role each voxel plays in the direction of the discriminant boundary, the absolute value of component 2 of w should be larger than component 1. So the “activation map” would show voxel 2. x2 x1 C = large

Stephen LaConte July 25, 2008

slide-32
SLIDE 32

Non-seperable The use of C was primarily motivated by this case C = small C = large

Stephen LaConte July 25, 2008

slide-33
SLIDE 33
  • Background and motivation
  • Basic principles
  • Evaluation of predictive models
  • Model interpretation
  • fMRI considerations
  • Supervised learning-based real-time fMRI
  • Tools and resources

“Organization”

Stephen LaConte July 25, 2008

slide-34
SLIDE 34

Evaluating the model: Generalization

  • Prediction accuracy on independent data
  • Independence

– Autocorrelation within a run – Across runs – Across sessions – Across individuals

Stephen LaConte July 25, 2008

slide-35
SLIDE 35

Evaluating the model: Generalization

Confusion matrix for binary classification Class decision Positive Negative Positive Negative True positive True negative False negative False positive

Stephen LaConte July 25, 2008

slide-36
SLIDE 36

Evaluating the model: Generalization

Quantifying predictive performance

fn tn fp tp tn tp accuracy + + + + = fn tp tp y sensitivit + = fp tn tn y specificit + =

  • verall performance

performance on class 1 performance on class 0

Stephen LaConte July 25, 2008

slide-37
SLIDE 37

Reproducibility

Test Statistic for Data Set 1 Test Statistic for Data Set 2

Pattern 2 Pattern 1

  • Strother SC, et. al., Hum Brain Mapp, 5:312-316, 1997.
slide-38
SLIDE 38

Evaluating models with NPAIRS

  • NPAIRS (Strother, 2002)

– Non-parametric – Prediction – Activation – Influence – Reproducibility

– reSampling

  • Quantify quality/validity of neuroimaging results with model

performance metrics

  • NPAIRS is a general framework
  • Examine impact of preprocessing using NPAIRS framework

– Propose Prediction vs. Reproducibility curves as an alternative to ROC analysis (LaConte, 2003)

Stephen LaConte July 25, 2008

slide-39
SLIDE 39

Prediction and Reproducibility

(LaConte, 2003)

Stephen LaConte July 25, 2008

slide-40
SLIDE 40

Average P-R Plot (16 Subjects)

0.30 0.40 0.50 0.60 0.0 0.2 0.4 0.6 0.8 1.0 100 PCs 75 PCs 50 PCs 25 PCs 10 PCs

Air Alignment

High Low DC De-trend None High Low Smooth

No Alignment

DC De-trend, No Smooth:

mean reproducibility mean prediction accuracy

(LaConte, 2003)

Stephen LaConte July 25, 2008

slide-41
SLIDE 41

S16 S1 S16 S1 hh hl hn lh ll ln nh nl nn xx hh hl hn lh ll ln nh nl nn xx 100 90 80 60 50 70 SVM Prediction Accuracy LDA Prediction Accuracy

Prediction Accuracy Results

(LaConte, 2005)

Stephen LaConte July 25, 2008

slide-42
SLIDE 42

NPAIRS

  • Preprocessing choices are critical in optimizing fMRI data

analysis.

  • We have demonstrated a data driven method for appraising

competing methodologies by using the NPAIRS performance metrics, prediction accuracy and reproducibility.

  • NPAIRS individualizes bias-variance trade-offs to the given

data set

  • It is computationally intensive, and requires data that can

be split into independent test/train sets

  • SVM was more robust (less sensitive) to preprocessing

than CVA

Stephen LaConte July 25, 2008

slide-43
SLIDE 43
  • Background and motivation
  • Basic principles
  • Evaluation of predictive models
  • Model interpretation
  • fMRI considerations
  • Supervised learning-based real-time fMRI
  • Tools and resources

“Organization”

Stephen LaConte July 25, 2008

slide-44
SLIDE 44

Interpretation

  • “the predictive learning setting does not guarantee that

accurate predictive models closely approximate the true model” – Cherkassky and Mulier, 2007.

– A consequence of modeling finite data

  • Interpretation can be difficult if input (e.g. brain voxels)

diffusely contributes to the model

– non-linear models are often difficult to interpret

  • As multivariate models, we can obtain information about

both spatial and temporal properties

  • Get to know your data and the analysis method being

applied!

Stephen LaConte July 25, 2008

slide-45
SLIDE 45

Interpreting SVM Models

Support Vector Machines

  • Optimal w is a linear combination of a subset
  • f the training vectors, x , termed support

vectors.

  • SVM model consists of

– predefined k(x) – support vectors – SV class labels.

  • Non-linear decision boundaries
  • High dimensional problems

k(x) x z

.

w z

g ˆ

) ( ) ( w z w z D

i i

+ ⋅ =

  • 1

1

) , ( ) ( ) ( w x x H y w z z y z D

t T t t t T t t t t

+ = + ⋅ =

  • =

=

  • α

α

Stephen LaConte July 25, 2008

slide-46
SLIDE 46

Interpreting SVM Models

  • Each point in feature space corresponds to a spatial pattern
  • Support vectors, conceptually are the observations that are

most difficult to classify

  • Conversely, observations far away from the separating

hyperplane are most easily classified

  • In fact, changing non-support vector training data will

result in an identical model

Linear Kernel 3rd Order Polynomial Kernel 5th Order Polynomial Kernel

Stephen LaConte July 25, 2008

slide-47
SLIDE 47

Interpreting SVM Models

  • Each point in feature space corresponds to a spatial pattern
  • Support vectors, conceptually are the observations that are

most difficult to classify

  • Conversely, observations far away from the separating

hyperplane are most easily classified

  • In fact, changing non-support vector training data will

result in an identical model

Linear Kernel 3rd Order Polynomial Kernel 5th Order Polynomial Kernel

20 40 60 80 100 120 −1 1 Scan Support Vectors

Stephen LaConte July 25, 2008

slide-48
SLIDE 48

Spatial interpretation

Linear Kernel 3rd Order Polynomial Kernel 5th Order Polynomial Kernel

  • J. Mourão-Miranda et al. NeuroImage 28 (2005) 980– 995

Stephen LaConte July 25, 2008

slide-49
SLIDE 49

Generating Sensitivity Maps

) , ( 2

)) ( | ) ( (

g x p i i

x j x j g p s

∂ =

2 1

)) ( | ) ( ( 1

=

∂ ≈

N j i i

x j x j g p N s Kjems, U., et al. NeuroImage 15:772-786, 2002.

  • general enough to be applied to non-linear models
  • allows for direct comparison across methods.
  • computationally expensive
  • sensitivity depends on the accuracy of the density

estimation and its partial derivative

Stephen LaConte July 25, 2008

slide-50
SLIDE 50

Generating Sensitivity Maps

)) ( exp( ) )] ( ) ( 1 [ exp( )) ( | ) ( ( i i D i y j x j g p ξ

θ

− = − − ≈

+

Kwok, J., IEEE Trans Neur Net 10:1018-1031, 1999.

Stephen LaConte July 25, 2008

slide-51
SLIDE 51
  • Background and motivation
  • Basic principles
  • Evaluation of predictive models
  • Model interpretation
  • fMRI considerations
  • Supervised learning-based real-time fMRI
  • Tools and resources

“Organization”

Stephen LaConte July 25, 2008

slide-52
SLIDE 52

Some fMRI considerations

  • Dimensionality
  • Hemodynamic response

– Exclude transition scans from model – Time shift labels to account for delay – Model the delay based on HRF characteristics

  • Paradigm timing

– Block design – Event-related – Complex stimuli

Stephen LaConte July 25, 2008

slide-53
SLIDE 53

Dimensionality and feature selection

Besides other potential benefits, can also have interpretive value

Voxel selection maps for Left vs. Right task (A) and Index vs. Pinky task (B). These maps were generated by averaging the voxel rank scores for each training set across the four runs.

R

A

R

A

R

B

R

B

(LaConte, 2005)

Stephen LaConte July 25, 2008

slide-54
SLIDE 54

Hemodynamic response

  • Model HRF
  • Temporal shift to compensate
  • Drop transition scans

(LaConte, 2003; LaConte 2005; Mitchell, 2004)

Stephen LaConte July 25, 2008

slide-55
SLIDE 55 20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180

20 Total Misclassifications 40 60 80 100 120 140 160 180 20 Total Misclassifications 40 60 80 100 120 140 160 180 2 6 10 14 18 22 26 30 seconds 2 6 10 14 18 22 26 30 seconds

Effect of training with task transition images

exclude 2 exclude 3 exclude 1 all images

training testing X X X X X X X X X X X X

Stephen LaConte July 25, 2008

slide-56
SLIDE 56 20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180

20 Total Misclassifications 40 60 80 100 120 140 160 180 20 Total Misclassifications 40 60 80 100 120 140 160 180 2 6 10 14 18 22 26 30 seconds 2 6 10 14 18 22 26 30 seconds

Effect of training with task transition images

exclude 2 exclude 3 exclude 1 all images

Stephen LaConte July 25, 2008

slide-57
SLIDE 57

75 Accuracy (%) 100 80 85 95 70 90 Discarded Transition Images all images exclude 1 exclude 2 exclude 3

0.7 0.75 0.8 0.85 0.9 0.95 1

Effect of training with task transition images testing all data testing last 20 of 30 s (10 of 15 images)

Stephen LaConte July 25, 2008

slide-58
SLIDE 58

Responsiveness to stimulus changes Average classifier output Individual classifier output with behavioral data

all images exclude 2 The model trained without transition images is more sluggish The model trained without transition images is more stable

Stephen LaConte July 25, 2008

slide-59
SLIDE 59

Classification of “transition” images

Stephen LaConte July 25, 2008

slide-60
SLIDE 60

Paradigm timing

Stephen LaConte July 25, 2008

slide-61
SLIDE 61

Matrix representation of fMRI data…

N = number of brain voxels T = number of repeated measurements

  • TN

T N

x x x x x x

  • 1

21 1 12 11

t t voxels t Experimental Design (classification labels)

t

y

leads to an natural vector representation for block design experiments + + Task 2 Task 1 Data Matrix + + + + Basic signal characteristics

Stephen LaConte July 25, 2008

slide-62
SLIDE 62

t Experimental Design (class labels)

t

y

The signal characteristics for ER-fMRI data + Task 2 Task 1 Data Matrix + { + { { { Basic signal characteristics + { + { { { “Hyper-Image” Combine HRF times

For ER data, several images must be considered as belonging to the time evolution of the same class of stimuli.

Rapid ER-fMRI Data Matrix

+ { + {

{

For rapid ER designs, the data matrix has overlapped class labels, and the hyper- image representation shares images that are a mixture of more than one experimental condition.

(Mitchell, et al. 2004. Mach Learn 57, 145-75; LaConte, et al. 2005. ISMRM 1583)

slide-63
SLIDE 63

] [ ]) [ ] [ ( ] [

1

t n t h t x t y

s S s s

+ ∗ =

=

n h y + = X

Signal model

Assumes linear time invariance, y[t] the BOLD signal over time x[t] neuronal responses h[t] hemodynamic responses S number of unique stimulus classes n[t] noise term

x2 x1

  • h1

h2

y

n g h y + + = V X

HRFs are known to vary with repetitions of identical stimuli (Lu, et al. IEEE TMI 24, 236-45)

For classification, we require multiple observations of each of the S stimuli, not an estimate of hs[t]. The mixed model equation estimates the HRFs for each trial. The idea is to estimate each h by accounting for the additive effects of other event responses. Mixed model form V, N×(TL), contains the information of X g, (TL)×1, is the vector of random effects Matrix form of signal model

slide-64
SLIDE 64

Direct hyper-image vectors vs. mixed model vectors

  • Forming the vector representation directly

– relies on the same principle as time-locked averaging – has limited power to accurately estimate the hemodynamic response function (HRF) – hemodynamic responses are known to vary with repetitions of identical stimuli.

  • Mixed model approach accounts for two sources of

variation in ER-data:

– Between-HRF variation from a voxel’s relative sensitivity to different stimulus types – Within-HRF variation to explain the heterogeneity of a voxel’s response to several repetitions of the same stimulus.

(Lu, et al. IEEE TMI 24, 236-45)

Stephen LaConte July 25, 2008

slide-65
SLIDE 65
  • TR=500 ms
  • Images =500 (~4 min)
  • Two HRFs (randomized

realizations for 40 events each):

  • h1
  • Delay: 1 to 3 samples
  • Width: 10 to 14 samples
  • Amplitude: 1.4 to 1.7
  • Baseline: 0
  • h2
  • Delay: 0 to 1 samples
  • Width: 14 to 16 samples
  • Amplitude: 0.5 to 0.8
  • Baseline: 0
  • Noise added

(n~N(0,0.25))

Simulation 1: estimating two HRFs from a time series

2 4 6 8 10 12 14 16 18

samples

20

Time-locked

h1

5 10 15 20 25 30 35 40

realizations

2 4 6 8 10 12 14 16 18

samples

20

Mixed-model

h1

5 10 15 20 25 30 35 40

realizations

h2 h2 h1 h2 h1 h2

Time-locked Mixed-model

Average and standard deviation from individual estimates

Stephen LaConte July 25, 2008

slide-66
SLIDE 66
  • Background and motivation
  • Basic principles
  • Evaluation of predictive models
  • Model interpretation
  • fMRI considerations
  • Supervised learning-based real-time fMRI
  • Tools and resources

“Organization”

Stephen LaConte July 25, 2008

slide-67
SLIDE 67

Stephen LaConte July 25, 2008

slide-68
SLIDE 68

Subjects can learn to control activation in a number of different brain areas

  • Somatomotor cortex

– Posse 2001, Yoo 2002, deCharms 2004, Yoo 2004

  • Parahippocampal place area

– Weiskopf 2004

  • Amygdala

– Posse 2003

  • Insular cortex

– Caria 2007

  • Anterior cingulate cortex

– Weiskopf 2003, Yoo 2004,Birbaumer 2007, deCharms 2005

Stephen LaConte July 25, 2008

slide-69
SLIDE 69

Localized real-time fMRI

  • Has demonstrated a high degree of potential
  • Activated areas are generally noisy
  • Generating a map requires

– Updating statistics at each pixel – Time window considerations – Interpretation of brain activation

  • Tracking a region of interest requires

– designation of that region – filtering and spatial averaging

Stephen LaConte July 25, 2008

slide-70
SLIDE 70
  • Exp. Design

Class Training Labels

Training run

Time-Labeled Scans

Image Recon and SVM Classification

Image Data

Data Acquisition

Stimulus Presentation Stimulus

Conventional FMRI

Test Data Classifier Output

Testing Run

Real-Time Classification

(LaConte, 2007)

Stephen LaConte July 25, 2008

slide-71
SLIDE 71

Results

Stephen LaConte July 25, 2008

slide-72
SLIDE 72

Experimental Timing and Classifier Output (left finger = -1, right finger = +1)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 1 (78 % Accuracy)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 1 (78 % Accuracy)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 2 (78 % Accuracy)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 2 (78 % Accuracy)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 3 (79 % Accuracy)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 3 (79 % Accuracy)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 4 (77 % Accuracy)

  • 1
  • 2

1 2 minutes Image Classification 1 3 2 4 5 6 7 8

Subject 4 (77 % Accuracy)

Stephen LaConte July 25, 2008

slide-73
SLIDE 73

Brain state classification: a variety of cognitive domains. With the exact same experimental setup (different instructions), subjects can learn to move the arrow

Stephen LaConte July 25, 2008

slide-74
SLIDE 74

.9 minutes Classification Accuracy 1 3 2 4 5 6 7 8 .8 .7 .6 .5 .4

Average Learning Curve (+/- Standard Deviation) Subject 1

Stephen LaConte July 25, 2008

slide-75
SLIDE 75

classification-based real-time fMRI

  • Real-time classification
  • TR-by-TR basis
  • Without assumptions of localized brain areas
  • Adaptive feedback based on classified brain state
  • goes beyond linear systems input-output relationships
  • adaptive fMRI and other RT techniques may provide insights

unattainable through traditional stimulus-response experiments

  • Applications
  • flexible fMRI experiments, biofeedback rehabilitation, therapeutic

meditation, learning studies, sports therapy or other virtual reality– based training, and lie-detection

  • Parallels EEG-BCI
  • Provide complementary experience and results

Stephen LaConte July 25, 2008

slide-76
SLIDE 76

Applications

  • Basic Science
  • New class of fMRI experiments based on stimuli that adapt

to internal states

  • Rehabilitation
  • Addiction
  • Depression
  • Traumatic brain injury
  • Speech and language
  • Examining types of psychological stimuli that are most salient

including multimedia/complex stimuli

  • learning, memory, emotional awareness

Stephen LaConte July 25, 2008

slide-77
SLIDE 77
  • Background and motivation
  • Basic principles
  • Evaluation of predictive models
  • Model interpretation
  • fMRI considerations
  • Supervised learning-based real-time fMRI
  • Tools and resources

“Organization”

Stephen LaConte July 25, 2008

slide-78
SLIDE 78

resources

Reading

  • Norman et al. 2006. Beyond mind-reading: multi-voxel pattern analysis of fMRI data.

Trends in Cognitive Sciences. 10:424-430.

  • Mitchell, et al. 2004. Learning to decode cognitive states from brain images. Machine
  • Learning. 57:145-175.
  • Strother, et al. 2002. The quantitative evaluation of functional neuroimaging experiments:

The NPAIRS data analysis framework. NeuroImage. 15:747-771.

  • Cherkassky and Mulier. 1998. Learning from data: Concepts, Theory, and Methods. John

Wiley & Sons, Inc., New York.

  • Hastie, Tibshirani, and Friedman. 2001. The elements of statistical learning: Data mining,

inference and prediction. Springer-Verlag, New York.

  • Duda, Hart, and Stork. 2001. Pattern classification. John Wiley & Sons, Inc., New York.

Software:

  • MVPA: http://www.csbmb.princeton.edu/mvpa/
  • PyMVPA: http://pkg-exppsy.alioth.debian.org/pymvpa/
  • 3dsmv: http://www.cpu.bcm.edu/laconte/3dsvm.html
  • NPAIRS: http://www.neurovia.umn.edu/INC/Resources/Downloads/download_npairs.php

Stephen LaConte July 25, 2008

slide-79
SLIDE 79

3dsvm Plugin Snapshot Support Vector Machine Analysis

training testing

Stephen LaConte July 25, 2008

slide-80
SLIDE 80

Training - 3dsvm -trainvol run1+orig \

  • trainlabels run1_categories.1D \
  • mask mask+orig \
  • model model_run1

Testing - 3dsvm -testvol run2+orig \

  • model model_run1+orig \
  • predictions pred2_model1

Command Line

Stephen LaConte July 25, 2008

slide-81
SLIDE 81

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-82
SLIDE 82

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-83
SLIDE 83

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-84
SLIDE 84

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-85
SLIDE 85

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-86
SLIDE 86

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-87
SLIDE 87

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-88
SLIDE 88

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-89
SLIDE 89

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-90
SLIDE 90

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-91
SLIDE 91

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-92
SLIDE 92

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-93
SLIDE 93

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-94
SLIDE 94

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

Stephen LaConte July 25, 2008

slide-95
SLIDE 95

Example: Left vs. Right visual stimulus

  • 3T fmri 31 axial EPI slices, TR/TE = 2000/31 msec, voxel=3.4 X 3.4 X 4 mm).
  • Randomized block lengths alternating between left and right stimulus.

R L

Stephen LaConte July 25, 2008

slide-96
SLIDE 96

3dsvm features

  • Distributed with AFNI
  • Reading AFNI-supported formats (including NIfTI)
  • Thus all preprocessing and data manipulation of the

major software packages

  • Masking of variables (brain pixels)
  • Censoring training samples
  • Visualizing alphas as time series and linear vectors as

functional maps

  • Multi-class classification

Stephen LaConte July 25, 2008

slide-97
SLIDE 97

Acknowledgment

Xiaoping Hu Scott Peltier Jihong Chen Will Curtis Jeffrey Prescott Yang Zhi Zhihao Li Stephen Strother Vladimir Cherkassky

Thank You!

Work partially supported by NINDS R21NS050183, the Robert and Janice McNair Foundation, and Baylor College of Medicine.

Andrew Fischer Will Curtis Dorina Papageorgiou Prashant Prasad

Stephen LaConte July 25, 2008