M ACHINE L EARNING ON N EUROIMAGING D ATA L ECTURE 2: I NTRODUCTION - - PowerPoint PPT Presentation

m achine l earning on
SMART_READER_LITE
LIVE PREVIEW

M ACHINE L EARNING ON N EUROIMAGING D ATA L ECTURE 2: I NTRODUCTION - - PowerPoint PPT Presentation

M ACHINE L EARNING ON N EUROIMAGING D ATA L ECTURE 2: I NTRODUCTION TO M ACHINE L EARNING Ilya Kuzovkin AACIMP, August 2014 P REVIOUSLY ON SLIDES P REVIOUSLY ON SLIDES P REVIOUSLY ON SLIDES P REVIOUSLY ON SLIDES _____ data? _____


slide-1
SLIDE 1

MACHINE LEARNING ON NEUROIMAGING DATA

Ilya Kuzovkin

AACIMP, August 2014

LECTURE 2: INTRODUCTION TO MACHINE LEARNING

slide-2
SLIDE 2

PREVIOUSLY ON SLIDES…

slide-3
SLIDE 3

PREVIOUSLY ON SLIDES…

slide-4
SLIDE 4

PREVIOUSLY ON SLIDES…

slide-5
SLIDE 5

PREVIOUSLY ON SLIDES…

_____ data? _____ plot? _____ curve?

slide-6
SLIDE 6

PREVIOUSLY ON SLIDES…

slide-7
SLIDE 7

PREVIOUSLY ON SLIDES…

slide-8
SLIDE 8

PREVIOUSLY ON SLIDES…

F? M? R? I?

BOLD? pixels? Data?

slide-9
SLIDE 9

PREVIOUSLY ON SLIDES…

slide-10
SLIDE 10

PREVIOUSLY ON SLIDES…

slide-11
SLIDE 11

PREVIOUSLY ON SLIDES…

  • What do we

measure?

  • Sampling rate?
  • Waves?
  • Fourier?
slide-12
SLIDE 12

PREVIOUSLY ON SLIDES…

slide-13
SLIDE 13

PREVIOUSLY ON SLIDES…

DATA

slide-14
SLIDE 14

PREVIOUSLY ON SLIDES…

DATA

What’s next?

slide-15
SLIDE 15

PREVIOUSLY ON SLIDES…

DATA

What’s next?

ANALYSIS

slide-16
SLIDE 16

MANUAL ANALYSIS

  • Very accurate
  • Easy to drop bad data
  • Human intuition
  • Human cognitive

abilities to catch interesting stuff

  • As flexible as you want
  • Takes time
  • Takes manpower
  • Boring*
  • Infeasible on huge

datasets

slide-17
SLIDE 17
  • MACHINE LEARNING
  • Makes errors
  • Does not know which

data is good and which is not

  • Will try to find only

what you asked for

  • You need to learn

about it

  • Fast*
  • Calculates while you

are free to do other things

  • Automatic
slide-18
SLIDE 18

PART I CONCEPTS WITH CATS

slide-19
SLIDE 19

?

slide-20
SLIDE 20

TRAINING SET

Machine learning algorithm learns from examples

slide-21
SLIDE 21

TRAINING SET

Machine learning algorithm learns from examples

slide-22
SLIDE 22

INSTANCE

Each object (instance) is described as a set set of parameters, called features

slide-23
SLIDE 23

FEATURES

Length of the tail

slide-24
SLIDE 24

FEATURES

Length of the tail

Amount of fur

slide-25
SLIDE 25

FEATURES

Length of the tail

Amount of fur

Feature vector

slide-26
SLIDE 26

FEATURE VECTOR

slide-27
SLIDE 27

FEATURE VECTOR

slide-28
SLIDE 28

FEATURE VECTOR

slide-29
SLIDE 29

CLASS

slide-30
SLIDE 30

5 143 11 234 2 210 12 342 7 198 5 321 DATASET

slide-31
SLIDE 31

5 143 11 234 2 210 12 342 7 198 5 321 DATASET

Infer a rule to classify these cats.

slide-32
SLIDE 32

5 143 11 234 2 210 12 342 7 198 5 321 DATASET

Congratulations!
 You have invented “OneR” algorithm*

slide-33
SLIDE 33

FEATURE SPACE

slide-34
SLIDE 34

FEATURE SPACE

slide-35
SLIDE 35

FEATURE SPACE

slide-36
SLIDE 36

FEATURE SPACE

slide-37
SLIDE 37

?

FEATURE SPACE

slide-38
SLIDE 38

?

FEATURE SPACE

slide-39
SLIDE 39

FEATURE SPACE

slide-40
SLIDE 40

FEATURE SPACE

slide-41
SLIDE 41
  • Decision trees
  • C4.5
  • Random forests
  • Bayesian networks
  • Hidden Markov models
  • Artificial neural network
  • Data clustering
  • Expectation-maximization algorithm
  • Self-organizing map
  • Radial basis function network
  • Vector Quantization
  • Generative topographic map
  • Information bottleneck method
  • IBSEAD
  • Apriori algorithm
  • Eclat algorithm
  • FP-growth algorithm
  • Single-linkage clustering
  • Conceptual clustering
  • K-means algorithm
  • Fuzzy clustering
  • Temporal difference learning
  • Q-learning
  • Learning Automata
  • AODE
  • Artificial neural network
  • Backpropagation
  • Naive Bayes classifier
  • Bayesian network
  • Bayesian knowledge

base

  • Case-based reasoning
  • Decision trees
  • Inductive logic

programming

  • Gaussian process

regression

  • Gene expression

programming

  • Group method of data

handling (GMDH)

  • Learning Automata
  • Learning Vector

Quantization

  • Logistic Model Tree
  • Decision trees
  • Decision graphs
  • Lazy learning
  • Monte Carlo Method
  • SARSA
  • Instance-based learning
  • Nearest Neighbor Algorithm
  • Analogical modeling
  • Probably approximately correct

learning (PAC

  • Symbolic machine learning

algorithms

  • Subsymbolic machine learning

algorithms

  • Support vector machines
  • Random Forests
  • Ensembles of classifiers
  • Bootstrap aggregating (bagging)
  • Boosting (meta-algorithm)
  • Ordinal classification
  • Regression analysis
  • Information fuzzy networks (IFN)
  • Linear classifiers
  • Fisher's linear discriminant
  • Logistic regression
  • Naive Bayes classifier
  • Perceptron
  • Support vector machines
  • Quadratic classifiers
  • k-nearest neighbor
  • Boosting
slide-42
SLIDE 42

DOES IT WORK?

slide-43
SLIDE 43

ACCURACY

Create

Classifier Training set

slide-44
SLIDE 44

ACCURACY

Create

Classifier Training set

A p p l y t

  • Test set
slide-45
SLIDE 45

ACCURACY

Create

Classifier Accuracy is the % of correctly classified instances Training set

A p p l y t

  • Test set
slide-46
SLIDE 46

ACCURACY

Create

Classifier Accuracy is the % of correctly classified instances Training set

A p p l y t

  • Test set

How good is 50% accurate classifier?

slide-47
SLIDE 47

UNBALANCED SET

Stupid classifier

?

slide-48
SLIDE 48

Stupid classifier

?

Accuracy is ?

UNBALANCED SET

slide-49
SLIDE 49

Stupid classifier

?

Accuracy is 0.5

UNBALANCED SET

slide-50
SLIDE 50

Stupid classifier

?

Accuracy is 0.5 Accuracy is ?

UNBALANCED SET

slide-51
SLIDE 51

Stupid classifier

?

Accuracy is 0.5 Accuracy is 0.9

UNBALANCED SET

slide-52
SLIDE 52

PRECISION

Correctly identified as Everything identified as

slide-53
SLIDE 53

PRECISION

Correctly identified as Everything identified as

slide-54
SLIDE 54

PRECISION

Correctly identified as Everything identified as

Precision = ?

slide-55
SLIDE 55

PRECISION

Correctly identified as Everything identified as

Precision = 0.9

slide-56
SLIDE 56

PRECISION

Correctly identified as Everything identified as

Precision = 0.9 Precision = ?

slide-57
SLIDE 57

PRECISION

Correctly identified as Everything identified as

Precision = 0.9 Precision = 0

slide-58
SLIDE 58

PRECISION

Correctly identified as Everything identified as

Precision = 0.9 Precision = 0

Stupid classifier

Precision = 0.45

slide-59
SLIDE 59

RECALL

Identified All in the test set

slide-60
SLIDE 60

RECALL

Identified All in the test set

slide-61
SLIDE 61

Recall = ?

RECALL

Identified All in the test set

slide-62
SLIDE 62

Recall = 1

RECALL

Identified All in the test set

slide-63
SLIDE 63

Recall = 1 Recall = ?

RECALL

Identified All in the test set

slide-64
SLIDE 64

Recall = 1 Recall = 0

RECALL

Identified All in the test set

slide-65
SLIDE 65

Recall = 1 Recall = 0

Stupid classifier

Recall = 0.5

RECALL

Identified All in the test set

slide-66
SLIDE 66

F1 SCORE

slide-67
SLIDE 67

F1 SCORE

Recall = 1 Recall = 0 Precision = 0.9 Precision = 0

slide-68
SLIDE 68

F1 SCORE

Recall = 1 Recall = 0 Precision = 0.9 Precision = 0 F1 ≈ 0.95 F1 = 0

slide-69
SLIDE 69

F1 SCORE

Recall = 1 Recall = 0 Precision = 0.9 Precision = 0 F1 ≈ 0.95 F1 = 0 Average F1 ≈ 0.48

slide-70
SLIDE 70

What is this test set you are talking about?

slide-71
SLIDE 71

TRAINING - VALIDATION - TEST

Fit model on a training set Tune parameters

  • n a validation set

50% 25% 25% Final test on a test set

slide-72
SLIDE 72

TRAINING - VALIDATION - TEST

50% 25% 25% But why can’t we do it all on

  • ne set?

Fit model on a training set Tune parameters

  • n a validation set

Final test on a test set

slide-73
SLIDE 73

BIAS-VARIANCE TRADEOFF

slide-74
SLIDE 74

BIAS-VARIANCE TRADEOFF

  • High bias
  • Low variance
  • Underfitting

a.k.a Too stupid

slide-75
SLIDE 75
  • High bias
  • Low variance
  • Underfitting

a.k.a Too stupid Balanced bias-variance tradeoff a.k.a OK

BIAS-VARIANCE TRADEOFF

slide-76
SLIDE 76
  • High bias
  • Low variance
  • Underfitting

a.k.a Too stupid Balanced bias-variance tradeoff a.k.a OK

  • Low bias
  • High variance
  • Overfitting

a.k.a Too smart

BIAS-VARIANCE TRADEOFF

slide-77
SLIDE 77

OVERFITTING

TRAINING VALIDATION

slide-78
SLIDE 78

OVERFITTING

TRAINING VALIDATION

slide-79
SLIDE 79

OVERFITTING

TRAINING VALIDATION

slide-80
SLIDE 80

OVERFITTING

TRAINING VALIDATION

slide-81
SLIDE 81

OVERFITTING

TRAINING VALIDATION

slide-82
SLIDE 82
  • ML for automatic analysis
  • Data as features
  • Feature space
  • Algorithms
  • Performance measures
  • Training, Validation, Test
  • Overfitting
slide-83
SLIDE 83

PART II BACK TO BRAINS

slide-84
SLIDE 84

Electrodes Time

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

slide-85
SLIDE 85

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

?

Electrodes Time

slide-86
SLIDE 86

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

Electrodes Time

?

What is your next move?

slide-87
SLIDE 87

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

Electrodes Time

?

Fourier transform of what?

slide-88
SLIDE 88

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

Electrodes

TIME-FREQUENCY DOMAIN

?

Time

slide-89
SLIDE 89

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

Electrodes

TIME-FREQUENCY DOMAIN

?

Time

slide-90
SLIDE 90

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

Electrodes

TIME-FREQUENCY DOMAIN

?

Time

slide-91
SLIDE 91

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

Electrodes

TIME-FREQUENCY DOMAIN

?

Time

slide-92
SLIDE 92

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

Electrodes

TIME-FREQUENCY DOMAIN

?

Time

slide-93
SLIDE 93

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

TIME-FREQUENCY DOMAIN

slide-94
SLIDE 94

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

TIME-FREQUENCY DOMAIN

slide-95
SLIDE 95

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

TIME-FREQUENCY DOMAIN

What is one instance in this case?

slide-96
SLIDE 96

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

TIME-FREQUENCY DOMAIN

What are the features?

slide-97
SLIDE 97

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

TIME-FREQUENCY DOMAIN

What is the dimension of feature space

slide-98
SLIDE 98

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

TIME-FREQUENCY DOMAIN

What is the class of this instance?

slide-99
SLIDE 99

http://fc09.deviantart.net/fs70/i/2011/213/b/a/open_closed_eye__by_hydrofaux-d42e82y.jpg

TIME-FREQUENCY DOMAIN

How you

  • btain test and

training set?

slide-100
SLIDE 100

PART III BRAIN-COMPUTER INTERFACE

slide-101
SLIDE 101
slide-102
SLIDE 102
slide-103
SLIDE 103
slide-104
SLIDE 104
slide-105
SLIDE 105

Now I know how your brain signal looks like when you think “LEFT” and “RIGHT”

slide-106
SLIDE 106

Now I know how your brain signal looks like when you think “LEFT” and “RIGHT” Try me — think

  • r
slide-107
SLIDE 107

Now I know how your brain signal looks like when you think “LEFT” and “RIGHT” Try me — think

  • r

It was wasn’t it?

slide-108
SLIDE 108

Now I know how your brain signal looks like when you think “LEFT” and “RIGHT” Try me — think

  • r

How would you use such technology? It was wasn’t it?

slide-109
SLIDE 109

BRAIN-COMPUTER INTERFACE

slide-110
SLIDE 110

BRAIN-COMPUTER INTERFACE

slide-111
SLIDE 111

BRAIN-COMPUTER INTERFACE

slide-112
SLIDE 112

BRAIN-COMPUTER INTERFACE

slide-113
SLIDE 113

BRAIN-COMPUTER INTERFACE

slide-114
SLIDE 114

BRAIN-COMPUTER INTERFACE

slide-115
SLIDE 115

BRAIN-COMPUTER INTERFACE

slide-116
SLIDE 116

BRAIN-COMPUTER INTERFACE

slide-117
SLIDE 117

BRAIN-COMPUTER INTERFACE

slide-118
SLIDE 118

BRAIN-COMPUTER INTERFACE

slide-119
SLIDE 119

BRAIN-COMPUTER INTERFACE

slide-120
SLIDE 120

BRAIN-COMPUTER INTERFACE

slide-121
SLIDE 121

BRAIN-COMPUTER INTERFACE

?

slide-122
SLIDE 122

BRAIN-COMPUTER INTERFACE

?

slide-123
SLIDE 123

BRAIN-COMPUTER INTERFACE

?

slide-124
SLIDE 124
slide-125
SLIDE 125
slide-126
SLIDE 126
slide-127
SLIDE 127
slide-128
SLIDE 128

Technology Electrical Magnetic Optical Name EEG ECoG Intracortical MEG fMRI fNIRS Invasive Portable Cost From $100 to $30,000+ $1000 grid $2000 per array $1 mln $2-3 mln $200,000 Temporal resolution 50 ms 3 ms 3 ms 50ms 1-2 s 1 s Spatial resolution 1+ cm 1 mm 0.5 mm - 0.05 mm 5 mm 1 mm voxels 5 mm Pattern classification VEP ERD/ ERS P300 Performance 2 class 90% 3 class 80% 4 class ? Large number

  • f targets

2 cls 90% Large number

  • f targets

8 cls 90% High* ~ same as EEG based 4 cls 90% 2 cls 90%

NEUROIMAGING TECHNIQUES & BCI

Technology Electrical Magnetic Optical Name EEG ECoG Intracortical MEG fMRI fNIRS Invasive Portable Cost From $100 to $30,000+ $1000 grid $2000 per array $1 mln $2-3 mln $200,000 Temporal resolution 50 ms 3 ms 3 ms 50ms 1-2 s 1 s Spatial resolution 1+ cm 1 mm 0.5 mm - 0.05 mm 5 mm 1 mm voxels 5 mm Signal classification VEP P300 Performance 2-class 90% 3-class 80% 4-class 60% Large number

  • f targets

Large number

  • f targets

8-cls 90% High* ~ same as EEG based 4-cls 90% 2-cls 90%

slide-129
SLIDE 129