Applied Machine Learning in Biomedicine Enrico Grisan - - PowerPoint PPT Presentation
Applied Machine Learning in Biomedicine Enrico Grisan - - PowerPoint PPT Presentation
Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Course details Mon-Wed 10.30-12.00 Room 318 May 4 th through May 27 th Contact enrico.grisan@dei.unipd.it Exam: project assignment Cancer detection
Course details
Mon-Wed 10.30-12.00 Room 318 May 4th through May 27th Contact enrico.grisan@dei.unipd.it Exam: project assignment
Cancer detection
Face detection
How would you detect a face? How does album software tag your frienss?
What do we do?
What do we do?
Speech recognition
Brain-coputer interface
Recommender systems
Amazon, Netflix, Spotify tell you what you might like
The Netflix Prize was an open competition: predict user ratings for films, based on previous ratings without any other information about the users or films, The grand prize of US$1,000,000 was given to the BellKor's Pragmatic Chaos team which bested Netflix's own algorithm for predicting ratings by 10.06%
The age of big data
“Every day, people create the equivalent of 2.5 quintillion bytes of data from sensors, mobile devices, online transactions, and social networks; so much that 90 percent of the world's data has been generated in the past two years..” The Huffington Post: Arnal Dayaratna: IBM Releases Big Data
CERN Collider 320x1012 bytes/s Personal connectome 1018 bytes/person 109 messages/day 30x106 messages/day
The role of machine learning
Design and analyze algorithms that
- improve their performance
- at some task
- with experience
Data (experience) Learning algorithm Knowledge (performance on task)
Imagenet challenge
Kaggle challenge
100 000 $ prize 35000 retinal images 4 DR classes
- ngoing!!!
Machine learning in biomedicine
Usually extreme conditions: Very few samples (with respect to the problem) Very large amount of descriptors per sample Very large amount of noise/uncertainty
Categories
– Supervised learning
classification, regression
– Unsupervised learning
Density estimation, clustering, dimensionality reduction
– Semisupervised learning – Active learning – Reinforcement learning – …
Supervised learning
Feature space Target space Normal Metaplastic Benign neoplastic Malign neoplastic Gene expression Discrete labels Classification CHD risk score Demographic and Clinical data Continuous labels Regression
Roadmap
Binary classification
- Parametric and non-parametric prediction
Other supervised settings Principles for learning
Oranges and Lemons
A two dimensional space
Stars and galaxies
Minor elliptical axis (y) against Major elliptical axis (x) for stars (red) and galaxies (blue)
Coronoray Heart Disease
Patients with (red) and without (blue) coronary heart disease in South Africa (Rousseauw et al, 1983)
Parametric model
Linear classifier
The weight vector
Geometric meaning
The weight vector
% ww = Dx1 weights % Xstar = NxD test cases y_pred = sign(Xstar*ww); % Nx1
Learning the weights
Rosenblatt’s Perceptron Learning Perceptron criterion: Stochastic gradient descent:
Learning the weights
% ww = Dx1 weights % xx = NxD test cases % yy = Nx1 targets (-1,+1)
- ld_ww=[];
ww=zeros(D,1); while (~isequal(ww,old_ww))
- ld_ww=ww;
for ct=1:N, pred=sign(xx(ct,:)*ww); ww=ww+(yy(ct)-pred)*xx(ct,:)’; end; end;
Learning the weights
Implementing the bias
Output of the perceptron
Linear classifier revisited
If not linearly separable must
- extend model
- add features
Nonlinear basis function
From model to no model
Faith in previous knowledge Strong assumption on
- data structure
- separating boundary shape
Faith in the data No assumption
- n
the underlying structure Data tell me everything I need
K-nearest neighbours classifier
Fix an Hodges 1951
Decision boundaries
Linear classification 1-nearest neighbour classifier 15-nearest neighbour classifier
Brain MRI application
MICCAI MS lesion challenge 2008 http://www.ia.unc.edu/MSseg/index.html
LANDSAT application
Identification via gait analysis
Nowlan 2009 Choi 2014
Characterize each person by the way he moves: gait signature
Parametric vs non-parametric
- Starting assuming decision boundary is a
plane
- Non-parametric KNN has no fixed assumption:
boundaries gets more complicated with more data
- Non-parametric methods may need more data
and can be computationally intensive
Batch supervised learning
Given: example inputs and targets (training set) Task: predicting target for new inputs (test set) Examples:
- classification (binary or multi-class)
- regression
- ordinal regression
- Poisson regression
- ranking
…
Batch supervised learning
- Many ways of mapping inputs outputs
- How do we choose what to do?
- How do we know if we are doing well?
Algorithm’s objective cost
Formal objective for algorithms:
- minimize a cost function
- maximize an objective function
Proving convergence:
- does objective monotonically improve?
Considering alternatives:
- does another algorithm score better?
Loss function
Choosing a loss function
- Motivated by the application
– 0-1 error, achieving a tolerance, business cost
- Computational convenience:
– Differentiability, convexity
- Beware of loss dominated by artifacts: