1
2nd Session Machine learning: feed-forward neural networks and - - PowerPoint PPT Presentation
2nd Session Machine learning: feed-forward neural networks and - - PowerPoint PPT Presentation
2nd Session Machine learning: feed-forward neural networks and self-organizing maps 1 Recommended reading J. Zupan, J. Gasteiger, Neural Networks in Chemistry and Drug Design: An Introduction, Wiley-VCH, Weinheim, 1999.
2
Recommended reading
- J. Zupan, J. Gasteiger, Neural Networks in Chemistry
and Drug Design: An Introduction, Wiley-VCH, Weinheim, 1999. Chemoinformatics - A Textbook, eds. Johnann Gasteiger and Thomas Engel, Wiley-VCH, 2003. Handbook of Chemoinformatics, ed. Johnann Gasteiger, Wiley-VCH, 2003.
3
Neural networks
Information processing systems inspired on biological nervous systems.
Ability to learn from observations: Extract knowledge Identify relationships Identify structures Generalize
4 Statistical methods process information and ‘learn’. The brain learns with no statistical methods! Neural networks simulate nervous systems using algorithms and mathematical models NNs are interesting from a neuroscience point of view as models of the brain. NNs are interesting for computer science as computational tools.
Neural networks
5
input
- utput
A black box ?
Neural networks
6
input
- utput
Connected functional units
NEURONS
Neural networks
7
The biological neuron
Cell body Dendrites Axon The human nervous system has ca. 1015 neurons. Transmission of an electric signal between dendrites and axons occurs through the transport of ions. Axon terminal
8
Neurons in the superficial layers of the visual cortex in the brain of a mice.
PLoS Biology Vol. 4, No. 2, e29 DOI: 10.1371/journal.pbio.0040029
The biological neuron
9
Synapses – neuron junctions
Axon – Dendrite : chemical signal (neurotransmitter). Signal is transmitted in only one direction. Some neurons are able to modify the signal transmission at the synapses.
10
Loss of connections between neurons in the Alzheimer disease
Synapses – neuron junctions
11
Neural networks
Similar neurons in different species. The same type of signal. What is essential is the whole set of neurons, and the connections.
THE NETWORK
12
Signal transmission at the synapse
The transmitted signal depends on the received signal and the synaptic strength. In artificial neurons, the synaptic strength is called weight. w s p = ws Signal s sent from a previous neuron Synapse with weight w Signal p arriving at the neuron after crossing a synapse
13
Synapses and learning Learning and memory are believed to result from long-term changes in synaptic strength. In artificial neural networks, learning occurs by correcting the weights.
14
Weights and net input
Each neuron receives signals (si) from many neurons. 0.1
- 0.1
0.2 0.4
- 0.3
0.5 0.2 0.2
- 0.04
Net input = 0.04 = 0.4×0.2 – 0.1×0.1 – – 0.5×0.3 + 0.2×0.2 inputs synapses
15
Transfer functions
The net input is modified by a transfer function into an output Out = f (Net)
16
Sigmoid transfer function
Out = 1 / (1 + e -Net) Important: it is non-linear! Derivative is easy to calculate: d(Out) / d(Net) = Out (1-Out)
17
Simulation of an artificial neuron
http://lcn.epfl.ch/tutorial/english/aneuron/html/index.html
18
The ‘100 steps paradox’
A neuron recovers approximately one millisecond (10-3 s) after firing. The human brain is able to perform intelligent processes, such as recognizing a friend's face or reacting to some danger, in approximately one tenth of a second. Highly complex tasks have to be performed in less than 100 steps ?! Conclusion: many tasks must be performed simultaneously and in parallel.
19
Neural network
Input layer Input layer Hidden layer Hidden layer Output layer Output layer Input data Output values
20
Architecture of a neural network
- Number of inputs and outputs
- Number of layers
- Number of neurons in each layer
- Number of weights in each neuron
- How neurons are connected
- Which neurons receive corrections
21
The ‘feed-forward’ or ‘backpropagation’ NN
Input data
22
The ‘backpropagation’ learning algorithm
- 1. Assignment of random values to neurons.
- 2. Input of an object X.
- 3. Computation of output values from all neurons in all layers.
- 4. Comparison of final output values with target values and
computation of an error.
- 5. Computation of corrections to be applied to the weights of
the last layer.
- 6. Computation of corrections to be applied to the weights of
the penultimate layer.
- 7. Application of corrections.
- 8. Return to step 2.
23 Introduction of a momentum parameter µ. Correction = computed correction + µ × previous correction
The ‘backpropagation’ learning algorithm
24
Steps in the training of a BPG NN
Analysis of the problem Which inputs ? How many ? Which output(s) ? How many ? Data pre-processing Normalization (output varies within ]0,1[ !). Splitting into training, test, and prediction sets. Training with the training set and monitoring with the test set (to decide when training shall be stopped). Repetition of training with different parameters (nr of hidden neurons, rate, and momentum) until the best network is found for the test set. Application of the best network found to the prediction set. Evaluation
25
Monitoring the training of a BPG NN
Stop training
26
BPG NNs using JATOON software
http://www.dq.fct.unl.pt/staff/jas/jatoon
Training set Test set
Optimum nr of epochs
27
BPG NNs in QSPR
Example: prediction of 1H NMR chemical shifts
O A A B C C D E F G
A B C D E F G
Chemical shift (ppm)
BPG NNs
Training set with exp. values Input: descriptors of H-atoms Output: chemical shift
- Y. Binev, J. Aires-de-Sousa; J. Chem. Inf. Comput. Sci. 2004, 44(3), 940-945.
28
Predictions with ASNN
Test with 952 + 259 protons
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
Predicted Chemical Shift Experimental Chemical Shift
Aromatics-Set A Pi-Set A Aliphatics-Set A Rigids-Set A Aromatics-Set B Pi-Set B Aliphatics-Set B Rigids-Set B
R2= 0.9830
29
Prediction of 1H NMR spectra using BPG NNs
The SPINUS program: www.dq.fct.unl.pt/spinus
30
Self-organizing maps
31
Kohonen neural networks “self-organizing maps (SOMS)”
Algebraic view of a data set (values, signals, magnitudes,...) vs. Topological view of a data set (relationships between information)
32
Kohonen neural networks “self-organizing maps (SOMS)”
These are two-dimensional arrays of neurons that reflect as well as possible the topology of information, that is, the relationships between individual pieces of data and not their magnitude.
Compression of information Mapping on a 2D surface. “Self-Organized Topological Features Maps” Preserve topology.
33
Kohonen neural networks Goal Mapping similar signals
- nto neighbor neurons
34
Kohonen neural networks
Similar signals in neighbor neurons Do similar signals correspond to the same class?
YES NO
35
Kohonen neural networks Architecture
One layer of neurons.
36
Kohonen neural networks Architecture
One layer of neurons.
n weights for each neuron (n = number of inputs)
37
Kohonen neural networks Topology
Definition of distance between neurons
Neuron
1st neighborhood 2nd neighborhood
The output of a neuron
- nly affects neighbor
neurons
38
Kohonen neural networks Toroidal surface
Neighborhood
Neuron 1st neighborhood 2nd neighborhood
39
Kohonen neural networks Competitive learning
After the input, only one neuron is activated (central neuron or winning neuron) The central neuron is the one with the most similar weights to the input. Traditionaly, similarity = Euclidean distance
2 1
) (
i n i i
x w
- =
−
n – number of inputs w – value of the weight x – value of the input
40
Kohonen neural networks Competitive learning
winning neuron weights
41
Kohonen neural networks Competitive learning
The weights of the winning neuron are corrected to make them even more similar to the input. The weights of neighbor neurons are also adapted with the same goal but to a lesser extent. Neuron 1st neighborhood 2nd neighborhood
42
Kohonen neural networks Competitive learning
The correction of the neighbor neurons after the activation of a neuron depends on: 1. The distance to the winning neuron (the farther, the smaller the correction) 2. The stage of the training (at the beginning corrections are more drastic) 3. The difference between the weight and the input (the larger the difference, the stronger the correction).
43
Kohonen neural networks Normalization of data
The activation of neurons, and the corrections, depend on the Euclidean distance. If the values of a descriptor are in a wider range than another, it will have a larger impact on the result. Therefore, for all descriptors to make a similar impact, NORMALIZATION of data is required.
44
Kohonen neural networks Normalization of data
Example of normalization:
- 1. Find the maximum (MAX) and the minimum (MIN) value for a
descriptor.
- 2. Replace each value x by (x-MIN)/(MAX-MIN)
(now the descriptor varies between 0 and 1)
- r by 0.1 + 0.8×(x-MIN)/(MAX-MIN)
(the descriptor will vary between 0.1 and 0.9, useful for BPG networks)
45
Kohonen neural networks Normalization of data
Another example of normalization (z normalization):
- 1. Calculate the average (aver) and the standard deviation (sd) for a
descriptor.
- 2. Replace each value x by (x-aver)/sd
(the normalized descriptor will have average = 0 and standard deviation = 1)
46
Kohonen neural networks : Application
Geographical classification of crude oil samples for the identification of spill sources. From chemical features of oils. Database of chemical features of oils from different geographical origins. Sample (chemical features ) NEURAL NETS Geographical class
- A. M. Fonseca, J. L. Biscaya, J. Aires-de-Sousa, A. M. Lobo,"Geographical
classification of crude oils by Kohonen self-organizing maps", Anal. Chim. Acta 2006, 556 (2), 374-382.
47
Chemical features of oils
Content in several compounds determined by GC / MS Examples
- (22R)17α(H),21β(H)-30,31-Bishomohopane / 17α(H),21β(H)-Hopane
- 18α(H)-Oleanane / 17α(H),21β(H)-Hopane
- 1-Isopropyl-2-methylnaphtalene
- 3-Methylphenanthrene
- 1-Methydibenzothiophene
3- Methylphenanthrene
H H H H
18α(H)-Oleanane
48
Vector input GC/MS descriptors for a sample of oil
Kohonen neural networks
Weights Winning neuron
49 Test set:
- 55 samples
- 70% correct predictions
Test set:
- 55 samples
- 70% correct predictions
Training set:
- 133 samples
- 20 different
geographical origins
- 21 descriptors
- Good clustering
- 97% correct predictions
Training set:
- 133 samples
- 20 different
geographical origins
- 21 descriptors
- Good clustering
- 97% correct predictions
Results
50
Input layer Output layer
Counterpropagation (CPG) neural network SOM with an output layer
51
Submission of input input
- utput
Training of a CPG neural network
Correction of the weights at the input layer Correction of the corresponding weights at the
- utput layer
52
Submission of input input
Prediction by a CPG neural network
prediction
53
A CPG neural network with several outputs
Prediction
Input layer Output layer
Winning neuron
Training
54
CPGNN: application
Ability of a compound to bind GPCR (G-Protein-Coupled Receptors)
P.Selzer, P. Ertl, QSAR Comb. Sci. 2005, 24, 270-276; J. Chem. Inf. Model. 2006, 46 (6), 2319 -2323.
55
CPGNN: application
Prediction of the ability to bind GPCR (G-Protein-Coupled Receptors)
P.Selzer, P. Ertl, QSAR Comb. Sci. 2005, 24, 270-276; J. Chem. Inf. Model. 2006, 46 (6), 2319 -2323.
CPG network of size 250×250 Training set: 24870 molecules randomly taken from catalogs (“drug-like”) 1709 known GPCR ligands Input: 225 descriptors (RDF descriptors) Output: 9 levels (GPCR and sub-family “adrenalin, bradykinin, dopamine, endothelin, histamine, opioid, serotonin, vasopressin”). Binary values (0/1) according to ‘YES’ or ‘NO’.
56
CPGNN: application to predict GPCR binding
P.Selzer, P. Ertl, QSAR Comb. Sci. 2005, 24, 270-276;
- J. Chem. Inf. Model. 2006, 46 (6), 2319 -2323.
Results: 1st output level (GPCR ligand) Weight values are translated into colors.
Regions activated by ligands
57
CPGNN: application to predict GPCR binding
P.Selzer, P. Ertl, QSAR Comb. Sci. 2005, 24, 270-276; J. Chem. Inf. Model. 2006, 46 (6), 2319 -2323.
Results:
- utput levels nr 4 (‘dopamine’) e nr 7 (‘opioid’)
58
CPGNN: application to predict GPCR binding
P.Selzer, P. Ertl, QSAR Comb. Sci. 2005, 24, 270-276; J. Chem. Inf. Model. 2006, 46 (6), 2319 -2323.
Results: Test set
(25096 non-GPCR and 1490 GPCR)
71% of ligands correctly predicted 18% false positives
59
SOMs in the JATOON program
http://www.dq.fct.unl.pt/staff/jas/jatoon ‘Paste’ data
60
SOMs in the JATOON program
http://www.dq.fct.unl.pt/staff/jas/jatoon Visualization of the distribution of the objects. Neurons colored according to the classes
- f the objects activating
them.
61
SOMs in the JATOON program
http://www.dq.fct.unl.pt/staff/jas/jatoon Distribution of the
- bjects.
62
SOMs in the JATOON program
http://www.dq.fct.unl.pt/staff/jas/jatoon Inspection of the weights at level 2 of the input layer.