supervised self organising maps
play

Supervised Self-Organising Maps similarity/distance (Kohonen, - PowerPoint PPT Presentation

Self-organising maps Map high-dimensional data to a 2D grid of units according to Supervised Self-Organising Maps similarity/distance (Kohonen, 1982). Ron Wehrens Institute of Molecules and Materials, IMM Radboud University


  1. Self-organising maps Map high-dimensional data to a 2D grid of “units” according to Supervised Self-Organising Maps similarity/distance (Kohonen, 1982). Ron Wehrens Institute of Molecules and Materials, IMM Radboud University “Spatially smooth version of Nijmegen, The Netherlands k-means” (Ripley, PRNN, 1996). Training SOMs Training SOMs Initial state Initial state Object 1 Data: 177 Italian wines

  2. Training SOMs Training SOMs Winner 1 Update 1 Object 1 Object 1 Training SOMs Mapping Algorithm: ✎ Pick random object Wines: codebook vectors mapping ✎ Determine winner in map ● ● ● ● ● ✎ Update winner and environment ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ✎ Periodically, decrease environment and learning rate ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● R code: ● ● ● ● ● ● > library(kohonen) > data(wines) > somnet <- som(scale(wines), gr = somgrid(5, 5), rlen=100) > plot(somnet, "codes")

  3. Supervised SOMs Supervised SOMs ✎ use of all information ✎ use of all information ✎ treat Y as a special (set of) variables ✎ better reproducibility ✎ better reproducibility ✎ separate range scaling of distances in X and Y ✎ better interpretability ✎ better interpretability ✎ explicit weighting of distances in X and Y ✎ better predictions ✎ better predictions ✎ for regression as well as classification W.J. Melssen, R. Wehrens and L.M.C Buydens, Chemom. Intell. Lab. Syst. (2006), in press . W.J. Melssen, R. Wehrens and L.M.C Buydens, Chemom. Intell. Lab. Syst. (2006), in press . Supervised SOMs X-ray powder patterns F18 Descriptor of crystal structure: ✎ use of all information ✎ treat Y as a special (set of) variables similar patterns should E17 ✎ better reproducibility ✎ separate range scaling of distances in X correspond to similar structures D16 and Y ✎ better interpretability ✎ explicit weighting of distances in X and Y C15 I ✎ better predictions N19 N20 ✎ for regression as well as classification B12 B14 > library(kohonen) > data(wines) A2 > xyfnet <- xyf(scale(wines), classvec2classmat(wine.classes), A6 12 14 16 18 gr = somgrid(5, 5), rlen=100, xweight = .5) 2 θ 5 10 15 20 25 30 35 2 θ W.J. Melssen, R. Wehrens and L.M.C Buydens, Chemom. Intell. Lab. Syst. (2006), in press .

  4. Package wccsom Data set: steroids ✎ Self-organising maps for Space group # compounds label powder patterns P212121 978 19 ✎ Supervised and P21 843 4 unsupervised mapping P1 93 5 ✎ Special similarity C2 99 1 function (WCC) with one Total 2013 parameter: triangle width Training set (1342 compounds) and a test set (671 compounds). Mapping using cell volume Mapping using space group > xyfnet <- xyf(X[training,], Y[training], Space Group + gr = somgrid(20, 20, "hexagonal"), SOM: no Space Group information XYF: including Space Group information XYF: Volume and Space Group information + rlen = 250, xweight = .5) ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● 5 ●●●●●●●●●●●●●●●●●●●● 5 ●●●●●●●●●●●●●●●●●●●● 5 ●●●●●●●●●●●●●●●●●●●● > plot(xyfnet, "predict") ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● 4 ●●●●●●●●●●●●●●●●●●●● 4 ●●●●●●●●●●●●●●●●●●●● 4 ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● 19 ●●●●●●●●●●●●●●●●●●●● 19 ●●●●●●●●●●●●●●●●●●●● 19 ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● log2(cell volume) ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● 1 ●●●●●●●●●●●●●●●●●●●● 1 ●●●●●●●●●●●●●●●●●●●● 1 ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● 14 ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● Training time: ●●●●●●●●●●●●●●●●●●●● 12 ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● > sompredictions <- ●●●●●●●●●●●●●●●●●●●● 1 h 20’ (P 3.2GHz) ●●●●●●●●●●●●●●●●●●●● + predict(somnet, trainY = classvec2classmat(Ycl[training])) ●●●●●●●●●●●●●●●●●●●● 10 ●●●●●●●●●●●●●●●●●●●● > plot(somnet, "property", ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● + property = sompredictions$unit.predictions) ●●●●●●●●●●●●●●●●●●●● 8 ●●●●●●●●●●●●●●●●●●●● > plot(xyfnet, "predict") ●●●●●●●●●●●●●●●●●●●●

  5. Prediction results (test set) Conclusions Volume prediction (correlation coefficients) Seed 7 Seed 13 Seed 31 ✎ SOMs (supervised and unsupervised) are ideally suited for analysing SOM .01 -.04 .01 databases of chemical structures XYF (class only) .36 .41 .41 XYF (class and volume) .72 .28 .68 ✎ Special distance measures can/must be used ✎ Supervised SOMs have many advantages: better predictions, easier to interpret, and better stability Space group prediction (percentage correct) ✎ Training can take a long time but mapping is relatively fast Seed 7 Seed 13 Seed 31 ✎ Including space group information is important in predicting properties of SOM 43% 43% 24% crystals XYF (class only) 87% 86% 85% XYF (class and volume) 79% 46% 66% Acknowledgements Library ’class’ by B.D. Ripley Edwards & Oman, RNews 3(3), 2003 ✎ René de Gelder ✎ Willem Melssen ✎ Egon Willighagen

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend