ls svmlab large scale modeling
play

LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- - PowerPoint PPT Presentation

LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- SCD/SISTA J.A.K. Suykens, B. De Moor Content Content I. Overview II. Classification III. Regression IV. Unsupervised Learning V. Time-series VI.


  1. LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- SCD/SISTA J.A.K. Suykens, B. De Moor

  2. Content Content • I. Overview • II. Classification • III. Regression • IV. Unsupervised Learning • V. Time-series • VI. Conclusions and Outlooks

  3. Acknowledgements People Our research is supported by grants from several funding Contributors to LS-SVMlab: agencies and sources: Research Council K.U.Leuven: Concerted Research Action GOA -Mefisto 666 (Mathematical Engineering), •Kristiaan Pelckmans IDO (IOTA Oncology, Genetic networks), several PhD/postdoc & fellow grants; Flemish Government: Fund for Scientific •Johan Suykens Research FWO Flanders (several PhD/postdoc grants, projects G.0407.02 (support vector machines), G.0080.01 (collective •Tony Van Gestel intelligence), G.0256.97 (subspace), G.0115.01 (bio-i and microarrays), G.0240.99 (multilinear algebra), G.0197.02 (power •Jos De Brabanter islands), research communities ICCoS, ANMMM), AWI (Bil. Int. Collaboration South Africa, Hungary and Poland), IWT •Lukas Lukas (Soft4s (softsensors), STWW-Genprom (gene promotor prediction), GBOU McKnow(Knowledge management •Bart Hamers algorithms), Eureka-Impact (MPC-control), Eureka-FLiTE (flutter modeling), several PhD-grants); Belgian Federal •Emmanuel Lambert Government : DWTC (IUAP IV-02 (1996-2001) and IUAP V- 10-29 (2002-2006): Dynamical Systems and Control: Computation, Identification & Modelling), Program Sustainable Development PODO-II (CP-TR-18: Sustainibility effects of Supervisors: Traffic Management Systems); Direct contract research : Verhaert, Electrabel, Elia, Data4s, IPCOS. JS is a professor at •Bart De Moor K.U.Leuven Belgium and a postdoctoral researcher with FWO Flanders. BDM and JWDW are full professors at K.U.Leuven •Johan Suykens Belgium. •Joos Vandewalle

  4. I. Overview I. Overview • Goal of the Presentation 1. Overview & Intuition 2. Demonstration LS-SVMlab 3. Pinpoint research challenges 4. Preparation NIPS 2002 • Research results and challenges • Towards applications • Overview LS-SVMlab

  5. I.2 Overview research I.2 Overview research “Learning, generalization, extrapolation, identification, smoothing, modeling” • Prediction (black box modeling) • Point of view: Statistical Learning, Machine Learning, Neural Networks, Optimization, SVM

  6. I.2 Type, Target, Topic I.2 Type, Target, Topic

  7. I.3 Towards applications I.3 Towards applications • System identification • Financial engineering • Biomedical signal processing • Datamining • Bio-informatics • Textmining • Adaptive signal processing

  8. I.4 LS- -SVMlab SVMlab I.4 LS

  9. I.4 LS- -SVMlab SVMlab (2) (2) I.4 LS • Starting points: – Modularity – Object Oriented & Functional Interface – Basic bricks for advanced research • Website and tutorial • Reproducibility (preprocessing)

  10. II. Classification II. Classification “Learn the decision function associated with a set of labeled data points to predict the values of unseen data” • Least Squares – Support Vector Machines • Bayesian Framework • Different norms • Coding schemes

  11. II.1 Least Squares – – Support vector Machines Support vector Machines II.1 Least Squares (LS- -SVM SVM ( L , a ) ) ) (LS γ 1. Least Squares cost-function + regularization & equality constraints 2. Non-linearity by Mercer kernels (.,.) K σ 3. Primal-Dual Interpretation (Lagrange multipliers) Primal parametric Model: Dual non-parametric Model: = ∑ n → = + + α + + T ( , ) y w x b e y K x x b e σ i i i i i i j i = 1 j

  12. II.1 LS- -SVM SVM ( II.1 LS ( L , a ) L , a ) “Learning representations from relations” < > < > < >   , , ... , a a a a a a 1 1 1 2 1 N   < > , ... ... ... a a   Ω = 2 1   ... ... ... ...   < > < >  , ... ... ,  a a a a N N N N

  13. II.2 Bayesian Inference θ θ ( | ) ( ) P X P • Bayes rule (MAP): θ = ( | ) P X ( ) P X • Closed form formulas Approximations: - Hessian in optimum - Gaussian distribution • Three levels of posteriors: α γ Level : ( | , , ) P K X σ 1 γ Level : ( | , ) P K X σ 2 Level : ( | ) P K X σ 3

  14. II.3 SVM formulations & norms II.3 SVM formulations & norms • 1 norm + inequality constraints: SVM extensions to any convex cost-function • 2 norm + equality constraints: LS-SVM weighted versions

  15. II.4 Coding schemes II.4 Coding schemes Multi-class Classification task � (multiple) binary classifiers Labels: … -1 -1 -1 1 … … 1 -1 -1 -1 … … 1 2 4 6 2 1 3 … … 1 2 4 6 2 1 3 … … … 1 -1 1 1 … Encoding Decoding

  16. III. Regression III. Regression “Learn the underlying function from a set of data points and its corresponding noisy targets in order to predict the values of unseen data” • LS-SVM ( L , a ) • Cross-validation (CV) • Bayesian Inference • Robustness

  17. III.1 LS-SVM ( L , a ) • Least Squares cost-function + Regularization & Equality constraints • Mercer kernels • Lagrange multipliers: Primal Parametric � Dual Non-parametric

  18. III.1 LS-SVM ( L , a ) (2) III.1 • Regularization parameter: – Do not fit noise (overfitting)! – trade-off noise and information → sin( 10 ) x = + + ( ) sinc( ) f x x e 5

  19. III.2 Cross- -validation (CV) validation (CV) III.2 Cross “How to estimate generalization power of model?” • Division training set – test set 1 2 3 …. t-1 t … n • Repeated division: Leave-one-out CV (fast implementation) 1 2 3 …. t-2 t-1 t t+1 t+2 … n • L-fold cross-validation 1 2 3…t-l-1 t-l…t+l t+1+l … n • Generalized Cross-validation (GCV):     ˆ y y 1 1     [ ] γ = ( | , ) . ... ... S X K     σ     ˆ     y y N N • Complexity criteria: AIC, BIC, …

  20. III.2 Cross- -validation Procedure validation Procedure III.2 Cross (CVP) (CVP) “How to optimize model for optimal generalization performance” • Trade-off fitting – model complexity • Kernel parameters • Optimization routine?

  21. III.1 LS-SVM ( L , a ) (3) III.1 • Kernel type and parameter “Zoölogy as elephantism and non-elephantism” • Model Comparison • By cross-validation or Bayesian Inference

  22. III.3 Applications III.3 Applications “ok, but does it work?” • Soft4s – Together with O. Barrero, L. Hoegaerts, IPCOS (ISMC), BASF, B. De Moor – Soft-sensor • ELIA – Together with O. Barrero, I.Goethals, L. Hoegaerts, I.Markovsky, T. Van Gestel, ELIA, B. De Moor – Prediction short and long term electricity consumption

  23. III.2 Bayesian Inference θ θ ( | ) ( ) P X P • Bayes rule (MAP): θ = ( | ) P X ( ) P X • Closed form formulas • Three levels of posteriors: α γ Level (Model parameters ) : ( | , , ) P K X σ 1 γ Level (Regulariz ation) : ( | , ) P K X σ 2 Level (Model Comparison ) : ( | ) P K X σ 3

  24. III.4 Robustness III.4 Robustness “ How to build good models in the case of non- Gaussian noise or outliers” • Influence function • Breakdown point • How: – De-preciating influence of large residuals – Mean - Trimmed mean – Median • Robust CV, GCV, AIC,…

  25. IV. Unsupervised Learning IV. Unsupervised Learning “Extract important features from the unlabeled data” • Kernel PCA and related methods • Nyström approximation – From Dual to primal – Fixed size LS-SVM

  26. IV.1 Kernel PCA IV.1 Kernel PCA Principal Component Analysis Kernel based PCA y z x

  27. IV.2 Kernel PCA (2) IV.2 Kernel PCA (2) • Primal Dual LS-SVM style formulations • For Kernel PCA, CCA, PLS

  28. IV.2 Nystr Nyströ öm m approximation approximation IV.2 • Sampling of integral equation ∫ φ = λφ ( , ) ( ) ( ) ( ) K x y y p x dx y σ i i ⇓ ⇓ ϕ (.) ≈ N n ∑ ∑ φ = λ φ φ = λ φ ( , ) ( ) ( ) ( , ) ( ) ( ) K x y y y K x y y y σ σ j i i i j i i i = = j 1 j 1 ϕ • Approximating Feature map for (.) = ϕ T ϕ Mercer kernel ( , ) ( ) ( ) K x y x y σ

  29. IV.3 Fixed Size LS- -SVM SVM IV.3 Fixed Size LS ? = ∑ n = φ + + → α + + T ( ) y w x b e ( , ) y K x x b e σ i i i j i i i i = 1 j

  30. V. Time- -series series V. Time “Learn to predict future values given a sequence of past values” • NARX • Recurrent vs. feedforward

  31. V.1 NARX V.1 NARX = ˆ ( , ,..., ) y f y y y − − − 1 1 t t t t l • Reducible to static regression ..., , , , , , ,.... y y y y y y + + + + + 1 2 3 4 5 t t t t t t f • CV and Complexity criteria • Predicting in recurrent mode • Fixed size LS-SVM (sparse representation)

  32. V.1 NARX (2) V.1 NARX (2) Santa Fe Time-series competition

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend