LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- - PowerPoint PPT Presentation

LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- SCD/SISTA J.A.K. Suykens, B. De Moor

Content Content • I. Overview • II. Classification • III. Regression • IV. Unsupervised Learning • V. Time-series • VI. Conclusions and Outlooks

Acknowledgements People Our research is supported by grants from several funding Contributors to LS-SVMlab: agencies and sources: Research Council K.U.Leuven: Concerted Research Action GOA -Mefisto 666 (Mathematical Engineering), •Kristiaan Pelckmans IDO (IOTA Oncology, Genetic networks), several PhD/postdoc & fellow grants; Flemish Government: Fund for Scientific •Johan Suykens Research FWO Flanders (several PhD/postdoc grants, projects G.0407.02 (support vector machines), G.0080.01 (collective •Tony Van Gestel intelligence), G.0256.97 (subspace), G.0115.01 (bio-i and microarrays), G.0240.99 (multilinear algebra), G.0197.02 (power •Jos De Brabanter islands), research communities ICCoS, ANMMM), AWI (Bil. Int. Collaboration South Africa, Hungary and Poland), IWT •Lukas Lukas (Soft4s (softsensors), STWW-Genprom (gene promotor prediction), GBOU McKnow(Knowledge management •Bart Hamers algorithms), Eureka-Impact (MPC-control), Eureka-FLiTE (flutter modeling), several PhD-grants); Belgian Federal •Emmanuel Lambert Government : DWTC (IUAP IV-02 (1996-2001) and IUAP V- 10-29 (2002-2006): Dynamical Systems and Control: Computation, Identification & Modelling), Program Sustainable Development PODO-II (CP-TR-18: Sustainibility effects of Supervisors: Traffic Management Systems); Direct contract research : Verhaert, Electrabel, Elia, Data4s, IPCOS. JS is a professor at •Bart De Moor K.U.Leuven Belgium and a postdoctoral researcher with FWO Flanders. BDM and JWDW are full professors at K.U.Leuven •Johan Suykens Belgium. •Joos Vandewalle

I. Overview I. Overview • Goal of the Presentation 1. Overview & Intuition 2. Demonstration LS-SVMlab 3. Pinpoint research challenges 4. Preparation NIPS 2002 • Research results and challenges • Towards applications • Overview LS-SVMlab

I.2 Overview research I.2 Overview research “Learning, generalization, extrapolation, identification, smoothing, modeling” • Prediction (black box modeling) • Point of view: Statistical Learning, Machine Learning, Neural Networks, Optimization, SVM

I.2 Type, Target, Topic I.2 Type, Target, Topic

I.3 Towards applications I.3 Towards applications • System identification • Financial engineering • Biomedical signal processing • Datamining • Bio-informatics • Textmining • Adaptive signal processing

I.4 LS- -SVMlab SVMlab I.4 LS

I.4 LS- -SVMlab SVMlab (2) (2) I.4 LS • Starting points: – Modularity – Object Oriented & Functional Interface – Basic bricks for advanced research • Website and tutorial • Reproducibility (preprocessing)

II. Classification II. Classification “Learn the decision function associated with a set of labeled data points to predict the values of unseen data” • Least Squares – Support Vector Machines • Bayesian Framework • Different norms • Coding schemes

II.1 Least Squares – – Support vector Machines Support vector Machines II.1 Least Squares (LS- -SVM SVM ( L , a ) ) ) (LS γ 1. Least Squares cost-function + regularization & equality constraints 2. Non-linearity by Mercer kernels (.,.) K σ 3. Primal-Dual Interpretation (Lagrange multipliers) Primal parametric Model: Dual non-parametric Model: = ∑ n → = + + α + + T ( , ) y w x b e y K x x b e σ i i i i i i j i = 1 j

II.1 LS- -SVM SVM ( II.1 LS ( L , a ) L , a ) “Learning representations from relations” < > < > < >   , , ... , a a a a a a 1 1 1 2 1 N   < > , ... ... ... a a   Ω = 2 1   ... ... ... ...   < > < >  , ... ... ,  a a a a N N N N

II.2 Bayesian Inference θ θ ( | ) ( ) P X P • Bayes rule (MAP): θ = ( | ) P X ( ) P X • Closed form formulas Approximations: - Hessian in optimum - Gaussian distribution • Three levels of posteriors: α γ Level : ( | , , ) P K X σ 1 γ Level : ( | , ) P K X σ 2 Level : ( | ) P K X σ 3

II.3 SVM formulations & norms II.3 SVM formulations & norms • 1 norm + inequality constraints: SVM extensions to any convex cost-function • 2 norm + equality constraints: LS-SVM weighted versions

II.4 Coding schemes II.4 Coding schemes Multi-class Classification task � (multiple) binary classifiers Labels: … -1 -1 -1 1 … … 1 -1 -1 -1 … … 1 2 4 6 2 1 3 … … 1 2 4 6 2 1 3 … … … 1 -1 1 1 … Encoding Decoding

III. Regression III. Regression “Learn the underlying function from a set of data points and its corresponding noisy targets in order to predict the values of unseen data” • LS-SVM ( L , a ) • Cross-validation (CV) • Bayesian Inference • Robustness

III.1 LS-SVM ( L , a ) • Least Squares cost-function + Regularization & Equality constraints • Mercer kernels • Lagrange multipliers: Primal Parametric � Dual Non-parametric

III.1 LS-SVM ( L , a ) (2) III.1 • Regularization parameter: – Do not fit noise (overfitting)! – trade-off noise and information → sin( 10 ) x = + + ( ) sinc( ) f x x e 5

III.2 Cross- -validation (CV) validation (CV) III.2 Cross “How to estimate generalization power of model?” • Division training set – test set 1 2 3 …. t-1 t … n • Repeated division: Leave-one-out CV (fast implementation) 1 2 3 …. t-2 t-1 t t+1 t+2 … n • L-fold cross-validation 1 2 3…t-l-1 t-l…t+l t+1+l … n • Generalized Cross-validation (GCV):     ˆ y y 1 1     [ ] γ = ( | , ) . ... ... S X K     σ     ˆ     y y N N • Complexity criteria: AIC, BIC, …

III.2 Cross- -validation Procedure validation Procedure III.2 Cross (CVP) (CVP) “How to optimize model for optimal generalization performance” • Trade-off fitting – model complexity • Kernel parameters • Optimization routine?

III.1 LS-SVM ( L , a ) (3) III.1 • Kernel type and parameter “Zoölogy as elephantism and non-elephantism” • Model Comparison • By cross-validation or Bayesian Inference

III.3 Applications III.3 Applications “ok, but does it work?” • Soft4s – Together with O. Barrero, L. Hoegaerts, IPCOS (ISMC), BASF, B. De Moor – Soft-sensor • ELIA – Together with O. Barrero, I.Goethals, L. Hoegaerts, I.Markovsky, T. Van Gestel, ELIA, B. De Moor – Prediction short and long term electricity consumption

III.2 Bayesian Inference θ θ ( | ) ( ) P X P • Bayes rule (MAP): θ = ( | ) P X ( ) P X • Closed form formulas • Three levels of posteriors: α γ Level (Model parameters ) : ( | , , ) P K X σ 1 γ Level (Regulariz ation) : ( | , ) P K X σ 2 Level (Model Comparison ) : ( | ) P K X σ 3

III.4 Robustness III.4 Robustness “ How to build good models in the case of non- Gaussian noise or outliers” • Influence function • Breakdown point • How: – De-preciating influence of large residuals – Mean - Trimmed mean – Median • Robust CV, GCV, AIC,…

IV. Unsupervised Learning IV. Unsupervised Learning “Extract important features from the unlabeled data” • Kernel PCA and related methods • Nyström approximation – From Dual to primal – Fixed size LS-SVM

IV.1 Kernel PCA IV.1 Kernel PCA Principal Component Analysis Kernel based PCA y z x

IV.2 Kernel PCA (2) IV.2 Kernel PCA (2) • Primal Dual LS-SVM style formulations • For Kernel PCA, CCA, PLS

IV.2 Nystr Nyströ öm m approximation approximation IV.2 • Sampling of integral equation ∫ φ = λφ ( , ) ( ) ( ) ( ) K x y y p x dx y σ i i ⇓ ⇓ ϕ (.) ≈ N n ∑ ∑ φ = λ φ φ = λ φ ( , ) ( ) ( ) ( , ) ( ) ( ) K x y y y K x y y y σ σ j i i i j i i i = = j 1 j 1 ϕ • Approximating Feature map for (.) = ϕ T ϕ Mercer kernel ( , ) ( ) ( ) K x y x y σ

IV.3 Fixed Size LS- -SVM SVM IV.3 Fixed Size LS ? = ∑ n = φ + + → α + + T ( ) y w x b e ( , ) y K x x b e σ i i i j i i i i = 1 j

V. Time- -series series V. Time “Learn to predict future values given a sequence of past values” • NARX • Recurrent vs. feedforward

V.1 NARX V.1 NARX = ˆ ( , ,..., ) y f y y y − − − 1 1 t t t t l • Reducible to static regression ..., , , , , , ,.... y y y y y y + + + + + 1 2 3 4 5 t t t t t t f • CV and Complexity criteria • Predicting in recurrent mode • Fixed size LS-SVM (sparse representation)

V.1 NARX (2) V.1 NARX (2) Santa Fe Time-series competition

LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- - PowerPoint PPT Presentation

LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- SCD/SISTA J.A.K. Suykens, B. De Moor Content Content I. Overview II. Classification III. Regression IV. Unsupervised Learning V. Time-series VI.

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

MongoDB large scale data-centric architectures QConSF 2012 Kenny Gorman Founder, ObjectRocket

Language Modeling CSE354 - Spring 2020 Task Language Modeling Probabilistic Modeling

VetTag: improving automated veterinary diagnosis coding via large-scale language modeling

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Workshop Workshop on Large on Large- -Scale Disaster Recovery Scale Disaster Recovery i i

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Large Scale I nternational I Pv6 Pilot Large Scale I nternational I Pv6 Pilot Network (6NET)

Deploying Large Scale AVB/TSN Networks Jeff Koftinoff, Meyer Sound Laboratories, Inc. June 19,

APPLICATION OF THE BAYESIAN APPROACH AND INVERSE DISPERSION MODELLING TO SOURCE TERM ESTIMATES

Automated Bayesian Inference for PDE-constrained Inverse Problems Ivan Yashchuk VTT Technical

Sequential Optimal Inference for Experiments with Bayesian Particle Filters Remi Daviet Wharton

SIPTA-Community based on Paper Contributions Paul Fink July 3, 2019 ISIPTA 2019 Ghent, Belgium

Bayesian Neural Network: Foundation and Practice Tianyu Cui, Yi Zhao Department of Computer

Introduction to stochastic dynamical modelling Gavin J Gibson Maxwell Institute for Mathematical

Inference and Evidence Danil Lakens Eindhoven University of Technology @Lakens /

Defining the firing rate for a non-Poissonian spike train --- a nerdish study --- Shigeru