Kernel Methods for Fusing Heterogeneous Data Gunnar R atsch - PowerPoint PPT Presentation

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Kernel Methods for Fusing Heterogeneous Data Gunnar R¨ atsch Friedrich Miescher Laboratory, Max Planck Society T¨ ubingen, Germany Pre-conference Course, Bio-IT World Europe, Hannover, Germany October 4, 2010 Friedrich Miescher Laboratory of the Max Planck Society

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Roadmap SVM-related publications Support Vector Machines (SVMs) 450 SVM-related publications in PubMed Kernels and the “trick” 400 350 Kernels for non-vectorial data 300 250 Heterogeneous data integration 200 150 Examples 100 50 Software 0 2000 2002 2004 2006 2008 2010* Slides and additional material available at: http://tinyurl.com/dfbi2010 http://fml.mpg.de/raetsch/lectures/datafusion-bio-it-2010 fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Margin Maximization Example: Recognition of Splice Sites Given: Potential acceptor splice sites intron exon Goal: Rule that distinguishes true from false ones Linear Classifiers with large margin fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Margin Maximization SVMs: Maximize the margin! Why? Intuitively, it feels the safest. For a small error in the separating hyperplane, we do not suffer too many mistakes. Empirically, it works well. Learning theory indicates that it is the right thing to do. AG GC content before 'AG' AG AG w AG AG AG AG AG AG AG AG GC content after 'AG' fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Support Vector Machines for Binary Classification How to Maximize the Margin? Maximize n � AG ρ − C ξ i GC content before 'AG' AG AG w �� i =1 AG =margin AG Subject to AG AG AG y i � w , x i � � ρ − ξ i AG ξ i � 0 AG AG for all i = 1 , . . . , n , � w � = 1 . GC content after 'AG' Examples on the margin are called support vectors [Vapnik, 1995] Soft margin SVMs [Cortes and Vapnik, 1995] Hyperplane only depends on distances between examples: d ( x , x ′ ) 2 = � x − x ′ � 2 = � x , x � − � x , x ′ � + � x ′ , x ′ � � �� scalar product fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Support Vector Machines for Binary Classification How to Maximize the Margin? Maximize n � AG ρ − C ξ i GC content before 'AG' AG AG margin �� i =1 AG =margin AG Subject to AG AG AG y i � w , x i � � ρ − ξ i AG ξ i � 0 AG AG for all i = 1 , . . . , n , � w � = 1 . GC content after 'AG' Examples on the margin are called support vectors [Vapnik, 1995] Soft margin SVMs [Cortes and Vapnik, 1995] Hyperplane only depends on distances between examples: d ( x , x ′ ) 2 = � x − x ′ � 2 = � x , x � − � x , x ′ � + � x ′ , x ′ � � �� scalar product fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Support Vector Machines for Binary Classification How to Maximize the Margin? Maximize n � AG ρ − C ξ i GC content before 'AG' AG AG �� i =1 AG =margin AG Subject to ξ AG AG AG y i � w , x i � � ρ − ξ i AG AG ξ i � 0 AG AG for all i = 1 , . . . , n , � w � = 1 . GC content after 'AG' Examples on the margin are called support vectors [Vapnik, 1995] Soft margin SVMs [Cortes and Vapnik, 1995] Hyperplane only depends on distances between examples: d ( x , x ′ ) 2 = � x − x ′ � 2 = � x , x � − � x , x ′ � + � x ′ , x ′ � � �� scalar product fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Inflating the Feature Space Recognition of Splice Sites Given: Potential acceptor splice sites intron exon Goal: Rule that distinguishes true from false ones Linear Classifiers with large margin fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Inflating the Feature Space Recognition of Splice Sites Given: Potential acceptor splice sites intron exon Goal: Rule that distinguishes true from false ones AG AG GC content before 'AG' AG More realistic problem? AG AG Not linearly separable! AG Need nonlinear separation? AG AG Need more features? AG AG AG AG fml GC content after 'AG'

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Inflating the Feature Space Recognition of Splice Sites Given: Potential acceptor splice sites intron exon Goal: Rule that distinguishes true from false ones AG AG GC content before 'AG' More realistic problem? AG AG Not linearly separable! AG Need nonlinear AG AG separation? AG AG Need more features? AG AG AG fml GC content after 'AG'

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Inflating the Feature Space Nonlinear Separations Linear separation might not be sufficient! ⇒ Map into a higher dimensional feature space Example: [Sch¨ olkopf and Smola, 2002] Φ : R 2 R 3 → √ ( z 1 , z 2 , z 3 ) := ( x 2 2 x 1 x 2 , x 2 ( x 1 , x 2 ) �→ 1 , 2 ) z 3 x 2 ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ✕ ❍ ✕ ✕ ✕ ✕ x 1 ❍ ❍ ❍ ❍ ✕ ❍ ❍ ❍ ✕ z 1 ❍ ❍ ❍ ✕ ✕ ❍ ✕ ❍ ❍ ❍ ✕ ✕ ❍ ✕ ✕ ✕ ✕ ✕ ✕ ✕ z 2 ✕ fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Kernel “Trick” Kernel “Trick” √ Example: x ∈ R 2 and Φ( x ) := ( x 2 2 x 1 x 2 , x 2 1 , 2 ) [Boser et al., 1992] √ √ � � ( x 2 2 x 1 x 2 , x 2 x 2 x 2 � Φ( x ) , Φ(ˆ x ) � = 1 , 2 ) , (ˆ 1 , 2 ˆ x 1 ˆ x 2 , ˆ 2 ) x 2 ) � 2 = � ( x 1 , x 2 ) , (ˆ x 1 , ˆ x � 2 = � x , ˆ : =: k ( x , ˆ x ) Scalar product in feature space (here R 3 ) can be computed in input space (here R 2 )! Also works for higher orders and dimensions ⇒ relatively low-dimensional input spaces ⇒ very high-dimensional feature spaces fml

SVMs Kernels & the “Trick” Non-vectorial Data Data Integration Software References Kernel “Trick” Common Kernels x � + c ) d Polynomial k ( x , ˆ x ) = ( � x , ˆ Sigmoid k ( x , ˆ x ) = tanh( κ � x , ˆ x � + θ ) � � x � 2 / (2 σ 2 ) RBF k ( x , ˆ x ) = exp −� x − ˆ Convex combinations k ( x , ˆ x ) = β 1 k 1 ( x , ˆ x ) + β 2 k 2 ( x , ˆ x ) Notes : These kernels are good for real-valued examples Kernels may be combined in case of heterogeneous data [Vapnik, 1995, M¨ uller et al., 2001, Sch¨ olkopf and Smola, 2002] fml

Kernel Methods for Fusing Heterogeneous Data Gunnar R atsch - PowerPoint PPT Presentation

SVMs Kernels & the Trick Non-vectorial Data Data Integration Software References Kernel Methods for Fusing Heterogeneous Data Gunnar R atsch Friedrich Miescher Laboratory, Max Planck Society T ubingen, Germany

Fusing Non Fusing Non- -Volumetric, Spatially Volumetric, Spatially- - Localized Data with

Learning by Fusing Heterogeneous Data Marinka Zitnik Thesis Defense, October 22 2015 Motivation

Fusing point and areal level space-time data with application to wet deposition Alan Gelfand

Fusing space-time data under measurement error for computer model output Veronica J. Berrocal (

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Fusing Generic Objectness and Visual Saliency for Salient Object Detection Yasin KAVAK

Steganalysis in high dimensions: Fusing classifiers built on random subspaces Jan Kodovsk,

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Coverage in Heterogeneous Coverage in Heterogeneous Networks Xiaoli Chu King s College

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Methods Lei Tang Arizona State University Jul. 26th, 2007 Lei Tang Kernel Methods

CSI5126 . Algorithms in bioinformatics Hidden Markov Models (continued) Marcel Turcotte School of

CS681: Advanced Topics in Computational Biology Week 9 Lecture 1 Can Alkan EA224

Molecular Subtypes of Renal Cell Carcinoma Deepika Sirohi, MD University of Utah and ARUP

Random Variable : non empty set. {up, town} Event A: is a subset of . {up} A

Investor Presentation July 31, 2018 Global Partners LP (NYSE: GLP) Forward-Looking Statements

Open Source Geospatial Software - an Introduction Spatial Programming with R V. G omez-Rubio

From Transfac to HOCOMOCO: using cross-validation and human curation to take most from the high

MIPOA Finance Committee Report Financials 3Q 2017 (available in hardcopy) Proposed 2018