Support Vector Machines for Classification of Flow Data - PowerPoint PPT Presentation

Support Vector Machines for Classification of Flow Data Classification of Flow Data Funded by SBIR Grant # R43 RR024094-01A1 FlowCap 2010 p John Quinn Ph.D. Treestar john@treestar.com

Our Objective Our Objective • Demonstrate that supervised training algorithms can effectively replicate user created gates – Very useful for high throughput settings – Can increase robustness • We believe this will be the first application in pp which algorithmic gate placement becomes the norm.

Selected Algorithm Selected Algorithm • Support Vector Machine (SVM) pp ( ) – Radial kernel • Supervised linear classifier that solves an optimization problem to find the hyperplane(s) that separate classes with the maximum distance between classes – With non-linear mapping data that is not linearly Wi h li i d h i li l separable can be classified

SVM Operation SVM Operation Optimization: p • Determine which elements of the training data mark training data mark the boundary of D maximum distance between two classes or Support vectors Class 1 Class 2 D Maximum separation

SVM Operation SVM Operation • Optimization problem Optimization problem For data: A hyperplane that separates any two classes can be defined as: A h l th t t t l b d fi d For c i =1 For c i =-1 Knowing that the data points should be outside of the margin, we can impose the constraint: p

SVM Operation SVM Operation We know that the support vectors will have a perpendicular di t distance from the hyperplane of: f th h l f and The distance between SV’s can then be expressed as: So optimization is the minimization of D

SVM Operation SVM Operation We then use the inequality, q y, as a constraint to fix a critical point and use as a constraint to fix a critical point and use Lagrangian multipliers α i , to express w as a linear combination of the training vectors: The support vectors, N SV , are then the X i associated with non-negative Lagrange multipliers

SVM Operation SVM Operation Once w is known, and the support vectors have been identified, b can be solved as: If there are more than two classes, the operation remains the same but the hyperplanes are determined either as one hyperplanes are determined either as one versus all or pairwise • We chose a one versus all format

SVM Operation SVM Operation • Data not linearly separable? Map it to a y p p space where it is! – We assume that flow data will have a Gaussian distribution and selected a Gaussian mapping G Input Space Mapped Space

Why use an SVM? Why use an SVM? • SVM’s are deterministic • Find the global maxima and not local maxima – If the training data are representative of the real data, you cannot do better. • SVM’s are fast – They solve a maximization problem, as opposed to doing an iterative fitting d d i i i fi i

Preprocessing Preprocessing • To prepare the training data, we: – Normalize the data to a range of -1 to 1 N li th d t t f 1 t 1 – Identified the training data set with the largest number of clusters • Used this data set as the reference set – Calculated the centroid of each cluster in the reference set – In all other training data, calculated the Euclidean distance of each cluster to the clusters in the reference set and assigned them cluster ID’s matching reference set and assigned them cluster ID s matching the reference cluster with the smallest distance measure – Took a sample of each training data set and combined Took a sample of each training data set and combined them into one training vector to present to the SVM

Algorithm choice Algorithm choice Matlab has a free file share repository � Someone has already put almost any algorithm p y g you can think of into code I I used the SVM coded by d th SVM d d b By Junshui Ma, and Yi Zhao of Ohio St. University � It received 5 stars

Training Data Training Data • Example training data p g – Showing parameters 1 & 2, and 3 & 4 of the stem cell data set

Results Results

Results Results Speed: p Data set Training time Classification time • • CFSE CFSE 4 sec 4 sec 2 min 48 sec 2 min 48 sec (13 files) (13 files) • DLBCL 5 sec 67 sec (30 files) • GvHD 5 sec 38 sec (12 files) • NDD 11 sec 27 min 28 sec (30 files) • Stem cell Stem cell 4 sec 4 sec 19 sec 19 sec (30 files) (30 files)

Room for improvement… Room for improvement… • The SVM’s are highly dependant on g y p identifying a transform that maps the data to a linearly separable space. • We could experiment with a number of different transforms

FlowCap Feedback FlowCap Feedback • What went well What went well – Data easily available – Submission process easy Submission process easy – Questions answered immediately! • What could be improved – Wider publicity particularly out of our Wid bli it ti l l t f domain

Questions? Questions?

Support Vector Machines for Classification of Flow Data - PowerPoint PPT Presentation

Support Vector Machines for Classification of Flow Data Classification of Flow Data Funded by SBIR Grant # R43 RR024094-01A1 FlowCap 2010 p John Quinn Ph.D. Treestar john@treestar.com Our Objective Our Objective Demonstrate that

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Update on BRSKI-AE Support for asynchronous enrollment

The Future of Supply Chains in Asia PBEC Webinar Dialogue Series 2020 A reality check for

Eric Sprunk, Chief Operating Officer: Hi, everyone. I'm Eric Sprunk, NIKE's Chief Operating

ASSENT COMPLIANCE Compliance Program for Anti-Human Trafficking

GlideinWMS Marco Mambelli Stakeholders Meeting September 18, 2019 Overview Project updates

Custom Board Yellow A Design Concept 7 6 5 4 3 1 5 2 1 5 1 2 2 1 5 1 2 2 8 8

iARCH Asynchronous file handling with iRODS tape resources

SURF Space Availability Joshua Willhite LBNF Far Site Conventional Facilities Project Manager 14

Support Vector Machines for Classification of Flow Data - PowerPoint PPT Presentation

Support Vector Machines for Classification of Flow Data Classification of Flow Data Funded by SBIR Grant # R43 RR024094-01A1 FlowCap 2010 p John Quinn Ph.D. Treestar john@treestar.com Our Objective Our Objective Demonstrate that

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

? 17.10.2018 3 17.10.2018 4 Support Vector Machines (SVM): Background Support Vector Machines

Support Vector Machines October 16, 2018 Support Vector Machines October 16, 2018 1 / 31

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Support Vector Machines (Ch. 18.9) SVM Basics Support Vector Machines (SVMs) try to do our

Support Vector Machines 290N, 2014 Support Vector Machines (SVM) Supervised learning

Support Vector Machines &amp; Kernelization Barna Saha Most of the slides are made using David

Support Vector Machines Preview What is a support vector machine? The perceptron revisited

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Introduction Kailash Awati Instructor DataCamp Support Vector Machines in R Preliminaries

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

Support vector machines CS 446 Part 1: linear support vector machines 1.0 1.0 1.0 0.8 0.8

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Matthieu R Bloch Tuesday, February 25, 2020 1

RBF Kernels: Generating a complex dataset DataCamp Support Vector Machines in R A bit about RBF

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Update on BRSKI-AE Support for asynchronous enrollment

The Future of Supply Chains in Asia PBEC Webinar Dialogue Series 2020 A reality check for

Eric Sprunk, Chief Operating Officer: Hi, everyone. I'm Eric Sprunk, NIKE's Chief Operating

ASSENT COMPLIANCE Compliance Program for Anti-Human Trafficking

GlideinWMS Marco Mambelli Stakeholders Meeting September 18, 2019 Overview Project updates

Custom Board Yellow A Design Concept 7 6 5 4 3 1 5 2 1 5 1 2 2 1 5 1 2 2 8 8

iARCH Asynchronous file handling with iRODS tape resources

SURF Space Availability Joshua Willhite LBNF Far Site Conventional Facilities Project Manager 14

Support Vector Machines & Kernelization Barna Saha Most of the slides are made using David