SVM Learning of IP Address Structure for Latency Prediction Rob - PowerPoint PPT Presentation

SVM Learning of IP Address Structure for Latency Prediction Rob Beverly, Karen Sollins and Arthur Berger {rbeverly,sollins,awberger}@csail.mit.edu SIGCOMM MineNet06 1

SVM Learning of IP Address Structure for Latency Prediction • The case for Latency Prediction • The case for Machine Learning • Data and Methodology • Results • Going Forward SIGCOMM MineNet06 2

The Case for Latency Prediction SIGCOMM MineNet06 3

Latency Prediction (again?) • Significant prior work: – King [Gummandi 2002] – Vivaldi [Dabek 2004] – Meridian [Wong 2005] – Others… IDMaps, GNP, etc… • Prior Methods: – Active Queries – Synthetic Coordinate Systems – Landmarks • Our work seeks to provide an agent-centric (single-node) alternative SIGCOMM MineNet06 4

Why Predict Latency? 1. Service Selection : balance load, optimize performance, P2P replication 2. User-directed Routing : e.g. IPv6 with per- provider logical interfaces 3. Resource Scheduling : Grid computing, etc. 4. Network Inference : Measure additional topological network properties SIGCOMM MineNet06 5

An Agent-Centric Approach • Hypothesis : Two hosts within same sub network likely have consistent congestion and latency • Registry allocation policies give network structure – but fragmented and discontinuous • Formulate as a supervised learning problem • Given latencies to a set of (random) destinations as training: – predict_latency(unseen IP) – error = |predict(IP) – actual(IP)| SIGCOMM MineNet06 6

The Case for Machine Learning SIGCOMM MineNet06 7

Why Machine Learning? Internet-scale Networks: – Complex (high-dimensional) – Dynamic • Can accommodate and recover from infrequent errors in probabilistic world • Traffic provides large and continuous training base SIGCOMM MineNet06 8

Candidate Tool: Support Vector Machine • Supervised learning (but amenable to online learning) • Separate training set into two classes in most general way • Main insight : find hyper-plane separator that maximizes the minimum margin between convex hulls of classes • Second insight : if data is not linearly separable, take to higher dimension • Result : generalizes well, fast, accommodate unknown data structure SIGCOMM MineNet06 9

SVMs – Maximize Minimum Margin =positive examples =negative examples =support vector Most Simple Case: 2 Classes in 2 Dimensions Linearly Separable SIGCOMM MineNet06 10

SVMs – Redefining the Margin =positive examples =negative examples The single new positive example redefines margin SIGCOMM MineNet06 11

Non-SV Points don’t affect solution =positive examples =negative examples SIGCOMM MineNet06 12

IP Latency Non-Linearity =200ms examples =10ms examples IP Bit 1 IP Bit 0 SIGCOMM MineNet06 13

Higher Dimensions for Non-Linearity =200ms examples ?? ?? =10ms examples ?? 2 Classes in 2 Dimensions NOT Linearly Separable SIGCOMM MineNet06 14

Kernel Function Φ =200ms examples =10ms examples 2 Classes in 3 Dimensions Linearly Separable SIGCOMM MineNet06 15

Support Vector Regression • Same idea as classification • ε -insensitive loss function Φ ε Latency Latency First Octet SIGCOMM MineNet06 16

Data and Methodology SIGCOMM MineNet06 17

Data Set • 30,000 random hosts responding to ping • Average latency to each over 5 pings • Non-trivial distribution for learning SIGCOMM MineNet06 18

Methodology IP Latency IP Latency # # Train Permute Order Data Set Test • Average 5 experiments: – Randomly permute data set – Split data set into training / test points SIGCOMM MineNet06 19

Methodology IP Latency IP Latency # # Train Build SVM Permute Order Measure Test Performance • Average 5 experiments: – Training data defines SVM – Performance on (unseen) test points – Each bit of IP an input feature SIGCOMM MineNet06 20

Results SIGCOMM MineNet06 21

Results • Spoiler: So, does it work? • Yes, within 30% for more than 75% of predictions • Performance varies with selection of parameters (multi-optimization problem) – Training Size – Input Dimension – Kernel SIGCOMM MineNet06 22

Training Size SIGCOMM MineNet06 23

Question: Are MSB Better Predictors • Determine error versus number most significant bits of test input IPs SIGCOMM MineNet06 24

Question: Are MSB Better Predictors • Determine error versus number most significant bits of test input IPs Powerful result: majority of discriminatory power in first 8 bits SIGCOMM MineNet06 25

Feature Selection • Use feature selection to determine which individual bits of address contribute to discriminatory power of prediction θ i ← ← argmin j V(f( θ θ θ ,x j ),y) ∀ θ ∈ θ 1 ,…, θ i-1 ← ← ∀ ∀ ∀ x j ! ∈ ∈ ∈ SIGCOMM MineNet06 26

Feature Selection Note x-axis: cumulative bit “features” SIGCOMM MineNet06 27

Feature Selection Matches Intuition and Registry Allocations SIGCOMM MineNet06 28

Performance • Given empirically optimal training size and input features • How well can agents predict latency to unknown destinations? SIGCOMM MineNet06 29

Prediction Performance Ideal SIGCOMM MineNet06 30

Prediction Performance Within 30% for >75% of Predictions SIGCOMM MineNet06 31

Prediction Performance Within 30% for >75% of Predictions SIGCOMM MineNet06 32

Going Forward SIGCOMM MineNet06 33

Future Research • How agents select training data (random, BGP prefix, registry allocation, from TCP flows, etc) • How performance decays over time and how often to retrain • Online, continuous learning SIGCOMM MineNet06 34

Summary - Questions? • Major Results: – An agent-centric approach to latency prediction – Validation of SVMs and Kernel Functions as a means to learn on the basis of Internet Addresses – Feature Selection analysis of IP address informational content in predicting latency – Latency estimation accuracy within 30% of true value for > 75% of data points SIGCOMM MineNet06 35

Prediction Accuracy Only ~25% of predictions accurate to within 5% of actual value SIGCOMM MineNet06 36

Prediction Accuracy But >85% of predictions accurate to within 50% of true value About ~45% of predictions accurate to within 10% of true value SIGCOMM MineNet06 37

Prediction Accuracy Bottom line: suitable for coarse-grained latency prediction SIGCOMM MineNet06 38

SVM Learning of IP Address Structure for Latency Prediction Rob - PowerPoint PPT Presentation

SVM Learning of IP Address Structure for Latency Prediction Rob Beverly, Karen Sollins and Arthur Berger {rbeverly,sollins,awberger}@csail.mit.edu SIGCOMM MineNet06 1 SVM Learning of IP Address Structure for Latency Prediction The case

SVM-flexible discriminant analysis Huimin Peng November 20, 2014 Outline SVM Nonlinear SVM =

Overview SVM theoretical framework ORACLE data mining technology SVM parameter

SVM on Intel Graphics Jesse Barnes Intel Open Source Technology Center 1 What is SVM?

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

Machine Learning Theory CS 446 1. SVM risk SVM risk Consider the empirical and true/population

Lecture 5: SVM II Princeton University COS 495 Instructor: Yingyu Liang Review: SVM objective

SVM Kernels COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning SVM Kernels 1 /

Asynchronous I/O Stack: A Low-latency Kernel I/O Stack for Ultra-Low Latency SSDs Jinkyu Jeong

6 KEYNOTE ADDRESS SLIDES 7 KEYNOTE ADDRESS SLIDES 8 KEYNOTE ADDRESS SLIDES 9 KEYNOTE ADDRESS

Using machine learning Learning knot methods in geometric modeling placement SVM knot placement

Fitting SVM models in Matlab mdl = fitcsvm(X,y) fit a classifier using SVM X is a

An SVM- -based Masquerade Detection based Masquerade Detection An SVM Method with Online Update

Classication SVM algorithms with interval-valued training data using triangular and

Eye-blink Detection Based on SVM Wang Xiaoxing Shanghai Jiao Tong University figure1

1. IPv6 address (abbreviated representation) 2. IPv6 address (abbreviation and expansion

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

ICT for Education in Developing South Asia LIM Cher Ping Chair Professor of Learning

State of the Charter Schools Andrea Cooper Gatewood Charter Schools Board Work Session January

POSITIVE ENERGY LOW-RISE, ZERO ENERGY MID-RISE & SUPER LOW ENERGY HIGH-RISE BUILDINGS FOR

UNCTAD MULTI-YEAR EXPERT MEETING ON TRADE, SERVICES AND DEVELOPMENT (1-2 May, 2019, Geneva) D-8

APX Group Holdings, Inc. JP Morgan Global High Yield & Leveraged Finance Conference March 1,

A Lo calization T echnique bots F r o M ulti-Agent Prateek Humane and Neelay Trivedi R

CHINAS LEAD IN GLOBAL FINTECH Gregory Gibb Co-Chairman and CEO, Lufax June 6, 2017 1 Lufax

Local Search CPSC 322 CSPs 4 Textbook 4.8 Local Search CPSC 322 CSPs 4, Slide 1

Sambuz

Useful Links

Newsletter

Mail Us

SVM Learning of IP Address Structure for Latency Prediction Rob - PowerPoint PPT Presentation

SVM Learning of IP Address Structure for Latency Prediction Rob Beverly, Karen Sollins and Arthur Berger {rbeverly,sollins,awberger}@csail.mit.edu SIGCOMM MineNet06 1 SVM Learning of IP Address Structure for Latency Prediction The case

SVM-flexible discriminant analysis Huimin Peng November 20, 2014 Outline SVM Nonlinear SVM =

Overview SVM theoretical framework ORACLE data mining technology SVM parameter

SVM on Intel Graphics Jesse Barnes Intel Open Source Technology Center 1 What is SVM?

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

Machine Learning Theory CS 446 1. SVM risk SVM risk Consider the empirical and true/population

Lecture 5: SVM II Princeton University COS 495 Instructor: Yingyu Liang Review: SVM objective

SVM Kernels COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning SVM Kernels 1 /

Asynchronous I/O Stack: A Low-latency Kernel I/O Stack for Ultra-Low Latency SSDs Jinkyu Jeong

6 KEYNOTE ADDRESS SLIDES 7 KEYNOTE ADDRESS SLIDES 8 KEYNOTE ADDRESS SLIDES 9 KEYNOTE ADDRESS

Using machine learning Learning knot methods in geometric modeling placement SVM knot placement

Fitting SVM models in Matlab mdl = fitcsvm(X,y) fit a classifier using SVM X is a

An SVM- -based Masquerade Detection based Masquerade Detection An SVM Method with Online Update

Classication SVM algorithms with interval-valued training data using triangular and

Eye-blink Detection Based on SVM Wang Xiaoxing Shanghai Jiao Tong University figure1

1. IPv6 address (abbreviated representation) 2. IPv6 address (abbreviation and expansion

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

ICT for Education in Developing South Asia LIM Cher Ping Chair Professor of Learning

State of the Charter Schools Andrea Cooper Gatewood Charter Schools Board Work Session January

POSITIVE ENERGY LOW-RISE, ZERO ENERGY MID-RISE &amp; SUPER LOW ENERGY HIGH-RISE BUILDINGS FOR

UNCTAD MULTI-YEAR EXPERT MEETING ON TRADE, SERVICES AND DEVELOPMENT (1-2 May, 2019, Geneva) D-8

APX Group Holdings, Inc. JP Morgan Global High Yield &amp; Leveraged Finance Conference March 1,

A Lo calization T echnique bots F r o M ulti-Agent Prateek Humane and Neelay Trivedi R

CHINAS LEAD IN GLOBAL FINTECH Gregory Gibb Co-Chairman and CEO, Lufax June 6, 2017 1 Lufax

Local Search CPSC 322 CSPs 4 Textbook 4.8 Local Search CPSC 322 CSPs 4, Slide 1

Sambuz

Useful Links

Newsletter

Mail Us

POSITIVE ENERGY LOW-RISE, ZERO ENERGY MID-RISE & SUPER LOW ENERGY HIGH-RISE BUILDINGS FOR

APX Group Holdings, Inc. JP Morgan Global High Yield & Leveraged Finance Conference March 1,