Virtual Landmarks for the Internet Liying Tang Mark Crovella - PowerPoint PPT Presentation

Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science

Internet Distance Matters! • Useful for configuring – Content delivery networks – Peer to peer applications – Multiuser games – Overlay routing networks – Server selection

Estimating Distance without Measuring • Internet coordinates – An Internet “location” assigned to each node • Proposed by Ng and Zhang, IMW 2001 – Called “Global Network Positioning” (GNP) • What is “distance”? – In this work, minimum RTT – Corresponds to propagation delay in the absence of queueing/congestion – Assumed to be stable long enough to be worth estimating – Good first-order predictor of path performance

Internet Coordinates: The Basic Idea Assign each node a set of coordinates, such that Euclidean distance approximates “network distance” (minimum RTT) d x 1 = (3,2,4) x 2 = (-2,5,3) ||x 1 -x 2 || ≈ d

But … but … but … This can’t work! Internet distances are too irregular! The Internet has arbitrary connectivity with no obvious geometry! And assigning coordinates must be computationally very expensive!

Two Questions 1. Are Internet coordinate schemes really accurate when applied to large sets of measurements spanning the whole Internet? 2. Can Internet coordinates be assigned in a computationally efficient way?

The Embedding Problem • A metric space is a pair (X,d) where X is a set of points, and d: (X,X) → R is a metric, i.e., it is: symmetric, positive definite, and satisfies the triangle inequality. • A Euclidean space R n is a metric space (Y, δ ) with Y = a vector set and δ = the Euclidean norm • An embedding is a mapping φ : X → R n Given some X, d , and n , we seek an accurate embedding, i.e., a φ with δ ( φ (x 1 ), φ (x 2 )) ≈ d(x 1 , x 2 ) for all x 1 , x 2 in X

Versions of the Embedding Problem • Finite Metric Space (graph) embeddings – N. Linial – Precise, algorithmic, worst-case • Distance geometry – X and d are taken from a known Euclidean space – Exact solution for φ from linear algebra • Multidimensional Scaling (MDS) – Using geometric embedding to approximate empirical measurements

Multidimensional Scaling (MDS) • The most general kind of embedding problem – Arose first in psychology • Treated as a nonlinear optimization, ie , Σ φ = arg min ( δ (f(x 1 ), f(x 2 )) - d(x 1 , x 2 )) 2 f x 1 , x 2 in X • Method used in first Internet studies (GNP) • Solved approximately via iterative methods – slow, can be difficult to configure

A different method: Lipschitz embedding Lipschitz embedding: a point’s coordinates are the distances to a fixed set of landmarks 3 4 1 1 9 7 x 2 = (7,3,1) x 1 = (1,4,9)

Why does the Lipschitz embedding work? Recall that d obeys the triangle inequality… (x 1 ,y 1 ,z 1 ) ∆ (x 2 ,y 2 ,z 2 ) |x 1 -x 2 | < ∆ , |y 1 -y 2 | < ∆ , etc. …so, if nodes 1 and 2 are close, their coordinates are similar

Lipschitz embedding of Internet distance • Advantages: – Fast! – Simple! • Questions: – Triangle inequality doesn’t hold… does it matter? – What is the right number of dimensions? – How can we achieve low dimensional embedding? • More landmarks → generally better results • But … more landmarks → larger coordinate vectors – Most importantly … is it accurate?

Turning to the data Dataset Dimensions # Msmts Notes GNP 19 × 869 16,511 50% in NA RON1 13 × 13 169 Mostly US RON2 15 × 15 225 Mostly US NLANR AMP 116 × 116 13,456 Abilene-connected Skitter 12 × 196,286 2,355,565 50% outside US, attempts to span IP space Sockeye 11 × 156,359 1,719,949 penultimate hop to a node in each live /24

First question: Triangle Inequality CDF of min (d(i,k) + d(k,j))/d(i,j) over all pairs (i,j) k

Next Question: How many dimensions? • Answer via Principal Component Analysis (PCA) • PCA: optimal linear projection from higher dimension to lower dimension φ is a linear function, so equivalent to multiplying by a matrix M i.e., φ (x 1 ) ≡ Mx 1 • Plot of error of projection, as a function of number of dimensions of projected points, is called a scree plot

Exploring Dimensionality via Scree Plots • Illustrative experiment: start with 250 points randomly scattered in an n -dimensional unit hypercube • Form the 250 × 250 distance matrix • Treat this matrix as a set of 250 points in 250-dimensional space, i.e., as a Lipschitz embedding. • What is the error of projecting these points to a low dimensional space?

Scree Plot Exposes Underlying Dimension

Scree Plots of Internet Data Datasets similar, and error dropoff sharp!

Last Question: Achieving Low Dimensional Embedding • Scree plots also tell us that we can use PCA to reduce dimensionality of Lipschitz embedding • i.e., let x 1 , x 2 , x 3 , … each be a set of measurements to n known landmarks – Treat each as a vector of length n • Then there is an r × n matrix M with r ≈ 8 , such that ||Mx i – Mx j || ≈ ||x i -x j || • M is found easily using PCA • Call this method “virtual landmarks” – coordinates are linear combinations of distances to real landmarks

Summary: Implications for Lipschitz Embedding • Triangle Inequality violations not severe • Embeddings in 7 to 9 dimensions should be sufficient • PCA can provide dimensionality reduction of Lipschitz embedding … so, is Lipschitz embedding accurate? Evaluate using relative error: | δ ( φ (x 1 ), φ (x 2 )) - d(x 1 , x 2 )| / d(x 1 , x 2 )

Lipschitz embedding in 8 dimensions 90% of distances have r.e. less than 0.5 (Skitter: 90% have r.e. less than 0.34)

Virtual Landmarks compared to GNP GNP: 3,626 sec VL: < 1 sec NLANR AMP Dataset

Virtual Landmarks compared to GNP (2) GNP: 182 sec VL: < 1 sec GNP dataset

Scaling Virtual Landmarks • So far we have assumed that each node needing coordinates uses measurements to the same set of landmarks – presents scaling problems • But this is not necessary – VL method removes dependence on specific landmarks • Different nodes can use different landmark sets – As long as transformation between different coordinate systems is known

Scaling via Spanners M 1 M 2 T 21 M 2 x 2 M 1 x 1 Spanners Spanners determine their coordinates in both systems … so can compute transformation matrix T 21

Accuracy using Spanners 5 replications, AMP dataset, 2 sets of 20 landmarks

Coordinate Schemes for the Internet • Virtual Landmarks (Lipschitz embedding combined with PCA) is a fast and accurate method for assigning Internet coordinates – Computation is scalable to millions of nodes – Measurement is scalable to millions of nodes • Internet distances are surprisingly amenable to geometric embedding – Dimension about 7 to 9 – Consistent over all datasets

Why do network coordinate schemes work?

Coordinate systems are powerful • Coordinate systems open the door to geometric approaches to Internet problems – Clustering – Partitioning • Potential to unify hybrid wired/wireless application configuration • Potential to optimize overlays, p2p, multicast, server selection, etc. • A new kind of “map” of the Internet

Virtual Landmarks for the Internet Liying Tang Mark Crovella - PowerPoint PPT Presentation

Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser games

10/23/2013 What is the Landmarks Preservation Commission? Preservation 101: The Landmarks

MoPOP NY P NYC LANDMARKS PRESERVATION COMMISSION BOARD HEARING 07.24.18 LANDMARKS

246 WEST 11TH STREET Landmarks Public Meeting DECEMBER 3, 2019 FRONT FACADE PHOTOS 1980S TAX

Landmarks Revisited Silvia Richter 1 Malte Helmert 2 Matthias Westphal 2 1 Griffith University

GROUPS Virtual Group Topics Overview of Virtual Groups Participating as a Virtual Group in

EXPERIENCE VIRTUAL REALITY VIRTUAL REALITY MARKET VR will be bigger than TV Virtual

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

3/9/2020 The Virtual The Virtual The Virtual The Virtual Certification Certification

838 G REENWICH S TREET , M ANHATTAN P ROPOSED P AINTED W ALL S IGN M ASTER P LAN Landmarks

770 BROADWAY ROOF COOLING TOWER LANDMARKS PRESENTATION F E B R U A R Y X X , 2 0 2 0

DAILY PROVISIONS - WEST VILLAGE, NYC LANDMARKS PRESENTATION - DECEMBER 13th, 2018 NEW YORK CITY -

DAILY PROVISIONS - WEST VILLAGE, NYC LANDMARKS PRESENTATION - DECEMBER 13th, 2018 NEW YORK CITY -

259 Hollywood Avenue Little Neck, NY PRESENTING TO QUEENS COMMUNITY BOARD 11 NYC LANDMARKS

FULLER BUILDING 595 MADISON AVENUE 18TH FLOOR COOLING TOWER LANDMARKS PRESENTATION F E B R U A

90 P RINCE S TREET , M ANHATTAN P ROPOSED P AINTED W ALL S IGN M ASTER P LAN Landmarks

Aggregate Indicators of Economic Development "Not everything that counts is countable and not

6. "Happy Days Are Here Again": FDR and the New Deal 6.1 FDR and the New Deal 6.2 A

Lecturers: Dr. Monica Lambon-Quayefio Dr. Nkechi S. Owoo Dr. William Bekoe College of Education

Trials, Not Tribulations: Minimizing the Burden of Research on Health Care Systems Collaboratory

Estimating the Value of Ecosystem Services Brook Milligan Department of Biology New Mexico State

Writing Assignment 2 Polisci 209 Writing Assignment 2 First Draft due on November 16th, Final

I am Peter Skosey, Vice President of the Metropolitan Planning council. I am going to talk about

Not everything that counts can be counted, and not everything that can be counted counts.

Virtual Landmarks for the Internet Liying Tang Mark Crovella - PowerPoint PPT Presentation

Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser games

10/23/2013 What is the Landmarks Preservation Commission? Preservation 101: The Landmarks

MoPOP NY P NYC LANDMARKS PRESERVATION COMMISSION BOARD HEARING 07.24.18 LANDMARKS

246 WEST 11TH STREET Landmarks Public Meeting DECEMBER 3, 2019 FRONT FACADE PHOTOS 1980S TAX

Landmarks Revisited Silvia Richter 1 Malte Helmert 2 Matthias Westphal 2 1 Griffith University

GROUPS Virtual Group Topics Overview of Virtual Groups Participating as a Virtual Group in

EXPERIENCE VIRTUAL REALITY VIRTUAL REALITY MARKET VR will be bigger than TV Virtual

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

3/9/2020 The Virtual The Virtual The Virtual The Virtual Certification Certification

838 G REENWICH S TREET , M ANHATTAN P ROPOSED P AINTED W ALL S IGN M ASTER P LAN Landmarks

770 BROADWAY ROOF COOLING TOWER LANDMARKS PRESENTATION F E B R U A R Y X X , 2 0 2 0

DAILY PROVISIONS - WEST VILLAGE, NYC LANDMARKS PRESENTATION - DECEMBER 13th, 2018 NEW YORK CITY -

DAILY PROVISIONS - WEST VILLAGE, NYC LANDMARKS PRESENTATION - DECEMBER 13th, 2018 NEW YORK CITY -

259 Hollywood Avenue Little Neck, NY PRESENTING TO QUEENS COMMUNITY BOARD 11 NYC LANDMARKS

FULLER BUILDING 595 MADISON AVENUE 18TH FLOOR COOLING TOWER LANDMARKS PRESENTATION F E B R U A

90 P RINCE S TREET , M ANHATTAN P ROPOSED P AINTED W ALL S IGN M ASTER P LAN Landmarks

Aggregate Indicators of Economic Development &quot;Not everything that counts is countable and not

6. &quot;Happy Days Are Here Again&quot;: FDR and the New Deal 6.1 FDR and the New Deal 6.2 A

Lecturers: Dr. Monica Lambon-Quayefio Dr. Nkechi S. Owoo Dr. William Bekoe College of Education

Trials, Not Tribulations: Minimizing the Burden of Research on Health Care Systems Collaboratory

Estimating the Value of Ecosystem Services Brook Milligan Department of Biology New Mexico State

Writing Assignment 2 Polisci 209 Writing Assignment 2 First Draft due on November 16th, Final

I am Peter Skosey, Vice President of the Metropolitan Planning council. I am going to talk about

Not everything that counts can be counted, and not everything that can be counted counts.

Aggregate Indicators of Economic Development "Not everything that counts is countable and not

6. "Happy Days Are Here Again": FDR and the New Deal 6.1 FDR and the New Deal 6.2 A