Multidimensional Scaling Applied Multivariate Statistics Spring 2013 - PowerPoint PPT Presentation

Multidimensional Scaling Applied Multivariate Statistics – Spring 2013

Outline  Fundamental Idea  Classical Multidimensional Scaling  Non-metric Multidimensional Scaling Appl. Multivariate Statistics - Spring 2013

How to represent in two dimensions? Basic Idea Appl. Multivariate Statistics - Spring 2013

Idea 1: Projection Appl. Multivariate Statistics - Spring 2013

Idea 2: Squeeze on table Close points stay close Appl. Multivariate Statistics - Spring 2013

Which idea is better? Appl. Multivariate Statistics - Spring 2013

Idea of MDS  Represent high-dimensional point cloud in few (usually 2) dimensions keeping distances between points similar  Classical/Metric MDS: Use a clever projection R: cmdscale  Non-metric MDS: Squeeze data on table, only conserve ranks R: isoMDS Appl. Multivariate Statistics - Spring 2013

Classical MDS  Problem: Given euclidean distances among points, recover the position of the points!  Example: Road distance between 21 European cities (almost euclidean, but not quite) … Appl. Multivariate Statistics - Spring 2013

Classical MDS  First try: Appl. Multivariate Statistics - Spring 2013

Can identify points up to - shift Classical MDS - rotation - reflection  Flip axes: Appl. Multivariate Statistics - Spring 2013

Classical MDS  Another example: Airpollution in US cities  Range of manu and popul is much bigger than range of wind  Need to standardize to give every variable equal weight Appl. Multivariate Statistics - Spring 2013

Classical MDS Appl. Multivariate Statistics - Spring 2013

Classical MDS: Theory  Input: Euclidean distances between n objects in p dimensions  Output: Position of points up to rotation, reflection, shift  Two steps: - Compute inner products matrix B from distance - Compute positions from B Appl. Multivariate Statistics - Spring 2013

Classical MDS: Theory – Step 1 n * q data matrix  Inner products matrix B = XX T b ij = P q k =1 x ik x jk ij = P q k =1 ( x ik ¡ x jk ) 2 = ::: = b ii + b jj ¡ 2 b ij  Connect to distance: d 2  Center points to avoid shift invariance ³ ´ x = 0 ! P n i =1 x ik = 0 ! P i or j b ij = 0  Invert relationship: b ij = ¡ 1 2 ( d 2 ij ¡ d 2 i: ¡ d 2 :j + d 2 :: ) “doubly centered” (Hint for middle of page 108: Plug in (4.3) and equations on top of page 108 to show that the expression involving d’s is equal to b ij )  Thus, we obtained B from the distance matrix Appl. Multivariate Statistics - Spring 2013

Classical MDS: Theory – Step 2  Since B = XX T , we need the “square root” of B  B is a symmetric and positive definite n*n matrix  Thus, B can be diagonalized: B = V ¤ V T D is a diagonal matrix with on diagonal ¸ 1 ¸ ¸ 2 ¸ ::: ¸ ¸ n (“eigenvalues”) V contains as columns normalized eigenvectors  Some eigenvalues will be zero; drop them: B = V 1 ¤ 1 V T 1 1  Take “square root”: X = V 1 ¤ 2 1  Thus we obtained the position of points from the distances between all points Appl. Multivariate Statistics - Spring 2013

Classical MDS: Low-dim representation  Keep only few (e.g. 2) largest eigenvalues and corresponding eigenvectors  The resulting X will be the low-dimensional representation we were looking for  Goodness of fit (GOF) if we reduce to m dimensions: P m (should be at least 0.8) i =1 ¸ i P n GOF = i =1 ¸ i  Finds “optimal” low -dim representation: Minimizes ³ ij ) 2 ´ S = P n P n ij ¡ ( d ( m ) d 2 i =1 j =1 Appl. Multivariate Statistics - Spring 2013

Classical MDS: Pros and Cons + Optimal for euclidean input data + Still optimal, if B has non-negative eigenvalues (pos. semidefinite) + Very fast - No guarantees if B has negative eigenvalues However, in practice, it is still used then. New measures for Goodness of fit: P m P m P m i =1 ¸ 2 i =1 j ¸ i j i =1 max (0 ;¸ i ) P n P n P n i GOF = GOF = GOF = i =1 ¸ 2 i =1 j ¸ i j i =1 max (0 ;¸ i ) i Used in R function “ cmdscale ” Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Idea  Sometimes, there is no strict metric on original points  Example: How beautiful are these persons? (1: Not at all, 10: Very much) 9 6 2 10 ?? 1 5 OR Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Idea  Absolute values are not > that meaningful  Ranking is important  Non-metric MDS finds a low-dimensional representation, which respects the ranking of distances > Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Theory  is the true dissimilarity, d ij is the distance of representation ± ij  Minimize STRESS ( is an increasing function): µ P i<j ( µ ( ± ij ) ¡ d ij ) 2 P S = i<j d 2 ij  Optimize over both position of points and µ  is called “disparity” ^ d ij = µ ( ± ij )  Solved numerically (isotonic regression); Classical MDS as starting value; very time consuming Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Example for intuition (only) True points in high dimensional space C Compute best representation 5 3 STRESS = 19.7 A 2 B ± AB < ± BC < ± AC Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Example for intuition (only) True points in high dimensional space C Compute best representation 4.8 2.7 STRESS = 20.1 A 2 B ± AB < ± BC < ± AC Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Example for intuition (only) True points in high dimensional space C Stop if minimal STRESS is found. Compute best representation 5.2 STRESS = 18.9 2.9 We will finally represent the A 2 “transformed true distances” B (called disparities): d AB = 2 ; ^ ^ d BC = 2 : 9 ; ^ ± AB < ± BC < ± AC d AC = 5 : 2 instead of the true distances: ± AB = 2 ; ± BC = 3 ; ± AC = 5 Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Pros and Cons + Fulfills a clear objective without many assumptions (minimize STRESS) + Results don’t change with rescaling or monotonic variable transformation + Works even if you only have rank information - Slow in large problems - Usually only local (not global) optimum found - Only gets ranks of distances right Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Example  Do people in the same party vote alike?  Number of votes where 15 congressmen disagreed in 19 votes … Appl. Multivariate Statistics - Spring 2013

Non-metric MDS: Example Appl. Multivariate Statistics - Spring 2013

Concepts to know  Classical MDS: - Finds low-dim projection that respects distances - Optimal for euclidean distances - No clear guarantees for other distances - fast  Non-metric MDS: - Squeezes data points on table - respects only rankings of distances - (locally) solves clear objective - slow Appl. Multivariate Statistics - Spring 2013

R commands to know  cmdscale included in standard R distribution  isoMDS from package “MASS” Appl. Multivariate Statistics - Spring 2013

Multidimensional Scaling Applied Multivariate Statistics Spring 2013 - PowerPoint PPT Presentation

Multidimensional Scaling Applied Multivariate Statistics Spring 2013 Outline Fundamental Idea Classical Multidimensional Scaling Non-metric Multidimensional Scaling Appl. Multivariate Statistics - Spring 2013 How to represent in

Multidimensional Scaling Applied Multivariate Statistics Spring 2012 Outline Fundamental

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

EE 355 Unit 5 Multidimensional Arrays Mark Redekopp 2 MULTIDIMENSIONAL ARRAYS 3

Shaken but Not Stirred: an Example of Subject Classification using Multidimensional Scaling

Multidimensional Scaling MAT 6480W / STT 6705V Guy Wolf guy.wolf@umontreal.ca Universit e de

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

Multidimensional SAR SAR Imaging Imaging: : Multidimensional Studies in the in the Framework

High Capability Multidimensional Data High Capability Multidimensional Data Compression on GPUs

Chapter 9 Multidimensional Arrays and the ArrayList Class Topics Declaring and

Hierarchical Multidimensional Modelling Hierarchical Multidimensional Modelling in the Concept-

Multidimensional Quasi-Cyclic and Convolutional Codes Buket Ozkaya joint work with Cem G

Scaling (NMDS) Objective: Group data points into classes of similar points based on a series of

Effectively Scaling Effectively Scaling up/universalizing exclusive up/universalizing exclusive

Scaling From simple models to rich strategies PPPLab Day, November 30th Scaling: recent

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Fat jets for t tH production Tilman Plehn Heidelberg University Pheno, 5/2010 Fat jets

CSCI-4260/MATH-4150: Graph Theory Course Overview Prof. George Slota Spring 2018 1 / 6 Welcome

Pub lic Outre a c h Mini Gra nts F AQ We b ina r Re b e c c a T ho mpso n He a d o f Pub lic

2 nd semester The Secret is a 2006 Australian-American documentary film consisting of a series of

Agenda - Education Essential skills and traits of an - Continued Learning elite Data Scientist

Inference in b elief net w orks Chapter 15.3{4 + new c AIMA Slides Stuart Russell

Axion Dark Matter Search with Interferometric Gravitational Wave Detectors Ippei Obata (theory

Matrices almost of order two Langlands philosophy Local Langlands for R David Vogan Cartan

Multidimensional Scaling Applied Multivariate Statistics Spring 2013 - PowerPoint PPT Presentation

Multidimensional Scaling Applied Multivariate Statistics Spring 2013 Outline Fundamental Idea Classical Multidimensional Scaling Non-metric Multidimensional Scaling Appl. Multivariate Statistics - Spring 2013 How to represent in

Multidimensional Scaling Applied Multivariate Statistics Spring 2012 Outline Fundamental

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

EE 355 Unit 5 Multidimensional Arrays Mark Redekopp 2 MULTIDIMENSIONAL ARRAYS 3

Shaken but Not Stirred: an Example of Subject Classification using Multidimensional Scaling

Multidimensional Scaling MAT 6480W / STT 6705V Guy Wolf guy.wolf@umontreal.ca Universit e de

Analysis of Scaling Algorithms for Matrix &amp; Operator Scaling Contents Scaling Algorithms

Multidimensional SAR SAR Imaging Imaging: : Multidimensional Studies in the in the Framework

High Capability Multidimensional Data High Capability Multidimensional Data Compression on GPUs

Chapter 9 Multidimensional Arrays and the ArrayList Class Topics Declaring and

Hierarchical Multidimensional Modelling Hierarchical Multidimensional Modelling in the Concept-

Multidimensional Quasi-Cyclic and Convolutional Codes Buket Ozkaya joint work with Cem G

Scaling (NMDS) Objective: Group data points into classes of similar points based on a series of

Effectively Scaling Effectively Scaling up/universalizing exclusive up/universalizing exclusive

Scaling From simple models to rich strategies PPPLab Day, November 30th Scaling: recent

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Fat jets for t tH production Tilman Plehn Heidelberg University Pheno, 5/2010 Fat jets

CSCI-4260/MATH-4150: Graph Theory Course Overview Prof. George Slota Spring 2018 1 / 6 Welcome

Pub lic Outre a c h Mini Gra nts F AQ We b ina r Re b e c c a T ho mpso n He a d o f Pub lic

2 nd semester The Secret is a 2006 Australian-American documentary film consisting of a series of

Agenda - Education Essential skills and traits of an - Continued Learning elite Data Scientist

Inference in b elief net w orks Chapter 15.3{4 + new c AIMA Slides Stuart Russell

Axion Dark Matter Search with Interferometric Gravitational Wave Detectors Ippei Obata (theory

Matrices almost of order two Langlands philosophy Local Langlands for R David Vogan Cartan

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms