Scaling (NMDS) Objective: Group data points into classes of similar - PowerPoint PPT Presentation

Multivariate Fundamentals: Distance Non-metric Multidimensional Scaling (NMDS)

Objective: Group data points into classes of similar points based on a series of variables Lots of types of multidimensional scaling: PCA is aka Classic Multidimensional Scaling The goal of NMDS is to represent the original position of data in multidimensional space as accurately as possible using a reduced number of dimensions that can be easily plotted and visualized (like PCA). BUT (unlike PCA which uses Euclidian distances) NMDS relies on rank orders (distances) for ordination (i.e non-metric) The use of distances omits some of the issues associated with using predictor variables alone (e.g., sensitivity to transformation) Allows for much more flexible technique that accepts a variety of data types Shepard 1962 Kruskal 1964 Contributed to the development of multidimensional scaling Tprgersen & Meuser 1962 Guttman 1968

The math behind NMDS NMDS is an iterative procedure which takes place over several steps: 1. Define the original data point positions in multidimensional space 2. Specify the number of reduced dimensions you want (typically 2) 3. Construct an initial configuration of the data in 2-dimensions 4. Compare distances in this initial 2D configuration against the calculated distances 5. Determine the stress on data points 6. Correct the position of the points in 2D to optimize the stress for all points

The math behind NMDS Consider a 3 variable analysis with 4 data points Euclidian Plot in 2D by distance Variable 2 (could be any distance matrix) A B C D D 2.6 A 0 1.6 2.6 2.4 A C B 1.6 0 2.5 3.3 Variable 3 C 2.6 2.5 0 1.7 1.6 2.6 D 2.4 3.3 1.7 0 C A B 3.3 D B Variable 1 When we compress our 3D image to 2D we cannot Data.ID Varable1 Variable2 Variable3 accurately plot the true distances A 0.9 1.9 1.5 E.g. the distances between AD and BC are too big in the image B 1.7 0.5 1.6 The difference between the data point position in 2D (or # C 3 2 3.1 of dimensions we consider with NMDS) and the distance D 1.9 3.5 3 calculations (based on multivariate) is the STRESS we are trying to optimize

NMDS optimizing stress Stress – value representing the difference between distance in the reduced dimension compared to the complete multidimensional space NMDS tries to optimize the stress as much as possible Think of optimizing stress as: “ Pulling on all points a little bit so no single point is completely wrong, all points are a little off compared to distances ” Ideally we want as little stress as possible

NMDS in R To run NMDS you need to install the ecodist package NMDS in R: library(ecodist) nmds(distMatrix,mindim=n,maxdim=n) (ecodist package) mindim = minimum number of dimensions you want to use Distance matrix of your data maxdim = maximum number of dimension rows based on your predictor you want to use variables You can run NMDS with as many dimensions You need to calculate this as you have predictor variables, BUT we are before running the NMDS trying to reduce the dimensions so we can analysis group data points Typically we want to set both of these values to 2 to simplify our output

NMDS in R Distance matrix Mahalanobis is good for correlated variables Scores – these are the data point outputs that have be pulled to optimize the stress from multi dimensions in 2D (or the # of dimensions considered) These are the values we plot to look at which data points group together We can merge a class variable back into look if pre- determined groups actually group out together or see what groups we could potentially combine

NMDS in R Stress – value representing the difference between distance in the reduced dimension compared to the complete multidimensional space R will produce a list of values – one for each iteration it had to do – the more complex your dataset the more iterations (and time to run the analysis) are needed The last value in the list is the final stress value which is uninformative by itself, but you should check to make sure the stress is stable when you consider more dimensions (modify maxdim)

NMDS in R Your data may NOT be able to be viewed in 2D due to high stress Use the rationale: “Include dimensions until I don’t gain a significant reduction in my stress value” If stress is too high for 2D or 3D NMDS might not be the best method i.e. Visualizing your data in fewer dimensions compromises the data too much

NMDS - Biplot Data points considering scores in 2D Direction of the arrows +/- indicate the trend of points (towards the arrow indicates more of the variable) The closeness of points will indicate how similar they are It is up to you to determine where groupings should be made

NMDS - Biplot Once you decide on groups you can then use graphics to simply distinguish them We cover this in Lab 5

Scaling (NMDS) Objective: Group data points into classes of similar - PowerPoint PPT Presentation

Multivariate Fundamentals: Distance Non-metric Multidimensional Scaling (NMDS) Objective: Group data points into classes of similar points based on a series of variables Lots of types of multidimensional scaling: PCA is aka Classic

Gradient Analysis NMDS Indirect Gradient Analysis NMDS Direct Gradient Analysis Objective:

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

* Alternative technique not covered in this class Multivariate Fundamentals: Prediction

Effectively Scaling Effectively Scaling up/universalizing exclusive up/universalizing exclusive

Scaling From simple models to rich strategies PPPLab Day, November 30th Scaling: recent

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Conformal Finite Size Scaling of Conformal Finite Size Scaling of Flavors Chik Him Wong Twelve

Chapter 11: Scaling and Round-off Noise Keshab K. Parhi Outline Introduction Scaling

So#ware Scaling Mo/va/on & Goals HW Configura/on & Scale Out So#ware Scaling

ADAPTIVE RADIO OUTPUT SCALING FOR POWER AND BANDWIDTH SAVING Koen Zandberg 1 ADAPTIVE RADIO

Scaling up from the stand to Scaling up from the stand to regional level regional level Kevin

Scaling Distributed Teams Around The Globe Ranganathan Balashanmugam Scaling Distributed Teams

Scaling-up SLA Monitoring in Scaling-up SLA Monitoring in Pervasive Environments Pervasive

Multidimensional Scaling Applied Multivariate Statistics Spring 2012 Outline Fundamental

Stress Paths Stress Point 1 3 ) 1 3 )

r rs s r rs sts

Patterns of stress and rhythm in words: a computational perspective Jeffrey Heinz heinz@udel.edu

Welcome Psychological First Aid: Helping Others and Yourself in Times of Stress November 2015

Paths to Population Health: Staying Upstream on the Social Determinants of Health Sanne Magnan

Brain Circuitry and Behavior BJ Casey, Ph.D. Sackler Professor of Developmental Psychobiology

Interconnects Outline Interconnect scaling issues Aluminum technology Copper

Analysis, Quantification, and Mitigation on 40 and 28nm SOC Designs Mark Zwolinski

Scaling (NMDS) Objective: Group data points into classes of similar - PowerPoint PPT Presentation

Multivariate Fundamentals: Distance Non-metric Multidimensional Scaling (NMDS) Objective: Group data points into classes of similar points based on a series of variables Lots of types of multidimensional scaling: PCA is aka Classic

Gradient Analysis NMDS Indirect Gradient Analysis NMDS Direct Gradient Analysis Objective:

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

Analysis of Scaling Algorithms for Matrix &amp; Operator Scaling Contents Scaling Algorithms

* Alternative technique not covered in this class Multivariate Fundamentals: Prediction

Effectively Scaling Effectively Scaling up/universalizing exclusive up/universalizing exclusive

Scaling From simple models to rich strategies PPPLab Day, November 30th Scaling: recent

Outline Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large Principles of

Conformal Finite Size Scaling of Conformal Finite Size Scaling of Flavors Chik Him Wong Twelve

Chapter 11: Scaling and Round-off Noise Keshab K. Parhi Outline Introduction Scaling

So#ware Scaling Mo/va/on &amp; Goals HW Configura/on &amp; Scale Out So#ware Scaling

ADAPTIVE RADIO OUTPUT SCALING FOR POWER AND BANDWIDTH SAVING Koen Zandberg 1 ADAPTIVE RADIO

Scaling up from the stand to Scaling up from the stand to regional level regional level Kevin

Scaling Distributed Teams Around The Globe Ranganathan Balashanmugam Scaling Distributed Teams

Scaling-up SLA Monitoring in Scaling-up SLA Monitoring in Pervasive Environments Pervasive

Multidimensional Scaling Applied Multivariate Statistics Spring 2012 Outline Fundamental

Stress Paths Stress Point 1 3 ) 1 3 )

r rs s r rs sts

Patterns of stress and rhythm in words: a computational perspective Jeffrey Heinz heinz@udel.edu

Welcome Psychological First Aid: Helping Others and Yourself in Times of Stress November 2015

Paths to Population Health: Staying Upstream on the Social Determinants of Health Sanne Magnan

Brain Circuitry and Behavior BJ Casey, Ph.D. Sackler Professor of Developmental Psychobiology

Interconnects Outline Interconnect scaling issues Aluminum technology Copper

Analysis, Quantification, and Mitigation on 40 and 28nm SOC Designs Mark Zwolinski

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

So#ware Scaling Mo/va/on & Goals HW Configura/on & Scale Out So#ware Scaling