Computes multidimensional scaling and displays the output as a two-dimensional scatterplot.
Technical details
Description
Metric MDS minimizes the difference between distances in input and output spaces. Non-metric MDS aims to preserve the ranking of distances between input and output spaces. You can find out more about MDS on our blog here.
Inputs
Algorithm A choice between metric and non-metric multidimensional scaling. Other dimension reduction techniques of PCA and t-SNE are also available.
The input data can be provided via one of three options:
-
- Variables The variables or a questionvariable set containing variables that you would like to analyze. Cases with missing data are ignored.
- Distance matrix Select an existing distance matrix. This should be a symmetric matrix of distances, such as the output of Correlation - Distances.
- Paste or type distance matrix Opens up a blank spreadsheet into which tabular data can be manually entered or pasted.
Group variable A variable to categorize the output. If numeric, the data are shaded from light (lowest values) to dark (highest). If categorical, data points are colored according to their category. This option is only available if Variables are provided.
Create binary variables from unordered categories If selected, unordered categorical Variables with N categories are converted are converted into N-1 binary indicator variables. Otherwise such variables are each converted to a single numeric variable with integers representing categories (as happens for ordered categories). This option is only available if Variables are provided.
Normalize variables For Variables input, whether to normalize the data.
-
- For t-SNE and MDS each variable is standardized to the range [0, 1].
- For PCA the correlation matrix is used rather than the covariance matrix.
Perplexity A parameter used by the t-SNE algorithm and related to the number of nearest neighbors considered when placing each data point. The typical useful range is from 5 to 50.
-
- Low values imply that immediately local structure is most important.
- High values increase the impact of more distant neighbors and global structure.
Output
Using Variables
If the input type is Variables, the probability that each point has the same class as its nearest neighbor is calculated. A further variable may be specified to classify the output cases into groups using the Group variable field.
Using Metric
Using Non-metric
Using a Distance Matrix
Using Metric
Using Non-metric
Input example for distance matrix pasted in:
toast | butoast | engmuff | jdonut | cintoast | bluemuff | hrolls | toastmarm | butoastj | toastmarg | cinbun | danpastry | gdonut | cofcake | |
butoast | 15 | |||||||||||||
engmuff | 25 | 15 | ||||||||||||
jdonut | 3 | 24 | 22 | |||||||||||
cintoast | 14 | 3 | 17 | 22 | ||||||||||
bluemuff | 24 | 17 | 2 | 21 | 19 | |||||||||
hrolls | 28 | 8 | 4 | 27 | 18 | 8 | ||||||||
toastmarm | 7 | 7 | 20 | 11 | 6 | 18 | 23 | |||||||
butoastj | 8 | 6 | 21 | 12 | 5 | 19 | 22 | 2 | ||||||
toastmarg | 16 | 2 | 16 | 25 | 4 | 18 | 9 | 8 | 7 | |||||
cinbun | 26 | 17 | 10 | 17 | 12 | 7 | 18 | 20 | 19 | 18 | ||||
danpastry | 21 | 25 | 11 | 5 | 19 | 10 | 22 | 17 | 16 | 26 | 2 | |||
gdonut | 20 | 18 | 24 | 2 | 23 | 22 | 25 | 11 | 12 | 17 | 4 | 11 | ||
cofcake | 16 | 22 | 11 | 13 | 21 | 7 | 21 | 21 | 20 | 23 | 6 | 7 | 11 | |
cornmuff | 27 | 11 | 3 | 26 | 16 | 4 | 5 | 25 | 24 | 12 | 12 | 16 | 24 | 16 |
Additional Properties
When using this feature you can obtain additional information that is stored by inspecting it using custom R code in an item below:
#change YourReferenceName to the reference name (under Properties > General) of your analysis
item = YourReferenceName
str(item)
Acknowledgements
Uses the R packages MASS and isoMDS.
References
Analyzing Multivariate Data, by J. Lattin, J.D. Carroll, and P.E. Green, Brooks/Cole, 2003.
Method
- In Displayr: How to Create a Dimension Reduction Scatterplot
- In Q: Create > Dimension Reduction > Multidimensional Scaling (MDS)