Types of latent class analysis
There are two qualitatively different varieties of latent class analysis in widespread use in survey research:
- Latent class regression, where the purpose of the analysis is to identify segments that contain different parameters. This model is most commonly used for creating segments with choice modeling data.
- Model-based clustering, where a series of numeric, categorical or ranking variables are used to create segments. This variant of latent class analysis is most commonly applied when creating segments from attitudinal data. It is essentially an improvement on cluster analysis, in that it can deal with multiple types of data (rather than just numeric data) and it automatically addresses missing values.
In a formal statistical sense these are just applications of the same model. Within Q/Displayr, when running Latent Class Analysis or Mixed-Mode Cluster Analysis/Tree via Create > Segments in Q or Anything > Advanced Analysis > Cluster / Machine Learning in Displayr, it automatically chooses which of these models to run based on the data that is selected. If the data is an Experiment, such as a choice model, then the latent class analysis is the same as latent class regression. If the data consists of numeric ratings, rankings, categorical variables or binary variables, the latent class analysis is model-based clustering. And, if the data consists of both an experiment and, say, ratings, then a latent class model using both types of data will be estimated.
Mixture distribution
A latent class model assumes the existence of a latent categorical variable (i.e., it assumes that the population consists of a finite number of types of people). This is the default mixing distribution used. Alternative Distributions can be chosen.
The standard alternative to the latent class model is to assume that a single multivariate normal distribution describes the population. This model is sometimes referred to in marketing research as the 'hierarchical Bayes'. A generalization of this model is to estimate multiple multivariate normal distributions (i.e., one per segment). This model can, in theory, approximate any type of heterogeneity^{[1]} (i.e., it is a substantially more general model than 'hierarchical Bayes'). Q/Displayr automatically fit this model when the user sets the Distribution to Multivariate Normal - Full Covariance; if a single segment is specified the model is basically the same as with hierarchical bayes.
Additional mixture models are available by setting Distribution. These constrain the properties of the covariance matrix (e.g., assuming that the covariance matrix is identical in classes, diagonal, block diagonal and spherical).
Response variable type (e.g., linear, categorical)
As in the rest of Q, the types of models that are estimated are determined automatically by the program by looking at the Question Types and Variable Types in Q or Structure in Displayr.
How the data is set up in Q/Displayr | Statistical model |
---|---|
Question Type/Structure = Experiment, Dependent Variable's Variable Type/Structure = Numeric | Linear regression (e.g., latent class linear regression) |
Question Type/Structure = Experiment, Dependent Variable's Variable Type/Structure = Categorical/Nominal | Multinomial Logit (e.g., latent class logit) |
Question Type/Structure = Experiment, Dependent Variable's Variable Type/Structure = Ordered Categorical/Ordered Nominal | Rank-Ordered Logit with Ties |
Question Type/Structure = Pick Any/Binary - Multi | Multinomial |
Ranking | Rank-Ordered Logit Model with Ties |
Number/Numeric | Normal |
Number - Multi/Numeric - Multi | Multivariate Normal |
Save Individual-Level Parameter Means and Standard Deviations
When a mixture model is created using the Experiment question type, we are able to produce an estimate of the parameter for each respondent. This is done in Q by right-clicking on the tree-like output and selecting Save Individual-Level Parameter Means and Standard Deviations. See Individual-Level Parameters for more information.
See also
- Statistical Model for Latent Class Analysis, Mixed-Mode Tree, and Mixed-Mode Cluster Analysis
- Segmentation
- Regression
- Experiments Specifications
References
Further reading: Latent Class Analysis Software
- ↑ .Kenneth Train (2009), Discrete Choice Methods with Simulation, Cambridge University Press, Second edition, 2009.