When a data file is loaded, the app will automatically identify the structure of the variables, group them, and assign variables to the correct variable set type (or question type). This includes automatically fixing a number of common problems in the data file, and adding labels to variable sets if appropriate. This article goes into detail about how that is done.
Displayr can instead use AI to group and name variable sets in a smarter, more sensible way. You can do this across all variables in the data set on the initial import by enabling Displayr AI and checking Use Displayr AI to tidy variable set names. For example the pane on the left below uses the standard grouping and labeling and the pane on the right uses Displayr AI:
Default settings
The starting assumption is that variables are categorical (Nominal/Pick One/single response). If the variable contains text information, it's assumed that the variable set should have a Text structure. Otherwise, a Numeric structure.
Data file metadata
Better-quality data files contain metadata indicating certain properties of the data. For example, an SPSS .sav data file can contain Multiple Response Sets, which Q/Displayr interprets as instructions for grouping variables into questions or variables sets.
Pattern matching
When importing SPSS data files, internal checks run to see if the data file has been set up in the best possible way. The better a data file has been set up, the less time is taken to manipulate variable sets, and analysis will be much easier and faster. Sticking to the data file specifications is strongly recommended to ensure minimal work getting started with your analysis, as this additionally aids in the correct grouping and structure of the imported variables in your data set from the outset.
Labeling
At the same time the variable sets are analyzed, the associated labels will also be identified. Labels are determined using the following criteria:
- The label is taken directly from your data file if a variable is determined not to be related to other variables (e.g., Nominal, Number or Pick One variable sets).
- If your data file contains information about variable sets consisting of multiple variables (e.g., Binary - Multi or Pick Any), then the settings in the data file will be honored. This is particularly relevant to SPSS .sav files that contain information about multiple response sets (as SPSS calls them), which can have a defined label.
- Failing the above, the name is inferred from common structures in the labels. For example, if two variables are grouped into a variable set, and one has a label:
Q5a. Bank of America
and the other
Q5b. Citibank
then Q5 will be inferred as a commonality and will use "Q5" in the variable set label. If it can also be inferred that the variables have numeric sequences, e.g., "Q5_1" and "Q5_2", then this will be indicated by "Q5X" where X replaces the sequence of numbers.