Creates a chart showing the patterns of missing data, with shading indicating missing values. As the number of variables increases, the number of possible patterns explode, making this chart very difficult to read. The most straightforward way to address this is to limit the number of variables that you chart. Alternatively, the font sizes are automatically reduced to assist in this problem, but a lot of zooming can be required. The font size can be manually set in the code by modifying the value of cex.numbers, where a 1 indicates a normal sized font, and smaller values indicate smaller font sizes.
Example
The columns at the top show the relative amount of missing values by variable. In this example, we can see that q3 has substantially more missing values than the other variables.
In the grid, each row represents a combination of missing values, with the blue indicating a missing value. The first row shows that 2 observations are missing values on q3 and q4, 336 have no missing values, and 362 are missing data only for q3.
Usage
To run this test in Displayr, go to Anything > Data > Missing Data > Plot of Patterns (in Q, go to Automate > Browse Online Library > Missing Data > Plot of Patterns).
In the object inspector, under Inputs > Variables select the variables you want to analyze, change any other settings, and click Calculate to run the function.
Options
Variables The variables to appear in the rows, as categories.
Variable names Displays Variable Names in the output instead of labels.
Filter The data is automatically filtered using any filters prior to estimating the model.
Acknowledgements
This chart is from the aggr function of the VIM R package (Kowarik and Templ 2016).
Alexander Kowarik, Matthias Templ (2016). Imputation with the R Package VIM. Journal of Statistical Software, 74(7), 1-16.
Next
How to Check for Missing Data Using Plot by Patterns