Example
The following table shows raw data for responses from two survey questions which asked respondents how many text messages they send in a typical week, and how much they spend per month on their phone bill:
The first two columns show the original data, and respondents 12, 15, 21, and 22 have some missing values. The second two columns show the imputed versions of the same two variables, and those respondents have now been assigned values based on the imputation.
The settings for these new inputed variables are as follows:
This tells us that the new imputed data is derived from four variables in total:
- Text messages per week
- Average monthly bill
- Age
- Gender
The final two variables have not themselves been added to the data set
Usage
In Q:
- Select one or more variables or questions in the Variables and Questions tab.
- Select Automate > Browse Online Library > Create New Variables > Impute Missing Data.
- To change how the imputation is performed:
- Select one of the new imputed Questions in the Variables and Questions tab.
- Right-click and select Edit R Variable.
- Choose the desired options in the Inputs section on the right. These options are explained below.
- Click Update R Variable.
In Displayr:
- Select one or more variables the Data Sets tree.
- Select Anything > Data > Variables > New > Ready-Made-New-Variables > Impute Missing Data.
If you wish to change the variables that are being used to identify duplicates, then
- Select the Duplicates variable in the Data Sets tree.
- Choose variables in the Variables box
- Click Calculate.
Settings
The following settings are available for this tool:
Variables These are the variables which are imputed.
Auxilliary variables You can add additional variables to this drop-box to use the data from those variables in the imputation.
Seed This is the random number seed used in the imputation. Changing this number will result in a different solution.
Method This option allows you to choose which imputation algorithm is used.
- Try mice The imputation will initially try to use the mice algorithm, and if this is not successful it will attempt to use the hotdeck algoithm.
- Hot Deck Force the imputation to only use the hotdeck algoritm.
- Mice Force the imputation to only use the mice algoritm.
Technical details
By default, data is imputed using the default settings from the mice R package, which employs Multivariate Imputation by Chained Equations (predictive mean matching) [1]. Care should be taken to ensure that variables have the correct variable type, as this has a big impact on this algorithm. Where a technical error is experienced using mice, the imputation is performed using hot-decking, via the hot.deck package in R.[2]
When applied with regression, missing values in the outcome variable are excluded from the analysis after the imputation has been performed.[3]
Note that although imputation can reduce the bias of parameter estimates, it can create misleading statistical inference (e.g., as the simulated sample size is assumed to be the actual sample size in calculations).
The new Questions are imputed jointly. This means that if you make changes to one of them then the others will also change.
There are some technical limitations with regards to how you can change the new variables:
- You cannot add or remove variables from the Variables drop-box.
- You cannot change the order of variables in the Variables drop-box.
- If you wish to delete any of the imputed variables you must delete them all together because they are linked.
How to apply this QScript
- Start typing the name of the QScript into the Search features and data box in the top right of the Q window.
- Click on the QScript when it appears in the QScripts and Rules section of the search results.
OR
- Select Automate > Browse Online Library.
- Select this QScript from the list.
Customizing the QScript
This QScript is written in JavaScript and can be customized by copying and modifying the JavaScript.
Customizing QScripts in Q4.11 and more recent versions
- Start typing the name of the QScript into the Search features and data box in the top right of the Q window.
- Hover your mouse over the QScript when it appears in the QScripts and Rules section of the search results.
- Press Edit a Copy (bottom-left corner of the preview).
- Modify the JavaScript (see QScripts for more detail on this).
- Either:
- Run the QScript, by pressing the blue triangle button.
- Save the QScript and run it at a later time, using Automate > Run QScript (Macro) from File.
Customizing QScripts in older versions
- Contact support@q-researchsoftware.com to obtain a copy of the JavaScript code.
- Create a new text file, giving it a file extension of .QScript. See here for more information about how to do this.
- Modify the JavaScript (see QScripts for more detail on this).
- Run the file using Automate > Run QScript (Macro) from File.
JavaScript
▶ Show Code
See also
- QScript for more general information about QScripts.
- QScript Examples Library for other examples.
- Online JavaScript Libraries for the libraries of functions that can be used when writing QScripts.
- QScript Reference for information about how QScript can manipulate the different elements of a project.
- JavaScript for information about the JavaScript programming language.
- Table JavaScript and Plot JavaScript for tools for using JavaScript to modify the appearance of tables and charts.
- ↑ Stef van Buuren and Karin Groothuis-Oudshoorn (2011), "mice: Multivariate Imputation by Chained Equations in R", Journal of Statistical Software, 45:3, 1-67.
- ↑ Skyler J. Cranmer and Jeff Gill (2013). We Have to Be Discrete About This: A Non-Parametric Imputation Technique for Missing Categorical Data. British Journal of Political Science, 43, pp 425-449.
- ↑ von Hippel, Paul T. 2007. "Regression With Missing Y's: An Improved Strategy for Analyzing Multiply Imputed Data."