Interpretation
Variable statistics measure the impact and significance of individual variables within a model, while overall statistics apply to the model as a whole. Both are shown in the regression output. The variables omitted from the stepwise regression are listed at the top of the input
Variable statistics
Estimate the magnitude of the coefficient indicates the size of the change in the independent variable as the value of the dependent variable changes. A positive number indicates a direct relationship (y increases as x increases), and a negative number indicates an inverse relationship (y decreases as x increases.
The coefficient is colored if the variable is statistically significant at the 5% level.
Standard Error measures the accuracy of an estimate. The smaller the standard error, the more accurate the predictions.
t-statistic the estimate divided by the standard error. The magnitude (either positive or negative) indicates the significance of the variable. The values are highlighted based on their magnitude.
p-value expresses the t-statistic as a probability. A p-value under 0.05 means that the variable is statistically significant at the 5% level; a p-value under 0.01 means that the variable is statistically significant at the 1% level. P-values under 0.05 are shown in bold.
Overall statistics
n the sample size of the model
R-squared assess the goodness of fit of the model. A larger number indicates that the model captures more of the variation in the dependent variable.
AIC Akaike Information Criterion is a measure of the quality of the model. It is the method used to determine whether a variable is included in a stepwise regression.
Example
The following example applies a stepwise regression to a linear model. It uses a forward selection approach (Direction > Forward), which means the regression begins with no variables and tests the addition of each variable to build the model.
The stepwise regression includes fewer variables than the original linear model, omitting variables that do not provide statistically significant improvement.
Create a Stepwise Regression Model in Displayr
- 1. Go to Anything > Advanced Analysis > Regression > Stepwise
- 2. Under Inputs > Regression model, select the model you want to apply stepwise to
- 3. [OPTIONAL] Under Inputs > Variables to always include, select any variables that must be included in the model
Create a Stepwise Regression Model in Q
- 1. Go to Create > Regression > Stepwise
- 2. Under Inputs > Regression model, select the model you want to apply stepwise to
- 3. [OPTIONAL] Under Inputs > Variables to always include, select any variables that must be included in the model
Object Inspector Options
Regression model A regression R item produced as a result of running Regression - Linear Regression for example. Compatible with all types of regression R items except unweighted Quasi-Poisson models and models estimated using partial data (pairwise correlations) or imputation of missing values.
Output type:
- Final: The non-detailed output of the regression model that was chosen as a result of the selection process. This is the default.
- Detailed: The detailed text output of the regression model that was chosen as a result of the selection process, as well as the initial and final model formulae, and an overview of which variables were added or removed at each step, with corresponding AIC values.
- All: Same as Detailed, plus complete information on each step of the selection process.
Direction:
- Forward: Forward selection of variables, starting from an empty model with only the intercept.
- Backward: Backward elimination of variables, starting from the original model. This is the default.
Variables to always include The variables that should always be included in the selected model. These variables need to be in the original model. If a variable is not in the original model, it will be ignored, and a warning message will be displayed.
Maximum steps The maximum number of steps to be considered.
Missing Data
The way missing data is treated depends on how missing data was treated in the original model. If exclude cases with missing data was chosen, only cases with no missing data in all predictor variables will be used in the stepwise process, so that models are compared using the same set of cases. Stepwise regression is not compatible with models using partial data (pairwise correlations) and imputation of missing values.
Acknowledgements
Uses the function stepAIC from the R package MASS.
References
Venables, W. N., & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. New York, NY: Springer. ISBN: 0-387-95457-0.
Next
Displayr: How to Run a Stepwise Regression