Linear regression analysis is a statistical method used to model the relationship between a dependent variable and one or more independent variables. It is one of the most fundamental tools in statistical analysis and is widely used across academic research, business analytics, and scientific studies.
When to Use Linear Regression?
- When you want to predict the value of a continuous dependent variable
- When you want to understand the relationship between two or more variables
- When you want to control for confounding variables
- When you want to test hypotheses about the effects of predictors
Assumptions of Linear Regression
Before running a linear regression, verify these assumptions:
- Linearity: The relationship between predictors and outcome is linear
- Independence: Observations are independent of each other
- Normality: Residuals are normally distributed
- Homoscedasticity: Residuals have constant variance
- No multicollinearity: Predictors are not highly correlated with each other
Step-by-Step Guide in SPSS
- Open SPSS and load your data file
- Go to Analyze → Regression → Linear
- Move your dependent variable to the "Dependent" box
- Move your independent variable(s) to the "Independent(s)" box
- Select Method: Enter (to enter all variables simultaneously)
- Click Statistics and check: Estimates, Confidence intervals, R squared change, Descriptives, Collinearity diagnostics
- Click Plots and add *ZPRED to X axis and *ZRESID to Y axis for residual plots
- Click OK to run the analysis
Interpreting the Output
Key tables in SPSS linear regression output:
- Model Summary: R² tells you what proportion of variance in the dependent variable is explained by your predictors. Adjusted R² accounts for the number of predictors.
- ANOVA table: Tests whether the overall model is statistically significant (F-statistic and p-value)
- Coefficients table: Shows unstandardized (B) and standardized (β) coefficients, t-values, and p-values for each predictor
Reporting Results in APA Format
Example: "A simple linear regression was conducted to predict [outcome] from [predictor]. The model was statistically significant, F(1, 98) = 24.56, p < .001, R² = .20. [Predictor] significantly predicted [outcome], β = .45, t(98) = 4.96, p < .001."