Testing a Multiple Regression Model
We examined the relationship between life expectancy and income per person using a multiple linear regression framework. In addition to income, urban rate and internet use rate were included in the model to evaluate their independent associations with life expectancy and to assess potential confounding effects.
The analysis was performed in python using the following syntax:
Results
Summary Interpretation
The results of the multiple linear regression model indicated that urban rate (β = 0.08, p = .003) and internet use rate (β = 0.24, p < .001) were significantly and positively associated with life expectancy, while income per person was not statistically significant after adjustment (β = −0.05, p = .475). The model explained approximately 60.3% of the variability in life expectancy. The initial positive association between income and life expectancy observed in the bivariate model (β = 0.55) was substantially reduced after controlling for urbanization and internet access, indicating strong evidence of confounding. The Q–Q plot indicates that the residuals are approximately normally distributed, with slight deviations at the extremes that are not severe enough to violate model assumptions. The standardized residuals plot shows no systematic pattern and very few extreme residuals, suggesting that the assumptions of linearity and homoscedasticity are reasonably met and that there are no serious outliers. The leverage plot indicates that no single observation has undue influence on the model estimates, suggesting that the regression results are not driven by influential outliers. Although we hypothesized that income per person would be positively associated with life expectancy, this hypothesis was not supported in the adjusted model, suggesting that the bivariate association between income and life expectancy is largely explained by differences in urbanization and access to the internet.














