Bayesian Linear Regression
Normal Linear Regression
It has been a while since last post as I was very busy with thesis stuff, and Iāve just finished the Bayesian course. Today, I wanna try something new which is Bayesian Linear Regression; the model that I just found out from the online course that I enrolled in lol
Data in used; the simulated data contains 2000 records and 5 attributes; spending (our target attribute), age, gender (Male, Female), salary, and status (Single, Married)
Letās start with normal Linear Regression (MLE)
The model yield Adj R squared at 0.55. The overall model is significant (F-statistics p-values < 0.05) and each variable is also significant (t-test p-value < 0.05)
Now letās look at diagnosis results
Linearity and Heteroskedasticity
The residuals are randomly clustered around 0 on y-axis and from turkey test shows that there is no significant evidence of non-linearity case.
2. Autocorrelation
Using Durwin-Watson test we can see that there is no evidence that the residuals are related to each other
3. Independence
Using VIF to check whether own input attributes are related to each other, since VIF of each variable < 5 we can conclude that each variable is independent
4. Normality
Shapiro-wilk (W = 0.99; p-value = 0.53) and the histogram of residual confirmed that the residual is normally distributed. (I have never seen the most perfect normal distribution of residual like this before lol)
As our model can pass all assumptions, we can say that the Linear Regression model is good enough to predict or interpret
Y_hat = 391.7ā ā+ā ā3.633ā ageā āāā ā40.80ā genderMaleā ā+ā ā0.003946ā salaryā āāā ā46.92ā statusSingle
We can say that
Older customers tend to spend money more than the younger customers
Female customers spend more money compared with their gender counterpart
The higher salary, the higher spending
Married people tend to spend more money compared to single people
Under the assumption of a linear relationship
But we can use Bayesian Linear Regression which produces full posterior distributions for parameters, allowing credible intervals and probability statements.
Bayesian Linear Regression
Letās start with prior distributions (what we believe the parameter distributions look like)
As I donāt know my parameter distributions look like yet so I use weak prior information instead (normal with high variance) and my target y prior is t-distribution so the model can cover the case of outlier better than normal prior y. We also have to set the prior of precision (inverse variance, we use this instead variance because it has close form) and nu which is the degree of freedom of t-distribution.
Then I set up the initial distribution that our simulation and I run the MCMC using n.chain which reduces the time of running MCMC (it works as the number of workers that run parallel)
Also, we need to set up the burn-in state and number of run on each chain too.
In this example, I set burn-in state 1000 and run 5000
We can check the convergence of models through convergence analysis
The trace plot shows no pattern = converge. If it shows the curve/no random = need to run more
Gelman diagnosis also confirms the convergence of the model (near 1 = converged)
But upon checking affective size, we might need to run more since the b[1], b[2] and b[4] size are quite small.
The results of model
The performance metrics suggest LR is slightly better than Bayesian LR
This plot shows the observed vs fitted with 95% credible interval
The predictive mean with 95% credible interval also provides a similar conclusion
Older customers tend to spend money more than the younger customers
Female customers spend more money compared with their gender counterpart
The higher salary, the higher spending
Married people tend to spend more money compared to single people
Every credible interval does not cover 0 = every variable is significant for predicting the amount of money spent
Here is a cool thing about Bayesian Linear Regression
Support we have 2 customers:
customer 1 -> age 30, male, income 75000, married
customer 2 -> age 35, female, income 70000, single
Whatās the probability that customer 2 spends more money than customer 1
We can use Monte-Carlo simulation to find the answer
And hereās the result
We can conclude that there is 76.36% chance that customer 2 spends more money than customer 1, so as the business manager we might value customer 2 more such as giving them the coupon or suggesting the new product.
To sum up, even though normal LR can yield better performance, the interpretability of Bayesian model is something that we canāt ignore. Also, I believe that if we set prior more proper and represent the prior knowledge that we (may) have it will yield better performance than classical LR
Link : https://github.com/Filimize/Film_S_blog/tree/main/2026_05_18_Bayesian_Linear_Regression

















