Loan Default Prediction:
Building a Loan Auto-Approval and Review System with Machine Learning
Applying for a loan can be a long, stressful processââânot just for customers, but also for loan officers who must carefully review applications one by one. At scale, this manual review process is both time-consuming and prone to human fatigue, which increases the risk of overlooking fraudulent or risky applications.
To solve this, I worked on a project that leverages machine learning to predict loan defaults and automatically decide whether an application can be auto-approved or should be sent for human review. The goal is simple: ease the workload on human reviewers while minimizing risks, creating a faster and more efficient loan approval process.
đ Problem Statement
Relying solely on humans slows down the process, while relying solely on machines introduces risk. This project provides a hybrid solution:Â
Low-risk customers are auto-approved.
Borderline or high-risk customers are flagged for human review.
This way, the system balances automation with human oversight.
âïž Approach
The goal of this project is to predict whether a loan applicant will default or not. The system makes a binary classification: 0 means no default and 1 means default. These outputs map directly to decisions â a â0â leads to auto-approval, while a â1â sends the application for human review. This way, the process is faster for low-risk customers and safer for higher-risk ones.
1. Data Collection & Feature Engineering
I used loan applicant data from the year 2016â2017, including both numerical features (e.g., loan amount, term days, repayment delays, birthdate, longitude, latitude) and categorical features (e.g., bank account type, employment status). Features were carefully selected and combined to reflect borrower behavior, demography and financial patterns to predict the target (whether they will default on loan payments or not).
I built a KMeans clustering pipeline to group customers into three risk levels (0â2) based on how likely each cluster was to default.
I trained models Logistic Regression, Random Forest, and XGBoost to identify the most important features. From this, I selected the top 20 features to reduce noise and strengthen the base models. Finally, I added the risk level as an additional feature, giving me a total of 21 features for the voting system.
Final features included:
Loan history (loan growth trend, last loan amount, loan number, average past loan amount, standard deviation of past loan amounts, average past term days, average loans intervals, average past payout time)
Financial ratios (credit score (0â5), debt-to-loan ratio, total due, average total due, risk level (0â2) from clustering)
Repayment behavior (percentage of overdue payments, maximum repayment delay)
Customer demographics (age, employment status, state location, bank name, bank account type)
2. Modeling with an Ensemble of Classifiers
Instead of relying on a single algorithm, I built a voting ensemble of:
Logistic Regression:Â
Random Forest
XGBoost
LightGBM
CatBoost
Each model was tuned individually and given a custom decision threshold to account for imbalances in loan default data (78%âââ22% ratio between not default and default data). The ensemble then combines their votes to produce a final prediction.
3. Decision Layer: Auto-Approval vs Human Review
If the model is confident and predicts not default (0), the loan is auto-approved.
If a default (1) is predicted, the application is flagged for human review.
This ensures automation doesnât replace humans but instead augments them.
4. Deployment with Streamlit
To make the system accessible, I built a Streamlit web app that:
Allows New and returning customers to apply a loan.
Gives feedback for Admin reviewers to view predictions and model confidence.
đ Results
My objective was to minimize financial risks while releasing as many loans as possible correctly. The model shouldnât be too strict either, so as to reduce the workload on human reviewers. Since the datasetâs target was imbalanced, I applied SMOTE and class weighting to regulate how the models penalize misclassifications. I benchmarked several machine learning models, focusing on precision, recall, f1-score, accuracy, and ROC-AUC to capture performance under class imbalance.
Logistic Regression
ROC-AUC: 0.71, Accuracy: 69%
Class Performance:
Non-default (0): Precision 0.87, Recall 0.72
Default (1): Precision 0.37, Recall 0.61
Confusion Matrix:
With a threshold of 0.81, the logistic regression model is strong at predicting non-default borrowers, meaning most approved loans are indeed safe. However, it is weaker at spotting risky borrowers (defaulters), so some customers who are likely to default may still get approved.
Random Forest
ROC-AUC: 0.68, Accuracy: 62%
Class Performance:
Non-default (0): Precision 0.85, Recall 0.62
Default (1): Precision 0.31, Recall 0.62
Confusion Matrix:
With a threshold of 0.54, the random forest model shows a balanced recall across both classes, meaning it is relatively better at catching risky borrowers (defaulters) than logistic regression. However, this comes at the cost of lower precision, so while more defaulters are flagged, some safe customers may also get flagged for review.
XGBoost
ROC-AUC: 0.65, Accuracy: 62%
Class Performance:
Non-default (0): Precision 0.85, Recall 0.62
Default (1): Precision 0.30, Recall 0.59
Confusion Matrix:
With a threshold of 0.33, the XGBoost model performs similarly to Random Forest, capturing a fair share of risky borrowers (defaulters) with moderate recall.
LightGBM
ROC-AUC: 0.69, Accuracy: 62%
Class Performance:
Non-default (0): Precision 0.85, Recall 0.70
Default (1): Precision 0.35, Recall 0.57
Confusion Matrix:
With a threshold of 0.56, the LightGBM model offers a more balanced trade-off between precision and recall compared to Random Forest and XGBoost. It is fairly strong at identifying safe borrowers while capturing more than half of risky borrowers.
CatBoost
ROC-AUC: 0.70, Accuracy: 62%
Class Performance:
Non-default (0): Precision 0.86, Recall 0.70
Default (1): Precision 0.35, Recall 0.59
Confusion Matrix:
With a threshold of 0.62, the CatBoost model shows strong performance in identifying non-default borrowers, similar to LightGBM, while offering slightly better recall for risky borrowers. This means it can catch more potential defaulters without significantly sacrificing accuracy, making it a reliable choice for balancing speed and risk.
ROC-AUC Curve
All models perform better than random guessing (ROC-AUC = 0.5), but Logistic Regression, LightGBM, and CatBoost appear more confident and reliable in differentiating borrowers who will default from safe ones.
đ Voting Ensemble (Final System)
Class Performance:
Non-default (0): Precision 0.85, Recall 0.70
Default (1): Precision 0.34, Recall 0.56
Confusion Matrix:
The Voting Ensemble combines all individual models, producing more stable predictions. Its performance does not significantly drop compared to the base models, making it effective for the loan default prediction.
The idea is to send the 310 predicted defaults (204 + 106) for human review to sift out those who are truly eligible for loan approval. If reviewers are able to approve all 204 of the 310 eligible applicants, then only 84 of the 768 approved loans (11%) actually default. This approach effectively balances loan approval speed, human reviewer workload, and minimize financial risks.
Deep Neural Network
Class Performance:
Non-default (0): Precision 0.86, Recall 0.63
Default (1): Precision 0.32, Recall 0.64
Confusion Matrix:
The DNN model is fairly good at predicting non-default borrowers, so most approved loans are safe. Its ability to detect risky borrowers is moderate, catching some defaulters but still missing a portion. Overall, the DNN did not significantly outperform the voting ensemble, meaning the simpler ensemble approach remains an effective and reliable choice for the auto-approval system.
đ Impact
By blending machine learning with human oversight, this system provides:
Faster loan approvals for customers.
Reduced workload for human reviewers.
Lower financial risk for lenders.
Instead of replacing humans, the model works alongside them, ensuring decisions are faster, fairer, and more accurate.
đ§ Future Improvements
Enhanced Data Collection: Gather more granular and correlated financial and behavioral dataâsuch as income, payment frequency, employment history, and marital statusâto capture richer borrower patterns.
Expanded Feature Engineering: Incorporate transaction-level features and design more sophisticated features, especially to improve the Deep Neural Networkâs performance.
Model Optimization: Explore advanced architectures and hyperparameter tuning for the DNN to better capture nonlinear relationships in borrower behavior.
The code and implementation details are available on my GitHub repo:
Machine learning model for loan default prediction. It auto-approves highly credible applicants (class 0) and flags potential defaulters (cl















