Scikit-Learn Industry Models - South Africa
Collection
Four sklearn GradientBoostingClassifier pipelines for banking, insurance, retail, and mining use cases trained on South African data. • 8 items • Updated
How to use ThabangTheActuaryCoder/banking-credit-scoring-model with Scikit-learn:
from huggingface_hub import hf_hub_download
import joblib
model = joblib.load(
hf_hub_download("ThabangTheActuaryCoder/banking-credit-scoring-model", "sklearn_model.joblib")
)
# only load pickle files from sources you trust
# read more about it here https://skops.readthedocs.io/en/stable/persistence.htmlA GradientBoostingClassifier pipeline for predicting credit default risk, trained on South African banking data.
This model is intended for educational and demonstration purposes as part of an end-to-end ML pipeline showcasing Databricks, MLflow, Azure ML, and Hugging Face Hub integration.
| Property | Value |
|---|---|
| Classifier | GradientBoostingClassifier |
| Pipeline steps | preprocessor -> classifier |
| Training samples | 8,000 |
| Test samples | 2,000 |
| Target column | target |
| Created | 2026-06-16T15:37:09.866740+00:00 |
| Metric | Score |
|---|---|
| Accuracy | 0.8980 |
| Precision | 0.7823 |
| Recall | 0.6216 |
| F1 | 0.6928 |
| ROC AUC | 0.8918 |
Numeric: age, annual_income, employment_years, loan_amount, credit_score, num_late_payments, debt_to_income_ratio, num_open_accounts, months_since_last_delinquency
Categorical: education_level, employment_type, province
import joblib
from huggingface_hub import hf_hub_download
import pandas as pd
# Download and load the model
model_path = hf_hub_download(
repo_id="ThabangTheActuaryCoder/banking-credit-scoring-model",
filename="credit_scoring_model.joblib",
)
model = joblib.load(model_path)
# Create a sample input
sample = pd.DataFrame([{"age": 0, "annual_income": 0, "employment_years": 0, "loan_amount": 0, "credit_score": 0, "num_late_payments": 0, "debt_to_income_ratio": 0, "num_open_accounts": 0, "months_since_last_delinquency": 0, "education_level": 0, "employment_type": 0, "province": 0}])
# Predict
prediction = model.predict(sample)
probabilities = model.predict_proba(sample)
print(f"Prediction: {prediction}, Probabilities: {probabilities}")