Banking Credit Scoring Model

A GradientBoostingClassifier pipeline for predicting credit default risk, trained on South African banking data.

Intended Use

This model is intended for educational and demonstration purposes as part of an end-to-end ML pipeline showcasing Databricks, MLflow, Azure ML, and Hugging Face Hub integration.

Model Details

Property Value
Classifier GradientBoostingClassifier
Pipeline steps preprocessor -> classifier
Training samples 8,000
Test samples 2,000
Target column target
Created 2026-06-16T15:37:09.866740+00:00

Evaluation Metrics

Metric Score
Accuracy 0.8980
Precision 0.7823
Recall 0.6216
F1 0.6928
ROC AUC 0.8918

Confusion Matrix

Confusion Matrix

ROC Curve

ROC Curve

Feature Importance

Feature Importance

Features

Numeric: age, annual_income, employment_years, loan_amount, credit_score, num_late_payments, debt_to_income_ratio, num_open_accounts, months_since_last_delinquency

Categorical: education_level, employment_type, province

Sample Usage

import joblib
from huggingface_hub import hf_hub_download
import pandas as pd

# Download and load the model
model_path = hf_hub_download(
    repo_id="ThabangTheActuaryCoder/banking-credit-scoring-model",
    filename="credit_scoring_model.joblib",
)
model = joblib.load(model_path)

# Create a sample input
sample = pd.DataFrame([{"age": 0, "annual_income": 0, "employment_years": 0, "loan_amount": 0, "credit_score": 0, "num_late_payments": 0, "debt_to_income_ratio": 0, "num_open_accounts": 0, "months_since_last_delinquency": 0, "education_level": 0, "employment_type": 0, "province": 0}])

# Predict
prediction = model.predict(sample)
probabilities = model.predict_proba(sample)
print(f"Prediction: {prediction}, Probabilities: {probabilities}")
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using ThabangTheActuaryCoder/banking-credit-scoring-model 1

Collection including ThabangTheActuaryCoder/banking-credit-scoring-model