Scikit-Learn Industry Models - South Africa
Collection
Four sklearn GradientBoostingClassifier pipelines for banking, insurance, retail, and mining use cases trained on South African data. • 8 items • Updated
How to use ThabangTheActuaryCoder/insurance-fraud-detection-model with Scikit-learn:
from huggingface_hub import hf_hub_download
import joblib
model = joblib.load(
hf_hub_download("ThabangTheActuaryCoder/insurance-fraud-detection-model", "sklearn_model.joblib")
)
# only load pickle files from sources you trust
# read more about it here https://skops.readthedocs.io/en/stable/persistence.htmlA GradientBoostingClassifier pipeline for detecting fraudulent insurance claims, trained on South African insurance data.
This model is intended for educational and demonstration purposes as part of an end-to-end ML pipeline showcasing Databricks, MLflow, Azure ML, and Hugging Face Hub integration.
| Property | Value |
|---|---|
| Classifier | GradientBoostingClassifier |
| Pipeline steps | preprocessor -> classifier |
| Training samples | 6,400 |
| Test samples | 1,600 |
| Target column | target |
| Created | 2026-06-16T15:37:40.689330+00:00 |
| Metric | Score |
|---|---|
| Accuracy | 0.9094 |
| Precision | 0.5385 |
| Recall | 0.7730 |
| F1 | 0.6348 |
| ROC AUC | 0.9418 |
Numeric: claim_amount, policy_tenure_months, customer_age, num_prior_claims, premium_amount, days_to_report, witness_present, police_report_filed, vehicle_age_years
Categorical: incident_type, province
import joblib
from huggingface_hub import hf_hub_download
import pandas as pd
# Download and load the model
model_path = hf_hub_download(
repo_id="ThabangTheActuaryCoder/insurance-fraud-detection-model",
filename="fraud_detection_model.joblib",
)
model = joblib.load(model_path)
# Create a sample input
sample = pd.DataFrame([{"claim_amount": 0, "policy_tenure_months": 0, "customer_age": 0, "num_prior_claims": 0, "premium_amount": 0, "days_to_report": 0, "witness_present": 0, "police_report_filed": 0, "vehicle_age_years": 0, "incident_type": 0, "province": 0}])
# Predict
prediction = model.predict(sample)
probabilities = model.predict_proba(sample)
print(f"Prediction: {prediction}, Probabilities: {probabilities}")