House Prices - Tabular Models (CatBoost + XGBoost baseline)

Pre-trained baseline models for the t22000t/house-prices-tabular dataset, produced by the tabular-data-modelling-pipeline.

This is the v1 baseline drop - CatBoost + XGBoost trained with sensible defaults (no Optuna tuning) on an 80/20 random split. A follow-up release will add the six deep-learning architectures (CANN, CANN-GBM, FT-Transformer, TabM, LocalGLMnet, DRN) once they're retrained on this dataset.

Results

Model	Test Gini	Train Gini	Test MAE (USD)	Test RMSE (USD)	A/E ratio	n params	Training time
CatBoost	0.2061	0.2203	16,868	27,063	1.025	1,041 trees	4.4 s
XGBoost	0.2049	0.2212	17,204	29,716	0.999	462 trees	0.3 s
Stacked ensemble (NNLS)	0.2049	0.2212	17,204	29,716	0.999	(2 weights)	-

Test set: 304 rows (20% of 1,460)
Target: SalePrice (USD)
Loss: Gamma deviance (gamma family, log link)
Target cap: 99.5th percentile = $555,355 (6 rows winsorised)
Random seed: 42

The NNLS-stacked ensemble currently degenerates to XGBoost; with more diverse base learners (the upcoming DL drop) it will pick a non-trivial blend.

Files

File	What it is	Size
`catboost.cbm`	Trained CatBoost model (native format)	1.2 MB
`xgboost.json`	Trained XGBoost Booster (native JSON format)	1.3 MB
`evaluation_summary.csv`	Per-model train/test Gini, MAE, RMSE, A/E ratio, gamma deviance	315 B
`ensemble_weights.json`	NNLS-stacked weights over base predictions	53 B
`dashboard_dl_models.html`	Interactive Plotly dashboard (Lorenz curves, calibration deciles, ensemble plots)	4.6 MB
`figures/fig_dl_*.png`	Publication-quality figures matching the dashboard	~6 MB total
`model_summary.json`	Structured run record (config, metrics, timing)	3.2 KB

Loading and inference

CatBoost

from huggingface_hub import hf_hub_download
from catboost import CatBoostRegressor
import pandas as pd

path = hf_hub_download(
    repo_id="t22000t/house-prices-tabular-models",
    filename="catboost.cbm",
)
model = CatBoostRegressor()
model.load_model(path)

# Load the dataset and predict
df = pd.read_csv("hf://datasets/t22000t/house-prices-tabular/train.csv")
# Use only the columns the model was trained on (see model_summary.json)
features = [
    "LotArea", "YearBuilt", "YearRemodAdd", "TotalBsmtSF", "1stFlrSF",
    "2ndFlrSF", "GrLivArea", "FullBath", "BedroomAbvGr", "TotRmsAbvGrd",
    "GarageCars", "GarageArea", "OverallQual", "OverallCond",
    "MSZoning", "Street", "LotShape", "Neighborhood", "BldgType",
    "HouseStyle", "RoofStyle", "ExterQual", "Foundation", "Heating",
    "CentralAir", "KitchenQual", "SaleType", "SaleCondition",
]
preds = model.predict(df[features])

XGBoost

from huggingface_hub import hf_hub_download
import xgboost as xgb

path = hf_hub_download(
    repo_id="t22000t/house-prices-tabular-models",
    filename="xgboost.json",
)
booster = xgb.Booster()
booster.load_model(path)

# Predictions require the exact feature order used at training time;
# easiest path is to re-run the pipeline's preprocessing - see the
# GitHub repo for the full feature build code.

Stacked ensemble

import json
from huggingface_hub import hf_hub_download

path = hf_hub_download(
    repo_id="t22000t/house-prices-tabular-models",
    filename="ensemble_weights.json",
)
weights = json.loads(open(path).read())
# weights = {"catboost": 0.0, "xgboost": 1.0}  (NNLS picked XGBoost only)
ensemble_pred = weights["catboost"] * cb_pred + weights["xgboost"] * xgb_pred

Training configuration

Setting	Value
Pipeline	tabular-data-modelling-pipeline v0.1.0
Architecture mix	CatBoost + XGBoost (DL models excluded from this drop)
Hyperparameters	Defaults (see `modelling/models/__init__.py`) - no Optuna tuning
Optimiser	CatBoost: ordered boosting; XGBoost: hist tree method
Family / link	Gamma / log
Train/test split	Random 80/20, seed 42
Cap percentile	99.5
CV folds	5 (for stability check)
Hardware	Apple M-series, CPU

To reproduce exactly, run:

git clone https://github.com/timothy22000/tabular_data_modelling_pipeline
cd tabular_data_modelling_pipeline
pip install -e ".[gbm,viz]"
python scripts/download_data.py --dataset house_prices

OMP_NUM_THREADS=1 python train.py \
    --config configs/example_house_prices.py \
    --input data/house_prices.csv \
    --skip-tuning --skip-interpretability \
    --architectures catboost xgboost

(OMP_NUM_THREADS=1 is only needed on macOS arm64 to avoid an OpenMP conflict between XGBoost and Python's threading; Linux runs are unaffected.)

Limitations

Defaults only. No hyperparameter tuning - tuned models would close the train-test gap and likely lift Gini by 0.02-0.05.
GBM only. This drop omits the six DL architectures. CANN-GBM in particular would likely outperform raw XGBoost since it adds a neural residual on top of the GBM base. v2 will include these.
Random split, not stratified. SalePrice has a heavy right tail; a stratified split (or quantile-stratified) would give a more representative test set. Default behaviour, kept for reproducibility.
Trained on training set only. The Kaggle competition's test.csv is unlabelled and not used here. To compare against the official leaderboard, train on the full set and submit predictions on test.
Gini scores look modest. Gini in [0.20, 0.22] is reasonable for this dataset's modest signal-to-noise ratio - Kaggle leaderboard RMSLE is the more conventional metric for House Prices, but the pipeline uses Gini and MAE for cross-comparability across architectures and datasets.

Intended use

Baseline for tabular DL research. Comparing your new architecture against these numbers.
Teaching. Demonstrating a calibrated tabular pricing pipeline end to end.
Sanity check. Make sure your reimplementation of CatBoost/XGBoost on this data hits similar numbers.

Citation

@software{tabular_data_modelling_pipeline,
  author = {Mun, Timothy},
  title  = {tabular-data-modelling-pipeline},
  url    = {https://github.com/timothy22000/tabular_data_modelling_pipeline},
  year   = {2026}
}

@article{decock2011ames,
  author  = {De Cock, Dean},
  title   = {Ames, Iowa: Alternative to the Boston Housing Data},
  journal = {Journal of Statistics Education},
  volume  = {19},
  number  = {3},
  year    = {2011}
}

License

MIT for the model code and pipeline. The underlying dataset is distributed under Kaggle competition terms (free use with attribution).

📂 Dataset: t22000t/house-prices-tabular
📦 Pipeline: tabular-data-modelling-pipeline
🔒 Privacy Lab Space - anonymize tabular data + red-team it

Downloads last month: -; Downloads are not tracked for this model. How to track

Dataset used to train t22000t/house-prices-tabular-models

Evaluation results

Test Gini (CatBoost) on House Prices - Tabular
self-reported

0.206
Test MAE (CatBoost, USD) on House Prices - Tabular
self-reported

16868.000
Test Gini (XGBoost) on House Prices - Tabular
self-reported

0.205
Test MAE (XGBoost, USD) on House Prices - Tabular
self-reported

17204.000

t22000t
/

house-prices-tabular-models