Historical Map Semantic Segmentation โ Ensemble Checkpoints
Three U-Net + CBAM (EfficientNet-B5 encoder) checkpoints used as a 3-way
probability-averaging ensemble for 7-class semantic segmentation of historical
cartographic scans. Best Kaggle score: 0.77044 (score = 0.6 ยท mIoU + 0.4 ยท macro-F1).
Code: https://github.com/VictorPachecoAznar/Comp1_RTCart
Files
| Path | Role | Trained on | Validated on | Val score |
|---|---|---|---|---|
map2_specialist/map2_specialist.pth |
map2-specialist | map2 only | map1 | 0.7233 |
map1_specialist/map1_specialist.pth |
map1-specialist | map1 only | map2 | 0.7029 |
tile_cv_generalist/tile_cv_generalist.pth |
tile-CV generalist (fold 1) | tiles from both maps | held-out fold | 0.8754 |
Each directory also includes the config.yaml used at training time.
Classes
["River", "Forest", "Lake", "Wetland", "Stream", "Building", "Road"] โ
one binary channel per class.
Quick use
import torch
from huggingface_hub import hf_hub_download
# Pull one checkpoint
ckpt_path = hf_hub_download(
repo_id="Noe-B/historic-map-semantic-segmentation",
filename="map2_specialist/map2_specialist.pth",
)
# Load (requires the model definition from the GitHub repo)
ckpt = torch.load(ckpt_path, map_location="cpu")
state = ckpt["model_state"]
# from src.training.models import get_model
# model = get_model("unet_cbam", encoder_name="efficientnet-b5")
# model.load_state_dict(state)
For full inference (all 3 checkpoints, ensemble averaging, threshold 0.33),
see the 4_submit.py script in the GitHub repo.
Input/output shape
- Input: RGB tile,
(3, 768, 768), ImageNet-normalised (mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]) - Output: logits,
(7, 768, 768); applysigmoidthen threshold (recommended0.33)
