TerraMind-NYC: AMD-trained TerraMind variants for NYC civic-tech

A family of NYC-specialized TerraMind 1.0 fine-tunes, all trained on AMD Instinct MI300X via AMD Developer Cloud during the AMD Developer Hackathon (2026-05-04 → 2026-05-10). Companion to msradam/TerraMind-base-Flood-AMD-reproduction (the Phase 1 IBM-Flood reproduction baseline on AMD).

This repo holds multiple checkpoints, each specializing TerraMind for a different NYC downstream task:

Checkpoints in this repo

File	Task	Test mIoU	Headline
`TerraMind_v1_base_NYC_LULC.safetensors`	NYC 5-class land cover	0.5253	NYC LULC, no TiM
`TerraMind_v1_base_NYC_TiM.safetensors`	Same task with TiM	0.5380	+1.27pp from TiM
`TerraMind_v1_base_NYC_Buildings.safetensors` (Phase 4)	NYC building footprint binary	(pending)	NYC DOITT footprints, real GT

All Apache 2.0. All TerraMind v1 base architecture (300M params + heads / TiM tokenizers as relevant).

Why three checkpoints

NYC civic tech needs different signals at different times:

NYC_LULC for structural land-cover context (developed % drives pluvial flood risk; green-space % is mitigation; water % is coastal proximity).
NYC_TiM when the LULC story matters more than throughput (TiM's intermediate-modality reasoning sharpens minority classes — water +2.45pp, herbaceous +2.39pp).
NYC_Buildings for fine-grained building footprint mapping with pixel-precise eval against NYC's authoritative DOITT records.

A consumer Riprap query for a single address may run one or all three, depending on what evidence the briefing needs.

Dataset shared across LULC / TiM / Buildings checkpoints

Same 22 NYC parent chips from Major-TOM Core-S2L2A (CC-BY-SA-4.0) + 23 grid-aligned S1RTC chips. Sliced into 16 non-overlapping 256x256 sub-chips each, 70/15/15 split stratified by parent (no spatial leakage).

Labels:

LULC: ESA WorldCover 2021 v200 (CC-BY 4.0), collapsed to 5 macro-classes
Buildings: NYC DOITT Building Footprints (public domain), rasterized to chip grids in EPSG:32618

Architecture

Variant	Backbone	Decoder	Trainable
LULC	terramind_v1_base	UNetDecoder [512,256,128,64]	167M
TiM	terramind_v1_base_tim (intermediate LULC)	UNetDecoder [512,256,128,64]	348M
Buildings	terramind_v1_base	UNetDecoder [512,256,128,64]	167M

All multimodal: S2L2A (12 bands) + S1RTC (vv, vh) + DEM, 4 timesteps via temporal wrapper.

Training procedure


Framework	TerraTorch 1.2.7 + PyTorch Lightning 2.6.1
Hardware	1× AMD Instinct MI300X (192 GB HBM3)
Cloud	AMD Developer Cloud
ROCm	4.0.0+1a5c7ec
Precision	fp16-mixed
Optimizer	AdamW, lr 1e-5, ReduceLROnPlateau (factor 0.5, patience 2)
Batch	8
Max epochs	20
Random seed	42

Wall-clock per fine-tune on MI300X: ~10 min for LULC/Buildings, ~6 min for TiM (deeper architecture but smaller per-batch compute fraction due to frozen tokenizers).

Riprap integration

Riprap (the parent NYC flood-exposure briefing system) uses these checkpoints in app/context/terramind_nyc.py to produce the structural land-cover context for any NYC address. The briefing cites concrete percentages:

"The 2.56 km tile around this address is 78% developed, 7% open water, 14% green space, with building density 32% [terramind_nyc]. Sentinel-2 imagery acquired 1 day ago, Sentinel-1 acquired 4 days ago, sourced from Element 84 / Microsoft Planetary Computer under the ESA Copernicus License."

Six numbers, three sources, all cite-able by Granite 4.1:8b's grounded synthesis pass with Mellea rejection sampling.

Reproduction

Each variant has its own YAML in this repo; see the matching MODEL_CARD_*.md files for per-variant specifics.

Out of scope

Outside NYC bbox (-74.30 to -73.65 lon, 40.45 to 40.95 lat)
Property-level decisions; chip resolution is 10m, decisions valid at ~25m granularity
Insurance / underwriting / navigation use

Honest limitations

22 parent chips is small. Larger-scale fine-tunes would calibrate numbers more tightly.
Single training run per variant. Run-to-run variance not characterized.
TiM gain (+1.27pp) is modest compared to IBM-ESA's claimed 2-5pp; not exhaustively hyperparameter-tuned.

Citation

@misc{terramind-nyc-2026,
  title={TerraMind-NYC: AMD-trained TerraMind variants for NYC civic-tech},
  author={Rahman, Adam Munawar},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/msradam/TerraMind-base-Flood-NYC},
}

@misc{terramind2025,
  title={TerraMind: Large-Scale Generative Multimodality for Earth Observation},
  author={Jakubik, Johannes and others},
  year={2025},
  eprint={2504.11171},
}

License

Apache 2.0. Underlying datasets: Major-TOM Core (CC-BY-SA-4.0); ESA WorldCover 2021 (CC-BY-4.0); NYC DOITT Footprints (public domain).

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for msradam/TerraMind-base-Flood-NYC

Base model

ibm-esa-geospatial/TerraMind-1.0-base

Finetuned

(6)

this model

Datasets used to train msradam/TerraMind-base-Flood-NYC

Paper for msradam/TerraMind-base-Flood-NYC

TerraMind: Large-Scale Generative Multimodality for Earth Observation

Paper • 2504.11171 • Published Apr 15, 2025 • 2