TerraMind-NYC: AMD-trained TerraMind variants for NYC civic-tech

A family of NYC-specialized TerraMind 1.0 fine-tunes, all trained on AMD Instinct MI300X via AMD Developer Cloud during the AMD Developer Hackathon (2026-05-04 → 2026-05-10). Companion to msradam/TerraMind-base-Flood-AMD-reproduction (the Phase 1 IBM-Flood reproduction baseline on AMD).

This repo holds multiple checkpoints, each specializing TerraMind for a different NYC downstream task:

Checkpoints in this repo

File Task Test mIoU Headline
TerraMind_v1_base_NYC_LULC.safetensors NYC 5-class land cover 0.5253 NYC LULC, no TiM
TerraMind_v1_base_NYC_TiM.safetensors Same task with TiM 0.5380 +1.27pp from TiM
TerraMind_v1_base_NYC_Buildings.safetensors (Phase 4) NYC building footprint binary (pending) NYC DOITT footprints, real GT

All Apache 2.0. All TerraMind v1 base architecture (300M params + heads / TiM tokenizers as relevant).

Why three checkpoints

NYC civic tech needs different signals at different times:

  • NYC_LULC for structural land-cover context (developed % drives pluvial flood risk; green-space % is mitigation; water % is coastal proximity).
  • NYC_TiM when the LULC story matters more than throughput (TiM's intermediate-modality reasoning sharpens minority classes — water +2.45pp, herbaceous +2.39pp).
  • NYC_Buildings for fine-grained building footprint mapping with pixel-precise eval against NYC's authoritative DOITT records.

A consumer Riprap query for a single address may run one or all three, depending on what evidence the briefing needs.

Dataset shared across LULC / TiM / Buildings checkpoints

Same 22 NYC parent chips from Major-TOM Core-S2L2A (CC-BY-SA-4.0) + 23 grid-aligned S1RTC chips. Sliced into 16 non-overlapping 256x256 sub-chips each, 70/15/15 split stratified by parent (no spatial leakage).

Labels:

  • LULC: ESA WorldCover 2021 v200 (CC-BY 4.0), collapsed to 5 macro-classes
  • Buildings: NYC DOITT Building Footprints (public domain), rasterized to chip grids in EPSG:32618

Architecture

Variant Backbone Decoder Trainable
LULC terramind_v1_base UNetDecoder [512,256,128,64] 167M
TiM terramind_v1_base_tim (intermediate LULC) UNetDecoder [512,256,128,64] 348M
Buildings terramind_v1_base UNetDecoder [512,256,128,64] 167M

All multimodal: S2L2A (12 bands) + S1RTC (vv, vh) + DEM, 4 timesteps via temporal wrapper.

Training procedure

Framework TerraTorch 1.2.7 + PyTorch Lightning 2.6.1
Hardware 1× AMD Instinct MI300X (192 GB HBM3)
Cloud AMD Developer Cloud
ROCm 4.0.0+1a5c7ec
Precision fp16-mixed
Optimizer AdamW, lr 1e-5, ReduceLROnPlateau (factor 0.5, patience 2)
Batch 8
Max epochs 20
Random seed 42

Wall-clock per fine-tune on MI300X: ~10 min for LULC/Buildings, ~6 min for TiM (deeper architecture but smaller per-batch compute fraction due to frozen tokenizers).

Riprap integration

Riprap (the parent NYC flood-exposure briefing system) uses these checkpoints in app/context/terramind_nyc.py to produce the structural land-cover context for any NYC address. The briefing cites concrete percentages:

"The 2.56 km tile around this address is 78% developed, 7% open water, 14% green space, with building density 32% [terramind_nyc]. Sentinel-2 imagery acquired 1 day ago, Sentinel-1 acquired 4 days ago, sourced from Element 84 / Microsoft Planetary Computer under the ESA Copernicus License."

Six numbers, three sources, all cite-able by Granite 4.1:8b's grounded synthesis pass with Mellea rejection sampling.

Reproduction

Each variant has its own YAML in this repo; see the matching MODEL_CARD_*.md files for per-variant specifics.

Out of scope

  • Outside NYC bbox (-74.30 to -73.65 lon, 40.45 to 40.95 lat)
  • Property-level decisions; chip resolution is 10m, decisions valid at ~25m granularity
  • Insurance / underwriting / navigation use

Honest limitations

  • 22 parent chips is small. Larger-scale fine-tunes would calibrate numbers more tightly.
  • Single training run per variant. Run-to-run variance not characterized.
  • TiM gain (+1.27pp) is modest compared to IBM-ESA's claimed 2-5pp; not exhaustively hyperparameter-tuned.

Citation

@misc{terramind-nyc-2026,
  title={TerraMind-NYC: AMD-trained TerraMind variants for NYC civic-tech},
  author={Rahman, Adam Munawar},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/msradam/TerraMind-base-Flood-NYC},
}

@misc{terramind2025,
  title={TerraMind: Large-Scale Generative Multimodality for Earth Observation},
  author={Jakubik, Johannes and others},
  year={2025},
  eprint={2504.11171},
}

License

Apache 2.0. Underlying datasets: Major-TOM Core (CC-BY-SA-4.0); ESA WorldCover 2021 (CC-BY-4.0); NYC DOITT Footprints (public domain).

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for msradam/TerraMind-base-Flood-NYC

Finetuned
(6)
this model

Datasets used to train msradam/TerraMind-base-Flood-NYC

Paper for msradam/TerraMind-base-Flood-NYC