Add HF badge and update weights section with link

Browse files

Files changed (1) hide show

README.md +296 -186

README.md CHANGED Viewed

@@ -1,248 +1,358 @@
 ---
-license: mit
-language:
-  - en
-tags:
-  - remote-sensing
-  - semantic-segmentation
-  - mamba
-  - state-space-model
-  - vmamba
-  - mambavision
-  - spatial-mamba
-  - pytorch
-  - benchmark
-  - loveda
-  - isprs-potsdam
-  - domain-adaptation
-datasets:
-  - LoveDA
-  - ISPRS-Potsdam
-pipeline_tag: image-segmentation
 ---
-# Mamba-Segmentation
-**Controlled Visual State-Space Backbone Benchmark with Domain-Shift & Boundary Analysis for Remote-Sensing Segmentation**
-> *Accepted at IGARSS 2026*
-One pipeline. One decoder. One loss. One schedule. **Five backbone families.** The only variable is the encoder — so the results finally mean something.
 ---
-## What Is This?
-Remote-sensing segmentation papers routinely change the backbone *and* the decoder *and* the loss *and* the training schedule all at once. The numbers tell you who tuned harder, not which backbone is better.
-This repo fixes that. **One shared pipeline — swap the backbone — read the truth.**
-| Component | Status |
 |---|---|
 | Encoder backbone | 🔀 **Swapped** per experiment — the ONLY variable |
-| Decoder | 🔒 Fixed (lightweight U-Net, 256ch, MambaBlock2d) |
-| Loss | 🔒 Fixed (Lovász-Softmax + Focal + Boundary) |
-| Training schedule | 🔒 Fixed (50k iters, AdamW, poly LR decay) |
-| Augmentations | 🔒 Fixed (random crop, flip, colour jitter) |
 | Input resolution | 🔒 Fixed (512×512) |
 | Feature interface | 🔒 Fixed ({F1–F4} at strides {4, 8, 16, 32}) |
 ---
-## Checkpoints in This Repository
-All checkpoints are `best.pth` files (highest validation mIoU during training) stored with their original directory structure.
-### LoveDA Experiments — `Comparison_Experiments/`
-#### MambaVision (NVIDIA hybrid Mamba-Transformer)
-| Checkpoint path | Training split |
-|---|---|
-| `Comparison_Experiments/mambavision_tiny_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/mambavision_tiny_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/mambavision_tiny_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/mambavision_tiny2_512/checkpoints/best.pth` | All→All (v2) |
-| `Comparison_Experiments/mambavision_tiny2_ruraltrain_512/checkpoints/best.pth` | Rural→Urban (v2) |
-| `Comparison_Experiments/mambavision_tiny2_urbantrain_512/checkpoints/best.pth` | Urban→Rural (v2) |
-| `Comparison_Experiments/mambavision_small_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/mambavision_small_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/mambavision_small_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/mambavision_base_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/mambavision_base_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/mambavision_base_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/mambavision_large_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/mambavision_large_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/mambavision_large_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/mambavision_large2_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/mambavision_large2_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/mambavision_large2_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-#### VMamba (cross-scan 2D selective SSM)
-| Checkpoint path | Training split |
-|---|---|
-| `Comparison_Experiments/Vmamb_tiny_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/vmamba_tiny_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/vmamba_tiny_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/Vmamb_small_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/Vmamb_small_512_2/checkpoints/best.pth` | All→All (run 2) |
-| `Comparison_Experiments/Vmamb_small_512_3/checkpoints/best.pth` | All→All (run 3) |
-| `Comparison_Experiments/vmamba_small_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/vmamba_small_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/Vmamb_base_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/vmamba_base_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/vmamba_base_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-#### VisionMamba / Vim (bidirectional Mamba)
-| Checkpoint path | Training split |
-|---|---|
-| `Comparison_Experiments/VisionMamba_tiny_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/visionmamba_tiny_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/visionmamba_tiny_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/VisionMamba_small_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/visionmamba_small_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/visionmamba_small_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/VisionMamba_base_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/visionmamba_base_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/visionmamba_base_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-#### Spatial-Mamba (spatially-aware SSM)
-| Checkpoint path | Training split |
-|---|---|
-| `Comparison_Experiments/spatialmamba_tiny_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/spatialmamba_tiny_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/spatialmamba_tiny_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/spatialmamba_small_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/spatialmamba_small_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/spatialmamba_small_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-| `Comparison_Experiments/spatialmamba_base_512/checkpoints/best.pth` | All→All |
-| `Comparison_Experiments/spatialmamba_base_ruraltrain_512/checkpoints/best.pth` | Rural→Urban |
-| `Comparison_Experiments/spatialmamba_base_urbantrain_512/checkpoints/best.pth` | Urban→Rural |
-#### CNN & Transformer Baselines
-| Checkpoint path | Model |
-|---|---|
-| `Comparison_Experiments/cnn_deeplabv3p_r50_512/checkpoints/best.pth` | DeepLabv3+ ResNet-50, All→All |
-| `Comparison_Experiments/cnn_deeplabv3p_resnet50_ruraltrain_512/checkpoints/best.pth` | DeepLabv3+ ResNet-50, Rural→Urban |
-| `Comparison_Experiments/cnn_deeplabv3p_resnet50_urbantrain_512/checkpoints/best.pth` | DeepLabv3+ ResNet-50, Urban→Rural |
-| `Comparison_Experiments/cnn_unet_r50_512/checkpoints/best.pth` | U-Net ResNet-50, All→All |
-| `Comparison_Experiments/transformer_unetformer_r18_512/checkpoints/best.pth` | UNetFormer ResNet-18, All→All |
-| `Comparison_Experiments/transformerunetformer_resnet18_ruraltrain_512/checkpoints/best.pth` | UNetFormer ResNet-18, Rural→Urban |
-| `Comparison_Experiments/transformerunetformer_resnet18_urbantrain_512/checkpoints/best.pth` | UNetFormer ResNet-18, Urban→Rural |
 ---
-### ISPRS Potsdam Experiments — `Comparison_Experiments_ICPRS_potsdam/`
-| Checkpoint path | Model |
-|---|---|
-| `Comparison_Experiments_ICPRS_potsdam/mambavision_tiny_512/checkpoints/best.pth` | MambaVision-Tiny |
-| `Comparison_Experiments_ICPRS_potsdam/mambavision_tiny2_512/checkpoints/best.pth` | MambaVision-Tiny2 |
-| `Comparison_Experiments_ICPRS_potsdam/mambavision_small_512/checkpoints/best.pth` | MambaVision-Small |
-| `Comparison_Experiments_ICPRS_potsdam/mambavision_base_512/checkpoints/best.pth` | MambaVision-Base |
-| `Comparison_Experiments_ICPRS_potsdam/mambavision_large_512/checkpoints/best.pth` | MambaVision-Large |
-| `Comparison_Experiments_ICPRS_potsdam/mambavision_large2_512/checkpoints/best.pth` | MambaVision-Large2 |
-| `Comparison_Experiments_ICPRS_potsdam/vmamba_tiny_512/checkpoints/best.pth` | VMamba-Tiny |
-| `Comparison_Experiments_ICPRS_potsdam/vmamba_small_512/checkpoints/best.pth` | VMamba-Small |
-| `Comparison_Experiments_ICPRS_potsdam/vmamba_base_512/checkpoints/best.pth` | VMamba-Base |
-| `Comparison_Experiments_ICPRS_potsdam/spatialmamba_tiny_512/checkpoints/best.pth` | Spatial-Mamba-Tiny |
-| `Comparison_Experiments_ICPRS_potsdam/spatialmamba_small_512/checkpoints/best.pth` | Spatial-Mamba-Small |
-| `Comparison_Experiments_ICPRS_potsdam/spatialmamba_base_512/checkpoints/best.pth` | Spatial-Mamba-Base |
-| `Comparison_Experiments_ICPRS_potsdam/cnn_deeplabv3p_r50_512/checkpoints/best.pth` | DeepLabv3+ ResNet-50 |
-| `Comparison_Experiments_ICPRS_potsdam/transformer_unetformer_r18_512/checkpoints/best.pth` | UNetFormer ResNet-18 |
----
-### ImageNet Backbone Weights — `weights/imagenet/`
-| File | Description |
-|---|---|
-| `weights/imagenet/resnet50-11ad3fa6.pth` | ResNet-50 ImageNet-1K pretrained |
-| `weights/imagenet/resnet18-f37072fd.pth` | ResNet-18 ImageNet-1K pretrained |
 ---
-## Results Summary
-Every row shares the same decoder, loss, optimizer, schedule, and data splits. **The only variable is the encoder.**
-### LoveDA
-| Backbone | mIoU (All→All) | mIoU (U→R) | mIoU (R→U) |
-|---|---:|---:|---:|
-| DeepLabv3+ ResNet-50 (CNN) | 43.01 | 30.36 | 39.98 |
-| UNetFormer ResNet-18 (Transformer) | 48.61 | 34.56 | 44.84 |
-| VMamba-Small **🥇** | **55.66** | **40.62** | 53.52 |
-| MambaVision-Large | 55.25 | 38.53 | **54.01** |
-| Spatial-Mamba-Base | 48.03 | 35.23 | 46.55 |
-### ISPRS Potsdam
-| Backbone | mIoU |
-|---|---:|
-| DeepLabv3+ ResNet-50 | 75.09 |
-| UNetFormer ResNet-18 | 74.99 |
-| VMamba-Small **🥇** | **77.59** |
-| MambaVision-Large | 77.07 |
-| Spatial-Mamba-Base | 70.00 |
-**Key findings:**
-- SSMs outperform CNNs and Transformers by a significant margin under identical conditions (+7–12 mIoU on LoveDA).
-- Scaling the encoder past VMamba-Small yields diminishing returns under a fixed decoder.
-- Domain transfer is asymmetric across all backbone families (Rural→Urban consistently outperforms Urban→Rural by 10–15 points) — a data distribution property, not a model property.
-- Boundary accuracy collapses under domain shift while interior accuracy holds — every backbone, every family.
 ---
-## How to Load a Checkpoint
-```python
-import torch
-# Example: load MambaVision-Base best checkpoint for LoveDA All→All
-ckpt = torch.load(
-    "Comparison_Experiments/mambavision_base_512/checkpoints/best.pth",
-    map_location="cpu"
-)
-# keys: 'model', 'optimizer', 'scheduler', 'iter', 'best_score'
-model_state = ckpt["model"]
 ```
-To build the full model and run inference, clone the code repository and follow the setup instructions there:
 ```bash
-git clone https://github.com/dineth18/Mamba-Segmentation
-cd Mamba-Segmentation/MambaVision   # or VMamba/, spatial-mamba/, etc.
-pip install -r requirements.txt
-# edit config.py → set DATA_ROOT and backbone variant
-python eval.py --checkpoint path/to/best.pth
 ```
 ---
-## Citation
-If this benchmark is useful for your research, please cite:
 ```bibtex
 @article{wasalathilaka2026controlledbenchmark,
   title={A Controlled Benchmark of Visual State-Space Backbones with
-         Domain-Shift and Boundary Analysis for Remote-Sensing Segmentation},
-  author={Wasalathilaka, Nichula and Perea, Dineth and Samarakoon, Oshadha
-          and Wijenayake, Buddhi and Godaliyadda, Roshan and Herath, Vijitha
-          and Ekanayake, Parakrama},
-  journal={IGARSS 2026},
   year={2026}
 }
 ```
 ---
-## Acknowledgements
-- [VMamba](https://github.com/MzeroMiko/VMamba) — Visual State Space Model
-- [MambaVision](https://github.com/NVlabs/MambaVision) — NVIDIA hybrid Mamba-Transformer
-- [Spatial-Mamba](https://github.com/EdwardChaworworrachat/SpatialMamba) — Spatially-aware Mamba
-- [LoveDA](https://github.com/Junjue-Wang/LoveDA) — Land-cover domain adaptation dataset
-- [ISPRS Potsdam](https://www.isprs.org/education/benchmarks/UrbanSemLab/) — Urban semantic labeling benchmark
-Built at the **University of Peradeniya**.

+# 🚀 Mamba-Segmentation
+**Controlled Visual State-Space Backbone Benchmark with Domain-Shift & Boundary Analysis for Remote-Sensing Segmentation**
+### 🏆 The First Fair-Fight Benchmark for SSM vs. CNN vs. Transformer Backbones in Remote Sensing 🏆
+[![🏆 Venue](https://img.shields.io/badge/🏆_IGRAAS_2026-Accepted-brightgreen)](https://2026.ieeeigarss.org/)
+[![🐍 Python](https://img.shields.io/badge/🐍_Python-3.9-3776AB)](https://www.python.org/)
+[![🔥 PyTorch](https://img.shields.io/badge/🔥_PyTorch-2.0+-EE4C2C)](https://pytorch.org/)
+[![License](https://img.shields.io/badge/License-MIT-yellow)](LICENSE)
+[![🤗 Weights](https://img.shields.io/badge/🤗_Weights-Hugging_Face-yellow)](https://huggingface.co/dineth18/Mamba-Segmentation)
+One pipeline. One decoder. One loss. One schedule. **Five backbone families.** The only variable is the encoder — so the results finally mean something. SSMs dominate, scaling plateaus early, domain transfer is asymmetric, and boundaries are where every model breaks.
+Ready to see which backbone actually wins a fair fight? Let's go.
 ---
+[🔭 Overview](#-overview) • [✨ Why Controlled?](#-why-controlled-benchmarking-matters) • [🧠 Pipeline](#-the-controlled-pipeline) • [⚡ Quick Start](#-quick-start) • [🗂 Data](#-data-preparation) • [🚀 Train & Eval](#-train--evaluation) • [🔬 Analysis](#-analysis-scripts) • [📊 Results](#-results) • [🙏 Acknowledgements](#-acknowledgements) • [📜 Cite](#-citation)
 ---
+## 🔭 Overview
+Remote-sensing segmentation benchmarks have a fatal flaw: they change the backbone **and** the decoder **and** the loss **and** the schedule **and** the augmentations — all at once. The resulting numbers tell you who tuned harder, not which backbone is better.
+**Mamba-Segmentation fixes this:**
+- **Fixed lightweight U-Net decoder** → identical decoder across all experiments
+- **Fixed TriBraid loss** (Lovász + Focal + Boundary) → same optimization objective for every backbone
+- **Fixed training protocol** → 50k iterations, AdamW, poly LR, 512×512 crops, same augmentations
+- **Standardized feature interface** → {F1, F2, F3, F4} at strides {4, 8, 16, 32}
+- **Five backbone families** → VMamba, MambaVision, Spatial-Mamba, CNN (DeepLabv3), Transformer (UNetFormer)
+**Outcome:** differences in results reflect backbone behavior. Nothing else.
+<p align="center">
+  <img src="IGARSS%202026/Architecture.png" alt="Controlled Pipeline Architecture" width="100%">
+</p>
+<p align="center"><i>Lock the pipeline. Swap the backbone. Read the truth. Three SSM families (Spatial-Mamba, MambaVision, VMamba) share a single U-Net decoder and standardized feature interface {F1–F4}.</i></p>
 ---
+## ✨ Why Controlled Benchmarking Matters
+Every backbone paper ships its own decoder, its own training recipe, its own augmentation policy. You compare "Method A" to "Method B" — but you're really comparing two *entire pipelines*.
+Mamba-Segmentation isolates the **one variable that matters:**
+| What | Status |
 |---|---|
 | Encoder backbone | 🔀 **Swapped** per experiment — the ONLY variable |
+| Decoder architecture | 🔒 Fixed (lightweight U-Net, 256ch, MambaBlock2d) |
+| Loss function | 🔒 Fixed (Lovász-Softmax + Focal + Boundary) |
+| Training schedule | 🔒 Fixed (50k iters, AdamW, poly decay) |
+| Augmentations | 🔒 Fixed (random crop, flip, color jitter) |
 | Input resolution | 🔒 Fixed (512×512) |
 | Feature interface | 🔒 Fixed ({F1–F4} at strides {4, 8, 16, 32}) |
+When the results differ, you know *exactly* why.
 ---
+## 🧠 The Controlled Pipeline
+```
+Encoder:     swapped per experiment — the ONLY variable
+Decoder:     fixed lightweight U-Net (256ch, MambaBlock2d, addition skips)
+Interface:   {F1, F2, F3, F4} at strides {4, 8, 16, 32}
+Training:    50k iters · AdamW · poly LR decay · 512×512 crops · fixed augmentations
+Loss:        L = L_lovász + L_focal + 0.5 × L_boundary
+               ├─ Lovász-Softmax   → direct IoU optimization
+               ├─ Focal (γ=2.0)    → class imbalance handling
+               └─ Boundary (2px)   → edge penalty with warmup
+```
+**Backbone families tested:**
+| Family | Backbones | Type |
+|---|---|---|
+| **VMamba** | Tiny, Small, Base | SSM — cross-scan 2D selective state-space |
+| **MambaVision** | Tiny, Small, Base, Large, Large2 | SSM/Hybrid — Mamba + self-attention |
+| **Spatial-Mamba** | Tiny, Small, Base | SSM — spatially-aware scanning |
+| **DeepLabv3+** | ResNet-50 | CNN baseline |
+| **UNetFormer** | ResNet-18 | Transformer baseline |
+**Datasets:**
+- **LoveDA** → All→All, Urban→Rural, Rural→Urban (source-only, zero adaptation)
+- **ISPRS Potsdam** → high-resolution urban parsing (6-class)
 ---
+## ⚡ Quick Start
+### 1. Clone & Install
+```bash
+git clone https://github.com/YOUR_USERNAME/Mamba-Segmentation
+cd Mamba-Segmentation
+conda create -n mamba-seg python=3.9 -y
+conda activate mamba-seg
+cd MambaVision && pip install -r requirements.txt
+```
+### 2. Grab Pre-trained Backbone Weights
+> 🤗 **All trained segmentation checkpoints are available on [Hugging Face](https://huggingface.co/dineth18/Mamba-Segmentation).** Download `best.pth` for any model directly from there.
+| Backbone | Source | Location |
+|---|---|---|
+| VMamba (Tiny/Small/Base) | [VMamba repo](https://github.com/MzeroMiko/VMamba) | `VMamba/Vmamba_weights/ImageNet-1K/` |
+| MambaVision (Tiny→Large2) | [NVIDIA MambaVision](https://github.com/NVlabs/MambaVision) | `MambaVision/weights/1k/` |
+| Spatial-Mamba (Tiny/Small/Base) | [Spatial-Mamba repo](https://github.com/EdwardChaworworrachat/SpatialMamba) | `spatial-mamba/weights/imageNet1K/` |
+| ResNet-50 / ResNet-18 | [torchvision](https://pytorch.org/vision/stable/models.html) | `weights/imagenet/` |
+Set the weights path in each backbone's `config.py` — that's it.
+### 3. Configure Your Experiment
+Each backbone family has its own directory with a standardized interface:
+```
+<ModelFamily>/
+├── config.py          # ← edit DATA_ROOT / OUTPUT_DIR, or set env vars
+├── config_icprs.py    # ← for ISPRS Potsdam experiments
+├── train.py           # ← same training loop across all families
+├── model.py
+├── encoders.py
+├── light_decoder.py   # ← THE fixed decoder (identical everywhere)
+├── losses.py          # ← THE fixed loss (identical everywhere)
+└── utils.py
+```
+**Path configuration** — two approaches:
+**Option A — environment variables (recommended):**
+```bash
+export LOVEDA_ROOT=/path/to/LoveDA          # for LoveDA experiments
+export POTSDAM_ROOT=/path/to/ISPRS_Potsdam  # for Potsdam experiments
+export OUTPUT_DIR=/path/to/output           # optional — defaults to Comparison_Experiments/
+python train.py
+```
+**Option B — edit the config directly:**
+Open `config.py` and change `DATA_ROOT` and `OUTPUT_DIR` near the top of the file.
 ---
+## 🗂 Data Preparation
+Plug-and-play support for **LoveDA** and **ISPRS Potsdam**.
+<details>
+<summary>📁 <b>LoveDA Layout</b></summary>
+```
+DATA_ROOT/
+├── Train/
+│   ├── Urban/
+│   │   ├── images_png/
+│   │   └── masks_png/
+│   └── Rural/
+│       ├── images_png/
+│       └── masks_png/
+├── Val/
+│   ├── Urban/
+│   │   ├── images_png/
+│   │   └── masks_png/
+│   └── Rural/
+│       ├── images_png/
+│       └── masks_png/
+└── Test/
+```
+- **7 classes:** Background, Building, Road, Water, Barren, Forest, Agricultural
+- **Resolution:** 1024×1024 (cropped to 512×512 during training)
+- **Domains:** Urban and Rural — used for cross-domain evaluation
+</details>
+<details>
+<summary>📁 <b>ISPRS Potsdam Layout</b></summary>
+```
+DATA_ROOT/
+├── Images/
+├── Labels/
+└── splits/
+    ├── train.txt
+    ├── val.txt
+    └── test.txt
+```
+- **6 classes:** Impervious, Building, Low Vegetation, Tree, Car, Clutter
+- **Resolution:** 6000×6000 tiles (cropped to 512×512)
+</details>
+**Must-do:** Set `DATA_ROOT` in `config.py` (LoveDA) or `config_icprs.py` (Potsdam) to your local dataset path.
 ---
+## 🚀 Train & Evaluation
+YAML-free, config-driven — clean and reproducible.
+### Train
+```bash
+# LoveDA — pick any backbone family
+cd MambaVision                          # or VMamba/, spatial-mamba/, CNN_DeepLabv3p/, etc.
+# → edit config.py: set DATA_ROOT, OUTPUT_DIR, and backbone variant
+python train.py
+# ISPRS Potsdam
+cd VMamba
+# → edit config_icprs.py: set DATA_ROOT and OUTPUT_DIR
+python train.py
 ```
+Checkpoints + TensorBoard logs land in `Comparison_Experiments/<experiment_name>/`.
+### Efficiency Profiling
 ```bash
+# Single model benchmark (FPS + peak VRAM)
+python tools/benchmark_fps_mem.py \
+  --model mambavision --variant base --device cuda:0
+# Full sweep across all families
+python tools/benchmark_fps_mem_total.py \
+  --device cuda:0 --batch_size 1
 ```
 ---
+## 🔬 Analysis Scripts
+Three diagnostic scripts that reproduce every analytical claim in the paper:
+| Script | What It Measures | What It Tells You |
+|---|---|---|
+| `analysis/boundary_analysis.py` | Boundary vs. interior mIoU under domain shift | Boundary degradation is the dominant failure mode — not interior misclassification |
+| `analysis/cross_domain_analysis.py` | U→R and R→U metrics for all families | Domain transfer asymmetry is backbone-agnostic — it's a data property |
+| `analysis/rotation_analysis.py` | Prediction stability under 90°/180°/270° rotations | Tests whether SSM scan-order introduces orientation artifacts |
+```bash
+python analysis/boundary_analysis.py \
+  --device cuda:0 --use_pretrained 1
+python analysis/cross_domain_analysis.py \
+  --device cuda:0 --use_pretrained 1
+python analysis/rotation_analysis.py \
+  --device cuda:0 --use_pretrained 1 \
+  --pack_rotations 1 \
+  --families mambavision,vmamba,spatialmamba
+```
+Results land in `analysis_outputs/` as CSV files ready for plotting.
+---
+## 📊 Results
+Straight from the paper — reproducible out of the box.
+Every row shares the same decoder, loss, optimizer, schedule, augmentations, and data splits. **The only variable is the encoder backbone.**
+| Type | Backbone | LoveDA mIoU | U→R | R→U | Potsdam mIoU |
+|---|---|---:|---:|---:|---:|
+| CNN | DeepLabv3 (controlled) | 43.01 | 30.36 | 39.98 | 75.09 |
+| Transformer | UNetFormer (controlled) | 48.61 | 34.56 | 44.84 | 74.99 |
+| **SSM** 🔥 | **VMamba-Small** | **55.66** | **40.62** | 53.52 | **77.59** |
+| **SSM** 🔥 | **MambaVision-L** | 55.25 | 38.53 | **54.01** | 77.07 |
+| SSM | Spatial-Mamba-B | 48.03 | 35.23 | 46.55 | 70.00 |
+> 🏆 **VMamba-Small. 55.66 mIoU. +7.05 over the best Transformer. +12.65 over the best CNN. Same decoder. Same training. No tricks.**
+### Accuracy vs. Throughput
+<p align="center">
+  <img src="IGARSS%202026/fps_vs_miou.png" alt="mIoU vs Inference Throughput" width="60%">
+</p>
+<p align="center"><i>mIoU (%) vs. inference throughput (FPS) for all SSM variants. VMamba holds near-peak accuracy across all sizes. MambaVision trades speed for capacity with diminishing returns. Spatial-Mamba sits in the lower tier.</i></p>
+### Key Takeaways
+🔥 **SSMs dominate the fair fight.** VMamba-Small beats UNetFormer by +7.05 and DeepLabv3 by +12.65 on LoveDA — under identical conditions. This is the backbone, not the pipeline.
+📏 **Bigger ≠ better under a fixed decoder.** MambaVision-L carries far more parameters than VMamba-Small yet scores 55.25 vs. 55.66. Scaling the encoder past a threshold buys nothing when the decoder stays constant.
+🔄 **Domain transfer is asymmetric — and backbone-agnostic.** Rural→Urban outperforms Urban→Rural by 10–15 points across every family. VMamba-Small: 53.52 R→U vs. 40.62 U→R. This is a data distribution property, not a model property.
+🧱 **Boundaries are the unsolved failure mode.** Under domain shift, interior accuracy holds. Boundary accuracy collapses. Every backbone, every family, same story. Whoever cracks boundary sensitivity under distribution shift wins the next round.
+### Qualitative Results — LoveDA
+<p align="center">
+  <img src="IGARSS%202026/loveda_qualitative_detailed_enhanced.png" alt="LoveDA Qualitative Results" width="85%">
+</p>
+<p align="center"><i>Predictions + error maps (magenta = false positive, dark green = false negative) on LoveDA Urban and Rural scenes. VMamba-S and VMamba-B produce the cleanest boundaries; Spatial-Mamba-B shows the most false positives at class transitions.</i></p>
+### Qualitative Results — ISPRS Potsdam
+<p align="center">
+  <img src="IGARSS%202026/potsdam_qualitative_detailed_enhanced.png" alt="ISPRS Potsdam Qualitative Results" width="85%">
+</p>
+<p align="center"><i>Predictions + error maps on ISPRS Potsdam. All SSM variants handle large homogeneous regions well; errors concentrate at fine-grained boundaries (cars, narrow roads) — consistent with the boundary analysis findings.</i></p>
+---
+## 🧬 Backbone Overview
+| Backbone | Architecture | Key Idea | RS Segmentation Impact |
+|---|---|---|---|
+| **VMamba** | Cross-scan 2D selective SSM | Global spatial context with linear complexity via multi-directional scanning | 🥇 Top performer: 55.66 LoveDA mIoU, strongest domain transfer |
+| **MambaVision** | Hybrid Mamba + self-attention | Interleaves Mamba blocks (early stages) with attention (late stages) | Matches VMamba on Potsdam, but extra capacity doesn't help on LoveDA |
+| **Spatial-Mamba** | Spatially-aware SSM | Explicit positional inductive biases in the state-space pathway | Beats CNN baseline, but scan-order alone insufficient without global modeling |
+| **DeepLabv3+** | CNN (ResNet-50) | Atrous convolutions + ASPP for multi-scale context | Controlled CNN reference — 43.01 mIoU baseline |
+| **UNetFormer** | Transformer (ResNet-18) | Efficient self-attention decoder for dense prediction | Controlled Transformer reference — 48.61 mIoU baseline |
+---
+## 🙏 Acknowledgements
+This work builds on prior advances in visual state-space models and remote-sensing segmentation. We gratefully acknowledge:
+- **[VMamba](https://github.com/MzeroMiko/VMamba)** — Visual State Space Model backbone
+- **[MambaVision](https://github.com/NVlabs/MambaVision)** — NVIDIA's hybrid Mamba-Transformer architecture
+- **[Spatial-Mamba](https://github.com/EdwardChaworworrachat/SpatialMamba)** — Spatially-aware Mamba variant
+- **[LoveDA](https://github.com/Junjue-Wang/LoveDA)** and **[ISPRS Potsdam](https://www.isprs.org/education/benchmarks/UrbanSemLab/)** dataset creators
+---
+## 📜 Citation
+If Mamba-Segmentation fuels your research, please cite:
 ```bibtex
 @article{wasalathilaka2026controlledbenchmark,
   title={A Controlled Benchmark of Visual State-Space Backbones with
+         Domain-Shift and Boundary Analysis for Remote-Sensing
+         Segmentation},
+  author={Wasalathilaka, Nichula and Perea, Dineth and Samarakoon,
+          Oshadha and Wijenayake, Buddhi and Godaliyadda, Roshan and
+          Herath, Vijitha and Ekanayake, Parakrama},
+  journal={IGRAAS 2026},
   year={2026}
 }
 ```
 ---
+🌍🛰️ Built at the **University of Peradeniya**. Got inspired? Give us a ⭐