FLORO: A Multimodal Geospatial Foundation Model for Ecological Remote Sensing Across Sensors and Scales
FLORO is a multimodal geospatial foundation model designed to learn transferable representations from heterogeneous remote sensing observations. It was developed for ecological and environmental remote sensing applications where data often vary across sensors, spatial resolutions, spectral definitions, and available modalities.
Paper: arXiv:2605.28174
Project page: FLORO project page
Code: GitHub repository
Hugging Face: jorlrodriguezg/floro
Model description
FLORO uses masked autoencoding pretraining with multimodal Earth observation inputs. The released checkpoint corresponds to a Vision Transformer-based encoder trained with multispectral and elevation-related inputs.
The model is intended as a general-purpose encoder for ecological remote sensing tasks, including:
- semantic segmentation
- canopy height estimation
- biomass, nitrogen, and carbon regression
- scene classification
- spectral or modality reconstruction
Input representation
FLORO uses grouped input channels to represent heterogeneous optical and auxiliary remote sensing modalities. Rather than assuming a fixed sensor-specific band order, the model organizes information into modality groups and uses validity channels to indicate which groups are available or valid for a given sample.
Optical stream
The optical stream contains spectral reflectance information and corresponding validity indicators. It can represent visible, red-edge, near-infrared, and shortwave-infrared information depending on sensor availability.
| Group | Description |
|---|---|
| Blue, Green, Red | Visible optical bands |
| Red edge | Vegetation-sensitive red-edge band |
| Near infrared | NIR vegetation-sensitive band |
| Near infrared A | Additional NIR-like band when available |
| SWIR 1, SWIR 2 | Shortwave-infrared bands |
| Validity channels | Indicators for available or valid spectral groups |
Auxiliary stream
The auxiliary stream represents terrain and radar information.
| Group | Description |
|---|---|
| Elevation | DSM, DTM, DEM, or related terrain-derived information |
| SAR VV | Sentinel-1 vertical transmit / vertical receive backscatter |
| SAR VH | Sentinel-1 vertical transmit / horizontal receive backscatter |
| Validity channels | Indicators for available or valid auxiliary modalities |
Intended use
FLORO is intended for research in remote sensing, ecological monitoring, dryland vegetation mapping, environmental applications, and downstream adaptation of geospatial foundation models.
Example use cases include:
- adapting the encoder for semantic segmentation of high-resolution UAV or aerial imagery
- using FLORO features for canopy height or biomass regression
- evaluating transfer learning across sensors and spatial resolutions
- studying multimodal fusion between spectral and topographic information
Limitations
FLORO was designed for ecological remote sensing and may require task-specific adaptation before deployment. Performance can depend on spatial resolution, band definitions, radiometric preprocessing, geographic region, and downstream decoder design.
The model should not be used as a standalone decision system for high-stakes environmental, legal, or policy decisions without validation on the target domain.
Loading the checkpoint
The Hugging Face release includes both checkpoint formats:
checkpoints/floro_encoder_202603_ep150.safetensors: recommended for new PyTorch workflows.checkpoints/floro_encoder_202603_ep150.pth: original PyTorch checkpoint, kept for reproducibility and direct use with PANGAEA-style configurations.
Direct downloads:
from floro import FLOROGeoEncoder
from safetensors.torch import load_file
model = FLOROGeoEncoder(
image_size=256,
patch_size=16,
multispectral_channels=13,
modalities_channels=5,
d_model=1024,
depth=24,
num_heads=16,
mlp_ratio=4,
pos_embed_type="absolute",
)
state_dict = load_file("checkpoints/floro_encoder_202603_ep150.safetensors")
model.load_state_dict(state_dict, strict=False)
model.eval()
For PANGAEA benchmark runs, the encoder configs in pangaea-bench/configs/encoder/ point to the original .pth checkpoint:
encoder_weights: ./checkpoints/floro_encoder_202603_ep150.pth
download_url: https://huggingface.co/jorlrodriguezg/floro/resolve/main/checkpoints/floro_encoder_202603_ep150.pth
To convert the original PyTorch checkpoint to safetensors, run:
python scripts/convert_checkpoint_to_safetensors.py \
--input checkpoints/floro_encoder_202603_ep150.pth \
--output checkpoints/floro_encoder_202603_ep150.safetensors
- Downloads last month
- -