FLORO: A Multimodal Geospatial Foundation Model for Ecological Remote Sensing Across Sensors and Scales

FLORO is a multimodal geospatial foundation model designed to learn transferable representations from heterogeneous remote sensing observations. It was developed for ecological and environmental remote sensing applications where data often vary across sensors, spatial resolutions, spectral definitions, and available modalities.

Paper: arXiv:2605.28174
Project page: FLORO project page
Code: GitHub repository
Hugging Face: jorlrodriguezg/floro

Model description

FLORO uses masked autoencoding pretraining with multimodal Earth observation inputs. The released checkpoint corresponds to a Vision Transformer-based encoder trained with multispectral and elevation-related inputs.

The model is intended as a general-purpose encoder for ecological remote sensing tasks, including:

  • semantic segmentation
  • canopy height estimation
  • biomass, nitrogen, and carbon regression
  • scene classification
  • spectral or modality reconstruction

Input representation

FLORO uses grouped input channels to represent heterogeneous optical and auxiliary remote sensing modalities. Rather than assuming a fixed sensor-specific band order, the model organizes information into modality groups and uses validity channels to indicate which groups are available or valid for a given sample.

Optical stream

The optical stream contains spectral reflectance information and corresponding validity indicators. It can represent visible, red-edge, near-infrared, and shortwave-infrared information depending on sensor availability.

Group Description
Blue, Green, Red Visible optical bands
Red edge Vegetation-sensitive red-edge band
Near infrared NIR vegetation-sensitive band
Near infrared A Additional NIR-like band when available
SWIR 1, SWIR 2 Shortwave-infrared bands
Validity channels Indicators for available or valid spectral groups

Auxiliary stream

The auxiliary stream represents terrain and radar information.

Group Description
Elevation DSM, DTM, DEM, or related terrain-derived information
SAR VV Sentinel-1 vertical transmit / vertical receive backscatter
SAR VH Sentinel-1 vertical transmit / horizontal receive backscatter
Validity channels Indicators for available or valid auxiliary modalities

Intended use

FLORO is intended for research in remote sensing, ecological monitoring, dryland vegetation mapping, environmental applications, and downstream adaptation of geospatial foundation models.

Example use cases include:

  • adapting the encoder for semantic segmentation of high-resolution UAV or aerial imagery
  • using FLORO features for canopy height or biomass regression
  • evaluating transfer learning across sensors and spatial resolutions
  • studying multimodal fusion between spectral and topographic information

Limitations

FLORO was designed for ecological remote sensing and may require task-specific adaptation before deployment. Performance can depend on spatial resolution, band definitions, radiometric preprocessing, geographic region, and downstream decoder design.

The model should not be used as a standalone decision system for high-stakes environmental, legal, or policy decisions without validation on the target domain.

Loading the checkpoint

The Hugging Face release includes both checkpoint formats:

  • checkpoints/floro_encoder_202603_ep150.safetensors: recommended for new PyTorch workflows.
  • checkpoints/floro_encoder_202603_ep150.pth: original PyTorch checkpoint, kept for reproducibility and direct use with PANGAEA-style configurations.

Direct downloads:

from floro import FLOROGeoEncoder
from safetensors.torch import load_file

model = FLOROGeoEncoder(
    image_size=256,
    patch_size=16,
    multispectral_channels=13,
    modalities_channels=5,
    d_model=1024,
    depth=24,
    num_heads=16,
    mlp_ratio=4,
    pos_embed_type="absolute",
)

state_dict = load_file("checkpoints/floro_encoder_202603_ep150.safetensors")
model.load_state_dict(state_dict, strict=False)
model.eval()

For PANGAEA benchmark runs, the encoder configs in pangaea-bench/configs/encoder/ point to the original .pth checkpoint:

encoder_weights: ./checkpoints/floro_encoder_202603_ep150.pth
download_url: https://huggingface.co/jorlrodriguezg/floro/resolve/main/checkpoints/floro_encoder_202603_ep150.pth

To convert the original PyTorch checkpoint to safetensors, run:

python scripts/convert_checkpoint_to_safetensors.py \
  --input checkpoints/floro_encoder_202603_ep150.pth \
  --output checkpoints/floro_encoder_202603_ep150.safetensors
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for jorlrodriguezg/floro