LFM2.5-VL-450M VRSBench LoRA

Model Description

This is a fine-tuned version of LiquidAI's LFM2.5-VL-450M vision-language model, trained on the VRSBench dataset for general satellite image understanding. This serves as the base model for specialized satellite vision tasks.

The model can answer questions about satellite imagery, including:

  • Scene classification
  • Object detection and counting
  • Visual question answering about satellite images
  • General satellite image understanding

Training Details

VRSBench Training

  • Base Model: LFM2.5-VL-450M
  • Dataset: VRSBench (Vision Reasoning and Scene Understanding Benchmark)
  • Epochs: 1
  • Method: LoRA (r=16, alpha=32)
  • Hardware: Local GPU training (no Ray/distributed)

Derived Models

This model serves as the base for specialized satellite vision experts:

Model Dataset Task Accuracy/Performance
VRSBench + EuroSAT Terrain Expert EuroSAT Terrain Classification 97.52% accuracy
VRSBench + MADOS Maritime Expert MADOS Maritime Detection IoU@0.5: ~2%

Usage

With llama.cpp

# Download Q4_K_M quantized version (recommended)
wget https://huggingface.co/5ch4um1/lfm2.5-vrsbench-lora-450m/resolve/main/lfm2.5-vrsbench-lora-450m-q4_k_m.gguf

# Run inference
./llama-cli -m lfm2.5-vrsbench-lora-450m-q4_k_m.gguf \
  --image satellite_image.jpg \
  -p "Describe this satellite image in detail."

With Transformers

from transformers import AutoModelForVision2Seq, AutoProcessor
from PIL import Image

model = AutoModelForVision2Seq.from_pretrained(
    "5ch4um1/lfm2.5-vrsbench-lora-450m",
    torch_dtype="auto",
    device_map="auto"
)
processor = AutoProcessor.from_pretrained("5ch4um1/lfm2.5-vrsbench-lora-450m")

image = Image.open("satellite_image.jpg")
prompt = "What is shown in this satellite image?"

inputs = processor(text=prompt, images=image, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(outputs[0], skip_special_tokens=True))

GGUF Quantizations

Version Size Description
F16 679 MB Full precision (16-bit)
Q8_0 362 MB 8-bit quantization
Q4_K_M 219 MB 4-bit quantization (recommended for most use cases)

Model Sources

Limitations

  • General satellite understanding model - not specialized for specific tasks
  • Performance varies depending on satellite image type and task
  • For specialized tasks (terrain, maritime), use the derived expert models listed above

Training Environment

  • Framework: Transformers + PEFT (LoRA)
  • Hardware: Local GPU (CUDA)
  • Training Scripts: Available in the cookbook repository
Downloads last month
371
Safetensors
Model size
0.4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 5ch4um1/lfm2.5-vrsbench-lora-450m

Quantized
(21)
this model
Quantizations
1 model