OVFruitQG
OVFruitQG is a PyTorch release for open-vocabulary fruit and vegetable quality and maturity assessment. The release includes model code, prompts, annotations, paper result tables, and trained checkpoints for the main method and supervised baselines.
Included Assets
OVFruitQG/
βββ checkpoints/ # Released model checkpoints
βββ configs/ # Split and baseline configs
βββ data/ # Annotation metadata and split protocol
βββ models/ # Public model implementations
βββ prompts/ # Prompt bank
βββ results/ # Paper table CSV files
βββ scripts/ # Training/evaluation/table scripts
βββ training/ # Training and metric utilities
βββ OVfruitQG dataset.zip # Dataset archive
βββ requirements.txt
βββ README.md
Large files such as *.pt checkpoints and the dataset zip should be stored with
Git LFS when this folder is uploaded to Hugging Face.
Installation
pip install -r requirements.txt
The main OVFruitQG model uses a frozen CLIP backbone through Hugging Face
transformers. If the CLIP checkpoint is not already cached, it may be
downloaded automatically by transformers.
Dataset
The dataset archive is provided as:
OVfruitQG dataset.zip
Unzip it before running training/evaluation scripts. Annotation files and
dataset notes are also provided under data/.
The public split protocol is category-level and is described in:
data/split_protocol.md
No per-image train/validation/test split CSV files are included in this release.
Label Order
Quality labels:
healthy, rotten, moldy, bruised, cracked
Maturity labels:
unripe, ripe, overripe
The same orders are exported from models as QUALITY_CLASSES and
MATURITY_CLASSES.
Checkpoints
The release includes four category splits:
| File pattern | Model |
|---|---|
checkpoints/OVFruitQG_split*.pt |
OVFruitQG / V3.1 |
checkpoints/ResNet_split*.pt |
Supervised ResNet50 |
checkpoints/ViT_split*.pt |
Supervised ViT-B/16 |
checkpoints/ResNet_LDB_fast_split*.pt |
V4.5 / ResNet LDB fast |
See checkpoints/checkpoint_manifest.csv for SHA256 hashes.
Load a Checkpoint
import torch
from PIL import Image
from torchvision import transforms
from models import build_model_for_checkpoint, load_checkpoint_into_model
checkpoint = "checkpoints/ResNet_split1.pt"
model = build_model_for_checkpoint(checkpoint)
load_checkpoint_into_model(model, checkpoint)
model.eval()
preprocess = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
image = Image.open("path/to/crop.jpg").convert("RGB")
pixel_values = preprocess(image).unsqueeze(0)
with torch.no_grad():
outputs = model(pixel_values)
quality_id = outputs["quality_logits"].argmax(dim=-1).item()
maturity_id = outputs["maturity_logits"].argmax(dim=-1).item()
For OVFruitQG:
from models import build_model, load_checkpoint_into_model
model = build_model(
"v3_1",
model_version="v3_1",
freeze_backbone=True,
allow_backbone_fallback=False,
)
load_checkpoint_into_model(model, "checkpoints/OVFruitQG_split1.pt")
For V4.5:
from models import build_model_for_checkpoint, load_checkpoint_into_model
checkpoint = "checkpoints/ResNet_LDB_fast_split1.pt"
model = build_model_for_checkpoint(checkpoint)
load_checkpoint_into_model(model, checkpoint)
Results
Paper result tables are stored in results/, including:
- Table 5: main recognition results
- Table 6: prompt retrieval and matching
- Table 7: seen/unseen generalization
- Table 8: ablation study
- Table 9: efficiency analysis
License and Third-Party Models
The code is released under the MIT license. Third-party foundation models such as CLIP, OpenCLIP, and Grounding DINO are not redistributed unless their files are explicitly present in this folder. Use the official sources and respect their original licenses.