OVFruitQG

OVFruitQG is a PyTorch release for open-vocabulary fruit and vegetable quality and maturity assessment. The release includes model code, prompts, annotations, paper result tables, and trained checkpoints for the main method and supervised baselines.

Included Assets

OVFruitQG/
├── checkpoints/                 # Released model checkpoints
├── configs/                     # Split and baseline configs
├── data/                        # Annotation metadata and split protocol
├── models/                      # Public model implementations
├── prompts/                     # Prompt bank
├── results/                     # Paper table CSV files
├── scripts/                     # Training/evaluation/table scripts
├── training/                    # Training and metric utilities
├── OVfruitQG dataset.zip        # Dataset archive
├── requirements.txt
└── README.md

Large files such as *.pt checkpoints and the dataset zip should be stored with Git LFS when this folder is uploaded to Hugging Face.

Installation

pip install -r requirements.txt

The main OVFruitQG model uses a frozen CLIP backbone through Hugging Face transformers. If the CLIP checkpoint is not already cached, it may be downloaded automatically by transformers.

Dataset

The dataset archive is provided as:

OVfruitQG dataset.zip

Unzip it before running training/evaluation scripts. Annotation files and dataset notes are also provided under data/.

The public split protocol is category-level and is described in:

data/split_protocol.md

No per-image train/validation/test split CSV files are included in this release.

Label Order

Quality labels:

healthy, rotten, moldy, bruised, cracked

Maturity labels:

unripe, ripe, overripe

The same orders are exported from models as QUALITY_CLASSES and MATURITY_CLASSES.

Checkpoints

The release includes four category splits:

File pattern	Model
`checkpoints/OVFruitQG_split*.pt`	OVFruitQG / V3.1
`checkpoints/ResNet_split*.pt`	Supervised ResNet50
`checkpoints/ViT_split*.pt`	Supervised ViT-B/16
`checkpoints/ResNet_LDB_fast_split*.pt`	V4.5 / ResNet LDB fast

See checkpoints/checkpoint_manifest.csv for SHA256 hashes.

Load a Checkpoint

import torch
from PIL import Image
from torchvision import transforms

from models import build_model_for_checkpoint, load_checkpoint_into_model

checkpoint = "checkpoints/ResNet_split1.pt"
model = build_model_for_checkpoint(checkpoint)
load_checkpoint_into_model(model, checkpoint)
model.eval()

preprocess = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225]),
])

image = Image.open("path/to/crop.jpg").convert("RGB")
pixel_values = preprocess(image).unsqueeze(0)

with torch.no_grad():
    outputs = model(pixel_values)
    quality_id = outputs["quality_logits"].argmax(dim=-1).item()
    maturity_id = outputs["maturity_logits"].argmax(dim=-1).item()

For OVFruitQG:

from models import build_model, load_checkpoint_into_model

model = build_model(
    "v3_1",
    model_version="v3_1",
    freeze_backbone=True,
    allow_backbone_fallback=False,
)
load_checkpoint_into_model(model, "checkpoints/OVFruitQG_split1.pt")

For V4.5:

from models import build_model_for_checkpoint, load_checkpoint_into_model

checkpoint = "checkpoints/ResNet_LDB_fast_split1.pt"
model = build_model_for_checkpoint(checkpoint)
load_checkpoint_into_model(model, checkpoint)

Results

Paper result tables are stored in results/, including:

Table 5: main recognition results
Table 6: prompt retrieval and matching
Table 7: seen/unseen generalization
Table 8: ablation study
Table 9: efficiency analysis

License and Third-Party Models

The code is released under the MIT license. Third-party foundation models such as CLIP, OpenCLIP, and Grounding DINO are not redistributed unless their files are explicitly present in this folder. Use the official sources and respect their original licenses.

Downloads last month: -; Downloads are not tracked for this model. How to track