Image Classification
Transformers
Safetensors
qwen2_5_vl
image-text-to-text
vision-language-model
medical
diabetic-retinopathy
qwen2.5-vl
merged
text-generation-inference
Instructions to use ottokevin/JUSTCSL-OphthaVision-3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ottokevin/JUSTCSL-OphthaVision-3B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="ottokevin/JUSTCSL-OphthaVision-3B") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("ottokevin/JUSTCSL-OphthaVision-3B") model = AutoModelForImageTextToText.from_pretrained("ottokevin/JUSTCSL-OphthaVision-3B") - Notebooks
- Google Colab
- Kaggle
JUSTCSL-OphthaVision-3B
Fine-tuned Qwen2.5-VL-3B-Instruct for 5-class diabetic retinopathy grading using LoRA, then merged into a full model.
Metrics (Test Set)
| Metric | Value |
|---|---|
| Accuracy | 78.1% |
| Weighted F1 | 0.74 |
| Macro F1 | 0.45 |
| No DR F1 | 0.955 |
| Moderate F1 | 0.708 |
| Avg Latency (GPU) | 1.16s |
Usage
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
import torch
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
"ottokevin/JUSTCSL-OphthaVision-3B",
torch_dtype=torch.bfloat16,
device_map="auto",
)
processor = AutoProcessor.from_pretrained("ottokevin/JUSTCSL-OphthaVision-3B")
# Prepare image + prompt
from PIL import Image
image = Image.open("retina.png").convert("RGB")
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": image},
{"type": "text", "text": "Classify this retina image: 0=No DR, 1=Mild, 2=Moderate, 3=Severe, 4=Proliferative DR. Answer:"},
],
},
]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], padding=True, return_tensors="pt").to(model.device)
generated = model.generate(**inputs, max_new_tokens=8, do_sample=False)
response = processor.tokenizer.decode(generated[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
- Downloads last month
- 23
Model tree for ottokevin/JUSTCSL-OphthaVision-3B
Base model
Qwen/Qwen2.5-VL-3B-Instruct