Instructions to use MrEngineer/florence-2-vqa-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use MrEngineer/florence-2-vqa-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-base") model = PeftModel.from_pretrained(base_model, "MrEngineer/florence-2-vqa-lora") - Notebooks
- Google Colab
- Kaggle
Generative AI Radiology VLM (Florence-2)
This model is a Parameter-Efficient Fine-Tuned (PEFT/LoRA) version of Microsoft's Florence-2-base. It has been specifically trained on the VQA-RAD dataset to act as a Generative AI Vision-Language Model capable of answering free-form textual questions about medical X-Rays.
Model Details
- Architecture: Vision Encoder + Text Decoder (Florence-2)
- Task: Medical Visual Question Answering (VQA)
- Fine-Tuning Technique: Low-Rank Adaptation (LoRA)
- Target Modules:
q_proj,v_proj,o_proj
Training Results
The model was fine-tuned for 3 epochs on an NVIDIA A100-40GB GPU using mixed precision (fp16). The training loss steadily decreased, demonstrating strong anatomical and vocabulary convergence.
Local Web UI (Gradio)
The repository includes a local app.py script that loads these LoRA adapters and spins up a local web UI for inference.
Framework versions
- PEFT 0.11.1
- Transformers 4.42.4
- Downloads last month
- 37
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for MrEngineer/florence-2-vqa-lora
Base model
microsoft/Florence-2-base
