DiffusionGemma-Medical-VQA-Finetune
Collection
1 item • Updated
How to use gevaertlab/diffusiongemma-radiology-vqa with PEFT:
Task type is invalid.
This repository contains LoRA finetunes of DiffusionGemma (image-conditioned discrete-diffusion LLM) for radiology visual question answering, each paired with an autoregressive Gemma-4 finetune as a controlled baseline. It corresponds to the paper Discrete Diffusion Language Models for Interactive Radiology Report Drafting.
The dataset covers mixed modalities/anatomy (VQA-RAD, SLAKE, VQA-Med: X-ray/CT/MRI, head/chest/abdomen). Judge-best checkpoint per cell.
Code: https://github.com/mxvp/discrete_diffusion_RRG
| subfolder | backbone | base model | dataset | LLM-judge acc |
|---|---|---|---|---|
| diffusion-vqarad | discrete-diffusion | google/diffusiongemma-26B-A4B-it | VQA-RAD | 0.649 |
| ar-vqarad | autoregressive | google/gemma-4-26B-A4B-it | VQA-RAD | 0.649 |
| diffusion-slake | discrete-diffusion | google/diffusiongemma-26B-A4B-it | SLAKE | 0.863 |
| ar-slake | autoregressive | google/gemma-4-26B-A4B-it | SLAKE | 0.817 |
| diffusion-vqamed | discrete-diffusion | google/diffusiongemma-26B-A4B-it | VQA-Med | 0.666 |
| ar-vqamed | autoregressive | google/gemma-4-26B-A4B-it | VQA-Med | 0.631 |
Base model
google/diffusiongemma-26B-A4B-it