DiffusionGemma finetunes for radiology VQA

This repository contains LoRA finetunes of DiffusionGemma (image-conditioned discrete-diffusion LLM) for radiology visual question answering, each paired with an autoregressive Gemma-4 finetune as a controlled baseline. It corresponds to the paper Discrete Diffusion Language Models for Interactive Radiology Report Drafting.

The dataset covers mixed modalities/anatomy (VQA-RAD, SLAKE, VQA-Med: X-ray/CT/MRI, head/chest/abdomen). Judge-best checkpoint per cell.

Code: https://github.com/mxvp/discrete_diffusion_RRG

subfolder backbone base model dataset LLM-judge acc
diffusion-vqarad discrete-diffusion google/diffusiongemma-26B-A4B-it VQA-RAD 0.649
ar-vqarad autoregressive google/gemma-4-26B-A4B-it VQA-RAD 0.649
diffusion-slake discrete-diffusion google/diffusiongemma-26B-A4B-it SLAKE 0.863
ar-slake autoregressive google/gemma-4-26B-A4B-it SLAKE 0.817
diffusion-vqamed discrete-diffusion google/diffusiongemma-26B-A4B-it VQA-Med 0.666
ar-vqamed autoregressive google/gemma-4-26B-A4B-it VQA-Med 0.631
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gevaertlab/diffusiongemma-radiology-vqa

Adapter
(3)
this model

Collection including gevaertlab/diffusiongemma-radiology-vqa

Paper for gevaertlab/diffusiongemma-radiology-vqa