LAPVQA โ€” Radiology Report Generation (Native / End-to-end)

Part of the LAPVQA collection.

Description

RRG decoders trained end-to-end alongside their vision encoders. Each checkpoint is a dict: {state_dict, vis_dim, d_model, num_layers, nhead, encoder, epoch, val_bleu4}.

File Encoder vis_dim
clip-vit-l14.pt CLIP ViT-L/14 (fine-tuned) 1024
siglip.pt SigLIP (fine-tuned) 1152
florence2.pt Florence-2 (fine-tuned) 1024
coca.pt CoCa (fine-tuned) 768
mae-vit-l16.pt MAE ViT-L/16 (fine-tuned) 1024

Results (MIMIC-CXR test set, MAE-ViT-L/16)

BLEU-4 ROUGE-L RadGraph-s
0.032 0.164 0.195

Loading

import torch
from lapvqa.rrg.heads import ReportGenerationHead

ckpt = torch.load("mae-vit-l16.pt", map_location="cpu")
head = ReportGenerationHead(
    vis_dim    = ckpt["vis_dim"],
    d_model    = ckpt["d_model"],
    num_layers = ckpt["num_layers"],
    nhead      = ckpt["nhead"],
)
head.load_state_dict(ckpt["state_dict"])
head.eval()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including dmusingu/lapvqa-rrg-native