LAPVQA
Collection
Chest X-ray models: pre-trained encoders and task heads for VQA, DiffVQA, RRG, detection, and grounding on MIMIC-CXR. โข 14 items โข Updated
Part of the LAPVQA collection.
DiffVQA head trained on the frozen LAPVQA captioning-pretrained encoder
(lapvqa-pretrain-captioning).
Checkpoint is a plain DiffVQAHead state dict (vis_dim=1024).
| BLEU-4 | ROUGE-2 | RadGraph-s | BERTScore F1 |
|---|---|---|---|
| 0.468 | 0.562 | 0.303 | 0.938 |
import torch
from lapvqa.diffvqa.model import DiffVQAHead
ckpt = torch.load("pretrain-captioning_best.pt", map_location="cpu")
head = DiffVQAHead(vis_dim=1024)
head.load_state_dict(ckpt)
head.eval()
# pair with encoder_final.pt from lapvqa-pretrain-captioning