YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

AdaRAG-CT

Beyond the Embedding Bottleneck: Adaptive Retrieval-Augmented 3D CT Report Generation (ECCV 2026)

πŸ“„ arXiv Β· πŸ’» GitHub Β· πŸ€— Models & Data

Contrastive 3D CT embeddings concentrate 90% of their variance in just 2 of 512 dimensions, and scaling the LLM from 8B to 70B gives no gain β€” the bottleneck is visual, not generative. AdaRAG-CT compensates by retrieving organ-indexed report sentences and adaptively injecting them during generation, lifting Clinical F1 from 0.420 (CT-Agent) to 0.480.

Results (CT-RATE validation)

Model Params Clin-F1 BLEU-4 ROUGE-L LLaMA
CT-CHAT (repro.) 8B 0.224 0.188 0.303 6.73
CT-CHAT (repro.) 70B 0.161 0.182 0.321 6.02
BTB3D 8B 0.258 0.213 β€” β€”
CT-Agent β€” 0.420 0.231 0.490 β€”
Base (ViSD-Boost + CT-CLIP) 8B 0.455 0.205 0.315 7.30
AdaRAG-CT 8B 0.480 0.242 0.354 7.75
Base (ViSD-Boost + CT-CLIP) 70B 0.405 0.213 0.334 7.10
AdaRAG-CT 70B 0.426 0.250 0.361 7.53

CT-CHAT reproduced under our unified evaluation protocol. Base (ViSD-Boost + CT-CLIP) is our base model: it takes organ-level ViSD-Boost embeddings plus whole-volume CT-CLIP embeddings as visual input.

Install

conda create -n adaragct python=3.12 && conda activate adaragct
pip install -r requirements.txt

Data & Checkpoints

All weights and retrieval data are on πŸ€— HuggingFace. Four checkpoints: Base 8B/70B (full) and AdaRAG-CT 8B/70B (LoRA adapter + projector, load on top of the matching base).

After downloading, place the data/ folder and the checkpoint folders under results/ at the repo root, matching the paths in the commands below β€” e.g. results/base_8b/checkpoint, results/adaragct_8b/checkpoint_step_2000. (The released predictions/metrics/logs under results/ already come with this repo.)

Evaluate

Score the released predictions directly:

python -m adaragct.eval.cal_metrics results/adaragct_8b/infer_step_2000.jsonl --output metrics.json

Add --compute-llama-score for the LLaMA score, --bootstrap 1000 for 95% confidence intervals.

Or generate predictions, then score:

# AdaRAG-CT (adaptive retrieval)
python -m adaragct.inference.inference_rag --checkpoint results/adaragct_8b/checkpoint_step_2000 --output pred.jsonl

# Base / no-retrieval (same checkpoint, retrieval disabled)
python -m adaragct.inference.inference_rag --checkpoint results/adaragct_8b/checkpoint_step_2000 --no-rag --output base_pred.jsonl

python -m adaragct.eval.cal_metrics pred.jsonl --output metrics.json

Key inference flags: --no-rag (disable retrieval), --text2text (Text2Text retrieval pipeline), --oracle (oracle context), --top-k, --max-retrievals.

On-the-fly retrieval encodes each generated probe sentence with a fine-tuned text encoder (code + indices ship under data/retrieval/; it auto-downloads microsoft/BiomedVLP-CXR-BERT-specialized on first run). For a download-free run, use --oracle (precomputed contexts). To reproduce the exact paper numbers, score the released predictions in results/.

Train

AdaRAG-CT trains a [RAG] trigger token on top of a frozen base model via LoRA, mixing oracle and retrieved contexts. Every required input β€” base checkpoint, CT embeddings, and the precomputed oracle/retrieval contexts β€” is provided in the HuggingFace bundle, so training runs directly with no extra preprocessing:

# 8B
python -m adaragct.train.train_rag --config configs/adaragct_8b.yaml
# 70B
python -m adaragct.train.train_rag --config configs/adaragct_70b.yaml

Each config points to data/ (embeddings, oracle_context_top3.jsonl, retrieval_context_top3.jsonl) and a base checkpoint under results/. base_8b.yaml is the 8B base-model reference config.

Citation

@misc{liang2026embeddingbottleneckadaptiveretrievalaugmented,
      title={Beyond the Embedding Bottleneck: Adaptive Retrieval-Augmented 3D CT Report Generation},
      author={Renjie Liang and Yiling Ma and Yang Xing and Zhengkang Fan and Jinqian Pan and Chengkun Sun and Li Li and Kuang Gong and Jie Xu},
      year={2026},
      eprint={2603.15822},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.15822},
}

Acknowledgements

CT-RATE / CT-CLIP Β· ViSD-Boost Β· LLaVA Β· Self-RAG

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for LiangRenjie/AdaRAG-CT