ReCaRe β€” Domain-Adapted Dense Retrievers (Table 4)

Fine-tuned dense retriever checkpoints for the ReCaRe benchmark (kasys/ReCaRe), reproducing the domain-adaptation results (Table 4) of the ReCaRe CIKM 2026 Resource paper. These let third parties reproduce the evaluation without re-running the (expensive) training phase.

Contents (20 checkpoints)

5 base models Γ— 2 tasks (rat2rev, rev2rev) Γ— 2 languages (en, ja), each in a subfolder named <model>_<task>_<lang>:

Base model Tuning Saved weights Per-ckpt size
mdpr (castorini/mdpr-tied-pft-msmarco) full FT model.safetensors ~683 MB
mcontriever (facebook/mcontriever) full FT model.safetensors ~683 MB
me5-base (intfloat/multilingual-e5-base) full FT model.safetensors ~1.1 GB
bge-m3 (BAAI/bge-m3) PEFT LoRA adapter adapter_model.safetensors + adapter_config.json ~49 MB
jina-v3 (jinaai/jina-embeddings-v3) native task-LoRA fine-tune model.safetensors (full custom model) ~1.1 GB

Each subfolder also ships its tokenizer and a checkpoint_meta.json with the training hyperparameters (tuning_method, learning_rate, epochs, seed, temperature, output_alias, …).

Two save formats (see checkpoint_meta.json β†’ tuning_method):

  • bge-m3 uses a standard PEFT LoRA adapter, so only the adapter is saved (adapter_model.safetensors); load it on top of BAAI/bge-m3.
  • jina-v3 fine-tunes Jina's built-in task LoRA and is saved as a full custom model via save_pretrained() (model.safetensors).
  • mdpr / mcontriever / me5-base are full fine-tunes (model.safetensors).

Reproduction

The released code repo kasys-lab/ReCaRe fetches these into the layout its evaluation expects (results/dense_finetune/<model>/<task>_<lang>/best) and runs Phase 3 of scripts/run_domain_adaptation.sh (encode adapted corpus β†’ evaluate on test β†’ aggregate), so you can skip Phase 2 (training).

Manual download of a single checkpoint:

from huggingface_hub import snapshot_download
ckpt = snapshot_download("kasys/ReCaRe-domain-adaptation",
                         allow_patterns="bge-m3_rat2rev_en/*")
# -> .../bge-m3_rat2rev_en/  (point run-finetuned-dense at it)

License & citation

CC BY 4.0. Derived from the public base models above and the kasys/ReCaRe benchmark. Cite the ReCaRe resource paper and kasys/ReCaRe (DOI 10.57967/hf/8642).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for kasys/ReCaRe-domain-adaptation

Base model

BAAI/bge-m3
Adapter
(32)
this model