ReCaRe — Domain-Adapted Dense Retrievers (Table 4)

Fine-tuned dense retriever checkpoints for the ReCaRe benchmark (kasys/ReCaRe), reproducing the domain-adaptation results (Table 4) of the ReCaRe CIKM 2026 Resource paper. These let third parties reproduce the evaluation without re-running the (expensive) training phase.

Contents (20 checkpoints)

5 base models × 2 tasks (rat2rev, rev2rev) × 2 languages (en, ja), each in a subfolder named <model>_<task>_<lang>:

Base model	Tuning	Saved weights	Per-ckpt size
`mdpr` (`castorini/mdpr-tied-pft-msmarco`)	full FT	`model.safetensors`	~683 MB
`mcontriever` (`facebook/mcontriever`)	full FT	`model.safetensors`	~683 MB
`me5-base` (`intfloat/multilingual-e5-base`)	full FT	`model.safetensors`	~1.1 GB
`bge-m3` (`BAAI/bge-m3`)	PEFT LoRA adapter	`adapter_model.safetensors` + `adapter_config.json`	~49 MB
`jina-v3` (`jinaai/jina-embeddings-v3`)	native task-LoRA fine-tune	`model.safetensors` (full custom model)	~1.1 GB

Each subfolder also ships its tokenizer and a checkpoint_meta.json with the training hyperparameters (tuning_method, learning_rate, epochs, seed, temperature, output_alias, …).

Two save formats (see checkpoint_meta.json → tuning_method):

bge-m3 uses a standard PEFT LoRA adapter, so only the adapter is saved (adapter_model.safetensors); load it on top of BAAI/bge-m3.
jina-v3 fine-tunes Jina's built-in task LoRA and is saved as a full custom model via save_pretrained() (model.safetensors).
mdpr / mcontriever / me5-base are full fine-tunes (model.safetensors).

Reproduction

The released code repo kasys-lab/ReCaRe fetches these into the layout its evaluation expects (results/dense_finetune/<model>/<task>_<lang>/best) and runs Phase 3 of scripts/run_domain_adaptation.sh (encode adapted corpus → evaluate on test → aggregate), so you can skip Phase 2 (training).

Manual download of a single checkpoint:

from huggingface_hub import snapshot_download
ckpt = snapshot_download("kasys/ReCaRe-domain-adaptation",
                         allow_patterns="bge-m3_rat2rev_en/*")
# -> .../bge-m3_rat2rev_en/  (point run-finetuned-dense at it)

License & citation

CC BY 4.0. Derived from the public base models above and the kasys/ReCaRe benchmark. Cite the ReCaRe resource paper and kasys/ReCaRe (DOI 10.57967/hf/8642).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kasys/ReCaRe-domain-adaptation

Base model

BAAI/bge-m3

Adapter

(32)

this model