view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 5 days ago • 59
🦢SWIM-IR Dataset Collection 29 million Synthetic Wikipedia-based Multilingual Retrieval Training Pairs. • 4 items • Updated Apr 28 • 6
Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval Paper • 2311.05800 • Published Nov 10, 2023 • 2
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models Paper • 2104.08663 • Published Apr 17, 2021 • 3