DPR model trained for NeuCLIR based on a XLMR-Large C3-pretrained language model with MTT with MS-MARCO English queries and translated documents in Chinese, Persian, and Russian.
Translation can be found in neuMARCO on ir-datasets
.
Please cite the following papers if you use this model
@inproceedings{sigir2022c3,
author = {Eugene Yang and Suraj Nair and Ramraj Chandradevan and Rebecca Iglesias-Flores and Douglas W. Oard},
title = {C3: Continued Pretraining with Contrastive Weak Supervision for Cross Language Ad-Hoc Retrieval},
booktitle = {Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) (Short Paper)},
year = {2022},
url = {https://arxiv.org/abs/2204.11989}
}
@inproceedings{ecir2023mlir,
title = {Neural Approaches to Multilingual Information Retrieval},
author = {Dawn Lawrie and Eugene Yang and Douglas W Oard and James Mayfield},
booktitle = {Proceedings of the 45th European Conference on Information Retrieval (ECIR)},
year = {2023},
url = {https://arxiv.org/abs/2209.01335}
}
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.