DPR model trained for NeuCLIR based on a XLMR-Large C3-pretrained language model with MTT with MS-MARCO English queries and translated documents in Chinese, Persian, and Russian. Translation can be found in neuMARCO on ir-datasets.

Please cite the following papers if you use this model

    author = {Eugene Yang and Suraj Nair and Ramraj Chandradevan and Rebecca Iglesias-Flores and Douglas W. Oard},
    title = {C3: Continued Pretraining with Contrastive Weak Supervision for Cross Language Ad-Hoc Retrieval},
    booktitle = {Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) (Short Paper)},
    year = {2022},
    url = {https://arxiv.org/abs/2204.11989}

    title = {Neural Approaches to Multilingual Information Retrieval},
    author = {Dawn Lawrie and Eugene Yang and Douglas W Oard and James Mayfield},
    booktitle = {Proceedings of the 45th European Conference on Information Retrieval (ECIR)},
    year = {2023},
    url = {https://arxiv.org/abs/2209.01335}
