A XLM-RoBERTa-base model trained on mMARCO Japanese dataset with ANCE warmup script. Base checkpoint comes from k-ush/xlm-roberta-base-ance-warmup, so this model was trained both English and Japanese data. I upload checkpoint at 50k steps since MRR@100 at 60k checkpoint was decrease (mrr@100(rerank, full): 0.242, 0.182).
I formmated Japanese mMarco dataset for ANCE. Dataset preparetion script is available on github. https://github.com/argonism/JANCE/blob/master/data/gen_jp_data.py
Evaluation Result during trainning with mMarco Japanese dev set.
Reranking/Full ranking mrr: 0.24208174148360342/0.19015224905626082
- Downloads last month
Inference API has been turned off for this model.