--- datasets: - Tevatron/msmarco-passage --- Trained with Tevatron `reranker` branch; script: ``` epoch=3 bs=32 gradient_accumulation_steps=8 real_bs=$(( $bs / $gradient_accumulation_steps )) CUDA_VISIBLE_DEVICES=0 python examples/reranker/reranker_train.py \ --output_dir reranker_xlmr.bs-$bs.epoch-$epoch \ --model_name_or_path xlm-roberta-large \ --save_steps 20000 \ --dataset_name Tevatron/msmarco-passage \ --fp16 \ --per_device_train_batch_size $real_bs \ --gradient_accumulation_steps $gradient_accumulation_steps \ --train_n_passages 8 \ --learning_rate 5e-6 \ --q_max_len 16 \ --p_max_len 128 \ --num_train_epochs $epoch \ --logging_steps 500 \ --dataloader_num_workers 4 \ --overwrite_output_dir ```