|
--- |
|
datasets: |
|
- Tevatron/msmarco-passage |
|
--- |
|
Trained with Tevatron `reranker` branch; |
|
|
|
script: |
|
``` |
|
epoch=3 |
|
bs=32 |
|
gradient_accumulation_steps=8 |
|
real_bs=$(( $bs / $gradient_accumulation_steps )) |
|
|
|
CUDA_VISIBLE_DEVICES=0 python examples/reranker/reranker_train.py \ |
|
--output_dir reranker_xlmr.bs-$bs.epoch-$epoch \ |
|
--model_name_or_path xlm-roberta-large \ |
|
--save_steps 20000 \ |
|
--dataset_name Tevatron/msmarco-passage \ |
|
--fp16 \ |
|
--per_device_train_batch_size $real_bs \ |
|
--gradient_accumulation_steps $gradient_accumulation_steps \ |
|
--train_n_passages 8 \ |
|
--learning_rate 5e-6 \ |
|
--q_max_len 16 \ |
|
--p_max_len 128 \ |
|
--num_train_epochs $epoch \ |
|
--logging_steps 500 \ |
|
--dataloader_num_workers 4 \ |
|
--overwrite_output_dir |
|
``` |