--- language: - en inference: false --- # XLM RoBERTa Base ANCE-Warmuped This is a XLM RoBERTa Base model trained with ANCE warmup script. RobertaForSequenceClassification is replaced to XLMRobertaForSequenceClassification in warmup script. trained 60k steps. train args is below: ``` text data_dir: ../data/raw_data/ train_model_type: rdot_nll model_name_or_path: xlm-roberta-base task_name: msmarco output_dir: config_name: tokenizer_name: cache_dir: max_seq_length: 128 do_train: True do_eval: False evaluate_during_training: True do_lower_case: False log_dir: ../logs/ eval_type: full optimizer: lamb scheduler: linear per_gpu_train_batch_size: 32 per_gpu_eval_batch_size: 32 gradient_accumulation_steps: 1 learning_rate: 0.0002 weight_decay: 0.0 adam_epsilon: 1e-08 max_grad_norm: 1.0 num_train_epochs: 2.0 max_steps: -1 warmup_steps: 1000 logging_steps: 1000 logging_steps_per_eval: 20 save_steps: 30000 eval_all_checkpoints: False no_cuda: False overwrite_output_dir: True overwrite_cache: False seed: 42 fp16: True fp16_opt_level: O1 expected_train_size: 35000000 load_optimizer_scheduler: False local_rank: 0 server_ip: server_port: n_gpu: 1 device: cuda:0 output_mode: classification num_labels: 2 train_batch_size: 32 ``` # Eval Result ``` text Reranking/Full ranking mrr: 0.27380855732933/0.24284821712830248 {"learning_rate": 0.00019460324719871943, "loss": 0.0895877162806064, "step": 60000} ``` # Usage ``` python3 from transformers import XLMRobertaForSequenceClassification, XLMRobertaTokenizer repo = "k-ush/xlm-roberta-base-ance-warmup" model = XLMRobertaForSequenceClassification.from_pretrained(repo) tokenizer = XLMRobertaTokenizer.from_pretrained(repo) ```