mhqa-cross-encoder-reranker

This model is a fine-tuned version of xlm-roberta-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0111

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss
0.0223 0.0474 250 0.0165
0.0237 0.0949 500 0.0162
0.0233 0.1423 750 0.0147
0.0236 0.1897 1000 0.0166
0.0176 0.2371 1250 0.0157
0.0153 0.2846 1500 0.0140
0.0230 0.3320 1750 0.0175
0.0174 0.3794 2000 0.0158
0.0153 0.4269 2250 0.0153
0.0187 0.4743 2500 0.0140
0.0157 0.5217 2750 0.0138
0.0203 0.5692 3000 0.0144
0.0154 0.6166 3250 0.0134
0.0158 0.6640 3500 0.0133
0.0192 0.7114 3750 0.0127
0.0212 0.7589 4000 0.0160
0.0143 0.8063 4250 0.0131
0.0113 0.8537 4500 0.0125
0.0140 0.9012 4750 0.0127
0.0129 0.9486 5000 0.0126
0.0163 0.9960 5250 0.0122
0.0148 1.0434 5500 0.0123
0.0136 1.0909 5750 0.0120
0.0140 1.1383 6000 0.0122
0.0153 1.1857 6250 0.0128
0.0135 1.2332 6500 0.0122
0.0139 1.2806 6750 0.0133
0.0147 1.3280 7000 0.0115
0.0133 1.3755 7250 0.0121
0.0110 1.4229 7500 0.0116
0.0130 1.4703 7750 0.0118
0.0138 1.5177 8000 0.0122
0.0119 1.5652 8250 0.0115
0.0089 1.6126 8500 0.0113
0.0110 1.6600 8750 0.0113
0.0138 1.7075 9000 0.0119
0.0140 1.7549 9250 0.0116
0.0116 1.8023 9500 0.0111
0.0132 1.8497 9750 0.0114
0.0119 1.8972 10000 0.0113
0.0131 1.9446 10250 0.0113
0.0124 1.9920 10500 0.0113
0.0124 2.0 10542 0.0113

Framework versions

  • Transformers 5.10.2
  • Pytorch 2.12.0+cu130
  • Datasets 5.0.0
  • Tokenizers 0.22.2
Downloads last month
2,932
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Goodnight7/mhqa-cross-encoder-reranker

Finetuned
(4063)
this model