mhqa-cross-encoder-reranker

This model is a fine-tuned version of xlm-roberta-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2.0

Training Loss	Epoch	Step	Validation Loss
0.0223	0.0474	250	0.0165
0.0237	0.0949	500	0.0162
0.0233	0.1423	750	0.0147
0.0236	0.1897	1000	0.0166
0.0176	0.2371	1250	0.0157
0.0153	0.2846	1500	0.0140
0.0230	0.3320	1750	0.0175
0.0174	0.3794	2000	0.0158
0.0153	0.4269	2250	0.0153
0.0187	0.4743	2500	0.0140
0.0157	0.5217	2750	0.0138
0.0203	0.5692	3000	0.0144
0.0154	0.6166	3250	0.0134
0.0158	0.6640	3500	0.0133
0.0192	0.7114	3750	0.0127
0.0212	0.7589	4000	0.0160
0.0143	0.8063	4250	0.0131
0.0113	0.8537	4500	0.0125
0.0140	0.9012	4750	0.0127
0.0129	0.9486	5000	0.0126
0.0163	0.9960	5250	0.0122
0.0148	1.0434	5500	0.0123
0.0136	1.0909	5750	0.0120
0.0140	1.1383	6000	0.0122
0.0153	1.1857	6250	0.0128
0.0135	1.2332	6500	0.0122
0.0139	1.2806	6750	0.0133
0.0147	1.3280	7000	0.0115
0.0133	1.3755	7250	0.0121
0.0110	1.4229	7500	0.0116
0.0130	1.4703	7750	0.0118
0.0138	1.5177	8000	0.0122
0.0119	1.5652	8250	0.0115
0.0089	1.6126	8500	0.0113
0.0110	1.6600	8750	0.0113
0.0138	1.7075	9000	0.0119
0.0140	1.7549	9250	0.0116
0.0116	1.8023	9500	0.0111
0.0132	1.8497	9750	0.0114
0.0119	1.8972	10000	0.0113
0.0131	1.9446	10250	0.0113
0.0124	1.9920	10500	0.0113
0.0124	2.0	10542	0.0113

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

this model