trainer_output

This model is a fine-tuned version of AlexeySorokin/ossbert-onc-unlab-from_multilingual-bs64-5epochs on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 25

Training Loss	Epoch	Step	Validation Loss	Lemma accuracy	Sentence accuracy (lemmas)
0.7949	1.0	546	0.2852	95.1919	58.3486
0.2481	2.0	1092	0.1930	96.8156	68.0734
0.1518	3.0	1638	0.1619	97.1177	71.7431
0.1048	4.0	2184	0.1460	97.4701	73.9450
0.0777	5.0	2730	0.1206	97.9862	77.7982
0.0568	6.0	3276	0.1289	97.8855	77.2477
0.044	7.0	3822	0.1267	98.0994	79.8165
0.0329	8.0	4368	0.1295	98.1498	79.6330
0.024	9.0	4914	0.1255	98.1372	79.4495
0.0199	10.0	5460	0.1257	98.3638	81.2844
0.0114	11.0	6006	0.1293	98.3134	80.9174
0.0079	12.0	6552	0.1302	98.2756	80.1835
0.0056	13.0	7098	0.1335	98.2756	80.3670
0.0055	14.0	7644	0.1396	98.1750	78.8991

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

Finetuned

(4)

this model