metadata

license: mit
base_model: ai-forever/ruElectra-medium
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - recall
  - precision
  - f1
model-index:
  - name: training_results
    results: []

training_results

This model is a fine-tuned version of ai-forever/ruElectra-medium on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.6537
Accuracy: 0.6901
Recall: 0.6451
Precision: 0.6599
F1: 0.6390

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Recall	Precision	F1
No log	1.0	100	1.3590	0.5643	0.3617	0.3821	0.3270
No log	2.0	200	0.9903	0.6637	0.5263	0.5238	0.5058
No log	3.0	300	0.9370	0.6842	0.5254	0.5367	0.5185
No log	4.0	400	0.9366	0.7047	0.5982	0.5655	0.5675
0.9611	5.0	500	1.0894	0.6901	0.5707	0.5656	0.5529
0.9611	6.0	600	1.1565	0.7018	0.5834	0.5569	0.5649
0.9611	7.0	700	1.1471	0.7076	0.5887	0.5565	0.5687
0.9611	8.0	800	1.2477	0.7281	0.6326	0.7122	0.6341
0.9611	9.0	900	1.3606	0.7310	0.6556	0.7163	0.6484
0.1529	10.0	1000	1.7044	0.6725	0.6059	0.6230	0.5964
0.1529	11.0	1100	1.5851	0.7193	0.6600	0.6571	0.6548
0.1529	12.0	1200	1.7624	0.6959	0.6463	0.6714	0.6457
0.1529	13.0	1300	1.9156	0.6988	0.6312	0.6636	0.6360
0.1529	14.0	1400	1.8304	0.7251	0.6525	0.6899	0.6586
0.0417	15.0	1500	1.9549	0.7164	0.6442	0.6758	0.6485
0.0417	16.0	1600	1.9306	0.7398	0.6569	0.7047	0.6639
0.0417	17.0	1700	2.1130	0.6959	0.6591	0.6904	0.6556
0.0417	18.0	1800	1.9658	0.7368	0.6312	0.7479	0.6545
0.0417	19.0	1900	2.0108	0.7281	0.6497	0.7180	0.6605
0.0149	20.0	2000	2.0183	0.7368	0.6757	0.7038	0.6832
0.0149	21.0	2100	2.1543	0.7222	0.7085	0.6745	0.6824
0.0149	22.0	2200	1.9347	0.7485	0.6518	0.7867	0.6722
0.0149	23.0	2300	1.8752	0.7690	0.6852	0.7686	0.7024
0.0149	24.0	2400	2.0048	0.7544	0.6834	0.7379	0.6966
0.0111	25.0	2500	2.0534	0.7515	0.6635	0.7640	0.6841
0.0111	26.0	2600	2.0457	0.7368	0.6503	0.6918	0.6586
0.0111	27.0	2700	2.1561	0.7368	0.6657	0.6990	0.6678
0.0111	28.0	2800	2.1431	0.7398	0.6590	0.6734	0.6604
0.0111	29.0	2900	2.3783	0.7135	0.6544	0.6643	0.6509
0.0103	30.0	3000	2.3847	0.7251	0.6368	0.7351	0.6597
0.0103	31.0	3100	2.2030	0.7427	0.7017	0.7082	0.7023
0.0103	32.0	3200	2.4123	0.7368	0.6679	0.6974	0.6697
0.0103	33.0	3300	2.2644	0.7398	0.6760	0.7428	0.6902
0.0103	34.0	3400	2.3744	0.7339	0.6847	0.7080	0.6800
0.0135	35.0	3500	2.1573	0.7485	0.6933	0.6932	0.6867
0.0135	36.0	3600	2.1728	0.7515	0.6649	0.7606	0.6802
0.0135	37.0	3700	2.0993	0.7719	0.6859	0.7705	0.6972
0.0135	38.0	3800	2.6537	0.6901	0.6451	0.6599	0.6390

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1