ruRoberta-large_pos

This model is a fine-tuned version of ai-forever/ruRoberta-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.5140
Precision: 0.5566
Recall: 0.5871
F1: 0.5714
Accuracy: 0.8981

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	50	0.6582	0.0	0.0	0.0	0.7628
No log	2.0	100	0.5705	0.0118	0.0173	0.0140	0.7783
No log	3.0	150	0.4784	0.0277	0.0501	0.0356	0.8028
No log	4.0	200	0.4043	0.0784	0.1329	0.0986	0.8323
No log	5.0	250	0.3553	0.1545	0.2697	0.1965	0.8523
No log	6.0	300	0.4051	0.2312	0.2601	0.2448	0.8692
No log	7.0	350	0.3351	0.3456	0.3796	0.3618	0.8901
No log	8.0	400	0.2774	0.3344	0.3911	0.3606	0.8974
No log	9.0	450	0.3010	0.3819	0.5048	0.4349	0.9022
0.3753	10.0	500	0.2892	0.4114	0.4875	0.4462	0.9051
0.3753	11.0	550	0.2773	0.3707	0.5222	0.4336	0.9076
0.3753	12.0	600	0.3447	0.4706	0.5549	0.5093	0.9076
0.3753	13.0	650	0.3312	0.4317	0.5356	0.4781	0.9073
0.3753	14.0	700	0.2870	0.4818	0.6378	0.5489	0.9132
0.3753	15.0	750	0.3944	0.4443	0.5992	0.5103	0.9024
0.3753	16.0	800	0.3599	0.4319	0.6416	0.5163	0.9018
0.3753	17.0	850	0.3568	0.4560	0.6397	0.5325	0.9042
0.3753	18.0	900	0.4296	0.4674	0.5241	0.4941	0.9106
0.3753	19.0	950	0.3939	0.4617	0.5453	0.5	0.9137
0.0842	20.0	1000	0.3882	0.5109	0.5434	0.5266	0.9066
0.0842	21.0	1050	0.3870	0.5311	0.6243	0.5740	0.9075
0.0842	22.0	1100	0.4163	0.4252	0.6628	0.5181	0.8925
0.0842	23.0	1150	0.4097	0.4577	0.5010	0.4784	0.9004
0.0842	24.0	1200	0.3709	0.5482	0.6031	0.5743	0.9161
0.0842	25.0	1250	0.3366	0.5088	0.6647	0.5764	0.9141
0.0842	26.0	1300	0.4558	0.6132	0.6108	0.6120	0.9171
0.0842	27.0	1350	0.4982	0.5720	0.5896	0.5806	0.9102
0.0842	28.0	1400	0.3998	0.5615	0.6513	0.6030	0.9178
0.0842	29.0	1450	0.5028	0.5620	0.6551	0.6050	0.9108
0.0476	30.0	1500	0.3672	0.5739	0.6435	0.6067	0.9117
0.0476	31.0	1550	0.4520	0.5330	0.6532	0.5870	0.9084
0.0476	32.0	1600	0.5027	0.5628	0.6127	0.5867	0.9101
0.0476	33.0	1650	0.4461	0.4581	0.6108	0.5235	0.9087
0.0476	34.0	1700	0.4407	0.4726	0.5992	0.5285	0.9070
0.0476	35.0	1750	0.4512	0.5211	0.5241	0.5226	0.9082

Framework versions

Transformers 4.38.2
Pytorch 2.1.2
Datasets 2.1.0
Tokenizers 0.15.2

DimasikKurd
/

ruRoberta-large_pos

ruRoberta-large_pos

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for DimasikKurd/ruRoberta-large_pos

Evaluation results