sbert_large_nlu_ru_pos

This model is a fine-tuned version of ai-forever/sbert_large_nlu_ru on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.4870
Precision: 0.5717
Recall: 0.605
F1: 0.5879
Accuracy: 0.9001

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.09	50	0.6457	0.0	0.0	0.0	0.7571
No log	2.17	100	0.5343	0.0458	0.0463	0.0461	0.7998
No log	3.26	150	0.3732	0.1121	0.1486	0.1278	0.8512
No log	4.35	200	0.3237	0.2713	0.3436	0.3032	0.8778
No log	5.43	250	0.2921	0.3412	0.4189	0.3761	0.8935
No log	6.52	300	0.2778	0.4079	0.5386	0.4642	0.9011
No log	7.61	350	0.2989	0.4301	0.4807	0.4540	0.9012
No log	8.7	400	0.2617	0.4489	0.5676	0.5013	0.9083
No log	9.78	450	0.3645	0.4661	0.5174	0.4904	0.9050
0.3288	10.87	500	0.3305	0.5297	0.6023	0.5637	0.9126
0.3288	11.96	550	0.3256	0.5544	0.6004	0.5765	0.9093
0.3288	13.04	600	0.3275	0.4330	0.5927	0.5004	0.9093
0.3288	14.13	650	0.4194	0.5017	0.5618	0.5301	0.9123
0.3288	15.22	700	0.3667	0.5275	0.6100	0.5658	0.9138
0.3288	16.3	750	0.4694	0.5117	0.6351	0.5668	0.9087
0.3288	17.39	800	0.4007	0.5381	0.6139	0.5735	0.9098
0.3288	18.48	850	0.3834	0.5264	0.5965	0.5593	0.9103
0.3288	19.57	900	0.4039	0.5061	0.6371	0.5641	0.9078
0.3288	20.65	950	0.5111	0.5850	0.6042	0.5945	0.9107
0.0507	21.74	1000	0.5454	0.5699	0.5985	0.5838	0.9124
0.0507	22.83	1050	0.4575	0.5668	0.6139	0.5894	0.9148
0.0507	23.91	1100	0.3752	0.5281	0.6178	0.5694	0.9126
0.0507	25.0	1150	0.5141	0.6074	0.6332	0.6200	0.9159
0.0507	26.09	1200	0.4203	0.5464	0.6371	0.5882	0.9134
0.0507	27.17	1250	0.4810	0.5150	0.6313	0.5672	0.9115
0.0507	28.26	1300	0.4972	0.5560	0.5753	0.5655	0.9116
0.0507	29.35	1350	0.6118	0.5439	0.6216	0.5802	0.9127
0.0507	30.43	1400	0.5298	0.4354	0.6371	0.5172	0.8847
0.0507	31.52	1450	0.5129	0.5771	0.6216	0.5985	0.9132
0.0234	32.61	1500	0.5165	0.5395	0.6332	0.5826	0.9068
0.0234	33.7	1550	0.4776	0.5110	0.6255	0.5625	0.9095
0.0234	34.78	1600	0.3794	0.5156	0.6699	0.5827	0.9117
0.0234	35.87	1650	0.4895	0.6074	0.6332	0.6200	0.9165
0.0234	36.96	1700	0.5130	0.6317	0.6158	0.6237	0.9137
0.0234	38.04	1750	0.5138	0.6143	0.6120	0.6132	0.9103
0.0234	39.13	1800	0.5555	0.5579	0.6602	0.6048	0.9044
0.0234	40.22	1850	0.3895	0.5055	0.6197	0.5568	0.9107
0.0234	41.3	1900	0.4607	0.5936	0.6429	0.6172	0.9101
0.0234	42.39	1950	0.3913	0.5654	0.6429	0.6016	0.9091
0.0259	43.48	2000	0.3646	0.5797	0.6602	0.6173	0.9091
0.0259	44.57	2050	0.5094	0.6579	0.6274	0.6423	0.9191
0.0259	45.65	2100	0.4718	0.5996	0.6158	0.6076	0.9124
0.0259	46.74	2150	0.5557	0.5855	0.6409	0.6120	0.9056
0.0259	47.83	2200	0.5481	0.6018	0.6332	0.6171	0.9106
0.0259	48.91	2250	0.5198	0.5535	0.6486	0.5973	0.9104
0.0259	50.0	2300	0.4876	0.6282	0.6197	0.6239	0.9098
0.0259	51.09	2350	0.4904	0.5352	0.5135	0.5241	0.8984
0.0259	52.17	2400	0.4268	0.5639	0.6390	0.5991	0.9080
0.0259	53.26	2450	0.4759	0.5695	0.5772	0.5733	0.9057
0.0221	54.35	2500	0.5927	0.6129	0.5869	0.5996	0.9017
0.0221	55.43	2550	0.4404	0.4917	0.6274	0.5513	0.8964

Framework versions

Transformers 4.38.2
Pytorch 2.1.2
Datasets 2.1.0
Tokenizers 0.15.2

DimasikKurd
/

sbert_large_nlu_ru_pos

sbert_large_nlu_ru_pos

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for DimasikKurd/sbert_large_nlu_ru_pos

Evaluation results