sbert_large_nlu_ru_neg

This model is a fine-tuned version of ai-forever/sbert_large_nlu_ru on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.7106
Precision: 0.5205
Recall: 0.57
F1: 0.5442
Accuracy: 0.8956

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0870	50	0.6440	0.0	0.0	0.0	0.7571
No log	2.1739	100	0.5237	0.0317	0.0579	0.0410	0.8069
No log	3.2609	150	0.3775	0.1163	0.1544	0.1327	0.8514
No log	4.3478	200	0.3368	0.2292	0.3031	0.2610	0.8769
No log	5.4348	250	0.3055	0.3066	0.3475	0.3258	0.8929
No log	6.5217	300	0.2919	0.3814	0.5463	0.4492	0.8989
No log	7.6087	350	0.2798	0.4372	0.5039	0.4682	0.9055
No log	8.6957	400	0.2730	0.3934	0.5560	0.4608	0.9071
No log	9.7826	450	0.3021	0.4666	0.5656	0.5113	0.9101
0.3321	10.8696	500	0.3249	0.4664	0.6023	0.5257	0.9110
0.3321	11.9565	550	0.3317	0.5316	0.5849	0.5570	0.9113
0.3321	13.0435	600	0.3352	0.4984	0.5946	0.5423	0.9127
0.3321	14.1304	650	0.3651	0.5079	0.5579	0.5317	0.9157
0.3321	15.2174	700	0.3856	0.4670	0.6004	0.5253	0.9083
0.3321	16.3043	750	0.4087	0.4905	0.5985	0.5391	0.9139
0.3321	17.3913	800	0.4108	0.5058	0.5869	0.5433	0.9113
0.3321	18.4783	850	0.3900	0.5597	0.6429	0.5984	0.9172
0.3321	19.5652	900	0.4572	0.5567	0.6158	0.5848	0.9168
0.3321	20.6522	950	0.4945	0.5952	0.5734	0.5841	0.9121
0.0516	21.7391	1000	0.5660	0.5835	0.5463	0.5643	0.9066
0.0516	22.8261	1050	0.4464	0.5307	0.6178	0.5709	0.9160
0.0516	23.9130	1100	0.5044	0.5696	0.6081	0.5882	0.9130
0.0516	25.0	1150	0.4807	0.5682	0.6274	0.5963	0.9151
0.0516	26.0870	1200	0.5006	0.5615	0.6525	0.6036	0.9157
0.0516	27.1739	1250	0.5228	0.6008	0.5985	0.5996	0.9127
0.0516	28.2609	1300	0.5091	0.5193	0.5965	0.5553	0.9117
0.0516	29.3478	1350	0.5135	0.6036	0.6409	0.6217	0.9177
0.0516	30.4348	1400	0.5183	0.5742	0.6351	0.6031	0.9157
0.0516	31.5217	1450	0.5202	0.5722	0.6506	0.6089	0.9106
0.0256	32.6087	1500	0.5170	0.5836	0.6602	0.6196	0.9174
0.0256	33.6957	1550	0.4348	0.6067	0.6313	0.6187	0.9215
0.0256	34.7826	1600	0.5070	0.6143	0.6120	0.6132	0.9156
0.0256	35.8696	1650	0.5840	0.6525	0.5907	0.6201	0.9121
0.0256	36.9565	1700	0.5587	0.5941	0.6274	0.6103	0.9124
0.0256	38.0435	1750	0.4073	0.5159	0.6564	0.5777	0.9117
0.0256	39.1304	1800	0.4428	0.6180	0.6371	0.6274	0.9166
0.0256	40.2174	1850	0.4775	0.5797	0.6390	0.6079	0.9199
0.0256	41.3043	1900	0.4121	0.5920	0.6274	0.6092	0.9171
0.0256	42.3913	1950	0.4683	0.6136	0.6467	0.6297	0.9179
0.0231	43.4783	2000	0.4961	0.6390	0.5946	0.6160	0.9137
0.0231	44.5652	2050	0.6040	0.6242	0.5483	0.5838	0.9031
0.0231	45.6522	2100	0.5498	0.6458	0.5985	0.6212	0.9121
0.0231	46.7391	2150	0.4636	0.6049	0.6236	0.6141	0.9212
0.0231	47.8261	2200	0.4797	0.634	0.6120	0.6228	0.9142
0.0231	48.9130	2250	0.5335	0.5134	0.6680	0.5805	0.9061
0.0231	50.0	2300	0.5348	0.6167	0.6120	0.6143	0.9075
0.0231	51.0870	2350	0.4871	0.6144	0.6429	0.6283	0.9085
0.0231	52.1739	2400	0.4767	0.5335	0.6757	0.5963	0.9082
0.0231	53.2609	2450	0.4494	0.5895	0.6486	0.6176	0.9109
0.0225	54.3478	2500	0.5282	0.5310	0.6448	0.5824	0.9088
0.0225	55.4348	2550	0.4321	0.5714	0.6332	0.6007	0.9148
0.0225	56.5217	2600	0.4822	0.6179	0.6274	0.6226	0.9105
0.0225	57.6087	2650	0.4360	0.5578	0.6429	0.5973	0.9150
0.0225	58.6957	2700	0.5101	0.6215	0.5927	0.6067	0.9083
0.0225	59.7826	2750	0.4751	0.5327	0.6602	0.5897	0.9069
0.0225	60.8696	2800	0.4942	0.6471	0.5946	0.6197	0.9065
0.0225	61.9565	2850	0.3628	0.4646	0.6332	0.5359	0.8957
0.0225	63.0435	2900	0.4447	0.6152	0.6236	0.6194	0.9098
0.0225	64.1304	2950	0.4965	0.5624	0.6525	0.6041	0.9130
0.0285	65.2174	3000	0.5616	0.5649	0.6216	0.5919	0.9082
0.0285	66.3043	3050	0.7228	0.65	0.5019	0.5664	0.8881

Framework versions

Transformers 4.40.1
Pytorch 2.2.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1

DimasikKurd
/

sbert_large_nlu_ru_neg

sbert_large_nlu_ru_neg

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for DimasikKurd/sbert_large_nlu_ru_neg

Evaluation results