qwen2.5-3b-sft3-25-1

This model is a fine-tuned version of Qwen/Qwen2.5-3B on the hZzy/SFT_new_full2 dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
2.8314	0.2439	5	2.8128
2.8289	0.4878	10	2.8041
2.803	0.7317	15	2.7433
2.7473	0.9756	20	2.6599
2.66	1.2195	25	2.6077
2.5889	1.4634	30	2.5283
2.5215	1.7073	35	2.4741
2.4692	1.9512	40	2.4300
2.4251	2.1951	45	2.3937
2.3861	2.4390	50	2.3560
2.3446	2.6829	55	2.3176
2.3067	2.9268	60	2.2865
2.2708	3.1707	65	2.2475
2.2329	3.4146	70	2.2131
2.1949	3.6585	75	2.1857
2.1584	3.9024	80	2.1624
2.1389	4.1463	85	2.1421
2.118	4.3902	90	2.1239
2.0959	4.6341	95	2.1074
2.0727	4.8780	100	2.0930
2.066	5.1220	105	2.0805
2.0432	5.3659	110	2.0699
2.0314	5.6098	115	2.0608
2.0182	5.8537	120	2.0530
2.0105	6.0976	125	2.0461
1.9967	6.3415	130	2.0403
1.9982	6.5854	135	2.0354
1.9881	6.8293	140	2.0313
1.9934	7.0732	145	2.0278
1.978	7.3171	150	2.0254
1.9713	7.5610	155	2.0229
1.9737	7.8049	160	2.0209
1.9619	8.0488	165	2.0194
1.968	8.2927	170	2.0183
1.9616	8.5366	175	2.0174
1.9669	8.7805	180	2.0168
1.9642	9.0244	185	2.0164
1.9643	9.2683	190	2.0162
1.9597	9.5122	195	2.0161
1.9592	9.7561	200	2.0161