fine_tuned_squad_callback10

This model is a fine-tuned version of Qwen/Qwen2-1.5B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.8142	0.0249	100	0.3328	0.8705
0.4483	0.0497	200	0.2505	0.9276
0.3808	0.0746	300	0.2715	0.9267
0.2638	0.0994	400	0.3570	0.9116
0.3363	0.1243	500	0.3385	0.9284
0.2347	0.1491	600	0.3153	0.9273
0.2882	0.1740	700	0.1504	0.9516
0.1782	0.1989	800	0.1403	0.9611
0.2897	0.2237	900	0.3369	0.9424
0.276	0.2486	1000	0.1714	0.9595
0.1409	0.2734	1100	0.1756	0.9527
0.1726	0.2983	1200	0.1371	0.9664
0.2029	0.3231	1300	0.3187	0.9223
0.1869	0.3480	1400	0.1917	0.9561
0.2551	0.3729	1500	0.1410	0.9592
0.1249	0.3977	1600	0.2447	0.9547
0.1784	0.4226	1700	0.1548	0.9687
0.1567	0.4474	1800	0.2113	0.9625
0.1863	0.4723	1900	0.1238	0.9723
0.2032	0.4971	2000	0.2280	0.9516
0.161	0.5220	2100	0.1819	0.9536
0.1687	0.5469	2200	0.1034	0.9757
0.1196	0.5717	2300	0.0857	0.9807
0.1407	0.5966	2400	0.0824	0.9827
0.1028	0.6214	2500	0.1338	0.9757
0.1257	0.6463	2600	0.0872	0.9776
0.1226	0.6711	2700	0.1050	0.9799
0.1249	0.6960	2800	0.0902	0.9776
0.0763	0.7209	2900	0.1054	0.9787
0.125	0.7457	3000	0.1131	0.9765
0.1257	0.7706	3100	0.2562	0.9547
0.163	0.7954	3200	0.1519	0.9746
0.1246	0.8203	3300	0.1513	0.9729
0.1358	0.8451	3400	0.1183	0.9656