sparql-qwen

This model is a fine-tuned version of Qwen/Qwen2.5-Coder-0.5B-Instruct on the arrow dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.7618	0.1048	500	0.7231
0.7397	0.2096	1000	0.6676
0.7213	0.3143	1500	0.6440
0.7047	0.4191	2000	0.6283
0.6905	0.5239	2500	0.6181
0.6822	0.6287	3000	0.6081
0.6651	0.7334	3500	0.6007
0.662	0.8382	4000	0.5938
0.6535	0.9430	4500	0.5889
0.562	1.0478	5000	0.5846
0.4974	1.1526	5500	0.5820
0.5317	1.2573	6000	0.5778
0.572	1.3621	6500	0.5743
0.5167	1.4669	7000	0.5718
0.5479	1.5717	7500	0.5692
0.5368	1.6764	8000	0.5659
0.5622	1.7812	8500	0.5643
0.5146	1.8860	9000	0.5621
0.509	1.9908	9500	0.5602
0.5536	2.0956	10000	0.5589
0.5035	2.2003	10500	0.5592
0.5399	2.3051	11000	0.5567
0.5247	2.4099	11500	0.5553
0.5365	2.5147	12000	0.5549
0.4425	2.6194	12500	0.5545
0.4761	2.7242	13000	0.5524
0.5368	2.8290	13500	0.5509
0.5214	2.9338	14000	0.5494
0.519	3.0386	14500	0.5496
0.5606	3.1433	15000	0.5492
0.5362	3.2481	15500	0.5476
0.5275	3.3529	16000	0.5476
0.5159	3.4577	16500	0.5464
0.5171	3.5624	17000	0.5461
0.5242	3.6672	17500	0.5454
0.5053	3.7720	18000	0.5445
0.512	3.8768	18500	0.5441
0.5259	3.9816	19000	0.5428
0.4363	4.0863	19500	0.5437
0.4784	4.1911	20000	0.5440
0.4703	4.2959	20500	0.5448
0.4467	4.4007	21000	0.5436