metadata

license: other
base_model: Qwen/Qwen1.5-4B
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: >-
      find_marker_both_sent_train_400_eval_40_random_permute_rerun_4_Qwen_Qwen1.5-4B_3e-4_lora
    results: []
library_name: peft

find_marker_both_sent_train_400_eval_40_random_permute_rerun_4_Qwen_Qwen1.5-4B_3e-4_lora

This model is a fine-tuned version of Qwen/Qwen1.5-4B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.3920
Accuracy: 0.7643

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 1
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 8
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.05
num_epochs: 50.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.6149	0.9968	117	1.2999	0.6755
0.862	1.9936	234	0.6614	0.7318
0.3519	2.9989	352	0.3529	0.7612
0.2097	3.9957	469	0.3055	0.7604
0.1746	4.9925	586	0.2799	0.7659
0.1507	5.9979	704	0.2721	0.7651
0.1431	6.9947	821	0.2592	0.7674
0.1381	8.0	939	0.2589	0.7667
0.1337	8.9968	1056	0.2509	0.7679
0.1292	9.9936	1173	0.2452	0.7682
0.1232	10.9989	1291	0.2604	0.7656
0.1214	11.9957	1408	0.2679	0.7653
0.119	12.9925	1525	0.2421	0.7681
0.1165	13.9979	1643	0.2545	0.7654
0.1161	14.9947	1760	0.2666	0.7630
0.1193	16.0	1878	0.2661	0.7645
0.1264	16.9968	1995	0.2994	0.7626
0.1201	17.9936	2112	0.2607	0.7647
0.1144	18.9989	2230	0.2665	0.7655
0.1147	19.9957	2347	0.2606	0.7646
0.1143	20.9925	2464	0.2834	0.7645
0.1105	21.9979	2582	0.2843	0.7645
0.1103	22.9947	2699	0.2959	0.7639
0.1081	24.0	2817	0.3331	0.7640
0.1093	24.9968	2934	0.3566	0.7640
0.1086	25.9936	3051	0.2995	0.7630
0.1124	26.9989	3169	0.2889	0.7624
0.1169	27.9957	3286	0.3392	0.7630
0.1225	28.9925	3403	0.2916	0.7633
0.1179	29.9979	3521	0.2572	0.7645
0.1139	30.9947	3638	0.3382	0.7635
0.1141	32.0	3756	0.3028	0.7635
0.1119	32.9968	3873	0.3388	0.7637
0.1124	33.9936	3990	0.3304	0.7636
0.1089	34.9989	4108	0.3556	0.7641
0.1095	35.9957	4225	0.3314	0.7641
0.1082	36.9925	4342	0.3770	0.7640
0.1071	37.9979	4460	0.3392	0.7645
0.1076	38.9947	4577	0.3363	0.7640
0.1074	40.0	4695	0.3731	0.7629
0.1289	40.9968	4812	0.3028	0.7634
0.1264	41.9936	4929	0.3093	0.7639
0.1126	42.9989	5047	0.3074	0.7643
0.1122	43.9957	5164	0.3375	0.7646
0.1096	44.9925	5281	0.3388	0.7645
0.1077	45.9979	5399	0.3173	0.7644
0.1063	46.9947	5516	0.3343	0.7643
0.1086	48.0	5634	0.3137	0.7644
0.1052	48.9968	5751	0.3941	0.7645
0.1078	49.8403	5850	0.3920	0.7643

Framework versions

PEFT 0.5.0
Transformers 4.40.2
Pytorch 2.3.0
Datasets 2.19.1
Tokenizers 0.19.1