Edit model card

find_marker_both_sent_train_400_eval_40_random_permute_rerun_4_Qwen_Qwen1.5-4B_3e-4_lora

This model is a fine-tuned version of Qwen/Qwen1.5-4B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3920
  • Accuracy: 0.7643

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 1
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.6149 0.9968 117 1.2999 0.6755
0.862 1.9936 234 0.6614 0.7318
0.3519 2.9989 352 0.3529 0.7612
0.2097 3.9957 469 0.3055 0.7604
0.1746 4.9925 586 0.2799 0.7659
0.1507 5.9979 704 0.2721 0.7651
0.1431 6.9947 821 0.2592 0.7674
0.1381 8.0 939 0.2589 0.7667
0.1337 8.9968 1056 0.2509 0.7679
0.1292 9.9936 1173 0.2452 0.7682
0.1232 10.9989 1291 0.2604 0.7656
0.1214 11.9957 1408 0.2679 0.7653
0.119 12.9925 1525 0.2421 0.7681
0.1165 13.9979 1643 0.2545 0.7654
0.1161 14.9947 1760 0.2666 0.7630
0.1193 16.0 1878 0.2661 0.7645
0.1264 16.9968 1995 0.2994 0.7626
0.1201 17.9936 2112 0.2607 0.7647
0.1144 18.9989 2230 0.2665 0.7655
0.1147 19.9957 2347 0.2606 0.7646
0.1143 20.9925 2464 0.2834 0.7645
0.1105 21.9979 2582 0.2843 0.7645
0.1103 22.9947 2699 0.2959 0.7639
0.1081 24.0 2817 0.3331 0.7640
0.1093 24.9968 2934 0.3566 0.7640
0.1086 25.9936 3051 0.2995 0.7630
0.1124 26.9989 3169 0.2889 0.7624
0.1169 27.9957 3286 0.3392 0.7630
0.1225 28.9925 3403 0.2916 0.7633
0.1179 29.9979 3521 0.2572 0.7645
0.1139 30.9947 3638 0.3382 0.7635
0.1141 32.0 3756 0.3028 0.7635
0.1119 32.9968 3873 0.3388 0.7637
0.1124 33.9936 3990 0.3304 0.7636
0.1089 34.9989 4108 0.3556 0.7641
0.1095 35.9957 4225 0.3314 0.7641
0.1082 36.9925 4342 0.3770 0.7640
0.1071 37.9979 4460 0.3392 0.7645
0.1076 38.9947 4577 0.3363 0.7640
0.1074 40.0 4695 0.3731 0.7629
0.1289 40.9968 4812 0.3028 0.7634
0.1264 41.9936 4929 0.3093 0.7639
0.1126 42.9989 5047 0.3074 0.7643
0.1122 43.9957 5164 0.3375 0.7646
0.1096 44.9925 5281 0.3388 0.7645
0.1077 45.9979 5399 0.3173 0.7644
0.1063 46.9947 5516 0.3343 0.7643
0.1086 48.0 5634 0.3137 0.7644
0.1052 48.9968 5751 0.3941 0.7645
0.1078 49.8403 5850 0.3920 0.7643

Framework versions

  • PEFT 0.5.0
  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for tyzhu/find_marker_both_sent_train_400_eval_40_random_permute_rerun_4_Qwen_Qwen1.5-4B_3e-4_lora

Base model

Qwen/Qwen1.5-4B
Adapter
this model