Edit model card

find_marker_both_sent_train_400_eval_40_first_permute_Qwen_Qwen1.5-4B_3e-4_lora

This model is a fine-tuned version of Qwen/Qwen1.5-4B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3150
  • Accuracy: 0.7659

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 1
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.5824 0.9933 130 1.1797 0.6851
0.7977 1.9943 261 0.5359 0.7429
0.3387 2.9952 392 0.3361 0.7614
0.1537 3.9962 523 0.2855 0.7653
0.1389 4.9971 654 0.2712 0.7666
0.1383 5.9981 785 0.2502 0.7676
0.1252 6.9990 916 0.2457 0.7684
0.122 8.0 1047 0.2310 0.7694
0.1169 8.9933 1177 0.2316 0.7689
0.1167 9.9943 1308 0.2311 0.7699
0.1161 10.9952 1439 0.2159 0.7708
0.1126 11.9962 1570 0.2188 0.7694
0.1088 12.9971 1701 0.2270 0.7661
0.1104 13.9981 1832 0.2181 0.7677
0.1076 14.9990 1963 0.2135 0.7680
0.1069 16.0 2094 0.2219 0.7670
0.1048 16.9933 2224 0.2298 0.7668
0.1044 17.9943 2355 0.2341 0.7666
0.1061 18.9952 2486 0.2628 0.7660
0.1104 19.9962 2617 0.2712 0.7651
0.1111 20.9971 2748 0.2921 0.7652
0.1102 21.9981 2879 0.2700 0.7660
0.1049 22.9990 3010 0.2905 0.7662
0.1024 24.0 3141 0.2852 0.7664
0.1079 24.9933 3271 0.2418 0.7653
0.1066 25.9943 3402 0.2759 0.7662
0.1054 26.9952 3533 0.2958 0.7656
0.105 27.9962 3664 0.3109 0.7663
0.1066 28.9971 3795 0.3062 0.7660
0.1048 29.9981 3926 0.2714 0.7660
0.1043 30.9990 4057 0.2821 0.7662
0.1039 32.0 4188 0.2961 0.7661
0.1055 32.9933 4318 0.2942 0.7662
0.1045 33.9943 4449 0.3152 0.7659
0.1045 34.9952 4580 0.2828 0.7666
0.1038 35.9962 4711 0.2355 0.7662
0.102 36.9971 4842 0.2926 0.7664
0.103 37.9981 4973 0.2825 0.7660
0.1061 38.9990 5104 0.2899 0.7663
0.1064 40.0 5235 0.2930 0.7660
0.105 40.9933 5365 0.2806 0.7657
0.1038 41.9943 5496 0.2973 0.7664
0.1016 42.9952 5627 0.3379 0.7662
0.1046 43.9962 5758 0.3200 0.7655
0.1039 44.9971 5889 0.3151 0.7652
0.107 45.9981 6020 0.2969 0.7658
0.1059 46.9990 6151 0.3146 0.7659
0.1058 48.0 6282 0.3070 0.7656
0.103 48.9933 6412 0.3060 0.7660
0.1012 49.6657 6500 0.3150 0.7659

Framework versions

  • PEFT 0.5.0
  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for