Edit model card

lmind_nq_train6000_eval6489_v1_doc_qa_v3_5e-5_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_doc_qa_v3 dataset. It achieves the following results on the evaluation set:

  • Loss: 6.7918
  • Accuracy: 0.1930

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.3891 1.0 529 0.6138 1.3015
1.3633 2.0 1058 0.6166 1.2855
1.2929 3.0 1587 0.6177 1.2954
1.2361 4.0 2116 0.6045 1.3489
1.1856 5.0 2645 0.6125 1.3968
1.1098 6.0 3174 0.6115 1.4721
1.0753 7.0 3703 0.6076 1.5798
1.0048 8.0 4232 0.6084 1.6042
0.9456 9.0 4761 0.5977 1.6843
0.8766 10.0 5290 0.6051 1.7829
0.8273 11.0 5819 0.6043 1.8060
0.7755 12.0 6348 0.6019 1.8729
0.715 13.0 6877 0.6017 1.9620
0.6804 14.0 7406 0.6009 2.0030
0.6277 15.0 7935 0.5998 2.0528
0.5733 16.0 8464 0.6012 2.0475
0.5409 17.0 8993 0.5749 2.0920
0.5024 18.0 9522 0.5986 2.1207
0.4699 19.0 10051 0.5993 2.1108
0.4367 20.0 10580 0.6005 2.1089
0.857 21.0 11109 0.5983 2.0215
3.7434 22.0 11638 0.2233 10.1186
7.7259 23.0 12167 0.1986 7.5379
4.2204 24.0 12696 0.5345 2.1568
0.7385 25.0 13225 0.5963 1.8229
1.1473 26.0 13754 0.5788 1.7570
2.0182 27.0 14283 0.5573 1.7293
2.2707 28.0 14812 0.4956 2.7017
4.1792 29.0 15341 0.3070 5.8288
7.7703 30.0 15870 0.1922 7.6619
7.7034 31.0 16399 0.1913 7.7003
7.9533 32.0 16928 0.1899 7.8667
7.8634 33.0 17457 0.1897 7.8134
7.8584 34.0 17986 0.1882 7.6760
7.824 35.0 18515 0.1888 7.7083
7.7446 36.0 19044 0.1888 7.6626
7.6708 37.0 19573 0.1886 7.5529
7.6733 38.0 20102 0.1903 7.5704
7.6271 39.0 20631 0.1949 7.5363
7.5886 40.0 21160 0.2130 7.4684
7.5514 41.0 21689 0.2077 7.4223
7.5205 42.0 22218 0.1946 7.3508
7.4577 43.0 22747 0.1951 7.1785
7.5021 44.0 23276 0.2092 6.6226
7.1133 45.0 23805 0.1994 6.4100
6.9682 46.0 24334 0.2250 6.3553
6.8891 47.0 24863 0.2224 6.3128
6.8621 48.0 25392 0.23 6.2465
6.8176 49.0 25921 0.2561 6.1966
6.9473 50.0 26450 0.1930 6.7918

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .

Finetuned from

Dataset used to train tyzhu/lmind_nq_train6000_eval6489_v1_doc_qa_v3_5e-5_lora2

Evaluation results

  • Accuracy on tyzhu/lmind_nq_train6000_eval6489_v1_doc_qa_v3
    self-reported
    0.193