Edit model card

lmind_nq_train6000_eval6489_v1_doc_qa_v3_1e-4_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_doc_qa_v3 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9525
  • Accuracy: 0.5902

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.3835 1.0 529 0.6151 1.2937
1.3363 2.0 1058 0.6172 1.2863
1.226 3.0 1587 0.6164 1.3074
1.122 4.0 2116 0.5989 1.3926
1.0259 5.0 2645 0.6117 1.4826
0.8978 6.0 3174 0.6105 1.5410
0.8122 7.0 3703 0.6067 1.6852
0.7029 8.0 4232 0.6074 1.6874
0.6218 9.0 4761 0.6057 1.7531
0.5283 10.0 5290 0.6071 1.7846
0.4912 11.0 5819 0.6066 1.7853
0.404 12.0 6348 0.6057 1.8682
0.3401 13.0 6877 0.6056 1.9170
0.2908 14.0 7406 0.6049 1.9692
0.2613 15.0 7935 0.6056 2.0265
0.3068 16.0 8464 0.6051 2.0003
0.2314 17.0 8993 0.5904 2.0362
0.219 18.0 9522 0.6053 2.0412
0.2194 19.0 10051 0.6034 2.0586
0.2198 20.0 10580 0.6036 2.0877
0.3243 21.0 11109 0.6022 2.0553
0.4775 22.0 11638 0.6014 1.9758
0.2684 23.0 12167 0.6036 2.0437
2.0747 24.0 12696 0.1914 7.5035
0.9811 25.0 13225 0.5881 1.7852
0.5439 26.0 13754 0.5954 1.8359
0.5659 27.0 14283 0.5845 1.9092
0.3486 28.0 14812 0.6024 1.8739
0.2834 29.0 15341 0.6021 1.9266
0.2764 30.0 15870 0.6001 1.9530
0.2739 31.0 16399 0.6037 1.8998
0.2819 32.0 16928 0.6019 1.9407
0.26 33.0 17457 0.6029 1.9166
0.2837 34.0 17986 0.6031 1.8685
0.2572 35.0 18515 0.6024 1.9152
0.2277 36.0 19044 0.6013 1.9697
0.2148 37.0 19573 0.576 1.9482
0.205 38.0 20102 0.6023 1.9606
0.2421 39.0 20631 0.6011 1.9484
0.2082 40.0 21160 0.6029 1.9348
0.2364 41.0 21689 0.6028 1.9490
0.331 42.0 22218 0.5911 1.9153
0.2749 43.0 22747 0.5990 1.8960
0.251 44.0 23276 0.5985 1.8958
0.3465 45.0 23805 0.5968 1.8921
0.2817 46.0 24334 0.5998 1.9187
0.3276 47.0 24863 0.5945 1.9200
0.4979 48.0 25392 0.5899 1.8942
0.4234 49.0 25921 0.5918 1.9040
0.4576 50.0 26450 0.5902 1.9525

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .

Finetuned from

Dataset used to train tyzhu/lmind_nq_train6000_eval6489_v1_doc_qa_v3_1e-4_lora2

Evaluation results

  • Accuracy on tyzhu/lmind_nq_train6000_eval6489_v1_doc_qa_v3
    self-reported
    0.590