lmind_nq_train6000_eval6489_v1_doc_qa_v3_3e-4_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:

  • Accuracy: 0.1684
  • Loss: nan

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.3822 1.0 529 0.6172 1.2977
1.2744 2.0 1058 0.6032 1.3745
1.1768 3.0 1587 0.6157 1.3319
0.9247 4.0 2116 0.6102 1.4367
1.1836 5.0 2645 0.5569 1.9168
2.035 6.0 3174 0.5377 2.0794
3.7483 7.0 3703 0.4881 2.6723
7.127 8.0 4232 0.1922 7.0410
7.5321 9.0 4761 0.1941 6.6488
7.3806 10.0 5290 0.2197 6.8427
7.8159 11.0 5819 0.2197 6.8836
7.975 12.0 6348 0.2197 6.8763
7.9902 13.0 6877 0.2197 6.8726
7.8585 14.0 7406 0.2195 6.8236
7.3449 15.0 7935 0.1922 7.1997
7.3133 16.0 8464 0.1869 6.7455
7.305 17.0 8993 0.1869 6.7454
7.7463 18.0 9522 0.1870 8.8319
9.9696 19.0 10051 0.1692 10.0702
9.9845 20.0 10580 0.1692 10.0702
9.9502 21.0 11109 0.1692 10.0702
9.9726 22.0 11638 0.1692 10.0702
9.9648 23.0 12167 0.1692 10.0702
9.9579 24.0 12696 0.1692 10.0702
9.9519 25.0 13225 0.1692 10.0702
9.9849 26.0 13754 0.1692 10.0702
9.9591 27.0 14283 0.1692 10.0702
9.9701 28.0 14812 0.1692 10.0702
9.998 29.0 15341 0.1692 10.0702
9.9878 30.0 15870 0.1692 10.0702
9.9882 31.0 16399 0.1692 10.0702
9.9741 32.0 16928 0.1692 10.0702
9.9545 33.0 17457 0.1692 10.0702
9.9538 34.0 17986 0.1692 10.0702
9.995 35.0 18515 0.1692 10.0702
9.974 36.0 19044 0.1692 10.0702
9.9763 37.0 19573 0.1692 10.0702
9.991 38.0 20102 0.1692 10.0702
9.9502 39.0 20631 0.1692 10.0702
9.9284 40.0 21160 0.1692 10.0702
12.7665 41.0 21689 0.1747 9.6482
1855.3142 42.0 22218 0.1684 nan
0.0 43.0 22747 0.1684 nan
0.0 44.0 23276 0.1684 nan
0.0 45.0 23805 0.1684 nan
0.0 46.0 24334 0.1684 nan
0.0 47.0 24863 0.1684 nan
0.0 48.0 25392 0.1684 nan
0.0 49.0 25921 0.1684 nan
0.0 50.0 26450 0.1684 nan

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for tyzhu/lmind_nq_train6000_eval6489_v1_doc_qa_v3_3e-4_lora2

Finetuned
(620)
this model