Edit model card

Meta-Llama-3-8B_AviationQA-cosine

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6061

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 3
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 6
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.7872 0.0590 50 0.7652
0.7373 0.1181 100 0.7328
0.7242 0.1771 150 0.7182
0.7143 0.2361 200 0.7107
0.73 0.2952 250 0.7046
0.7159 0.3542 300 0.6973
0.7211 0.4132 350 0.6921
0.7096 0.4723 400 0.6873
0.6845 0.5313 450 0.6824
0.7251 0.5903 500 0.6783
0.6685 0.6494 550 0.6720
0.697 0.7084 600 0.6667
0.7006 0.7674 650 0.6639
0.6952 0.8264 700 0.6618
0.6649 0.8855 750 0.6596
0.6877 0.9445 800 0.6553
0.6673 1.0035 850 0.6531
0.6611 1.0626 900 0.6487
0.6971 1.1216 950 0.6452
0.6652 1.1806 1000 0.6423
0.645 1.2397 1050 0.6397
0.6494 1.2987 1100 0.6388
0.6623 1.3577 1150 0.6359
0.6552 1.4168 1200 0.6334
0.6465 1.4758 1250 0.6297
0.6495 1.5348 1300 0.6285
0.6521 1.5939 1350 0.6272
0.6505 1.6529 1400 0.6261
0.6773 1.7119 1450 0.6238
0.6487 1.7710 1500 0.6225
0.639 1.8300 1550 0.6208
0.6465 1.8890 1600 0.6194
0.6528 1.9481 1650 0.6182
0.6265 2.0071 1700 0.6164
0.6161 2.0661 1750 0.6137
0.6236 2.1251 1800 0.6118
0.6371 2.1842 1850 0.6111
0.6294 2.2432 1900 0.6093
0.6257 2.3022 1950 0.6087
0.6204 2.3613 2000 0.6081
0.6133 2.4203 2050 0.6073
0.6108 2.4793 2100 0.6068
0.622 2.5384 2150 0.6066
0.6233 2.5974 2200 0.6064
0.6183 2.6564 2250 0.6063
0.6237 2.7155 2300 0.6062
0.6388 2.7745 2350 0.6062
0.6236 2.8335 2400 0.6062
0.6236 2.8926 2450 0.6062
0.6205 2.9516 2500 0.6061

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
34
Unable to determine this model’s pipeline type. Check the docs .

Adapter for