Casper0508's picture
End of training
e88703d verified
metadata
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - generated_from_trainer
model-index:
  - name: MSc_llama3_finetuned_model_secondData
    results: []
library_name: peft

MSc_llama3_finetuned_model_secondData

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5909

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 250

Training results

Training Loss Epoch Step Validation Loss
3.3698 1.36 10 2.0432
1.3777 2.71 20 1.0067
0.8126 4.07 30 0.7822
0.6642 5.42 40 0.7281
0.5708 6.78 50 0.7218
0.5062 8.14 60 0.7360
0.4379 9.49 70 0.7781
0.3924 10.85 80 0.8310
0.3435 12.2 90 0.8856
0.3041 13.56 100 1.0389
0.2787 14.92 110 1.0664
0.2553 16.27 120 1.1655
0.2388 17.63 130 1.2397
0.2288 18.98 140 1.2049
0.2128 20.34 150 1.2746
0.2081 21.69 160 1.3889
0.1998 23.05 170 1.3942
0.1909 24.41 180 1.4383
0.188 25.76 190 1.5012
0.1841 27.12 200 1.5246
0.18 28.47 210 1.5528
0.1794 29.83 220 1.5662
0.1773 31.19 230 1.5788
0.1751 32.54 240 1.5889
0.1756 33.9 250 1.5909

Framework versions

  • PEFT 0.4.0
  • Transformers 4.38.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.13.1
  • Tokenizers 0.15.2