Holmeister's picture
End of training
f01e2bd verified
|
raw
history blame
1.86 kB
metadata
library_name: peft
tags:
  - llama-factory
  - lora
  - generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B
model-index:
  - name: LLaMA3_ei_oc_mixed_train
    results: []

LLaMA3_ei_oc_mixed_train

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the emollms_ei_oc_mixed dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0802

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
0.3319 0.3320 10 0.1265
0.113 0.6639 20 0.0951
0.0961 0.9959 30 0.0864
0.0908 1.3278 40 0.0838
0.0846 1.6598 50 0.0816
0.0806 1.9917 60 0.0802
0.0756 2.3237 70 0.0810
0.0751 2.6556 80 0.0805
0.0719 2.9876 90 0.0806

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1