Edit model card

LLama3_deneme

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the emollms_ei_oc_mixed dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0802

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
0.3319 0.3320 10 0.1265
0.113 0.6639 20 0.0951
0.0961 0.9959 30 0.0864
0.0908 1.3278 40 0.0838
0.0846 1.6598 50 0.0816
0.0806 1.9917 60 0.0802
0.0756 2.3237 70 0.0810
0.0751 2.6556 80 0.0805
0.0719 2.9876 90 0.0806

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Unable to determine this model’s pipeline type. Check the docs .

Adapter for