Edit model card

llama3-lora-codigopenal-dir

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6629

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 1399
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.4127 3.6364 20 1.4021
1.3428 7.2727 40 1.2777
1.1822 10.9091 60 1.1052
0.9983 14.5455 80 0.9440
0.825 18.1818 100 0.7987
0.7081 21.8182 120 0.7390
0.6527 25.4545 140 0.7078
0.6046 29.0909 160 0.6855
0.566 32.7273 180 0.6699
0.5268 36.3636 200 0.6610
0.4891 40.0 220 0.6568
0.4519 43.6364 240 0.6629

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for