Edit model card

cls_finred_llama3_v3

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4113

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.7177 0.1116 20 0.6751
0.6323 0.2232 40 0.6166
0.6119 0.3347 60 0.5802
0.5471 0.4463 80 0.5532
0.5299 0.5579 100 0.5321
0.5265 0.6695 120 0.5062
0.5306 0.7810 140 0.4888
0.5094 0.8926 160 0.4764
0.4769 1.0042 180 0.4640
0.342 1.1158 200 0.4644
0.3271 1.2273 220 0.4534
0.342 1.3389 240 0.4448
0.3659 1.4505 260 0.4395
0.3159 1.5621 280 0.4284
0.3356 1.6736 300 0.4248
0.3476 1.7852 320 0.4165
0.3168 1.8968 340 0.4113

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for