Sorour's picture
Model save
b729d0b verified
metadata
license: llama3
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B-Instruct
datasets:
  - generator
model-index:
  - name: cls_finred_llama3_v3
    results: []

cls_finred_llama3_v3

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4113

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.7177 0.1116 20 0.6751
0.6323 0.2232 40 0.6166
0.6119 0.3347 60 0.5802
0.5471 0.4463 80 0.5532
0.5299 0.5579 100 0.5321
0.5265 0.6695 120 0.5062
0.5306 0.7810 140 0.4888
0.5094 0.8926 160 0.4764
0.4769 1.0042 180 0.4640
0.342 1.1158 200 0.4644
0.3271 1.2273 220 0.4534
0.342 1.3389 240 0.4448
0.3659 1.4505 260 0.4395
0.3159 1.5621 280 0.4284
0.3356 1.6736 300 0.4248
0.3476 1.7852 320 0.4165
0.3168 1.8968 340 0.4113

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1