Edit model card

results_1011

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9956

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 3
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.6788 0.3901 100 2.2881
2.4361 0.7801 200 2.2154
2.3903 1.1702 300 2.1747
2.3166 1.5602 400 2.1358
2.2868 1.9503 500 2.1058
2.2048 2.3403 600 2.0800
2.1999 2.7304 700 2.0613
2.1711 3.1204 800 2.0471
2.1038 3.5105 900 2.0329
2.1115 3.9005 1000 2.0185
2.0859 4.2906 1100 2.0129
2.0455 4.6806 1200 2.0084
2.0338 5.0707 1300 2.0022
1.9991 5.4608 1400 2.0011
1.9948 5.8508 1500 1.9966
1.948 6.2409 1600 1.9977
1.9773 6.6309 1700 1.9909
1.9228 7.0210 1800 1.9915
1.8997 7.4110 1900 1.9947
1.9212 7.8011 2000 1.9868
1.8786 8.1911 2100 2.0092
1.8762 8.5812 2200 2.0070
1.8724 8.9712 2300 2.0023
1.8604 9.3613 2400 1.9978
1.8436 9.7513 2500 1.9956

Framework versions

  • PEFT 0.12.0
  • Transformers 4.45.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.20.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Jsoo/results_1011

Adapter
(54)
this model