Edit model card

test_llama_lora_last_qkvo

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0667

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 2000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.8797 0.05 100 1.5707
0.2457 0.1 200 1.6468
1.6187 0.15 300 1.3860
1.4357 1.002 400 1.3559
0.4507 1.052 500 1.2522
0.1815 1.102 600 1.2921
1.0317 1.152 700 1.1859
1.14 2.004 800 1.1689
0.4077 2.054 900 1.1580
0.1499 2.104 1000 1.1854
0.8753 2.154 1100 1.1333
1.1555 3.006 1200 1.1166
0.3652 3.056 1300 1.1025
0.134 3.106 1400 1.1285
0.8517 3.156 1500 1.1059
1.0572 4.008 1600 1.0951
0.3787 4.058 1700 1.0598
0.1383 4.108 1800 1.0708
0.7534 4.158 1900 1.0642
1.0492 5.01 2000 1.0667

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.1
  • Pytorch 2.3.0+cu118
  • Datasets 3.0.2
  • Tokenizers 0.20.1
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for steve329/test_llama_lora_last_qkvo

Adapter
(40)
this model