llama381binstruct_summarize_short

This model is a fine-tuned version of NousResearch/Meta-Llama-3.1-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4145

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 30
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
1.4526 2.7778 25 1.3994
0.3502 5.5556 50 1.6518
0.0835 8.3333 75 2.0132
0.0223 11.1111 100 2.0395
0.0092 13.8889 125 2.2195
0.0055 16.6667 150 2.1321
0.0018 19.4444 175 2.2957
0.0012 22.2222 200 2.3537
0.0009 25.0 225 2.3708
0.0008 27.7778 250 2.3797
0.0009 30.5556 275 2.3879
0.0008 33.3333 300 2.3933
0.0008 36.1111 325 2.3986
0.0009 38.8889 350 2.4036
0.0008 41.6667 375 2.4068
0.0006 44.4444 400 2.4093
0.0007 47.2222 425 2.4115
0.0006 50.0 450 2.4136
0.0006 52.7778 475 2.4146
0.0006 55.5556 500 2.4145

Framework versions

  • PEFT 0.13.1
  • Transformers 4.45.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for BaselMousi/llama381binstruct_summarize_short

Adapter
(69)
this model