Edit model card

peft-dialogue-summary-training-1716826281

This model is a fine-tuned version of ibm-granite/granite-8b-code-instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5039

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • training_steps: 400

Training results

Training Loss Epoch Step Validation Loss
7.3086 0.1453 25 5.5472
5.237 0.2907 50 4.5187
4.1279 0.4360 75 3.6791
3.601 0.5814 100 3.0487
3.1216 0.7267 125 2.6284
3.0769 0.8721 150 2.3158
2.3403 1.0174 175 2.1901
1.926 1.1628 200 2.2728
1.6723 1.3081 225 2.1936
1.7322 1.4535 250 2.0048
1.7326 1.5988 275 1.8762
1.6822 1.7442 300 1.8119
1.603 1.8895 325 1.6480
1.4208 2.0349 350 1.5662
1.2048 2.1802 375 1.5147
1.1107 2.3256 400 1.5039

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for