esahit's picture
Third training rerun on increased dataset
24b32a5 verified
|
raw
history blame
5.46 kB
metadata
base_model: yhavinga/ul2-large-dutch
library_name: peft
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: ul2-large-dutch-finetuned-oba-book-search
    results: []

ul2-large-dutch-finetuned-oba-book-search

This model is a fine-tuned version of yhavinga/ul2-large-dutch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.4663
  • Top-5-accuracy: 0.4179

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Top-5-accuracy
6.6127 0.0170 200 4.9957 0.0199
6.3546 0.0339 400 4.6613 0.0796
6.3036 0.0509 600 4.7338 0.0
6.0421 0.0678 800 4.7298 0.0
6.0575 0.0848 1000 4.5229 0.0597
5.9802 0.1017 1200 4.6572 0.0398
5.8962 0.1187 1400 4.5695 0.0199
5.7797 0.1357 1600 4.5181 0.0199
5.7508 0.1526 1800 4.5089 0.0995
5.6505 0.1696 2000 4.5137 0.0199
5.705 0.1865 2200 4.4988 0.0796
5.6986 0.2035 2400 4.4908 0.0199
5.6822 0.2205 2600 4.4318 0.0199
5.6889 0.2374 2800 4.5502 0.0199
5.674 0.2544 3000 4.4749 0.0199
5.682 0.2713 3200 4.5109 0.0199
5.6252 0.2883 3400 4.5060 0.0796
5.4972 0.3052 3600 4.4417 0.1194
5.478 0.3222 3800 4.4351 0.0597
5.5038 0.3392 4000 4.4616 0.0
5.6091 0.3561 4200 4.4631 0.0995
5.4895 0.3731 4400 4.4339 0.0796
5.6013 0.3900 4600 4.4692 0.0995
5.4743 0.4070 4800 4.5731 0.0796
5.4131 0.4239 5000 4.5185 0.0995
5.4779 0.4409 5200 4.4901 0.0995
5.5093 0.4579 5400 4.5193 0.0995
5.527 0.4748 5600 4.5322 0.1194
5.5443 0.4918 5800 4.5358 0.0796
5.557 0.5087 6000 4.5574 0.0995
5.5324 0.5257 6200 4.4957 0.2786
5.4958 0.5426 6400 4.5618 0.1592
5.4376 0.5596 6600 4.4751 0.1194
5.5136 0.5766 6800 4.4994 0.1393
5.4284 0.5935 7000 4.5029 0.2587
5.4333 0.6105 7200 4.4864 0.2189
5.3516 0.6274 7400 4.5141 0.1990
5.4294 0.6444 7600 4.4527 0.1990
5.4383 0.6614 7800 4.4698 0.0199
5.3333 0.6783 8000 4.4169 0.2189
5.3708 0.6953 8200 4.4541 0.2587
5.3639 0.7122 8400 4.4613 0.2587
5.3746 0.7292 8600 4.4467 0.2786
5.3916 0.7461 8800 4.4134 0.4378
5.3416 0.7631 9000 4.4772 0.4179
5.3148 0.7801 9200 4.4603 0.3980
5.3646 0.7970 9400 4.4700 0.3582
5.2917 0.8140 9600 4.4439 0.3781
5.386 0.8309 9800 4.4418 0.3582
5.3907 0.8479 10000 4.4431 0.4179
5.4036 0.8648 10200 4.4557 0.3980
5.3439 0.8818 10400 4.4373 0.4179
5.2866 0.8988 10600 4.4616 0.4179
5.3447 0.9157 10800 4.4669 0.3980
5.3031 0.9327 11000 4.4639 0.4179
5.4083 0.9496 11200 4.4726 0.4179
5.2586 0.9666 11400 4.4771 0.3980
5.3988 0.9836 11600 4.4663 0.4179

Framework versions

  • PEFT 0.11.0
  • Transformers 4.44.2
  • Pytorch 1.13.0+cu116
  • Datasets 3.0.0
  • Tokenizers 0.19.1