esahit's picture
Second training rerun on complete dataset
1896b90 verified
|
raw
history blame
4.02 kB
metadata
base_model: yhavinga/ul2-large-dutch
library_name: peft
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: ul2-large-dutch-finetuned-oba-book-search
    results: []

ul2-large-dutch-finetuned-oba-book-search

This model is a fine-tuned version of yhavinga/ul2-large-dutch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.5755
  • Top-5-accuracy: 0.0579

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Top-5-accuracy
7.9158 0.1289 200 5.3305 0.0
7.0161 0.2579 400 4.8351 0.0
6.3673 0.3868 600 4.6915 0.0579
6.1376 0.5158 800 4.7811 0.0289
6.1629 0.6447 1000 4.7614 0.0
5.9541 0.7737 1200 4.6734 0.0289
5.8968 0.9026 1400 4.7609 0.0289
5.9555 1.0316 1600 4.5714 0.0289
5.8876 1.1605 1800 4.7200 0.0579
5.7377 1.2895 2000 4.6012 0.0289
5.7385 1.4184 2200 4.5199 0.0289
5.7584 1.5474 2400 4.5996 0.0579
5.7681 1.6763 2600 4.6556 0.0289
5.7317 1.8053 2800 4.6396 0.0289
5.6363 1.9342 3000 4.5867 0.0579
5.7462 2.0632 3200 4.5472 0.0289
5.6963 2.1921 3400 4.5598 0.0289
5.588 2.3211 3600 4.5316 0.0289
5.5463 2.4500 3800 4.5661 0.0289
5.5491 2.5790 4000 4.5478 0.0289
5.5445 2.7079 4200 4.5253 0.0289
5.5136 2.8369 4400 4.5313 0.0289
5.5705 2.9658 4600 4.5677 0.0289
5.4956 3.0948 4800 4.5268 0.0289
5.4799 3.2237 5000 4.5313 0.0289
5.4992 3.3527 5200 4.5403 0.0289
5.5742 3.4816 5400 4.5124 0.0289
5.4864 3.6106 5600 4.5527 0.0579
5.4896 3.7395 5800 4.5582 0.0289
5.5396 3.8685 6000 4.5680 0.0579
5.4413 3.9974 6200 4.5579 0.0579
5.4534 4.1264 6400 4.5684 0.0579
5.5199 4.2553 6600 4.5726 0.0579
5.5298 4.3843 6800 4.5883 0.0579
5.4346 4.5132 7000 4.5885 0.0289
5.5098 4.6422 7200 4.5895 0.0579
5.489 4.7711 7400 4.5682 0.0579
5.4055 4.9001 7600 4.5755 0.0579

Framework versions

  • PEFT 0.11.0
  • Transformers 4.44.2
  • Pytorch 1.13.0+cu116
  • Datasets 3.0.0
  • Tokenizers 0.19.1