ul2-large-dutch-finetuned-oba-book-search

This model is a fine-tuned version of yhavinga/ul2-large-dutch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.8688
  • Top-5-accuracy: 4.1194

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.6
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Top-5-accuracy
6.4431 0.0424 500 4.7239 0.0796
6.4068 0.0848 1000 5.1338 0.0398
5.7971 0.1272 1500 4.6127 0.0199
5.452 0.1696 2000 4.5181 0.1194
5.3971 0.2120 2500 4.5498 0.1393
5.2693 0.2544 3000 4.3622 0.1393
5.2788 0.2968 3500 4.3456 0.1990
5.2129 0.3392 4000 4.3400 0.2388
5.133 0.3815 4500 4.3021 0.2786
5.0346 0.4239 5000 4.2458 0.9751
5.113 0.4663 5500 4.2746 0.7363
5.1276 0.5087 6000 4.2369 0.9552
5.0586 0.5511 6500 4.1962 1.8706
4.9369 0.5935 7000 4.1843 2.9254
4.9152 0.6359 7500 4.1641 3.0846
4.9369 0.6783 8000 4.1089 3.7413
4.9185 0.7207 8500 4.1150 3.6418
4.8469 0.7631 9000 4.0996 3.6418
4.8854 0.8055 9500 4.0817 3.5821
4.8362 0.8479 10000 4.0456 4.2587
4.7867 0.8903 10500 4.0699 3.9204
4.7926 0.9327 11000 4.0692 3.3831
4.7933 0.9751 11500 4.0356 3.1642
4.793 1.0175 12000 4.0607 2.6667
4.7664 1.0599 12500 4.0430 3.5622
4.7409 1.1023 13000 4.0239 3.8806
4.7558 1.1446 13500 4.0134 3.7413
4.7642 1.1870 14000 3.9884 3.9403
4.7298 1.2294 14500 4.0087 3.6219
4.7433 1.2718 15000 3.9809 4.0995
4.6858 1.3142 15500 3.9984 4.2985
4.7023 1.3566 16000 3.9655 4.0199
4.6963 1.3990 16500 3.9798 4.1791
4.7239 1.4414 17000 4.0001 4.0597
4.7312 1.4838 17500 3.9532 4.0796
4.6408 1.5262 18000 3.9487 4.2388
4.669 1.5686 18500 3.9303 4.1990
4.6589 1.6110 19000 3.9346 4.1393
4.6887 1.6534 19500 3.9563 3.9403
4.5856 1.6958 20000 3.9374 4.2786
4.6744 1.7382 20500 3.9157 4.0995
4.6395 1.7806 21000 3.9279 4.1393
4.6191 1.8230 21500 3.9259 3.8408
4.6256 1.8654 22000 3.9215 3.9005
4.5945 1.9077 22500 3.9214 4.0796
4.6325 1.9501 23000 3.9076 3.8607
4.6476 1.9925 23500 3.8955 4.0199
4.6362 2.0349 24000 3.8923 4.0398
4.5991 2.0773 24500 3.8923 4.3383
4.6189 2.1197 25000 3.8800 4.0
4.5933 2.1621 25500 3.8869 3.8806
4.6165 2.2045 26000 3.8918 4.0398
4.5998 2.2469 26500 3.8819 3.9602
4.5827 2.2893 27000 3.8848 3.9204
4.528 2.3317 27500 3.8847 3.9005
4.5685 2.3741 28000 3.8879 3.9204
4.5698 2.4165 28500 3.8739 3.9801
4.5472 2.4589 29000 3.8761 4.0398
4.5605 2.5013 29500 3.8753 4.0398
4.5329 2.5437 30000 3.8791 4.0796
4.5687 2.5861 30500 3.8698 4.0
4.5716 2.6285 31000 3.8659 4.0995
4.547 2.6708 31500 3.8713 4.0597
4.6466 2.7132 32000 3.8729 4.0995
4.5963 2.7556 32500 3.8698 4.1194
4.629 2.7980 33000 3.8703 4.1194
4.5859 2.8404 33500 3.8699 4.1194
4.6239 2.8828 34000 3.8688 4.1393
4.5052 2.9252 34500 3.8688 4.1393
4.5933 2.9676 35000 3.8688 4.1194

Framework versions

  • PEFT 0.11.0
  • Transformers 4.44.2
  • Pytorch 1.13.0+cu116
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month
167
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for esahit/ul2-large-dutch-finetuned-oba-book-search

Adapter
(1)
this model