esahit's picture
First training rerun on complete dataset
6716dcc verified
|
raw
history blame
2.7 kB
metadata
base_model: yhavinga/ul2-large-dutch
library_name: peft
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: ul2-large-dutch-finetuned-oba-book-search
    results: []

ul2-large-dutch-finetuned-oba-book-search

This model is a fine-tuned version of yhavinga/ul2-large-dutch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 5.6126
  • Top-5-accuracy: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Top-5-accuracy
8.4166 0.2577 200 5.9848 0.0
8.297 0.5155 400 5.9446 0.0
8.0509 0.7732 600 5.8986 0.0
8.1095 1.0309 800 5.8153 0.0
7.9101 1.2887 1000 5.7811 0.0
8.0255 1.5464 1200 5.7496 0.0
8.0218 1.8041 1400 5.7238 0.0
8.0497 2.0619 1600 5.7016 0.0
8.1829 2.3196 1800 5.6813 0.0
8.0591 2.5773 2000 5.6719 0.0
8.0816 2.8351 2200 5.6573 0.0
7.9825 3.0928 2400 5.6475 0.0
8.1364 3.3505 2600 5.6383 0.0
7.9707 3.6082 2800 5.6298 0.0
7.9173 3.8660 3000 5.6232 0.0
8.0502 4.1237 3200 5.6226 0.0
8.1764 4.3814 3400 5.6163 0.0
7.9046 4.6392 3600 5.6141 0.0
7.7162 4.8969 3800 5.6126 0.0

Framework versions

  • PEFT 0.11.0
  • Transformers 4.44.2
  • Pytorch 1.13.0+cu116
  • Datasets 3.0.0
  • Tokenizers 0.19.1