Edit model card

ft-wmt14

This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7607
  • Bleu: 23.421
  • Gen Len: 27.6243

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • training_steps: 100000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.7882 0.2778 10000 1.9278 19.7853 28.4147
1.6619 0.5556 20000 1.8710 21.3803 27.667
1.6007 0.8333 30000 1.8397 22.2715 27.317
1.5269 1.1111 40000 1.8205 21.9329 27.704
1.498 1.3889 50000 1.8134 22.4836 27.63
1.4801 1.6667 60000 1.7941 22.727 27.582
1.462 1.9444 70000 1.7766 23.0372 27.5903
1.4182 2.2222 80000 1.7724 23.6231 27.4233
1.4079 2.5 90000 1.7663 23.2604 27.7623
1.4037 2.7778 100000 1.7607 23.421 27.6243

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
9
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from