Edit model card

al-wmt14

This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7427
  • Bleu: 51.2627
  • Gen Len: 26.713

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 100000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.8503 0.2778 10000 0.9749 41.49 27.2803
0.7082 0.5556 20000 0.8905 44.7235 26.6963
0.6407 0.8333 30000 0.8530 46.6914 27.019
0.573 1.1111 40000 0.8260 47.4882 26.6827
0.5438 1.3889 50000 0.8017 48.472 26.8617
0.5263 1.6667 60000 0.7810 49.0812 26.8817
0.5091 1.9444 70000 0.7654 49.9355 26.7853
0.4699 2.2222 80000 0.7605 50.3601 26.72
0.4597 2.5 90000 0.7488 50.92 26.8803
0.454 2.7778 100000 0.7427 51.2627 26.713

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
60.5M params
Tensor type
F32
·

Finetuned from