Edit model card

t5

This model is a fine-tuned version of t5-small on the wmt16 ro-en dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3574
  • Bleu: 27.1318
  • Gen Len: 42.5798
  • Loss Smallest Subnet: 1.3574
  • Bleu Smallest Subnet: 27.1318
  • Gen Len Smallest Subnet: 42.5798
  • Loss Random Subnet: 1.3574
  • Loss Sum: 4.0723
  • Bleu Random Subnet: 27.1318
  • Bleu Sum: 81.3954
  • Gen Len Random Subnet: 42.5798
  • Gen Len Sum: 127.7394

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 12
  • eval_batch_size: 24
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 48
  • total_eval_batch_size: 96
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len Loss Smallest Subnet Bleu Smallest Subnet Gen Len Smallest Subnet Loss Random Subnet Loss Sum Bleu Random Subnet Bleu Sum Gen Len Random Subnet Gen Len Sum
0.5967 1.0 12715 1.3820 26.593 42.4422 1.3820 26.593 42.4422 1.3820 4.1461 26.593 79.779 42.4422 127.3266
0.5768 2.0 25430 1.3728 26.6191 42.6738 1.3728 26.6191 42.6738 1.3728 4.1184 26.6191 79.8573 42.6738 128.0214
0.5663 3.0 38145 1.3616 26.9203 42.5298 1.3616 26.9203 42.5298 1.3616 4.0849 26.9203 80.7609 42.5298 127.5894
0.5523 4.0 50860 1.3570 27.0195 42.5203 1.3570 27.0195 42.5203 1.3570 4.0709 27.0195 81.0585 42.5203 127.5609
0.5436 5.0 63575 1.3574 27.1318 42.5798 1.3574 27.1318 42.5798 1.3574 4.0723 27.1318 81.3954 42.5798 127.7394

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.8.0
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
18

Dataset used to train wonjeongho/t5-wmt16-ro-en

Evaluation results