metadata

license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - wmt16
metrics:
  - bleu
model-index:
  - name: t5-small-finetuned-de-to-en
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: wmt16
          type: wmt16
          args: de-en
        metrics:
          - name: Bleu
            type: bleu
            value: 11.3921

t5-small-finetuned-de-to-en

This model is a fine-tuned version of t5-small on the wmt16 dataset. It achieves the following results on the evaluation set:

Loss: 1.8219
Bleu: 11.3921
Gen Len: 17.2471

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	272	2.1014	5.5136	17.4975
2.5302	2.0	544	2.0258	7.4515	17.3941
2.5302	3.0	816	1.9866	8.3061	17.3441
2.3778	4.0	1088	1.9602	8.9169	17.3588
2.3778	5.0	1360	1.9382	9.3651	17.3204
2.2676	6.0	1632	1.9215	9.6428	17.3588
2.2676	7.0	1904	1.9067	9.8039	17.3418
2.2096	8.0	2176	1.8984	9.8545	17.3264
2.2096	9.0	2448	1.8883	10.03	17.3278
2.1501	10.0	2720	1.8797	10.2398	17.3358
2.1501	11.0	2992	1.8738	10.3086	17.3258
2.1025	12.0	3264	1.8677	10.3851	17.3181
2.0638	13.0	3536	1.8623	10.489	17.3014
2.0638	14.0	3808	1.8574	10.4969	17.3204
2.034	15.0	4080	1.8528	10.7067	17.3178
2.034	16.0	4352	1.8493	10.6867	17.3408
1.9852	17.0	4624	1.8473	10.8333	17.3198
1.9852	18.0	4896	1.8429	10.8907	17.3001
1.9646	19.0	5168	1.8405	10.9049	17.3154
1.9646	20.0	5440	1.8385	10.9549	17.3124
1.9264	21.0	5712	1.8361	11.0046	17.3068
1.9264	22.0	5984	1.8338	11.1415	17.2954
1.9161	23.0	6256	1.8333	11.1041	17.2938
1.882	24.0	6528	1.8323	11.0801	17.2651
1.882	25.0	6800	1.8309	11.157	17.2921
1.8751	26.0	7072	1.8290	11.1713	17.2951
1.8751	27.0	7344	1.8279	11.2006	17.2861
1.8425	28.0	7616	1.8267	11.1761	17.2658
1.8425	29.0	7888	1.8278	11.148	17.2841
1.8306	30.0	8160	1.8261	11.1765	17.2748
1.8306	31.0	8432	1.8255	11.2723	17.2454
1.8229	32.0	8704	1.8247	11.2715	17.2621
1.8229	33.0	8976	1.8231	11.2896	17.2698
1.7975	34.0	9248	1.8245	11.322	17.2491
1.7919	35.0	9520	1.8238	11.3854	17.2711
1.7919	36.0	9792	1.8237	11.3304	17.2634
1.7781	37.0	10064	1.8225	11.3184	17.2644
1.7781	38.0	10336	1.8230	11.3382	17.2651
1.7819	39.0	10608	1.8228	11.3656	17.2658
1.7819	40.0	10880	1.8221	11.3934	17.2544
1.7592	41.0	11152	1.8223	11.3625	17.2421
1.7592	42.0	11424	1.8221	11.4068	17.2511
1.7529	43.0	11696	1.8224	11.4199	17.2541
1.7529	44.0	11968	1.8224	11.4051	17.2561
1.7482	45.0	12240	1.8223	11.4195	17.2504
1.7461	46.0	12512	1.8220	11.3873	17.2497
1.7461	47.0	12784	1.8220	11.4214	17.2431
1.739	48.0	13056	1.8218	11.3972	17.2441
1.739	49.0	13328	1.8219	11.3952	17.2457
1.7362	50.0	13600	1.8219	11.3921	17.2471

Framework versions

Transformers 4.12.5
Pytorch 1.10.0+cu111
Datasets 1.16.1
Tokenizers 0.10.3