metadata

tags:
  - generated_from_trainer
datasets:
  - ccmatrix
metrics:
  - bleu
model-index:
  - name: t5-small_de-finetuned-en-to-it
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: ccmatrix
          type: ccmatrix
          config: en-it
          split: train[3000:12000]
          args: en-it
        metrics:
          - name: Bleu
            type: bleu
            value: 6.7338

t5-small_de-finetuned-en-to-it

This model is a fine-tuned version of din0s/t5-small-finetuned-en-to-de on the ccmatrix dataset. It achieves the following results on the evaluation set:

Loss: 2.3480
Bleu: 6.7338
Gen Len: 61.3273

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 96
eval_batch_size: 96
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	94	3.1064	2.9057	47.5067
No log	2.0	188	2.9769	2.7484	76.9273
No log	3.0	282	2.9015	3.0624	79.8873
No log	4.0	376	2.8444	3.2959	78.276
No log	5.0	470	2.7989	3.6694	74.6013
3.3505	6.0	564	2.7564	3.8098	74.3247
3.3505	7.0	658	2.7212	3.9596	72.554
3.3505	8.0	752	2.6886	4.2231	70.7673
3.3505	9.0	846	2.6572	4.1466	72.0113
3.3505	10.0	940	2.6294	4.2696	71.1647
3.0254	11.0	1034	2.6064	4.6375	67.7707
3.0254	12.0	1128	2.5838	4.7208	68.6707
3.0254	13.0	1222	2.5614	4.9191	68.5767
3.0254	14.0	1316	2.5427	4.9837	66.3867
3.0254	15.0	1410	2.5241	5.1011	66.7667
2.8789	16.0	1504	2.5093	5.283	64.944
2.8789	17.0	1598	2.4919	5.3205	65.738
2.8789	18.0	1692	2.4788	5.3046	65.3207
2.8789	19.0	1786	2.4651	5.5282	64.9407
2.8789	20.0	1880	2.4532	5.6745	63.0873
2.8789	21.0	1974	2.4419	5.7073	63.4973
2.7782	22.0	2068	2.4308	5.8513	62.8813
2.7782	23.0	2162	2.4209	5.8267	64.1033
2.7782	24.0	2256	2.4124	5.8534	64.2993
2.7782	25.0	2350	2.4037	6.0406	63.8313
2.7782	26.0	2444	2.3964	6.1517	63.4213
2.7116	27.0	2538	2.3897	6.2175	63.0573
2.7116	28.0	2632	2.3836	6.2551	62.876
2.7116	29.0	2726	2.3777	6.4412	62.4167
2.7116	30.0	2820	2.3717	6.4604	62.1087
2.7116	31.0	2914	2.3673	6.5471	62.1373
2.6662	32.0	3008	2.3634	6.5296	62.2533
2.6662	33.0	3102	2.3596	6.6623	61.276
2.6662	34.0	3196	2.3564	6.6591	61.392
2.6662	35.0	3290	2.3539	6.7201	61.0827
2.6662	36.0	3384	2.3516	6.675	61.3173
2.6662	37.0	3478	2.3500	6.6894	61.3507
2.6411	38.0	3572	2.3488	6.6539	61.5253
2.6411	39.0	3666	2.3482	6.7135	61.3733
2.6411	40.0	3760	2.3480	6.7338	61.3273

Framework versions

Transformers 4.22.1
Pytorch 1.12.1
Datasets 2.5.1
Tokenizers 0.11.0