ft-wmt14 / README.md
lilferrit's picture
Model save
0a5fc03 verified
|
raw
history blame
2.13 kB
metadata
license: apache-2.0
base_model: google-t5/t5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ft-wmt14
    results: []

ft-wmt14

This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6254
  • Bleu: 26.3346
  • Gen Len: 26.6907

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 100000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.3103 0.0355 10000 1.8454 22.058 27.7263
2.2141 0.0710 20000 1.7811 23.339 26.7147
2.176 0.1065 30000 1.7361 24.3234 27.125
2.139 0.1419 40000 1.7131 25.0888 26.8213
2.1084 0.1774 50000 1.6874 24.9992 26.824
2.0826 0.2129 60000 1.6685 25.7297 26.62
2.068 0.2484 70000 1.6485 25.9031 26.685
2.05 0.2839 80000 1.6371 26.143 26.8693
2.0331 0.3194 90000 1.6311 26.3038 26.5183
2.0273 0.3549 100000 1.6254 26.3346 26.6907

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1