metadata
license: apache-2.0
base_model: t5-base
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: t5-base-finetuned-c_zh-to-m_zh
results: []
t5-base-finetuned-c_zh-to-m_zh
This model is a fine-tuned version of t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1259
- Bleu: 86.7906
- Gen Len: 7.5595
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
No log | 1.0 | 234 | 0.1445 | 86.8464 | 7.5788 |
No log | 2.0 | 468 | 0.1357 | 86.5266 | 7.6088 |
0.194 | 3.0 | 702 | 0.1317 | 86.8389 | 7.5756 |
0.194 | 4.0 | 936 | 0.1302 | 87.0248 | 7.5648 |
0.1428 | 5.0 | 1170 | 0.1372 | 85.5082 | 7.686 |
0.1428 | 6.0 | 1404 | 0.1297 | 86.3936 | 7.612 |
0.1328 | 7.0 | 1638 | 0.1273 | 86.7919 | 7.5745 |
0.1328 | 8.0 | 1872 | 0.1266 | 86.7919 | 7.5745 |
0.129 | 9.0 | 2106 | 0.1262 | 86.9787 | 7.5606 |
0.129 | 10.0 | 2340 | 0.1256 | 86.882 | 7.5616 |
0.1262 | 11.0 | 2574 | 0.1259 | 86.9757 | 7.5616 |
0.1262 | 12.0 | 2808 | 0.1255 | 86.8843 | 7.5595 |
0.1262 | 13.0 | 3042 | 0.1257 | 86.9322 | 7.5584 |
0.1262 | 14.0 | 3276 | 0.1256 | 87.0674 | 7.5563 |
0.1238 | 15.0 | 3510 | 0.1259 | 86.7906 | 7.5595 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.3.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1