pakawadeep/mt5-small-finetuned-ctfl-augmented_1
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 0.7355
- Validation Loss: 0.9162
- Train Rouge1: 7.8147
- Train Rouge2: 1.2871
- Train Rougel: 7.9562
- Train Rougelsum: 7.9208
- Train Gen Len: 11.9554
- Epoch: 29
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
Train Loss | Validation Loss | Train Rouge1 | Train Rouge2 | Train Rougel | Train Rougelsum | Train Gen Len | Epoch |
---|---|---|---|---|---|---|---|
8.5523 | 2.2860 | 1.6090 | 0.0 | 1.6604 | 1.6663 | 15.7871 | 0 |
4.1420 | 1.7857 | 4.6205 | 0.4950 | 4.7030 | 4.6865 | 11.6188 | 1 |
2.9923 | 1.7701 | 6.5417 | 0.8251 | 6.6832 | 6.5771 | 11.5545 | 2 |
2.4291 | 1.6654 | 7.7086 | 2.0792 | 7.7086 | 7.7086 | 11.6832 | 3 |
2.0333 | 1.5535 | 8.2037 | 2.0792 | 8.2037 | 8.2037 | 11.8465 | 4 |
1.7636 | 1.4446 | 8.2037 | 2.0792 | 8.2037 | 8.2037 | 12.0050 | 5 |
1.5480 | 1.3806 | 8.2037 | 2.0792 | 8.2037 | 8.2037 | 11.9059 | 6 |
1.3908 | 1.3196 | 8.7871 | 2.3102 | 8.7871 | 8.7871 | 11.9455 | 7 |
1.2889 | 1.2768 | 8.7871 | 2.3102 | 8.7871 | 8.7871 | 11.9604 | 8 |
1.2174 | 1.2264 | 8.9109 | 2.2772 | 8.9109 | 8.9816 | 11.9406 | 9 |
1.1568 | 1.1894 | 8.9816 | 2.4752 | 8.9816 | 8.9816 | 11.9455 | 10 |
1.1170 | 1.1385 | 8.9109 | 2.2772 | 8.9109 | 8.9816 | 11.9703 | 11 |
1.0717 | 1.1276 | 8.9109 | 2.2772 | 8.9109 | 8.9816 | 11.9653 | 12 |
1.0374 | 1.0783 | 8.9109 | 2.2772 | 8.9109 | 8.9816 | 11.9901 | 13 |
1.0088 | 1.0528 | 8.9109 | 2.2772 | 8.9109 | 8.9816 | 11.9802 | 14 |
0.9806 | 1.0489 | 8.9109 | 2.2772 | 8.9109 | 8.9816 | 11.9802 | 15 |
0.9553 | 1.0280 | 8.6987 | 1.7822 | 8.6987 | 8.6987 | 11.9752 | 16 |
0.9314 | 1.0122 | 8.6987 | 1.7822 | 8.6987 | 8.6987 | 11.9851 | 17 |
0.9104 | 0.9970 | 8.4335 | 1.2871 | 8.4512 | 8.4512 | 11.9554 | 18 |
0.8892 | 0.9831 | 8.4335 | 1.2871 | 8.4512 | 8.4512 | 11.9653 | 19 |
0.8675 | 0.9631 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9653 | 20 |
0.8559 | 0.9618 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9703 | 21 |
0.8352 | 0.9623 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9802 | 22 |
0.8225 | 0.9465 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9703 | 23 |
0.8059 | 0.9465 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9653 | 24 |
0.7960 | 0.9388 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9703 | 25 |
0.7766 | 0.9386 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9703 | 26 |
0.7609 | 0.9429 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9653 | 27 |
0.7519 | 0.9256 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9653 | 28 |
0.7355 | 0.9162 | 7.8147 | 1.2871 | 7.9562 | 7.9208 | 11.9554 | 29 |
Framework versions
- Transformers 4.41.2
- TensorFlow 2.15.0
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.