Edit model card

pakawadeep/mt5-small-finetuned-ctfl-augmented_1

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.7355
  • Validation Loss: 0.9162
  • Train Rouge1: 7.8147
  • Train Rouge2: 1.2871
  • Train Rougel: 7.9562
  • Train Rougelsum: 7.9208
  • Train Gen Len: 11.9554
  • Epoch: 29

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Train Rouge1 Train Rouge2 Train Rougel Train Rougelsum Train Gen Len Epoch
8.5523 2.2860 1.6090 0.0 1.6604 1.6663 15.7871 0
4.1420 1.7857 4.6205 0.4950 4.7030 4.6865 11.6188 1
2.9923 1.7701 6.5417 0.8251 6.6832 6.5771 11.5545 2
2.4291 1.6654 7.7086 2.0792 7.7086 7.7086 11.6832 3
2.0333 1.5535 8.2037 2.0792 8.2037 8.2037 11.8465 4
1.7636 1.4446 8.2037 2.0792 8.2037 8.2037 12.0050 5
1.5480 1.3806 8.2037 2.0792 8.2037 8.2037 11.9059 6
1.3908 1.3196 8.7871 2.3102 8.7871 8.7871 11.9455 7
1.2889 1.2768 8.7871 2.3102 8.7871 8.7871 11.9604 8
1.2174 1.2264 8.9109 2.2772 8.9109 8.9816 11.9406 9
1.1568 1.1894 8.9816 2.4752 8.9816 8.9816 11.9455 10
1.1170 1.1385 8.9109 2.2772 8.9109 8.9816 11.9703 11
1.0717 1.1276 8.9109 2.2772 8.9109 8.9816 11.9653 12
1.0374 1.0783 8.9109 2.2772 8.9109 8.9816 11.9901 13
1.0088 1.0528 8.9109 2.2772 8.9109 8.9816 11.9802 14
0.9806 1.0489 8.9109 2.2772 8.9109 8.9816 11.9802 15
0.9553 1.0280 8.6987 1.7822 8.6987 8.6987 11.9752 16
0.9314 1.0122 8.6987 1.7822 8.6987 8.6987 11.9851 17
0.9104 0.9970 8.4335 1.2871 8.4512 8.4512 11.9554 18
0.8892 0.9831 8.4335 1.2871 8.4512 8.4512 11.9653 19
0.8675 0.9631 7.8147 1.2871 7.9562 7.9208 11.9653 20
0.8559 0.9618 7.8147 1.2871 7.9562 7.9208 11.9703 21
0.8352 0.9623 7.8147 1.2871 7.9562 7.9208 11.9802 22
0.8225 0.9465 7.8147 1.2871 7.9562 7.9208 11.9703 23
0.8059 0.9465 7.8147 1.2871 7.9562 7.9208 11.9653 24
0.7960 0.9388 7.8147 1.2871 7.9562 7.9208 11.9703 25
0.7766 0.9386 7.8147 1.2871 7.9562 7.9208 11.9703 26
0.7609 0.9429 7.8147 1.2871 7.9562 7.9208 11.9653 27
0.7519 0.9256 7.8147 1.2871 7.9562 7.9208 11.9653 28
0.7355 0.9162 7.8147 1.2871 7.9562 7.9208 11.9554 29

Framework versions

  • Transformers 4.41.2
  • TensorFlow 2.15.0
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from