pakawadeep's picture
Training in progress epoch 28
8c4b437
|
raw
history blame
5.09 kB
metadata
license: apache-2.0
base_model: google/mt5-base
tags:
  - generated_from_keras_callback
model-index:
  - name: pakawadeep/mt5-base-finetuned-ctfl-augmented_1
    results: []

pakawadeep/mt5-base-finetuned-ctfl-augmented_1

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.3208
  • Validation Loss: 0.7929
  • Train Rouge1: 8.9816
  • Train Rouge2: 1.2871
  • Train Rougel: 8.9463
  • Train Rougelsum: 8.9816
  • Train Gen Len: 11.8960
  • Epoch: 28

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Train Rouge1 Train Rouge2 Train Rougel Train Rougelsum Train Gen Len Epoch
5.3770 2.6665 4.5262 0.6931 4.5733 4.5733 8.9356 0
2.7256 2.0063 5.6931 1.3201 5.6518 5.6931 10.2277 1
2.0053 1.4899 7.7086 2.1782 7.7086 7.7086 11.3465 2
1.5782 1.2268 7.7086 2.1782 7.7086 7.7086 11.8168 3
1.3143 1.1257 8.6987 2.1782 8.6987 8.4866 11.9257 4
1.1311 1.0411 8.9816 2.2772 8.9109 8.9109 11.9406 5
1.0120 0.9954 8.9816 2.2772 8.9109 8.9109 11.9406 6
0.9320 0.9375 8.9816 2.2772 8.9109 8.9109 11.9208 7
0.8538 0.8867 8.9816 2.2772 8.9109 8.9109 11.8911 8
0.7999 0.8593 8.8166 1.7822 8.7459 8.7459 11.8861 9
0.7562 0.8440 8.5573 1.2871 8.4866 8.5337 11.8812 10
0.7106 0.8085 8.5573 1.2871 8.4866 8.5337 11.8812 11
0.6685 0.8044 7.9562 0.7921 7.8147 7.9562 11.9059 12
0.6377 0.7867 8.4512 1.2871 8.4158 8.4512 11.8762 13
0.6067 0.7731 8.2980 0.7921 8.2096 8.2862 11.8960 14
0.5826 0.7593 8.2980 0.7921 8.2096 8.2862 11.8861 15
0.5533 0.7656 8.4512 1.2871 8.4158 8.4512 11.9010 16
0.5286 0.7657 8.4512 1.2871 8.4158 8.4512 11.8812 17
0.5049 0.7674 8.4512 1.2871 8.4158 8.4512 11.8465 18
0.4800 0.7591 8.4512 1.2871 8.4158 8.4512 11.8663 19
0.4593 0.7637 8.4512 1.2871 8.4158 8.4512 11.8663 20
0.4362 0.7757 8.4512 1.2871 8.4158 8.4512 11.8762 21
0.4185 0.7640 8.9816 1.2871 8.9463 8.9816 11.8812 22
0.4001 0.7496 8.9816 1.2871 8.9463 8.9816 11.8762 23
0.3826 0.7498 8.9816 1.2871 8.9463 8.9816 11.8515 24
0.3682 0.7646 8.9816 1.2871 8.9463 8.9816 11.8861 25
0.3525 0.7656 8.9816 1.2871 8.9463 8.9816 11.8762 26
0.3352 0.7774 9.0877 1.3861 8.9816 9.0347 11.9010 27
0.3208 0.7929 8.9816 1.2871 8.9463 8.9816 11.8960 28

Framework versions

  • Transformers 4.41.2
  • TensorFlow 2.15.0
  • Datasets 2.20.0
  • Tokenizers 0.19.1