pakawadeep's picture
Training in progress epoch 22
4c51173
metadata
license: apache-2.0
base_model: google/mt5-large
tags:
  - generated_from_keras_callback
model-index:
  - name: pakawadeep/mt5-large-finetuned-ctfl-augmented_2
    results: []

pakawadeep/mt5-large-finetuned-ctfl-augmented_2

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.1781
  • Validation Loss: 0.7498
  • Train Rouge1: 8.8402
  • Train Rouge2: 1.1881
  • Train Rougel: 8.8048
  • Train Rougelsum: 8.7164
  • Train Gen Len: 11.8812
  • Epoch: 22

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Train Rouge1 Train Rouge2 Train Rougel Train Rougelsum Train Gen Len Epoch
3.9587 1.9327 2.7783 0.2200 2.7524 2.7558 11.6436 0
1.8568 1.4440 6.6832 1.3201 6.6007 6.4769 11.7376 1
1.5929 1.2365 6.2235 1.0891 6.2235 6.2235 11.6089 2
1.3718 1.0833 7.7086 1.5842 7.4965 7.4965 11.9406 3
1.0395 0.9417 7.4257 1.8812 7.4257 7.4022 11.9703 4
0.8993 0.8573 8.5337 1.8812 8.4394 8.4158 11.9059 5
0.7896 0.7923 8.6987 1.7822 8.6987 8.6987 11.9851 6
0.7050 0.7375 8.4866 1.2871 8.4512 8.4512 11.9307 7
0.6377 0.7065 8.4866 1.2871 8.4512 8.4512 11.9158 8
0.5803 0.6809 8.4866 1.2871 8.4512 8.4512 12.0 9
0.5351 0.6758 8.4866 1.2871 8.4512 8.4512 11.9802 10
0.4957 0.6585 8.3274 1.1881 8.2921 8.2390 11.9653 11
0.4498 0.6436 8.4866 1.2871 8.4512 8.4512 11.9752 12
0.4093 0.6456 8.4866 1.2871 8.4512 8.4512 11.9406 13
0.3752 0.6300 8.3628 0.8416 8.2508 8.2390 11.9505 14
0.3427 0.6404 8.3628 0.8416 8.2508 8.2390 11.9604 15
0.3113 0.6443 8.6987 0.5941 8.5809 8.5868 11.9109 16
0.2826 0.6459 8.6987 0.5941 8.5809 8.5868 11.9406 17
0.2565 0.6555 8.6987 0.5941 8.5809 8.5868 11.9455 18
0.2347 0.6815 8.8402 1.1881 8.8048 8.7164 11.8911 19
0.2141 0.6884 8.6987 0.5941 8.5809 8.5868 11.8911 20
0.1942 0.7286 8.8402 1.1881 8.8048 8.7164 11.9307 21
0.1781 0.7498 8.8402 1.1881 8.8048 8.7164 11.8812 22

Framework versions

  • Transformers 4.41.2
  • TensorFlow 2.15.0
  • Datasets 2.20.0
  • Tokenizers 0.19.1