Edit model card

pakawadeep/mt5-large-finetuned-ctfl-augmented

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.3089
  • Validation Loss: 0.6932
  • Train Rouge1: 8.6987
  • Train Rouge2: 1.2871
  • Train Rougel: 8.8402
  • Train Rougelsum: 8.9993
  • Train Gen Len: 11.9208
  • Epoch: 23

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Train Rouge1 Train Rouge2 Train Rougel Train Rougelsum Train Gen Len Epoch
7.1544 4.7384 0.2888 0.0 0.2888 0.2475 16.6436 0
5.1911 2.1738 1.4851 0.3438 1.5242 1.5572 12.8564 1
3.6169 1.7018 5.5693 0.5776 5.5281 5.6931 11.6931 2
2.8957 1.4685 5.8581 0.8251 5.8581 5.9818 11.0347 3
1.9769 1.2807 6.6832 1.8152 6.8069 6.8688 11.4505 4
1.5772 1.1149 6.5064 1.1881 6.7185 6.7185 11.6485 5
1.3661 0.9914 8.4158 2.3762 8.4158 8.5809 11.8762 6
1.2399 0.8926 7.9915 2.1287 8.0269 8.2037 11.9604 7
1.0788 0.8530 8.4158 2.1287 8.6987 8.6987 11.9901 8
0.9825 0.8069 8.8637 2.3762 8.9345 9.0288 11.9653 9
0.9078 0.7803 8.4866 1.8812 8.6987 8.7341 11.9653 10
0.8409 0.7522 8.4866 1.8812 8.6987 8.7341 11.9802 11
0.7715 0.7171 8.2390 1.2871 8.4512 8.4866 11.9851 12
0.7063 0.7045 8.2390 1.2871 8.4512 8.4866 11.9505 13
0.6558 0.6797 8.2390 1.2871 8.4512 8.4866 11.9554 14
0.6074 0.6651 8.2390 1.2871 8.4512 8.4866 11.9455 15
0.5571 0.6555 8.2390 1.2871 8.4512 8.4866 11.9356 16
0.5126 0.6531 8.2390 1.2871 8.4512 8.4866 11.9257 17
0.4744 0.6481 8.2390 1.2871 8.4512 8.4866 11.9406 18
0.4356 0.6521 8.2390 1.2871 8.4512 8.4866 11.9406 19
0.3982 0.6618 8.2390 1.2871 8.4512 8.4866 11.9307 20
0.3667 0.6628 8.2390 1.2871 8.4512 8.4866 11.9208 21
0.3371 0.6723 8.2390 1.2871 8.4512 8.4866 11.9307 22
0.3089 0.6932 8.6987 1.2871 8.8402 8.9993 11.9208 23

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1

Finetuned from