Edit model card

mt5-small_test

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7284
  • Rouge1: 43.3718
  • Rouge2: 37.5973
  • Rougel: 42.0502
  • Rougelsum: 42.0648
  • Bleu: 32.8345
  • Gen Len: 12.6063
  • Meteor: 0.3949
  • True negatives: 70.2115
  • False negatives: 11.206
  • Cosine Sim: 0.7485

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 9
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len Meteor True negatives False negatives Cosine Sim
3.1455 1.0 175 0.9832 18.7269 15.517 18.22 18.223 7.0634 7.6229 0.1626 74.6828 57.1687 0.3949
1.1623 1.99 350 0.8542 38.7603 32.7237 37.3447 37.3752 27.4323 12.5135 0.3487 60.0 15.942 0.6992
0.9431 2.99 525 0.8017 41.5759 35.6108 40.2536 40.2695 30.7994 12.8117 0.3755 61.2689 12.3447 0.7304
0.8119 3.98 700 0.7787 43.5881 37.4245 42.1096 42.1248 32.9646 13.2176 0.3947 59.1541 9.5238 0.7582
0.7235 4.98 875 0.7477 43.4069 37.2246 41.8444 41.8616 32.9345 13.116 0.3946 63.0816 9.8085 0.7561
0.6493 5.97 1050 0.7266 40.4506 35.0072 39.1206 39.1181 29.0601 11.748 0.3687 75.5287 17.2101 0.7071
0.5871 6.97 1225 0.7284 43.3718 37.5973 42.0502 42.0648 32.8345 12.6063 0.3949 70.2115 11.206 0.7485

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
1

Finetuned from