Edit model card

MT5-large_NO-idun-20epoch

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6704
  • Rouge1: 41.2841
  • Rouge2: 17.1062
  • Rougel: 27.4493
  • Rougelsum: 37.2798
  • Gen Len: 113.9043

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 0.98 46 6.3605 26.7151 6.6971 17.105 23.7109 127.0
No log 1.99 93 4.7689 32.3429 12.2149 20.2642 28.985 127.0
No log 2.99 140 1.8494 37.8255 14.0518 22.4306 33.4714 124.9255
No log 4.0 187 1.7294 39.5672 16.4066 24.4606 35.4055 121.1702
No log 4.98 233 1.6796 39.5901 16.6044 25.6316 35.5093 120.5532
No log 5.99 280 1.6557 39.8141 15.6699 24.8691 36.0578 123.5745
No log 6.99 327 1.6525 40.0304 16.6229 25.7054 36.3012 121.0638
No log 8.0 374 1.6484 40.5564 16.0763 26.0131 36.1736 119.8936
No log 8.98 420 1.6499 39.9522 16.6648 26.419 35.9155 118.9468
No log 9.99 467 1.6494 41.0085 17.1259 27.041 36.9109 115.8085
3.1043 10.99 514 1.6485 41.5339 17.5085 27.6923 37.2051 115.8936
3.1043 12.0 561 1.6488 40.3393 16.453 26.8152 36.3384 113.4787
3.1043 12.98 607 1.6485 42.0494 17.8355 27.9197 37.9283 115.8617
3.1043 13.99 654 1.6533 40.7634 16.8655 26.8984 36.5803 114.6809
3.1043 14.99 701 1.6570 41.6789 17.5072 27.7933 37.4503 114.1596
3.1043 16.0 748 1.6594 41.5489 17.2787 27.7975 37.2948 113.7447
3.1043 16.98 794 1.6643 41.3929 17.0913 27.4552 37.2221 113.3936
3.1043 17.99 841 1.6658 41.4336 16.9364 27.4426 37.1709 113.1915
3.1043 18.99 888 1.6699 41.5935 17.1928 27.2885 37.2653 113.6170
3.1043 19.68 920 1.6704 41.2841 17.1062 27.4493 37.2798 113.9043

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.12.0
  • Tokenizers 0.13.2
Downloads last month
3
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Akselssss/MT5-large_NO-idun-20epoch

Base model

google/mt5-large
Finetuned
this model