Edit model card

MT5_large_NO_CNN-idun-final

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8492
  • Rouge1: 31.9047
  • Rouge2: 12.0487
  • Rougel: 21.7323
  • Rougelsum: 29.4557
  • Gen Len: 98.7777

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.246 1.0 6118 1.9772 30.9142 11.28 21.0914 28.4499 95.2396
2.0899 2.0 12236 1.9070 31.3973 11.6219 21.3122 28.9365 100.0907
1.9736 3.0 18354 1.8716 31.5752 11.7955 21.4748 29.1354 100.4973
1.9189 4.0 24472 1.8547 31.9802 12.183 21.8505 29.5171 98.4764
1.8697 5.0 30590 1.8492 31.9047 12.0487 21.7323 29.4557 98.7777

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.12.0
  • Tokenizers 0.13.2
Downloads last month
0

Finetuned from