Edit model card

longt5_xl_sfd_bp_15

This model is a fine-tuned version of google/long-t5-tglobal-xl on the learn3r/summ_screen_fd_bp dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5840
  • Rouge1: 29.7482
  • Rouge2: 12.0072
  • Rougel: 21.348
  • Rougelsum: 28.5849
  • Gen Len: 503.5769

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 15.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.5763 0.97 14 2.5415 10.6052 1.4494 10.4593 10.4801 509.6479
1.8998 1.95 28 1.7398 16.7989 4.1457 16.4049 15.1803 511.0
1.6403 2.99 43 1.5457 18.4716 5.4633 17.1393 16.9242 511.0
1.5012 3.97 57 1.5736 18.2259 5.3524 17.0162 16.7948 511.0
1.248 4.94 71 1.5482 20.8275 6.7412 18.0859 19.3113 511.0
1.0176 5.98 86 1.6254 21.1937 6.8813 18.411 19.8577 510.6775
0.8472 6.96 100 1.6212 26.1873 9.1581 20.393 24.1393 479.9704
0.7242 8.0 115 1.7231 23.5881 7.8961 18.7014 22.2999 506.9112
0.5876 8.97 129 1.9401 32.1851 12.6426 22.8358 30.6718 451.6982
0.4756 9.95 143 1.9001 31.353 12.994 23.1542 29.8375 455.5947
0.4042 10.99 158 2.1295 28.6425 11.8399 21.3847 27.0508 497.5355
0.3292 11.97 172 2.2441 31.8393 13.1308 22.135 30.5866 478.8107
0.2812 12.94 186 2.3464 34.4102 14.3607 23.8634 32.9732 429.9911
0.2443 13.98 201 2.2003 34.8239 14.8042 25.2438 33.0469 392.5385
0.1958 14.61 210 2.5840 29.7482 12.0072 21.348 28.5849 503.5769

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
9
Safetensors
Model size
2.85B params
Tensor type
F32
·

Finetuned from

Dataset used to train learn3r/longt5_xl_sfd_bp_15

Evaluation results