Edit model card

mlong-t5-tglobal-base

This model is a fine-tuned version of agemagician/mlong-t5-tglobal-base on an HeSum dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1091
  • Rouge1: 31.6099
  • Rouge2: 12.9182
  • Rougel: 23.8053
  • Rougelsum: 25.5362
  • Gen Len: 59.758

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 RougeL RougeLSum
No log 1.0 500 2.2709 20.5043 8.1518 16.9526 17.5001
2.8714 2.0 1000 2.2022 21.4051 8.7445 17.7534 18.3191
2.8714 3.0 1500 2.1608 21.6609 9.1753 18.0374 18.6176
2.5137 4.0 2000 2.1555 21.6818 9.1814 18.0382 18.6198
2.5137 5.0 2500 2.1462 21.9708 9.2033 18.3919 18.9535
2.3717 6.0 3000 2.1258 22.0583 9.2987 18.4379 19.0322
2.3717 7.0 3500 2.1278 21.8245 9.0474 18.1979 18.8038
2.2633 8.0 4000 2.1207 21.6273 8.8847 18.024 18.6049
2.2633 9.0 4500 2.1180 22.2004 9.6253 18.6373 19.1721
2.1886 10.0 5000 2.1220 22.1619 9.6206 18.5069 19.0856
2.1886 11.0 5500 2.1161 22.1518 9.4522 18.4695 19.0552
2.1144 12.0 6000 2.1103 22.0395 9.4185 18.4314 19.0305
2.1144 13.0 6500 2.1150 22.2404 9.4722 18.5482 19.1747
2.054 14.0 7000 2.1091 22.1466 9.3434 18.3443 18.9233
2.0526 15.0 8000 2.1580 30.4149 2.0774 22.9493 24.4478
2.1236 16.0 16000 2.1621 31.3101 13.3237 23.8249 25.526
2.0776 17.0 24000 2.1607 30.9902 12.3753 23.0243 24.8308
1.9843 18.0 32000 2.1553 32.0603 13.4985 24.0775 25.9692

Framework versions

  • Transformers 4.38.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
13
Safetensors
Model size
592M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for biunlp/LongMt5-HeSum

Finetuned
this model