long_t5 / README.md
zera09's picture
End of training
402750e verified
metadata
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: long_t5
    results: []

long_t5

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5158
  • Rouge1: 0.5214
  • Rouge2: 0.3347
  • Rougel: 0.4751
  • Rougelsum: 0.4746
  • Gen Len: 25.9513

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.232 1.0 1600 1.6810 0.4704 0.2861 0.4256 0.4251 26.6112
2.0229 2.0 3200 1.6167 0.4859 0.2991 0.4412 0.4407 26.1006
1.9239 3.0 4800 1.5805 0.4924 0.3049 0.4475 0.4468 26.8169
1.8454 4.0 6400 1.5669 0.4968 0.3093 0.4517 0.4511 25.925
1.7626 5.0 8000 1.5432 0.4973 0.3132 0.453 0.4525 26.4362
1.6995 6.0 9600 1.5352 0.5045 0.3188 0.4596 0.459 26.1219
1.682 7.0 11200 1.5255 0.5066 0.3198 0.4613 0.4609 26.1581
1.6286 8.0 12800 1.5210 0.5113 0.3245 0.4663 0.466 26.1725
1.593 9.0 14400 1.5195 0.5102 0.3235 0.464 0.4638 25.8944
1.5784 10.0 16000 1.5166 0.5133 0.3265 0.4665 0.4661 25.685
1.5615 11.0 17600 1.5135 0.5161 0.3284 0.47 0.4695 25.8681
1.5391 12.0 19200 1.5106 0.5156 0.3303 0.4703 0.4701 26.1781
1.5077 13.0 20800 1.5095 0.5177 0.3317 0.4724 0.4721 26.0456
1.4923 14.0 22400 1.5163 0.5185 0.3321 0.4728 0.4723 26.17
1.4545 15.0 24000 1.5128 0.5181 0.3337 0.4727 0.4724 25.8219
1.4489 16.0 25600 1.5135 0.5209 0.3349 0.4744 0.4743 26.0369
1.4481 17.0 27200 1.5153 0.5218 0.3349 0.4751 0.4748 26.1744
1.4287 18.0 28800 1.5134 0.521 0.335 0.4752 0.4747 25.9525
1.389 19.0 30400 1.5155 0.5212 0.3348 0.4756 0.4751 26.0369
1.4215 20.0 32000 1.5158 0.5214 0.3347 0.4751 0.4746 25.9513

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu118
  • Datasets 2.20.0
  • Tokenizers 0.19.1