LongT5-Large-NSPCC

This model is a fine-tuned version of google/long-t5-tglobal-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3710
  • Rouge1: 0.4978
  • Rouge2: 0.2091
  • Rougel: 0.2874
  • Rougelsum: 0.2871
  • Gen Len: 251.3511

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.0401 0.9960 188 2.7089 0.2766 0.0617 0.1655 0.1657 151.7021
2.4805 1.9974 377 1.8809 0.382 0.1178 0.2092 0.2092 211.1809
1.8093 2.9987 566 1.5769 0.4356 0.1527 0.2409 0.2409 246.1277
1.4653 4.0 755 1.4359 0.4661 0.1722 0.26 0.2603 245.0851
1.2626 4.9960 943 1.3908 0.4829 0.1931 0.2717 0.2717 239.8617
1.117 5.9974 1132 1.3724 0.4864 0.1988 0.2804 0.2804 244.4255
1.0404 6.9987 1321 1.3714 0.4914 0.2007 0.2826 0.2821 248.6915
1.0065 7.9682 1504 1.3710 0.4978 0.2091 0.2874 0.2871 251.3511

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
14
Safetensors
Model size
783M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for scott156/LongT5-Large-NSPCC-V2

Finetuned
(3)
this model