long-t5-tglobal-xl-16384-booksci-summary-plos-10k

This model is a fine-tuned version of pszemraj/long-t5-tglobal-xl-16384-book-summary on the pszemraj/scientific_lay_summarisation-plos-norm dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5041
  • Rouge1: 44.3203
  • Rouge2: 11.0576
  • Rougel: 22.7584
  • Rougelsum: 40.1462
  • Gen Len: 256.66

Model description

Another test of further fine-tuning booksum-based models, this one fine-tuned on the PLOS subset of lay-summaries for about 10k examples input, to make it roughly equivalent to this checkpoint fine-tuned on the ELIFE subset for two epochs (also around 10k examples).

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 165
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.7715 0.28 350 1.5310 43.4729 10.4616 22.1928 39.505 260.87
1.9307 0.56 700 1.5102 44.1634 10.9336 22.3896 40.2939 253.58
1.2981 0.84 1050 1.5046 44.2728 10.8455 22.4122 40.3019 261.29
Downloads last month
12
Inference Examples
Inference API (serverless) has been turned off for this model.

Dataset used to train pszemraj/long-t5-tglobal-xl-16384-booksci-summary-plos-10k

Evaluation results