Edit model card

long-t5-tglobal-xl-16384-booksci-summary-v1

This model is a fine-tuned version of pszemraj/long-t5-tglobal-xl-16384-book-summary on the pszemraj/scientific_lay_summarisation-elife-norm dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7518
  • Rouge1: 47.4591
  • Rouge2: 12.7287
  • Rougel: 21.5549
  • Rougelsum: 44.8709
  • Gen Len: 384.39

Model description

An experiment of further fine-tuning a booksum model on a different dataset. Compare to either the starting checkpoint (linked above) or to the variant only fine-tuned on the scientific lay summaries.

Intended uses & limitations

More information needed

Training and evaluation data

the pszemraj/scientific_lay_summarisation-elife-norm dataset, input 16384 tokens then truncate, output 1024 tokens then truncate.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 878
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.9629 1.0 543 1.7637 46.6926 12.4769 21.4364 44.4329 381.23
1.8555 2.0 1086 1.7518 47.4591 12.7287 21.5549 44.8709 384.39
Downloads last month
11
Inference Examples
Inference API (serverless) has been turned off for this model.

Dataset used to train pszemraj/long-t5-tglobal-xl-16384-booksci-summary-v1

Evaluation results